This post is excerpted from my newsletter, subscribe to follow this site and you will be automatically added to the newsletter e-mail list.
This year’s Mobile World Congress (MWC) saw record attendance, with reportedly over 100,000 attendees. If it were possible to untangle a single thread from the event, it would have to be the approaching launch of 5G networks. The advent of a new standard is always an important event in the mobile landscape. However, that is a topic for a different day, and in all honesty, it is better covered by others. Instead, I want to focus on what I think could be the key emerging trend – Network Function Virtualization or NFV.
Fantastic, another acronym. To make matters worse, NFV is not exactly new. The idea has been kicking around for over five years. So now readers are split into two groups – those who have never heard nor cared about NFV, and those who are tired of hearing about it. Stay with me, it gets better.
For those new to NFV, the basics. NFV is what everyone in the IT world calls virtualization. That is, the ability to run multiple ‘instances’ of an operating system on a single computer. Put another way, it is the ability to share a single server across multiple users (where users may be other machines). Virtualization has been mainstream in enterprise data centers for so long that we are already moving to new frameworks beyond it. (These go by names like microservices or containers, and we will get back to them in a moment.) However, for telecom services, virtualization is still new and largely untried.
Telecom operators already use virtualization in their own internal IT networks, the stuff where their corporate e-mail and other applications run. But for the most part, there is little virtualization being used in the equipment that powers base stations and the rest of the mobile network. Operator networks still work differently from the other big network we use, the Internet. This is true for a variety of reasons. Telecom networks have much more stringent reliability requirements. If Netflix goes down for a few hours, the worst that happens is you have to go outside into sunshine or at worst actually talk to someone. If a telecom network goes down then 911 calls stop working, disasters ensue, and Operator executives get called to testify in front of Congress or Parliament or closed sessions with angry regulators. It also turns out that there a bunch of technical reasons why virtualized systems cannot handle telecom workloads. Again, more on that in a moment.
The way things stand today, every service provided by a telecom network – connectivity, roaming, billing, messaging, etc. is handled by a specific piece of equipment. Each of these boxes has its own capacity, which means that a national network can have dozens or hundreds of each box scattered around a country. As one Telecom Architect once told me, most of his work is nor architecture, its plumbing, getting also those boxes working together. This is a lot of complexity for which we can read expensive to install and operate.
The operators would instead like to have all of those services converted to software, and then run that software on a common server. Remove all the boxes, and tie everything to together with code. This is exactly what virtualization provides, taking all the functions of the network and running them on shared servers. Thus Network Function Virtualization.
But Wait, There’s More
The operators see two real benefits from NFV. The first is the ability to consolidate their hardware. By implementing NFV, they hope to buy less hardware. More to the point, they hope to save money by buying the same servers that everyone else uses in their data centers, which tend be much less expensive than the specialty gear the operators currently use. In theory, this should lead to lower capex, or upfront costs.
Perhaps more important, is that running a virtualized network could (should?) allow the operators to reduce the costs of running those networks, so a reduction in opex as well. With virtualized networks, they do not need to train their operations staff on the intricacies of all those individual boxes. Instead, the hardware team just has to manage a set of standard servers. And the world is filled with IT admins capable of doing that.
However, there is another important benefit of virtualization. Today, if a carrier wants to launch a new service, like voicemail, they need a lot of time to plan, configure and test it. This is expensive and greatly limits their ability to bring on new features. In theory, running a virtualized network would dramatically reduce that time (i.e. cost) and allow for greater service experimentation. This idea is commonly referred to as ‘service agility’ a term which appears throughout NFV literature.
Will it Happen?
All of this sounds sounds very appealing. Of course, the reality is more complicated. First, whenever it comes to anything to do with the Telecom world there is the question of standardization. The fact that you can dial a number in San Francisco and a phone in Ulan Batur rings is nothing short of a miracle, which we now take entirely for granted. That is the beauty of standardization. The drawback is that standards are designed by committees, with all that entails. NFV standardization is further muddled by the fact that in principle, the standards piece could become just one more virtualized function. One of the big challenges facing NFV is that it proposes a very different approach to computing, one that looks a lot like the ‘everything is a perpetual beta test’ mode used by the purveyors of the modern Internet. There is something of a clash of cultures involved in the transition to NFV, and this has created some chaos. Creative chaos, but chaos nonetheless.
However, in a big departure from the air interface standards (e.g. 3G, 4G, 5G), the operators may not need to agree on a single version of NFV to move forward. There is enough flexibility in all of this, that several operators are well down the path towards their own version of NFV. It is no coincidence that this is largely how web standards work. HTML, for instance, is standardized somewhat after-the fact, with browser makers each testing out new features and later getting together to standardize things. This is a radical departure from telecom standards where everyone has to agree on the standard before the first piece of equipment gets installed.
No one knows how many carriers haver started down the path, but from what I could learn at the show there are many dozens who have already installed significant virtualization initiatives on live networks. For instance, I heard that one major US carrier already runs 40% of its traffic over some form of NFV systems. And I think another one may be even further ahead. Admittedly, there are many different pieces of NFV, and even the most advanced deployments today probably only encompass a small portion of full NFV capabilities.
It turns out there are still numerous technical challenges that need to be addressed for NFV to work. I am not going to dive into the details and block diagrams of how NFV architecture here. And so I will skip over the various standards and ‘Open’ consortia tackling NFV. Suffice it to say, there are a lot of moving pieces, but these seem to be coalescing into a smaller number of contending approaches. So progress, but no conclusion yet.
Devil is in the Timing
There is also the very real possibility that NFV could still be many years away. The operators and system vendors are just beginning to explore much of NFV’s promised functionality. There could be serious technical minefields still out there. For instance, there is the seemingly tiny issue of timing.
Modern telecom networks, especially mobile networks, rely on precise timing. Every mobile base station runs on a coordinated clock, with precision measured in micro-seconds. Again, I will skip over the technical reasons for this, but timing is a very big deal. However, virtual machines (VM), the basic unit of a virtualized system turn out to be really bad at timing. This is a feature not a bug, the original VMs were designed to be as efficient as possible, to cram as much compute into a single server as possible. If the VM performs some calculation, it sends it off immediately. No sense waiting around, and glomming up the system’s communications pathways (aka non-blocking I/O). At MWC, one VM expert told me that big data center virtualized systems can see timing systems get many seconds out of sync. This is not a big deal for some systems, but for telecom systems it is a disaster. Most people I speak with about NFV are not even aware of the issue, very few operators have gotten deep enough in their deployments to fully understand the problem, but I think it is a big monster lurking out there. For anyone interested, I know one company that appears to have a pretty elegant solution, ahead of anyone even realizing that there is a problem.
Another important issue is the future nature of computing itself. At the start of this piece, I mentioned that the big data center builders of the Internet and Enterprise space are rapidly moving beyond virtualization to micro-services. Put simply, this idea just takes virtualization down to even more granular level, allowing for even greater flexibility and efficiency in the use of shared server resources. This evolution may further delay implementation of NFV in operators, as everyone needs to get up to speed on the latest technologies. On the other hand, much of what the operators do with their servers is perfectly suited to micro-services. There is a natural fit there that could drive efficiency in a meaningful way.
If it does happen, it will be big
So while there are still a lot of questions about NFV, there are some good reasons to care about it.
In theory, NFV could help the operators deal with their perennial problem of their costs growing faster than their revenue. The operators are very worried about ‘breaking their cost curves’. NFV holds the promise of achieving this. Of course, the cynics will point out that this promise has been tied to many, many other past technologies. Even if NFV turns out not to be a miracle cure, it should still be able to generate positive ROI for operators in several ways.
For companies selling into operators, NFV has the potential to translate into a massive wave of revenue dollars. NFV will likely entail the operators building out a very large number of data centers, and filling them with equipment. I have heard estimates that go as high as 100,000 data centers. Take this with a grain of salt, these will not be 100k rack data centers like some of the large Internet providers build. These will be smaller, local affairs. Even after factoring that in, this could be some big numbers.
However, at this point the story takes a twist. There is a very real possibility that the beneficiaries of all that carrier spending will not be the traditional telecom equipment vendors. The whole point of NFV is to break apart the link between hardware and software. All those ‘common’ servers the operators may deploy are usually referred to as commodity hardware, as in low cost.
A Seeming Digression about Pets
Let me make a hypothetical example – Generic Networks (I would have called it Acme, Inc. but that of course was a real company). GN makes a Hamster Tracker, and operators in many countries sell HamTrac services to their pet-loving customers. Each HamTrac box costs $1 million, and can track a million hamsters simultaneously (serious engineering went into this product). If Operators want to add Chinchilla tracking, that is another $1 million box. One day, GN’s biggest customer, a major US operator says it is moving to a virtualized system. The operator mandates that all its equipment vendors migrate to commodity hardware. At first, GN just removes the cost of its hardware from the price they charge. Now a software license for HamTrac goes for $800k. Everyone benefits. The operator saves $200k per box as well as ongoing opex benefits from not having to manage all those boxes. The impact on GN is at least neutral, they get $200k less revenue, but that was all cost to them, so their profits are unchanged.
But what happens when the operator wants to implement ChinTrac. It gets a little harder for GN to argue that product should cost $800k as well since the two software systems are pretty similar under the hood. And then there is a bigger threat. Four Stanford grads in a shed invent AnimalTrxx a piece of software that tracks all kinds of animals. They have almost no costs, having built their system on largely free open source software. They can charge $100k for the same capacity. Pretty soon, someone else invents ZooFinder. Previously, these companies needed hardware expertise and inventory and a massive salesforce, precluding start-ups from entering the space. Under NFV, they just have to prove reliability, and suddenly pet tracking is a highly competitive business.
You get the idea.
In reality, there are still a lot of reasons why Telecom software is going to remain with the incumbent vendors for a long time. These companies are already finding ways to re-segment, re-market and re-price their system solutions of hardware, software and services. Many of them also sell other pieces of hardware that are too mission-critical for the operators to ever throw out. Nonetheless, there are good reasons to believe that the traditional equipment vendors are going to have to radically alter their businesses.
While 5G justifiably deserves the headlines this year, the next few years will (finally) see a rise of interest in NFV, or at least in its component parts. When I was preparing to go the show last month, a lot of people told me that “This is going to be my last year at MWC.” I understand the sentiment. It is crowded, expensive to travel there and of declining relevance to many segments. Nonetheless, I think many people will start to revisit their plans as the Telcos’ plans for these new technologies firm up. The big wave of capex may start to appear on people’s radars soon. And as NFV unlocks operator agility, this will open the door to new vendors and creative software initiatives.