This post is an excerpt from our newsletter in which we take a deep look at the shifting trends of the semiconductor industry. If you would like a full copy of the note, please contact us directly.
Artificial Intelligence (AI) is the hot topic in semiconductors today (and everywhere else). Google de-cloaked their Tensor Processing Units (TPUs) almost five years ago, and since then AI has become the focus of much attention, to the point that it is now perceived as national security issue. Setting aside all that heat, it is clear that this area is going to have a big impact on the semiconductor industry. At the very least it has emerged as one of the few product categories that is enjoying double-digit growth, but further ahead it could lead to some disruption in the massive market for data center equipment.
That being said, we need to define what we are talking about. As we commented a few months ago:
“This is the beauty of semiconductors. Many think of them as hardware for processing electric signals, but they are really just software made corporeal.”
This is true for AI chips as well. AI or Machine Learning (ML, for the rest of this note we will just call this ‘AI chips’ for simplicity, but there are differences), is just a very advanced form of software. At heart, this software is doing a lot of specialized math.
Much of what we call AI today is some form of Neural Network, which is just a complicated algebraic voting algorithm. These help computers ‘make decisions’ by assigning weights to various variables. Essentially, they are performing linear algebra on multi-dimensional matrices, which are also known as tensors (hence TPU).
This software performs a vast number of these calculations, over and over again. The math here is pretty simple (said someone who scraped by with B’s in college linear algebra), what matters is the vast number of times they perform it. This kind of math lends itself well to chips that are highly ‘parallelized’, meaning their processors are small but make up for that by having lots of tiny processors on each chip. This turns out to be the same requirement for processing graphics – it is not the same math, but again there are lots of pixels to calculate on each screen. And so for many years, companies have been using graphics chips (GPUs) to perform their AI tasks. However, as with anything in semiconductors, the more precisely you can fit a task to the design of a chip the faster it will run. So while GPUs are pretty good, there is still room for improvement, hence the interest in purpose-built AI chips.
AI algorithms use chips to perform two primary functions – usually called training and inference. Training, as it implies, refers to taking a large dataset and searching for patterns. These learnings are then shared (i.e. programmed) onto inference chips which make decisions on real-time data using those learnings. Training is a big process that typically uses banks of the most powerful (i.e. expensive) GPUs, while inference takes place on chips tailored to where they are used. This distinction has important implications for the market.
Training tasks tend to be done inside data centers. These tasks can take days and thus need constant power and ready access to memory and networking. By contrast inference tasks can take place anywhere. Google uses AI to power search results and so all its work, both training and inference, take place inside its data centers. Apple, on the other hand, appears to be using a lot of AI in its phones to make photos look better, which means that the inference work is all done on the phone and these chips have to be optimized for power efficiency. Looking a bit further ahead, autonomous vehicles will likely use a lot of AI, and all that inference will have to be done in vehicle (you know, to make decisions about collision avoidance).
A final wrinkle in classifying these products is subtle but very important. AI chips beat out GPUs in performance because they are tailored to specific AI software. If the chip does not fit the software well then that performance advantage over GPUs is lessened, and other factors (e.g. programming tools) may tip the balance back to GPUs. This is important in AI because everyone’s “AI software” is a little different. Above we touched on the topic of neural networks, but that is just a generic label for a very broad suite of software. There are dozens of different types of these and the math at their heart tends to be slightly different – Multiply, Multiply then Divide versus Multiply, Divide, then Multiply. These distinctions matter not only to middle schoolers struggling with algebra, they also carry very distinct layouts for chips. One company’s AI requirements can be very different from another company’s, and when these differences are burned into silicon they result in a wide range of performance, which, again, may tip the balance back towards GPUs.
All this means there are really three markets for AI chips – training in the data center, inference in the data center, and inference on the device (also known as on the edge of the network, or just ‘the edge). Each of these are already large markets and likely to grow at a very healthy clip for many years to come.
As of this writing, there are roughly twenty start-ups working on AI chips. We will not list them here, a quick Google search will yield numerous examples. For over a decade, venture investors have shunned chip start-ups. The VCs’ view was that the upfront costs and capital intensity of chip companies far outweighed the potential returns for markets that were generally saturating. The growth in AI chips altered this equation and made the sector viable once again for venture dollars.
Nonetheless, we think it is unlikely that more than one or two of these companies will survive independently beyond five years. While the demand for these chips is substantial, the number of customers for them is tightly constrained, especially for data center products. Moreover, the big chip companies will not sit still indefinitely (all recent evidence to the contrary). As these markets mature to the point where revenue can be measured in billions, the M&A engines will turn on. Companies like Nvidia, which make GPUs and so are already under threat, are already fighting this battle through AI-optimized products and investments in software tooling. For many uses, GPUs will be perfectly sufficient. But every major chip has to at least pay lip service to AI, and eventually will have to take substantive action.
However, when we net all of this out against changes in the chip market we detailed elsewhere (and here), we can see a number of potential turning points which may alter the broader semis industry’s trajectory.
To start, the market for cloud service providers is immense, accounting for roughly half the world’s CPU, GPU, storage and networking consumption. These are big companies waging a big fight against each other, and so the potential for in-house semis is correspondingly very big. Given their order volumes these companies have been the least affected by the pricing changes wrought by semis consolidation. Nonetheless, the fact that they are all looking at their own solutions indicates that the traditional vendors are not meeting their needs here.
We believe the future of the semis industry will be decided in this space. The decision these cloud companies face is among: a) building their own chip; b) making a bet on an emerging start-up; or c) waiting for the big companies to get their act together.
Google has clearly pulled far ahead in the design of its own AI chips, the TPU. They claim that their latest announced version of TPUs (their fourth) can handle both inference, and crucially, training as well. And we can assume that they have fifth and sixth generations versions well underway. They clearly view TPUs as a major source of strategic advantage. The other big cloud service providers may be forced to follow a similar path. There are certainly a lot of rumors in the press that Facebook feels this way. And while we have not touched on China too much in this post, the combination of Alibaba and Baidu’s very public AI ambitions and the Chinese government’s strong position on AI, it seems highly likely that these companies are working on their own chips as well. Given the scale of these companies’ purchases, they could remove this category from the chip companies’ revenue base entirely.
The market for data center silicon is about $30 billion a year, of which about 50% is CPUs and 15% is GPUs (the rest is largely memory and networking). Google has said that the move to TPUs will save them half of the data centers they would otherwise need to build if they only worked with CPUs and GPUs. These are big numbers, and the big chip companies risk seeing growth for their traditional products diminished by this shift. It is too soon to tell exactly how this will play out. CPUs are not going away any time soon. Nonetheless, it is likely that the move to AI chips will have some big impact somewhere.
The other big question will be the shape of the market for inference chips on the edge, or in devices. This market is a bit tricky to forecast because the nature of the end devices is so wide. We call them inference chips, but the inference chip for a car looks very different from an inference chip for a smartphone camera. There will likely be several tiers of products here ranging from full-power, full-feature chips for cars to low-power, low-cost products that go in throwaway video cameras that do things like recognize the number of cars at a stop light or watch crowds in public spaces.
Our best guess is that this market will look a lot like the broader chip market. There will be a small number of in-house design teams (e.g. Tesla, maybe), a handful of larger semis companies selling to top-tier customers, and then huge volumes coming from lost cost vendors just now emerging in China. It is very possible that the first global-tier Chinese chip company will rise to prominence on the back of AI chips.
This is probably the most vibrant part of the semiconductor industry right now and its development will likely have big implications for the broader semis industry.