Depending on who you ask the market for AI semiconductors is somewhere between a few billion and infinity dollars. We are firmly in the realm of high expectations when it comes to the category, and determining more precise market size estimates is important for the large group of people making investment decisions in the coming year.
Right now, the world is fixated on Nvidia’s dominant position in the market, but it is important to pick that apart a bit. Nvidia is clearly the leader in the market for training chips, but that only makes up about 10%-20% of the demand for AI chips. There is a far larger market for Inference chips, which take trained models and answer user questions. So it is a much bigger market, and no one, not even Nvidia, has a lock on this market.
Next, we have to break the market down even further. Users will need inference done in both the edge and the cloud. Cloud inference will be a function of data center demand. The total market for data center semis today is about $50 billion (not counting memory, which is admittedly a big omission). This inference market is already large but also fairly fragmented. Nvidia admittedly has a large share of this already, probably the largest share as so much AI work is done on GPUs. AMD is targeting this market as well, but they come at it behind a big gap with Nvidia. And this is a place where the hyperscalers are using a lot of their homegrown chips – AWS on inferentia and Google on TPU, to name just two. It is also worth remembering that much of this work is still done on CPU, especially with high-end GPUs in short supply. This corner of the market is going to remain highly competitive for the foreseeable future. Like AMD, this is the market that all the other CPU, GPU and accelerator vendors are going to fight over with a healthy mix of products.
Which brings us to the other big opportunity – Inference at the Edge. Edge is a widely abused term, but for our purposes here we are really referring to any device in the hands of an end-user. Today this largely consists of smartphones and PCs, but is expanding into other areas like cameras, robots, industrial systems and cars. Forecasting the size of this market is not easy. Beyond the widening scope of usage, much of the silicon for these devices is likely to be bundled into some System on a Chip (SoC) that runs all the functions of those devices.
The best example of this is the iPhone which already dedicates considerable chip area to AI cores in its A-Series processor. By some metrics, AI content already occupies 20% of the A-Series chips which is sizable when we consider that the other 80% has to run everything else on the phone. And there are many, many other companies pursuing SoC AI strategies.
Of course, a major question in the AI-sphere nowadays is how much compute power do we need to run the latest large language models (LLM) like GPT and Stable Diffusion. There is clearly immense interest in getting these models to run on the lightest compute footprint possible, and the open source community has made some major advances in just a short period of time on this front.
All of this means that the Edge Inference market is likely to remain highly fragmented. Probably the best way to think of this is to assume that for existing categories, like phones and PCs, the AI silicon will come from the companies that already provide the chips powering those devices, like Qualcomm and Intel/AMD.
As we said at the outset, finding reliable forecasts for the size of the AI silicon market is very hard to do right now. This is not helped by the fact that no one is exactly clear on how they will actually use LLMs and other models for actual work. Then there is the question of what to count and how to count it. If a hyperscaler buys a few hundred thousand CPUs which will run neural networks as well as more traditional workloads, or someone adds a few dozen square millimeters of AI blocs to their SoC – how do we count those? Right now, our rough rule of thumb is that the market for AI silicon will breakdown to about 15% for training, 45% for data center inference and 40% for edge inference.
This has some important implications for anyone contemplating entering the fray. For the foreseeable future, Nvidia will have a lock on the training market. The data center inference market looks attractive but already boasts a dozen companies vying for some share of that including the giants like Nvidia, AMD and Intel, plus Roll Your Own internal silicon from the customers. Edge inference is also likely to be dominated by existing vendors of traditional silicon, who are all busily investing in supporting transformers and LLMs. What is left for new entrants? There are basically four options from what we can see.
- Supply IP or chiplets to one of the SoC vendors. This has the advantage of having fairly low capital requirements, let your customer write checks to TSMC. There are a lot of customers trying to build SoCs, and while they may want to do everything themselves, many will take the smarter approach of getting help where they need it.
- Build up a big (really big) war chest and go after the data center market. This is challenging for many reasons, not least the number of customers is small, but the rewards are massive.
- Find some new edge device that could benefit from a tailored solution. Forget about phones and laptops, instead look to cameras, robots, industrial systems, etc. This is not easy either. Some of these devices are very cheap and so they cannot support chips with high ASPs. A few years ago, we saw dozens of pitches for companies looking to do low-power AI on cameras and drones. Very few of those survive.
- Finally, there is automotive, the Great Hope of the whole industry. This market is still highly fragmented and somewhat nebulous. There is not a lot time to gain entry here, but there is massive opportunity.
Put simply, the market for AI is largely already spoken for. This does not mean abandon hope all ye who enter, but companies will need to be incredibly focussed and very careful what markets and customers they choose.