What is going on with AI Factories?

A recurring theme in Nvidia’s discussions of the future data center landscape is the notion of an AI Factory. They have been using the term for a few years, but have recently been leaning on it more heavily. Something about the concept struck a nerve with us and we have been thinking about it a lot lately. On its face, the idea has its appeal, but we think there are some real problems with it underneath. These not only call into question the whole idea, but make us rethink Nvidia’s position, at least a bit.

Not surprisingly for an idea thrown out during a keynote address, Nvidia is not terribly specific about what an AI factory is exactly. From what we can tell, these are essentially stand alone data centers, independent form the Internet giant hyperscalers. They operate large AI-focused, GPU-heavy data centers and target customers keen to move into AI. If we dream big about how AI reshapes the data center, it is possible to envision these as becoming viable, powerful players.

That being said, we see a lot of problems here. First is just the nature of cloud computing. Running big data centers at scale, available to any and all users, is incredibly challenging, both financially and technically. There is a reason there are only a handful of cloud service providers providing infrastructure for others, there are serious economies of scale at play here. Over the past ten years, the whole IT landscape was turned over as enterprises everywhere gave up their own data centers and server racks, moving an ever-growing share of their budgets to the Cloud. For many companies, this is just a better solution. Of course, the power of the hyperscalers has grown considerably as have their margins. Over ten years ago, we took a look at AWS’s profits (before they broke them out) and even then they were clearly very profitable. Those profits have grown considerably since, leading to many industry leaders like A16Z’s Martin Cassado to call for a rethinking of the rush to the Cloud. So maybe an AI Factory makes sense.

The trouble with this argument is that moving from a giant hyperscaler to a GPU-specialized AI Factory does not necessarily change the economics for customers. At heart, the idea of converting the capex of buying your own servers to the opex of paying a hyperscaler to provider servers and all the staff needed to maintain and secure it, is a fairly compelling economic position. Can the AI factories do any better?

Moreover, before AWS, Azure and GCP rose to dominance, there was a period in which programming environment-specific platforms existed. Companies like Heroku made a play as Platform as a Service (PaaS) providers, in contrast to the lower level Infrastructure as a Service (IaaS) providers like AWS. Ultimately, customers realized they could run their own applications directly on IaaS and cut out the PaaS providers. Replace Heroku’s Ruby on Rails platform with Coreweave’s AI platform, and the parallels are clear.

Of course, the value proposition offered by today’s AI Factories is slightly different then PaaS offerings. PaaS was typically tied solely to a specific software ecosystem, while AI Factories are really tied to the hardware they offer, in this case GPUs, specifically Nvidia GPUs. So maybe that differentiates them from PaaS of yore, but that just opens up a more glaring problem. There are about a dozen companies today offering some form of GPU-centric Clouds. The best known of this is Coreweave, and they credibly offer some differentiated form of networking capabilities, but push any of the others to define their value proposition and it always comes down to one thing – “We have GPUs”. In a world constrained by GPU capacity – or as Dylan Patel puts it the “GPU Poors and the GPU Rich” – having H100 availability is worth valuable. The problem of course is that the last time we checked semiconductors are cyclical. Eventually, Nvidia GPUs will be readily available, maybe this year, maybe 2025, maybe further out – but we will eventually reach that point. When we do, what can these AI Factories offer? As we mentioned above, some of them credibly offer services that make AI workloads run more efficiently, but this essentially puts them into the camp of being a PaaS provider, a position at risk of losing out to the economies of scale that the IaaS providers like AWS and Azure can bring to bear.

So we are fairly cautious about AI factories, but we also think they pose a threat to Nvidia. Probably not Nvidia the company, but definitely NVDA the stock. If you look back at Nvidia’s history, they have been regularly subject to dramatic changes in direction in orders, inventory and revenue. In part, this stems from the fact that they have always sold hardware solutions, graphics cards, not just GPUs. That requires a lot of inventory out there in the world, they need inventory of chips, and also finished products, and these sat in warehouses and retail shelves. All of which dampened the company’s ability to accurately forecast demand, often creating a whiplash effect.Today, that vulnerability rests in all those AI Factories filling up with H100s, H200s and soon B200s. To be clear, there is no sign of weakness in this field today. Pricing for B200 instances are being quoted about 75% higher than H100 instance, in line with how the company is positioning the new product. That being said, when conditions turn, this is a large source of demand that could dry up fairly quickly.

Digits to Dollars

Deep Tech, Semis and More

What is going on with AI Factories?

Like this:

Related

Leave a ReplyCancel reply

Share this:

Like this:

Related

Leave a ReplyCancel reply

Discover more from Digits to Dollars