The Changing Face of Compute

The Death of General Compute

We have been writing a lot about how semis and compute systems will be designed and built in the future. We originally wrote this as a series of posts – laying out the landscape, with a look at how this will affect large, incumbent chip companies, and how this is opening the doors for semis start-ups and some thoughts on where this is all headed. Below, we compiled all of this into a single post because we feel it may help some readers to look at how all the composite pieces work together.

Once upon a time chip companies all specialized on designing one type of chip: Intel made CPUs; Qualcomm made modems; Nvidia made GPUs; Broadcom (pre-Avago) made networking chips. That age is all over. The future of semis will be designing ever more specific chips for ever more specific uses. This change will take many years to play out, but the transition has already begun. This is going to upend the semis industry to the same degree that consolidation over the past 20 years has.

There are many causes of this. This simplest is to just say Moore’s Law is slowing, so everyone needs to find a new business model. But that really does not explain much, so let’s unpack it. In the misty past before 2010, Moore’s Law meant that chips got ‘faster’ or ‘better’ every two years or so. If some customer had a special-purpose chip they needed, they could go out and design their own, but by the time they could get that chip to production, the new CPUs were coming into production, and those usually proved better than the purpose-built chip under design.

Then Moore’s Law slowed, we lack sufficient PhD’s to say it is over, but it has definitely slowed. So everyone now has to work a bit harder to squeeze performance gains out of their silicon designs. Most obviously, this has opened the doors to all the Roll Your Own silicon coming out from hardware and hyperscaler companies, but the changes are set to blow way past that.

The whole point of a semiconductor is to run some form of software. As we said, in the past, we could win performance gains for that software from denser chips, but now companies are going to have to look at the software side of the problem a bit more closely. Google rolled out its TPU because they wanted something that ran their AI algorithms better. They rolled out the VCU for the same reason, and that chip was actually designed by software engineers. Same story for Apple and its M- and A- series processors. In all of these, the whole point is to optimize the silicon for the software.

Not everyone is going to want or be able to roll their own chips, and so we are starting to see a host of intermediary chips that are not single-type, general purpose compute nor are they entirely customized. AMD’s recently-acquired Pensando’s DPUs are a good example of this intermediary step.

Once upon a time, data centers were essentially warehouses full of CPUs. Now they have to house GPUs, AI accelerators, funky networking loads and a bunch of FPGAs too. This is often called heterogenous compute, and it the opposite of that past CPU uniformity.

Nor are these changes only happening in data centers. The whole notion of “Edge Compute” looks increasingly to be an exercise in custom and semi-custom silicon popping up in all kinds of places – cars, factories and smart cities – to name just a few.

Ultimately, the major chip companies are going to have to decide how to address these changes. Building custom chips is not a great business, but designing semi-custom chips is full of risks not least picking the right designs, supporting them and hoping they land on target. Established companies are already starting to position themselves for this, and for the first time in a decade the door for start-ups is starting to open a crack.

Big Chip Companies are Becoming the Home Depot of Roll Your Own Silicon

We recently reviewed the Analyst Day slides from some of the leading semiconductor companies, and a clear theme emerged. The large companies are all shifting in a similar direction, posing some potential challenges for their long-term positions. Yesterday, we touched on the trend spreading through the industry. More and more customers are looking for special purpose chips, a coping mechanism for dealing with the slowdown in Moore’s Law. And the big players are all looking to support those customers.

Designing a chip is a complex process. First comes laying out the chip, making sure the design works, then preparing all the files required for the foundry, and when they come back from the fabs there is a lot of work involved in getting the chip debugged and working. It has gotten increasingly easy to do that first bit – designing a chip, but the rest of it is still a fairly laborious process, and the big chip companies think they can help others get through all of that.

Every chip design company has a large labor force that handles “operations” – managing the whole process of chip production from initial set up with the foundry all the way through to planning production. Companies like Google and Facebook want to design their own chips but they do not necessarily want to do all of this “back end” work themselves. If you are only producing two or three chips, it usually does not make economic sense to have those teams internally. By contrast, the chip companies have to have these teams, adding a third party’s design to the workflow does not add too much burden.

And there is more to it. Part of designing a chip is the ‘fun’ bit of mapping software requirements to logic gates, but there are many far less ‘sexy’ parts of every chip design – the I/O blocks, memory, SERDES and interconnects, and more. The big chip companies already have access to this intellectual property (IP), and many times it easier for the hyperscalers to obtain this IP from the chips companies rather than directly muck about with their own licensees.

For a period, there was considerable concern among the big chip companies that their biggest customers would do and end run around them by designing their own silicon. Rather than fight this trend, the incumbents have decided to embrace it by helping those customers.

At least that is the strategic logic. The commercial logic is a bit more muddled. Put simply, providing operations services to customers is not a great business, or at least it deviates pretty far from how the chip companies would prefer to operate. This service revenue is not recurring, every outside design can get put up for bid. The work involved is fairly labor intensive, and so gross margins are not that exciting, typically below corporate gross margins. On the other hand, much of the cost associated with this work can be amortized over a labor force that the chip companies have to maintain anyways, so contribution margins may make it worth the effort.

That being said, we think the incumbents have reached a sensible conclusion that offering these operations services can yield strategic benefits down the line. Bring a customer in for the engineering support, and then leverage that relationship to work their other chips into the mix.

We wrote a few weeks back about Qualcomm seemingly doing just that. And they are not alone. AMD dedicated several slides on this topic in their CEO’s keynote in their recent analyst day.

Source: AMD

Marvell also devoted considerable air time to the subject as their event as well.

Source: Marvell

Marvell also provided this handy graphic illustrating the whole strategy around building semi-custom heterogenous compute solutions.

Source: Marvell

But the strongest outreach effort comes from Broadcom, who is the Godfather of this whole segment. They held an investor event entirely focussed on Custom Silicon, hosted by our friends at DeutscheBank. This is already larger than a $2 billion for Broadcom, who largely got into the business by helping Apple build its A-series chips for iPhones ten years or so ago.

Source: Broadcom

When it is all said and done, the chip companies are never going to really love this business. It devalues their design teams which is their ultimate source of their differentiation. Nor is it clear how viable this business really is over the long term. Many companies have tried to do this in the past, but none of them lasted. Probably the best known are LSI and eASIC who both ended up in decidedly unexciting sales to larger companies with broader product portfolios. The case of eASIC in particular is illuminating as this company really struggled for relevance, let alone profitability.

That being said, fighting the trend looks futile at this stage. So instead, the big companies are all making the smart choice to diversify their offerings, and possibly/maybe/hopefully use this as a means to drive customers into their sales funnel for catalog chips. This is a strategy borne of the necessity of a rapidly changing market, and is thus a viable hedge. If current trends continue and more non-chip companies build more of their own chips, the semis incumbents will be well positioned to support them.

This is opening the door for start-ups as well

 More customers want more customized chips, and the slowing of Moore’s Law has opened the door (and shoved everyone through it) to make this economically viable. Last week, we pointed out that this trend is both a threat and an opportunity for the large, incumbent chip makers. Here we want to explore what this means for new chip companies.

At the most basic level, we are seeing companies shift from buying general purpose chips to buying more tailored systems on a chip (SoC). Instead of buying two chips – a discrete GPU and a discrete CPU – they might buy a single SoC that contains both a CPU and a GPU and maybe a few other functions. The best known example of this is Apple’s M Series CPU which has both of those plus a bunch of AI accelerators and a few other blocks. This makes sense for the customer, but it turns out to be a really challenging problem for the vendors. Chip companies have to employ large teams of designers and other engineers. This is high skilled labor and their costs add up. There are also large up-front costs for licenses for things like EDA tools. And then there are tape-out costs, upfront payments to the foundry to begin production which start in the tens of millions of dollars for advanced nodes. The advantage of catalog, general purpose chips is that they offer maximum leverage on these fixed costs. Design a successful chip, and then move the team on to the next generation while racking up healthy contribution margins from each chip sale.

By contrast, custom chips are very risky because sales are entirely dependent on getting that one chip right. Every customer wants their own chip, but that evaporates the operational leverage of the model. Tape out costs alone doom custom parts for most customers. The worst case occurs when the chip company designs an SoC with the wrong mix of sub-systems, and no one buys it; the chip company is out all that money.

This problem has become very pronounced in the seemingly hot category for AI accelerators. Five years ago, VCs poured money into the space into a few dozen companies making these chips. And so far, almost none of those companies have generated significant revenue (although a few had some very good exits). The problem is that they thought they could build general purpose accelerators when it turned out customers wanted something much more specific to their individual needs.

That bring said, the shift away from general compute has cracked open the door, essentially fragmenting a once largely monolithic industry. Tailored solutions of SoCs by definition will create niche opportunities which should provide plenty of toeholds for smaller companies to enter the fray. Everyone wants Tesla and GM as a customer, but how do Rivian and Nikola get the big chip companies to return their calls?

There are a few approaches start-ups can take to squaring this circle.

EdgeQ is a good example. They are building a chip for 5G networks. This field was dominated by Texas Instruments, Ericsson, NXP and Nokia for years. But the new wireless standard is, if nothing else, a giant welcome sign for new entrants to the marketThis process is further aided by the Open RAN project, which is brining new hardware makers into the business. By building a chip around the 5G standard, EdgeQ has created a chip that theoretically may be of great interest to emerging hardware vendors in the space. Not so long ago, anyone entering the market for wireless hardware would have just turned to Intel (who is not surprisingly a big proponent of Open RAN), but the industry will clearly benefit from something more customized to the workloads. EdgeQ does not have to build a different chip for each customer, but can build a solution tailored to the standards and build on from that beach head.

Another example is Indie Semiconductor. Indie is probably best known for its eye-popping SPAC fundraising, but they will eventually be better known for their specialization in the automotive semis segment. Selling chips to auto makers is painful. Product cycles are, by semis standards, ridiculously long – often five to seven years, or longer. The incumbent auto makers are all large companies with very demanding purchase standards as well as some grueling specifications and qualifications (data center CPUs do not need to contend with high external temperatures, humidity and constant vibration). Indie has essentially built its model around handling all of that. Most importantly, they have a process in place for designing chips that customers feel are customized to them, but are sufficiently interchangeable that Indie does not have to re-invent the wheel for each customer. Indie is a bit different in that most of the parts they make are at trailing edge processes and thus not subject to the vagaries of Moore’s Law. However, the company clearly has ambitions to move in further up the stack. Their work in the unglamorous parts of auto semis lets them build relationships with the big customers and thus win themselves a seat at the table for conversations around those more advanced chips.

These are just two ways that start-ups can benefit from the shift in the market, there are many other paths. That being said, both examples highlight important common threads. Both have chosen to go after specific industries, their web sites do not have dozens of industry use cases listed. Both are also focused on specific outside frames of reference – auto qualifications for Indie, wireless standards for EdgeQ. We would generalize more broadly to say companies should focus on specific software stacks – not exactly true in Indie’s case yet, but very true for EdqeQ (wireless standards are just a very peculiar form of software). There are many other outside frames upon which start-ups can hang their hats.

As we have mentioned often in the past, US Venture investors have been generally avoiding semis for a decade. The shift in compute trends greatly alters the return calculations these investors need to make when assessing potential outcomes. And this should open the door to funding for many more chip and system start-ups.

Things are gloomy now, but interesting changes are coming

The markets are falling. Crypto is cratering. Moore’s Law is slowing, maybe ending. The wireless networks are running out of ways to advance to new generations, with 6G very far away. And worst of all it is summer time in San Francisco. We are living in gloomy times. Or are we?

We were thinking about all this recently when listening to an interview with a semiconductor executive talk about advances 6G, which he described as coming soon and likely to be very important. Soon after, we were listening to a software executive at a networking company talk about all the performance advances we can expect from Moore’s Law, paving the way for new features in his products. Both were taking for granted that the other side’s advances would continue far into the future.

In reality, neither of those things can be taken as givens. Moore’s Law is clearly slowing, with performance advances taking longer to arrive, with costs per transistor moving steadily upwards. Dig into the wireless standards, and the truth comes out – we ran out of ways to easily manipulate wireless spectrum with 4G. Cellular radios already modulate frequency, amplitude, phase, time and power – essentially every element of the signal. 5G did not bring much in the way of new modulation schemes, so instead of squeezing more gains out of existing spectrum (via those modulations), the operators are now just looking to add more spectrum. Both sides of this still have more tricks they can pull out of their “let’s really test the laws of physics” bag-of-tricks, but it is safe to say that two of the underpinnings of the past two decades of productivity growth are not going to come as easily as they once did.

So that’s it. The digital revolution has maxed out. Time to move to our mountain compounds, install those Faraday cages and disappear off the grid. Thank you for reading…

Or maybe there is another option. Over the last few posts, we have laid out the case for a new way to approach designing digital systems. These will combine hardware, software and semis into more tailored systems that solve more specific problems. And this is just one approach. There are advances in power and energy, space networksmaterials, and many more systems in the works. Finding these and commercializing them will not be easy, but it is eminently possible. There are people out there with the technical expertise. The demand is large already, and growing huge. The only thing really missing is a little bit of capital to get it all off the ground. Watch this space.

Change is not easy, especially coming off of 20+ year of low interest rates, and free performance gains from some key technologies. A whole bunch of people are going to have to do a whole bunch of work. But there is little stopping us other than imagination and will. We are actually increasingly optimistic, despite that the fact that is 50 degrees Fahrenheit in mid-July and overcast.

6 responses to “The Changing Face of Compute

  1. One thing that you seem to be leaving out is the gradual realization of the terrible vulnerabilities of the major developed economies to disruptions to the supply of semiconductors. The next decade is going to see massive efforts in Europe, the US, and China to develop a “local” or at least near shore semiconductor supply. There is going to be a massive sloshing around of capital.

    • Part of what I’m working up to is how the US should direct its resources to address those challenges. Should we spend $100+ billion to catch up to TSMC? Or are we better off developing others things – new materials, new compute systems, new power sources.

  2. Pingback: The Feel of Bare Metal | Digits to Dollars·

  3. Pingback: I Am Buying Cadence Design And You Should Too - Public News Time·

  4. I wonder could one solution to the SoC problem be modules. Yes, the design of the interconnect/substrate is problematic but modules are cheaper than SoC and it is easier (and less costly) to make changes.

    • Yes
      Advanced packaging is definitely part of the solution to a slowdown. It’s one of the many ways we are going to eke performance gains that we used to get “for free” from increases in transistor density (aka Moore’s Law).
      A related topic is chiplets which is like tiles composing a system. Each individual chiller can be produced at different nodes. Not quite as fast as an SoC but still better than discrete packages

Leave a Reply