Last week as part of their Re:Invent developer conference, Amazon Web Services (AWS) unveiled two new chips. Both are updates of chips that have been on the market for a while – Graviton3 and Tranium2. This demonstrates that AWS is as serious about their home-grown semiconductors as all of their hyperscaling peers, but it also shows some interesting twists on the market.
Tranium is AWS’s AI training chip, it is a powerful matrix-multiplication engine, capable of doing the heavy work of training AI models (hence the name). It is billed as a threat to Nvidia, which it is arguably, but we are not as worried about the implications of this for Nvidia as we are about what their other chip threatens for Intel.
This next chip is Graviton3 – an Arm-based server CPU. This is formidable for a number of reasons. First, AWS says it is 25% more perfomative than its predecessor. We have taken to be a bit cautious about figures released by the hyperscalers about their chips. We know first hand that they have a full suite of internal test metrics which the rest of us will never see, and it will be hard for outsiders to run more publicly available benchmarks. Anandtech or it didn’t happen. Tom’s Hardware was able to decode a bit more from the announcement. With that grain of salt taken, Graviton3 looks very strong. More performance and less powerful than other leading CPUs. We saw several people commenting on Twitter that the new chip is competitive not only with Intel (who is struggling with its manufacturing processes) but also with AMD (who is not struggling with those issues). For those of us who have been extolling the virtues of Arm in the data center, this is a big milestone.
Graviton3 demonstrates just how serious AWS is about silicon. When they first unveiled Graviton, we cautioned that the first chip looked like a trial, more proof of concept than commercial product. Now the results are in, the trial was successful and AWS is betting heavily on roll-your-own CPUs. Graviton3 is a big jump over Graviton2, and we can assume that Graviton4 is already in the works.
This is particularly bad news for Intel. AWS is the largest customer in the data center server industry. They probably once consumed close to 20% of Intel’s data center CPUs. Intel’s data center group contributes about 40% of Intel’s profits, so if AWS shifts entirely to Graviton that becomes a major dent in Intel’s profitability.
So, time to short Intel? Probably not. Intel has been bleeding share to AMD for over a year, and so have already seen a portion of this reduction reflected in numbers. Worse is coming, but process has already begun.
It is also important to remember that AWS uses silicon differently than other hyperscales. Our central tenet on roll-your-own silicon is that it only makes sense when the chip delivers some form of strategic advantage. Just building your own chip for the costs savings is not enough of a justification. Take Google as an example. They started designing their own chips for machine learning almost ten years ago, around the time that they identified AI/ML as their strategic core. They run a lot of AI Tensor workloads on their TPUs, so the ROI is immense.
By contrast, AWS does not just run a single workload, they run all the workloads – supporting the compute needs of thousands of very diverse customers. Comparing CPUs – x86 vs Arm – depends heavily on what workload is being tested. Some loads lend themselves to Arm others to x86 (yes, there are still plenty that run better on x86). AWS is going to continue to offer x86 compute instances for a long time, because so many of their customers want to run x86 workloads. AWS’ compute needs are highly heterogenous, so they have to support many different kinds of CPUs.
That being said, AWS is very tight lipped about what runs where. AWS conducts a huge amount of compute “under the hood”, coordinating and managing all their traffic. The code for that was originally written and optimized for x86, but that could change, and probably has. We are starting to suspect that AWS is moving a growing share of its own internal workloads to its own silicon, this would make sense given their ability to customize silicon to meet the needs of their software team.
As far as we know, AWS has not said anything publicly about this, but their investment is clear. And last week’s announcement show their silicon team is highly capable. In fact, the thing that surprised us the most about last week’s news was that they did not announce more of their own chips – no update to their AI inference chip Inferentia, no video coding unit (VCU), nothing about networking. Their hypserscale peers are all working along these lines, so it is curious that AWS did not announce more. Maybe their team is constrained, or maybe the event itself was constrained – there was a lot going on at Re:Invent. So we will be keeping an eye out for those announcements, as well as more AWS instance types supported by internal silicon. Setting aside these quibbles, AWS’ seriousness in semis is very cleear.