An important topic in semis, which we get asked about frequently, is the idea of abstraction. One of the magical elements about modern technology is the distinction between all the various layers of a system. An elderly grandfather can type away in some application without the need to understand how anything but that application works. The developer who coded the application does not need to understand how the hardware works, she just has to understand the intricacies of whatever programming environment she is using. The engineers who design the chips in that hardware do not need to understand how to bend the laws of physics to physically manufacture those chips. Computing is built on this idea of layers abstracted from each other.
This works incredibly well, until it doesn’t. Grandpa’s application crashes, and now it would really help if he understood a little bit about the underlying operating system so he could debug it himself, without the need to call his son for remote IT help. That developer actually wants to understand a bit about the hardware to get better performance, or is… Heaven Help Them… writing a mobile application and needs to understand the agony of Android fragmentation just to get the app to work. And that semi designer is going to save themself a lot of debugging pain if they know enough about semis manufacturing to lay out circuits in a way that does not lead to manufacturing defects. The more someone knows about the layer immediately below where they are operating can be incredibly helpful.
Unspoken in all of this is that there are penalties for abstraction. In 99% of cases, the benefits of abstraction outweigh these penalties, but anyone looking to push performance to its maximum quickly finds themselves in that 1%.
Let us give an example from the programming world. A common task in many programs is sorting some data into a specific order – say alphabetizing a list of names, or ranking a list of customer orders by sales date. Sorting is a fundamental topic in computer science, meriting entire courses in school. Most modern programming languages come with a built-in Sort functions, and newcomers to the language are encouraged to use that function. But let’s say that newcomer is now employed and starts to research the best way to sort something for a particular application. A quick Google search on that language’s Sort function is guaranteed to return a half dozen videos with titles like “Why the built-in Sort function is awful – Here are 5 tips for better Sorting.” They could stay abstract, but in an effort to look diligent, are going to jump into that rabbit hole. Go deep enough and they may end up writing code for a Sort function they designed themself. Or they may switch to an entirely new programming language that is harder to use but offers much finer manipulation of code execution. The difference is a few dozen hours of work, or a few hundred if a new language is involved, all for a 10% increase in Sort speed. If the data they are sorting is a few hundred rows, that was a waste. But if the data is a few billion rows then someone is getting a promotion.
Taking all of this to the extreme leads to one of the most important trends in semis today – custom silicon. Here, the company building the custom silicon is breaking the abstraction between hardware and semis. This distinction did not exist 60 years ago when semis were first getting started. Back then companies built both. Then a whole list of good reasons led us to the very abstracted industry structure we have today, but all the reasons on that list are economic, not physical limitations. And now there is a growing list of reasons pushing the pendulum back in the other direction.
A truism in programming is that the closer software runs to “bare metal” the better the performance, and here bare metal is semis. This usually means working in “lower level” languages that sacrifice ease of use for simpler, more direct methods that have to move through fewer layers of abstraction to access the silicon. Building custom silicon is working the other way around, stripping out all the unnecessary blocks in the chip so that software speaks directly to the parts of the chip that matter most to it. Not everyone needs it or can accomplish it, but for those who can the benefits can be massive.
Cerebras’ wafer-scale chips are a good example — everything about their design is optimized for machine learning workloads.