Amazon's AI leap with Inferentia and Trainium chips

Amazon's focus on microchip innovation has led to the creation of two key chips – Inferentia and Trainium.

Amazon's AI leap with Inferentia and Trainium chips
A logo for Amazon Web Services (AWS) is seen at the Collision conference in Toronto, Ontario, Canada June 23, 2022. Picture taken June 23, 2022. REUTERS/Chris Helgren//File Photo

The backstory: In 2015, Amazon took a big step by acquiring Annapurna Labs, a chip startup from Israel. This marked Amazon's initial foray into chip development. Then, in 2018, the company entered the server chip market with a chip called Graviton. What's special about this chip is that it's based on Arm architecture. This was a challenge to rivals like AMD and Intel, who were leading the market with their own chips.

Meanwhile, Amazon's Web Services (AWS) has soared to the top, becoming the world's biggest cloud computing provider. By 2022, AWS had captured a 40% market share, as highlighted by Gartner. And AWS contributed 70% to Amazon's overall US$7.7 billion operating profit in the second quarter of this year.

More recently: AWS is jumping on board for artificial intelligence (AI) innovation. It announced in June that it’d be investing US$100 million in a generative AI innovation center. On the AI hardware front, AWS shook things up in July with new AI acceleration hardware featuring Nvidia H100s.

Even though it’s not directly taking on ChatGPT, a leaked email from Amazon's CEO Andy Jassy revealed the company’s diving into creating expansive large language models. In AWS's second-quarter earnings call, Jassy emphasized AI's pivotal role for the company, with over 20 machine learning services on offer to heavyweight clients like Philips, 3M, Old Mutual and HSBC.

The development: Amazon's focus on microchip innovation has led to the creation of two key chips – Inferentia and Trainium. These chips are designed to make AI models smarter and faster. They're sort of like computer brains. They’re also an alternative to the popular Nvidia chips that people use for this sort of tech, which have become expensive and a bit hard to come by. Amazon started making custom chips back in 2013, with one called Nitro. Now, Nitro is everywhere in its AWS servers – and there are more than 20 million of them in use.

A recent behind-the-scenes look by CNBC into Amazon's chip lab in Austin showed how the company’s developing and testing Trainium and Inferentia. Trainium is the newbie, hitting the scene in 2021. It's now much better at its job, making machine learning models smarter, and it costs less, too — about half as much as other methods. On top of that, Inferentia has been around since 2019, and now it's already on its second generation. For now, Nvidia’s GPUs are still in the lead when it comes to training AI, but Amazon’s cloud dominance could give it an edge in the market.

Key comments:

“The entire world would like more chips for doing generative AI, whether that’s GPUs or whether that’s Amazon’s own chips that we’re designing,” said Amazon Web Services CEO Adam Selipsky to CNBC in an interview in June. “I think that we’re in a better position than anybody else on Earth to supply the capacity that our customers collectively are going to want.”

“Amazon is not used to chasing markets. Amazon is used to creating markets. And I think for the first time in a long time, they are finding themselves on the back foot, and they are working to play catch up,” said Chirag Dekate, VP analyst at Gartner. “I think the true differentiation is the technical capabilities that they’re bringing to bear. Because guess what? Microsoft does not have Trainium or Inferentia.”

“Let’s rewind the clock even before ChatGPT. It’s not like after that happened, suddenly we hurried and came up with a plan because you can’t engineer a chip in that quick a time, let alone you can’t build a Bedrock service in a matter of 2 to 3 months,” said Swami Sivasubramanian, AWS’ VP of database, analytics and machine learning.