Data Center Chips in 2024: Top Trends and Releases

Now that Nvidia has announced its Blackwell GPUs, Intel, AMD, hyperscalers, and AI chip startups prepare to launch their own data center chips in 2024. We break down all new and upcoming releases.

Wylie Wong, Regular Contributor

April 11, 2024

14 Min Read
Rows of data center chips
Google

Table of Contents

Introduction

Nvidia recently made a big splash by announcing its next-generation Blackwell GPUs, but the rest of 2024 promises to be a busy year in the data center chip market as rival chipmakers are poised to release new processors.

AMD and Intel are expected to launch new competing data center CPUs, while other chipmakers, including hyperscalers and startups, plan to unveil new AI chips to meet the soaring demand for AI workloads, analysts say.

Fittingly, Intel on Tuesday (April 9) confirmed that its new Gaudi 3 AI accelerator for AI training and inferencing is expected to be generally available during the 2024 third quarter, while Meta on Wednesday (April 10) announced that its next-generation AI inferencing processor is in production and already being used in its data centers today.

While server sales are expected to grow by 6%, from 10.8 million server shipments in 2023 to 11.5 million in 2024, server revenue is expected to jump 59% year-over-year in 2024, an indication that processors remain a hot, growing market, said Manoj Sukumaran, a principal analyst of data center IT at Omdia. In fact, over the next five years, server revenue is expected to more than double to $270 billion by 2028.

Related:The AI Revolution Needs Chips, Software – and Gas Pipelines

“Even though the unit shipments are not growing significantly, revenue is growing quite fast because there is a lot of silicon going inside these servers, and as a result, server prices are going up significantly,” Sukumaran told Data Center Knowledge. “This is a huge opportunity for silicon vendors.”

Co-Processors Are a Hot Commodity

Data center operators have a large appetite for ‘co-processors’ – microprocessors designed to supplement and augment the capabilities of the primary processor.

Traditionally, the data center server market was CPU-centric with CPUs being the most expensive component in general-purpose servers, Sukumaran said. Just over 11% of servers had a co-processor in 2020, but by 2028, more than 60% of servers are expected to include co-processors, which not only increase compute capacity but also improve efficiency, he said.

Co-processors like Nvidia H100 and AMD’s MI300 GPUs, Google Cloud’s Tensor Processing Units (TPUs), and other custom application-specific integrated circuits (ASICs) are popular because they enable AI training, AI inferencing, database acceleration, the offloading of network and security functions, and video transcoding, Sukumaran said.

Related:World Chip Sales Return to Growth in Sign of Improving Demand

Video transcoding is a process that enables Netflix, YouTube, and other streaming media to optimize video quality for different user devices, from TVs to smartphones, the analyst noted. 

AMD and Intel vs. ARM CPUs

The CPU market remains lucrative. Intel is still the market share leader, but AMD and Arm-based CPUs from the likes of startup Ampere and other cloud service providers have chipped away at Intel’s dominance in recent years.

While Intel owns 61% of the CPU market, AMD has gained significant traction, growing from less than 10% of server unit shipments in 2020 to 27% in 2023, according to Omdia. Arm CPUs captured 9% of the market last year.

“The Arm software ecosystem has matured quite well over the past few years, and the lower power consumption and high-core densities of Arm CPUs are appealing to cloud service providers,” Sukumaran said.

In fact, Google Cloud on Tuesday (April 9) announced that its first Arm-based CPUs, called Google Axion Processors, will be available to customers later this year.

Intel aims to regain its footing in the CPU market this year by releasing next-generation server processors. The new Intel Xeon 6 processors with E-cores, formerly code-named ‘Sierra Forest,’ is expected to be available during the 2024 second quarter and is designed for hyperscalers and cloud service providers who want power efficiency and performance.

That will be followed soon after by the launch of the new Intel Xeon 6 processors with P-cores, formerly code-named Granite Rapids, which focuses on high performance.

AMD, however, is not sitting still and plans to release its fifth-generation EPYC CPU called Turin.

“AMD has been far and away the performance leader and has done an amazing job stealing market share from Intel,” said Matt Kimball, vice president and principal analyst at Moor Insights & Strategy. Almost all of it has been in the cloud with hyperscalers, and AMD is looking to further extend its gains with on-premises enterprises as well. 2024 is where you will see Intel emerge as competitive again with server-side CPUs from a performance perspective.”  

Chipmakers Begin Focusing on AI Inferencing

Companies across all verticals are racing to build AI models, so AI training will remain huge. But in 2024, the AI inferencing chip market will begin to emerge, said Jim McGregor, founder and principal analyst at Tirias Research.

“There is a shift toward inferencing processing,” he said. “We’re seeing a lot of AI workloads and generative AI workloads come out. They’ve trained the models. Now, they need to run them over and over again, and they want to run those workloads as efficiently as possible. So expect to see new products from vendors.”

Nvidia dominates the AI space with its GPUs, but AMD has produced a viable competitive offering with the December release of its Instinct MI300 series GPU for AI training and inferencing, McGregor said.

While GPUs and even CPUs are used for both training and inferencing, an increasing number of companies –  including Qualcomm, hyperscalers like Amazon Web Services (AWS), and Meta, and AI chip startups like Groq, Tenstorrent, and Untether AI – have built or are developing chips specifically for AI inferencing . Analysts also say these chips are more energy-efficient.

When organizations deploy a Nvidia H100 or AMD MI300, those GPUs are well-suited for training because they are big, with a large number of cores, and are high-performing with high-bandwidth memory, Kimball said.

“Inferencing is a more lightweight task. They don’t need the power of an H100 or MI300,” he said.

Top Data Center Chips for 2024 – An Expanding List

Here’s a list of processors that are expected to come out in 2024. Data Center Knowledge will update this story as companies make new announcements and release new products.  

AMD


AMD Instinct MI300X OAM Card

AMD Instinct MI300X

AMD plans to launch Turin, its next-generation server processor, during the second half of 2024, AMD CEO Lisa Su told analysts during the company’s 2023 fourth-quarter earnings call in January. Turin is based on the company’s new Zen 5 core.

“Turin is a drop-in replacement for existing 4th Gen EPYC platforms that extends our performance, efficiency and TCO leadership with the addition of our next-gen Zen 5 core, new memory expansion capabilities, and higher core counts,” she said on the earnings call.

No specific details of Turin are available. But Kimball, the Moor Insights & Strategy analyst, said Turin will be significant. “AMD will look to further differentiate themselves from Intel from a performance and performance-per-watt perspective,” he said.

AMD has also seen huge demand for its Instinct MI300 accelerators, including the MI300X GPU, since their launch in December. The company plans to aggressively ramp up production of the MI300 this year for cloud, enterprise and supercomputing customers, Su said during the earnings call.  

Intel


Intel 5th Gen Xeon

Intel 5th Gen Xeon chip

Intel executives plan to release several major chips this year: its Gaudi 3 AI accelerator and next-generation Xeon server processors.

Gaudi 3 will be for AI training and inferencing, and is aimed at the enterprise market. It’s designed to compete against Nvidia and AMD’s GPUs. The AI chip will offer four times more AI compute and 1.5 times more memory bandwidth than its predecessor, the Gaudi 2, Intel executives said at its Intel Vision 2024 event in Phoenix this week.

Gaudi 3 is projected to deliver 50% faster training and inferencing times and 40% better power efficiency for inferencing when compared to Nvidia’s H100 GPU, Intel executives added.

“This is going to be competitive with massive power savings and a lower price,” said Kimball, the analyst.

As for its next-generation Intel Xeon 6 processors, Sierra Forest will include a version that features 288 cores, which would be the largest core count in the industry. It’s also the company’s first “E-core” server processor designed to balance performance with energy efficiency.

Granite Rapids is a “P-core” server processor that’s designed for best performance. It will offer two to three times better performance for AI workloads over Sapphire Rapids, the company said.

Gaudi 3 will be available to OEMs during the 2024 second quarter with general availability anticipated during the third quarter, an Intel spokesperson said. Sierra Forest, now called Intel Xeon 6 processors with E-cores, is expected to be available during the 2024 second quarter. Granite Rapids, now called Intel Xeon 6 processors with P-cores, is expected to launch “soon after,” an Intel spokesperson said.  

The news follows Intel’s launch of its fifth-generation Xeon CPU last year.

Nvidia


Nvidia GB200 Grace Blackwell Superchip

Nvidia GB200 Grace Blackwell

Nvidia in mid-March announced that it will start shipping next-generation Blackwell GPUs later this year, which analysts say will enable the chip giant to continue to dominate the AI chip market.

The new family of Blackwell GPUs – designed for cloud providers and enterprises – offer 20 petaflops of AI performance on a single GPU and will enable organizations to train AI models four times faster, improve AI inferencing performance by 30 times and do it with up to 25 times better energy efficiency than Nvidia’s previous generation Hopper architecture chips, executives said.

Nvidia will also ship the Hopper-based H200 during the 2024 second quarter. The company recently announced new benchmarks showing it’s the most powerful platform for running generative AI workloads. An H200 will perform 45% faster than an H100 while inferencing a 70 billion parameter Llama 2 model, the company said.

Ampere


AmpereOne Chip

AmpereOne Chip

Ampere did not respond to Data Center Knowledge’s request for information on its chip plans for 2024. But last May, the startup, led by former Intel president Renee James, announced a new family of custom-designed, Arm-compatible server processors that feature up to 192 cores.

That processor, called AmpereOne, is designed for cloud service providers and simultaneously delivers high performance with high power efficiency, company executives said.

AWS


AWS Trainium2 chip with black background

AWS Trainium2

AWS is among the hyperscalers that partner with large chipmakers such as Nvidia, AMD and Intel and use their processors to provide cloud services for customers. But they also find it advantageous and cost-effective to build their own custom chips to power their own data centers and provide cloud services to customers.

AWS this year will launch Graviton4, an Arm-based CPU for general-purpose workloads and Tranium2 for AI training. Last year, it also unveiled Inferentia2, its second-generation AI inferencing chips, said Gadi Hutt, senior director of product and business development at AWS’ Annapurna Labs.

“Our goal is to give customers the freedom of choice and give them high-performance at significantly lower cost,” Hutt said.

Tranium2 will feature four times the compute and three times the memory of its first Tranium processor. While AWS uses the first Tranium chip in a cluster of 60,000 chips, Tranium2 will be available in a cluster of 100,000 chips, Hutt said.

Microsoft Azure


Microsoft Azure Maia 100 AI Accelerator

Microsoft Azure Maia 100 AI Accelerator

Microsoft recently announced the Microsoft Azure Maia 100 AI Accelerator for AI and generative AI tasks and Cobalt 100 CPU, an Arm-based processor for general-purpose compute workloads.

The company in November said it would begin rolling out the two processors in early 2024 to initially power Microsoft services, such as Microsoft Copilot and Azure OpenAI Service.

The Maia AI accelerator is designed for both AI training and inferencing, while the Cobalt CPU is an energy-efficient chip that’s designed to deliver good performance-per-watt, the company said.

Google Cloud


Google Axion Processor

Google Axion Processor

Google Cloud is a trailblazer among hyperscalers, having first introduced its custom Tensor Processing Units (TPUs) in 2013. TPUs, designed for AI training and inferencing, are available to customers on the Google Cloud. The processors also power Google services, such as Search, YouTube, Gmail and Google Maps.

The company launched its fifth-generation TPU late last year. The Cloud TPU v5p can train models 2.8 times faster than its predecessor, the company said.

Google Cloud on Tuesday (April 10) announced that it has developed its first Arm-based CPUs, called the Google Axion Processors. The new CPUs, built using the Arm Neoverse V2 CPU, will be available to Google Cloud customers later this year.

The company said customers will be able to use Axion in many Google Cloud services, including Google Compute Engine, Google Kubernetes Engine, Dataproc, Dataflow and Cloud Batch.

Kimball, the analyst, expects AMD and Intel will take a revenue hit as Google Cloud begins to deploy its own CPU for its customers.

“For Google, it’s a story of ‘I’ve got the right performance at the right power envelope at the right cost structure to deliver services to my customers. That’s why it’s important to Google,” he said. “They look across their data center. They have a power budget, they have certain SLAs, and they have certain performance requirements they have to meet. They designed a chip that meets all those requirements very specifically.”

Meta

Meta has deployed a next-generation custom chip for AI inferencing at its data centers this year, Meta announced on Wednesday (April 10).

The next-generation AI inferencing chip, previously code-named Artemis, is part of the company’s Meta Training and Inference Accelerator (MTIA) family of custom-made chips designed for Meta ’s AI workloads.

Meta introduced its first-generation AI inferencing chip, MTIA v1, last year. The new next-generation chip offers three times better performance and 1.5 times better performance-per-watt over the first-generation chip, the company said.

Cerebras


Cerebras logo on server rack

AI hardware startup Cerebras Systems introduced its third-generation AI processor, the WSE-3, in mid-March. The wafer-sized chip doubles the performance of its predecessor and competes against Nvidia in the high-end of the AI training market.

The company in mid-March also partnered with Qualcomm to provide AI inferencing to its customers. Models trained on Cerebras’ hardware are optimized to run inferencing on Qualcomm’s Cloud A100 Ultra accelerator.

Groq


Groq AI chip

Groq is a Mountain View, California-based AI chip startup that has built the LPU Inference Engine to run large language models, generative AI applications and other AI workloads.

Groq, which released its first AI inferencing chip in 2020, is targeting hyperscalers, enterprises, the public sector, AI startups and developers. The company will release its next-generation chip in 2025, a company spokesperson said.

Tenstorrent


Tenstorrent Wormhole Two Chip

Tenstorrent Wormhole Two Chip

Tenstorrent is a Toronto-based AI inferencing startup with a strong pedigree: its CEO is Jim Keller, a chip architect who has worked at Apple, AMD, Tesla and Intel and helped design AMD’s Zen architecture and chips for early Apple iPads and iPhones.

The company has begun taking orders of its Wormhole AI inferencing chips this year with a formal launch later this year, said Bob Grim, Tenstorrent’s vice president of strategy and corporate communications.

Tenstorrent is selling servers powered by 32 Wormhole chips to enterprises, labs and any organization that needs high-performance computing, he said. Tenstorrent is currently focused on AI inferencing, but its chips can also power AI training, so the company plans to also support AI training in the future, Grim said.

Untether AI

Untether AI is a Toronto-based AI chip startup that builds chips for energy-efficient AI inferencing.

The company – whose president is Chris Walker, a former Intel corporate vice president and general manager – shipped its first product in 2021 and plans to make its second-generation SpeedAI240 chip available this year, a company spokesperson said.

Untether AI’s chips are designed for a variety of form factors, from single-chip devices for embedded applications to 4-chip, PCI-Express accelerator cards, so its processors are used from the edge to the data center, the spokesperson said.

About the Author

Wylie Wong

Regular Contributor

Wylie Wong is a journalist and freelance writer specializing in technology, business and sports. He previously worked at CNET, Computerworld and CRN and loves covering and learning about the advances and ever-changing dynamics of the technology industry. On the sports front, Wylie is co-author of Giants: Where Have You Gone, a where-are-they-now book on former San Francisco Giants. He previously launched and wrote a Giants blog for the San Jose Mercury News, and in recent years, has enjoyed writing about the intersection of technology and sports.

Subscribe to the Data Center Knowledge Newsletter
Get analysis and expert insight on the latest in data center business and technology delivered to your inbox daily.

You May Also Like