The Battle for AI Backend Network Dominance: Ethernet vs. InfiniBand

TL;DR:

  • AI workloads are driving a 50% increase in data center switch spending.
  • AI backend networks currently represent less than 15% of the total data center switch market.
  • By 2027, AI backend network switches are expected to capture 40% of the market.
  • Generative AI applications are dealing with trillions of parameters, leading to the need for large clusters of accelerated nodes.
  • Ethernet and InfiniBand are in competition for dominance in the AI backend network market.
  • InfiniBand is expected to maintain its lead, but Ethernet is projected to reach a 20% market share by 2027.
  • The choice between Ethernet and InfiniBand depends on factors like network speed, congestion control, and adaptive routing.

Main AI News:

In the realm of data center AI networking, a pivotal question looms large: Will Ethernet or InfiniBand take the spotlight? As artificial intelligence (AI) workloads surge, so too does the demand for the backend network infrastructure that underpins these applications. According to Dell’Oro Group’s research, this surge will drive a remarkable 50% increase in data center switch spending. However, the ultimate victor in the battle for dominance in the burgeoning $10 billion AI backend network fabric market remains a matter of contention.

Presently, AI backend networks account for a modest fraction, less than 15%, of the total data center switch market expenditure. Nevertheless, analysts project a seismic shift, with AI backend network switches poised to claim a substantial 40% share of the data center switch market by 2027.

In this new era of AI, characterized by generative AI (genAI) applications grappling with an ever-expanding number of parameters, the stakes are exceptionally high. Sameh Boujelbene, Vice President at Dell’Oro, underscores this, stating, “Generative AI (genAI) applications are ushering in a novel era in the age of AI, ‘standing out for the sheer number of parameters that they have to deal with.'”

Today, large AI applications are handling trillions of parameters, a figure projected to grow tenfold each year. Coping with such rapid growth necessitates deploying thousands, or even hundreds of thousands, of accelerated nodes. To connect these nodes in vast clusters, a data center-scale fabric, known as the AI backend network, becomes indispensable. This network differs significantly from the traditional frontend network primarily used to link general-purpose servers.

Dell’Oro Group’s analysis extends to the AI backend networks constructed by major cloud service providers, including Google, Amazon Web Services, Microsoft, Meta, Alibaba, Tencent, ByteDance, and Baidu. Based on the current deployment of high-end accelerated servers, analysts assert that Microsoft holds the largest share, followed by Google and then Meta.

Amid this landscape, the rivalry between InfiniBand and Ethernet intensifies, as manufacturers on each side vying for supremacy in the AI backend network market. While InfiniBand is anticipated to maintain its lead, Ethernet is also poised for substantial growth, with an expected 20% market share by 2027, according to Boujelbene.

Yet, amidst the projections and market forecasts, a critical question persists: “What is the most suitable fabric that can scale to hundreds of thousands – and potentially millions – of accelerated nodes while ensuring the lowest job completion time?” This debate remains far from resolved.

Some may argue that Ethernet holds a one-generation lead over InfiniBand in terms of network speed. However, network speed is only one facet of the equation. InfiniBand and Ethernet approach congestion control and adaptive routing differently, factors that can significantly impact their suitability for specific enterprise needs. As Boujelbene notes, “InfiniBand used to be ahead of Ethernet in terms of performance in AI backend networks, but we are seeing significant improvement on Ethernet to close the gap.”

Conclusion:

The escalating demand for AI backend networks is reshaping the data center switch market. While InfiniBand maintains an edge, Ethernet is rapidly closing the gap. As the battle for dominance unfolds, cloud service providers may lean towards InfiniBand, while large enterprises favor Ethernet. The key lies in adapting to the evolving needs of AI applications and network performance requirements.

Source