New PCIe switch fabric aims to remove networking bottleneck for large AI clusters in the cloud.
There’s a new type of switch that could soon be showing up in AI-optimized data centers, a PCIe 6 fabric switch.
Astera Labs this week unveiled its new Scorpio fabric switch, which it says is the industry’s first PCIe 6 switch. These switches are specifically designed for AI workloads in accelerated computing platforms deployed at cloud scale. The Scorpio switches aim to deliver maximum system utilization and uptime for scale-out PCIe 6 connectivity and scale-up GPU clustering in AI servers.
The portfolio includes two product lines: Scorpio P-Series for GPU-to-CPU/NIC/SSD PCIe 6 connectivity and Scorpio X-Series for back-end GPU clustering.
Founded in 2017, Astera Labs is a semiconductor company that specializes in developing connectivity technologies for AI and cloud infrastructure. The company had its initial public offering on March 20 and has a series of product lines that help to support Ethernet, PCIe and CXL (Compute Express Link) connectivity.
“We’re really founded and continue to focus on solving connectivity bottlenecks within AI and cloud infrastructure,” Ahmad Danesh, associate vice president, product line management at Astera Labs, told Network World.
Scorpio PCIe 6.0 fabric switch
GPUs and AI acceleration technologies typically connect to a hardware motherboard via a PCIe (PCI Express) slot connection. To date however, there hasn’t been dedicated network fabric technology for connecting PCIe 6 connections. That’s the primary innovation that Astera Labs is now bringing to market with its Scorpio technology.
The P series fabric switch is designed to optimize data transfer between various components of an AI server.
“The biggest use case is NIC-to-GPU data ingest,” Danesh said. “How do you get from the scale-out network into the GPU so you can maximize GPU utilization, keep the GPU fed with data, as well as getting it back out?”
Danesh explained that the fundamental job of the P series Scorpio is being able to provide high performance peer-to-peer traffic from the GPU to all of its key resources, whether it’s the CPU, the NICs, the SSDs.
The X series targets scale-up clusters, facilitating high speed GPU-to-GPU communication. Unlike general-purpose switches, the X series offers platform-specific customizations.
“We can actually do platform-specific customization, depending on the different GPU, to really eke out performance, eke out the best bandwidth efficiency as well as the highest reliability,” Danesh said.
How the Scorpio switch fabric is deployed
While Astera Labs refers to Scorpio as a switch fabric, the actual technology the company currently provides is the silicon and software.
Astera Labs has not announced plans to build their own physical chassis or rack-level hardware for the Scorpio switch. Astera Labs sells the Scorpio silicon and evaluation boards and works with cloud service providers and OEMs for the actual hardware deployment.
Danesh explained that the Scorpio switch can be deployed in different form factors, such as a top-of-rack switch or integrated onto the same PCB (printed circuit board) as the AI accelerators.
The Scorpio technology will benefit from Astera Labs’ Cosmos software stack, which integrates connectivity, system management and optimization capabilities. Danesh noted that Cosmos provides a significant amount of telemetry and diagnostic information so that users can see exactly what’s going on at a link level.
“With our switching we can actually build more predictable performance as well, where each of these data paths is isolated and the GPUs can be fed more consistently with data,” Danesh said.
Astera Labs is already shipping pre-production quantities of Scorpio chips to customers, with full production slated for 2025.
Read more from this author
- Ciena and Arelion achieve 1.6 Tb/s optical transmission milestone: Ciena’s WaveLogic 6 Extreme (WL6e) is the optical networking technology that enabled the new speed record in carrier data networking transport.
- Edgecore unveils high-performance 400G spine switch for data centers: The DCS511 switch delivers up to 12.8 Tb/s of switching capacity with support for open-source network operating systems including SONiC.
- Linus Torvalds to developers: Pursue meaningful projects, not hype: At the Open Source Summit (OSS) Europe event in Vienna, Austria, the always colorful Linus Torvalds took the stage in a fireside chat to detail his views on the latest open source and Linux developments.
- Wireshark 4.4 boosts network protocol visibility: The creator of the popular open-source network protocol analyzer talks about what’s new in Wireshark 4.4, how governance has changed, and what to expect next.
- Fortinet expands security lineup with sovereign SASE: The new Fortinet Sovereign SASE offering provides a delivery option that lets organizations maintain local control over security inspection and logs.