The speed of Meta's Research Super Computer would dwarf that of the current world's fastest supercomputer. Credit: Gerd Altmann Facebook’s parent company Meta said it is building the world’s largest AI supercomputer to power machine-learning and natural language processing for building its metaverse project. The new machine, called the Research Super Computer (RSC), will contain 16,000 Nvidia A100 GPUs and 4,000 AMD Epyc Rome 7742 processors. It has 2,000 Nvidia DGX-A100 nodes, with eight GPU chips and two Epyc microprocessors per node. Meta expects to complete construction this year. RSC is already partially built, with 760 of the DGX-A100 systems deployed. Meta researchers have already started using RSC to train large models in natural language processing (NLP) and computer vision for research with the goal of eventually training models with trillions of parameters, according to Meta. “Meta has developed what we believe is the world’s fastest supercomputer. We’re calling it RSC for AI Research SuperCluster, and it’ll be complete later this year. The experiences we’re building for the metaverse require enormous compute power (quintillions of operations/second!) and RSC will enable new AI models that can learn from trillions of examples, understand hundreds of languages, and more,” said CEO Mark Zuckerberg in an emailed statement. RSC is expected to hit a peak performance of 5 exaFLOPS at mixed precision processing, both FP16 and FP32, which would rocket it to the top of the Top500 supercomputer list whose top performing supercomputer can hit 442 Pflop/s. It is being built in partnership with Penguin Computing, a specialist in HPC systems. Meta is not disclosing where the system is located. “RSC will help Meta’s AI researchers build new and better AI models that can learn from trillions of examples; work across hundreds of different languages; seamlessly analyze text, images, and video together; develop new augmented reality tools; and much more,” Kevin Lee, a technical program manager, and Shubho Sengupta, a software engineer, both at Meta, wrote in a blog post. “We hope RSC will help us build entirely new AI systems that can, for example, power real-time voice translations to large groups of people, each speaking a different language, so they can seamlessly collaborate on a research project or play an AR game together,” they wrote. In addition to all of the processing power, RSC also has to 175 petabytes in Pure Storage FlashArray, 46 petabytes in a cache storage, and 10 petabytes of Pure’s object storage equipment. RSC is estimated to be nine times faster than Meta’s previous research cluster, made up of 22,000 of Nvidia’s older generation V100 GPUs, and 20 times faster than its current AI systems. Meta does not plan to retire the old system. The company is focused on building learning models for automated tasks focused around content. It wanted this infrastructure in order to train models with more than a trillion parameters on data sets as large as an exabyte, with the goal of getting its arms around all the content generated on its platform. “By doing this, we can help advance research to perform downstream tasks such as identifying harmful content on our platforms as well as research into embodied AI and multimodal AI to help improve user experiences on our family of apps. We believe this is the first time performance, reliability, security, and privacy have been tackled at such a scale,” Lee and Sengupta wrote. Related content news Supermicro unveils AI-optimized storage powered by Nvidia New storage system features multiple Nvidia GPUs for high-speed throughput. By Andy Patrizio Oct 24, 2024 3 mins Enterprise Storage Data Center news Nvidia to power India’s AI factories with tens of thousands of AI chips India’s cloud providers and server manufacturers plan to boost Nvidia GPU deployment nearly tenfold by the year’s end compared to 18 months ago. By Prasanth Aby Thomas Oct 24, 2024 5 mins GPUs Artificial Intelligence Data Center news Gartner: 13 AI insights for enterprise IT Costs, security, management and employee impact are among the core AI challenges that enterprises face. By Michael Cooney Oct 23, 2024 6 mins Generative AI Careers Data Center news Network jobs watch: Hiring, skills and certification trends What IT leaders need to know about expanding responsibilities, new titles and hot skills for network professionals and I&O teams. By Denise Dubie Oct 23, 2024 33 mins Careers Data Center Networking PODCASTS VIDEOS RESOURCES EVENTS NEWSLETTERS Newsletter Promo Module Test Description for newsletter promo module. Please enter a valid email address Subscribe