On a typical day the company transfers over an exabyte of data between dozens of clusters spread across continents. Credit: NicoElNino / Shutterstock Google this week unveiled technical details of its method for efficiently transferring upwards of 1.2 exabytes of data — more than 100 billion gigabytes — every day. Details of a data copy service known as Effingo were contained in a technical paper that was presented during a delegate session at Sigcomm 2024, the annual conference of the ACM Special Interest Group on Data Communication, which wrapped up today in Sydney, Australia. Its authors note in the paper that “WAN bandwidth is never too broad — and the speed of light stubbornly constant. These two fundamental constraints force globally distributed systems to carefully replicate data close to where they are processed or served. A large organization owning such systems adds dimensions of complexity with ever-changing network topologies, strict requirements on failure domains, multiple competing transfers, and layers of software and hardware with multiple kinds of quotas.” According to the seven Google scientists who authored the report, on a typical day, Effingo transfers over an exabyte of data across dozens of clusters spread across continents, and serves more than 10,000 users. They called managed data transfer “an enabler, an unsung hero of large-scale, globally distributed systems” because it “reduces the network latency from across-globe hundreds to in-continent dozens of milliseconds.” This enables, it said, “ the illusion of interactive work” for users. However, they wrote, the goal of most data transfer management systems “is to transfer when it is optimal to do so — in contrast to a last-minute transfer at the moment data needs to be consumed. Such systems provide a standard interface to the resources, an interface that mediates between users’ needs, budgets and system goals.” Effingo is different in that it “has requirements and features uncommon in reported large-scale data transfer systems.” Rather than optimizing for transfer time, it optimizes for smooth bandwidth usage while controlling network costs by, for example, optimizing the copy tree to minimize the use of expensive links such as subsea cables. Its other design requirements included client isolation, which prevents transfers by one client affecting those of other clients; isolated failure domains restricting copies between two clusters from depending on a third cluster; data residency constraints that prohibit copies being made to any location not explicitly specified by the client; and data integrity checks to prevent data loss or corruption. And, the system must continue to operate even when dependencies are slow or temporarily unavailable. The paper provides details of how Google achieved each of these goals, with a section on lessons learned chronicling Effingo’s evolution. It emphasizes, however, that Effingo is still a work in progress and is continuously evolving. The authors said that Google plans to improve CPU usage during cross-data center transfers, improve integration with resource management systems, and enhance the control loop to let it scale out transfers faster. Nabeel Sherif, principal advisory director at Info-Tech Research Group, sees great value in the service. He said today, “while there might be considerations around cost and sustainability for such a resource- and network-intensive use case, the ability for organizations to greatly increase the scale and distance of their georedundancy means being able to achieve better user experiences as well as removing some of the limitations of making data accessible to applications that don’t sit very close by.” This, he said, “can be a game changer in both the areas of business continuity, global reach for web applications, and many other types of collaborations.” Related content news F5, Nvidia team to boost AI, cloud security F5 and Nvidia team to integrate the F5 BIG-IP Next for Kubernetes platform with Nvidia BlueField-3 DPUs. By Michael Cooney Oct 24, 2024 3 mins Generative AI Cloud Security Cloud Computing analysis AWS, Google Cloud certs command highest pay Skillsoft’s annual ranking finds AWS security certifications can bring in more than $200,000 while other cloud certifications average more than $175,000 in the U.S. By Denise Dubie Oct 24, 2024 8 mins Certifications IT Jobs Careers opinion Why enterprises should care more about net neutrality Net neutrality policies are the most significant regulatory influence on the Internet and data services, and they're the reason why end-to-end Internet QoS isn’t available. By Tom Nolle Oct 23, 2024 7 mins Network Management Software Telecommunications Industry news Network jobs watch: Hiring, skills and certification trends What IT leaders need to know about expanding responsibilities, new titles and hot skills for network professionals and I&O teams. By Denise Dubie Oct 23, 2024 33 mins Careers Data Center Networking PODCASTS VIDEOS RESOURCES EVENTS NEWSLETTERS Newsletter Promo Module Test Description for newsletter promo module. Please enter a valid email address Subscribe