Sustainability efforts and high-density AI-based applications are sparking new and revamped approaches to data center cooling.
Whether it’s to save money, reduce carbon emissions, comply with regulations or accommodate high-powered AI workloads, enterprises are looking to operate more energy-efficient data centers.
A key to achieving those goals is selecting the right technology to help cool equipment and components. Adequate cooling is vital because overheated equipment can affect equipment performance and reliability, as well as the energy efficiency of the data center component.
Determining the best way to cool servers and other data center equipment depends on a number of factors, including the type of cooling method, how to distribute cooling to equipment, how to control the cooling system, and how to measure cooling performance.
Three types of air cooling
Today’s data center cooling methods are divided into two main categories: air cooling and liquid cooling. Air cooling includes three main approaches, each with its own pros and cons, according to research from International Data Corp. (IDC).
- Perimeter computer room air conditioning (CRAC) places units along the data center’s perimeter to distribute cool air to equipment. The units draw ambient air from outside, cool it and distribute it through the data center using fans and ducts. The hot air from equipment is then drawn back to the units and exhausted to the outside. The advantages of this method are that it’s simple and cost-effective, flexible and scalable, and relatively easy to maintain, IDC says. The downside is it can be less efficient than other data center cooling solutions; can cause uneven distribution of cool air, particularly in large data centers; and requires more floor space than other data center cooling solutions.
- Perimeter CRAC with raised floor containment uses CRAC units placed along a data center’s perimeter to distribute cool air to equipment. As with the first method, the units draw ambient air from outside, cool it and distribute it through the data center via fans and ducts. The addition of containment prevents the mixing of hot and cold air, according to IDC. On the plus side, this method can result in improved cooling efficiency, reduced energy costs and reduced risk of hot spots, IDC says. On the negative side, it can result in increased initial costs and complexity, reduced space utilization and increased risk of condensation.
- Row-based cooling with containment uses dedicated cooling units to cool individual rows of server racks. The units are usually mounted in the row of racks, overhead or under the floor, IDC says, and containment is used to separate the hot air from the cold air. This improves cooling efficiency and cuts energy costs. The advantages are improved cooling efficiency, decreased energy costs and reduced risk of hot spots, IDC notes, while the cons are increased initial costs and complexity, reduced space utilization and increased risk of condensation.
Three types of liquid cooling
Air cooling has its limitations, says Arno van Gennip, vice president, global IBX operations engineering, at data center colocation provider Equinix. “As we see the amount of power used by new processors steadily increasing, air cooling will not be sufficient for certain applications,” van Gennip says. This has led to the rise of the other main category of data center cooling technologies: liquid cooling, including liquid immersion and direct-to-chip liquid cooling.
“At Equinix, we are seeing liquid cooling reemerging as technology for supporting high-density data centers,” van Gennip adds. “While air cooling has been the dominant approach, companies are now exploring liquid cooling, thanks to its ability to transfer heat more efficiently than air.”
- Immersion Cooling: With immersion cooling, servers and other IT equipment are submerged in a dielectric liquid that is non-conductive, which allows servers to be completely submerged in the liquid without any risk of electrical shock, IDC says. Immersion cooling can be either single phase, where the coolant stays in liquid form through the entire process; or two-phase, where it’s converted to a gas and then back to a liquid when it is cooled. The IDC research says this method provides the best heat dissipation, and improves energy efficiency and environmental impact. On the downside, it requires server modification and is complex to implement and maintain. While immersion cooling can allow organizations to achieve high power densities within the data center, “it also requires the most substantial changes to server technology and data center architecture,” van Gennip says. “Because it’s such a radical departure from traditional methods of deploying IT equipment, immersion cooling can often have substantial upfront costs and considerations.”
- Direct-to-chip: Direct-to-chip cooling circulates a liquid coolant directly over a CPU or other heat-generating component, absorbing heat from the equipment, then using a heat exchanger to dissipate heat to ambient air or water. As with immersion cooling, this method can be either single phase or two phase, according to the IDC report. The advantages are high heat dissipation, enhanced energy efficiency and environmental impact, and space savings compared with immersion cooling, according to IDC. The disadvantages are that it requires server modification and is complex to implement and maintain. Though direct-to-chip fits in a standard footprint, “it still requires architectural changes and additional equipment to deliver liquid to the cabinet and distribute it to the individual servers,” van Gennip says.
- Air-assisted liquid cooling: Another approach is air-assisted liquid cooling (AALC), where air cooling is augmented by liquid-filled radiators attached to the rear of a data center’s racks, says Chris Sharp, CTO at Digital Realty, a provider of data center services and colocation. “This lets the customer deploy densities above the typical 35 KW ceiling of air-only configurations, sometimes going up to 70 KW, with the added complexity for the data center operator of needing to bring liquid to the rack,” he says. It’s a good middle ground between air-only and direct liquid cooling, Sharp says, as it does not require modification of an organization’s equipment. Direct liquid cooling can cool more than 100 KW in many cases, he says, making it most suitable for highly dense racks, such as those used for the highest performance generative artificial intelligence (AI) use cases. “But it brings the most complexity in terms of operation and deployment,” Sharp says. “The data center operator must bring liquid to the rack and the customer must also have compatible equipment where heat blocks are attached to the components that generate heat inside their servers.”
Even with direct liquid cooling, some air cooling is still needed, Sharp says. “Not everything will be liquid cooled within a particular server,” he says. “This means starting with air and adding liquid over time is typically a good move.”
The majority of data center facilities have been designed for air cooling, Sharp says. In many cases, existing facilities can be retrofit to employ AALC or direct liquid cooling, though this might be a challenge for facilities that aren’t designed in a modular fashion, he says.
“The data center operator must weigh the value of bringing liquid to the facility for the expected customer footprint; not every customer or deployment will want or need liquid cooling,” Sharp says. “The majority of new facility builds should be liquid capable; that is, either with liquid cooling available at their initial commissioning, or have it be easy to add later.”
Emerging data center cooling technologies
Data center planners and builders can consider several newer cooling technologies becoming available, including geothermal cooling.
Geothermal cooling uses the relatively constant temperature of the earth below the surface to cool data center equipment, says Sean Graham, research director, cloud to edge datacenter trends at IDC. “Heat from the equipment is transferred to a fluid, which is then circulated through underground heat exchangers,” Graham says. “The fluid cools as it passes through the heat exchangers, and it is then returned to the data center to cool the equipment.”
The industry already has realized the benefits of air-assisted cooling, which can use a combination of airside economizers, chilled water economizers and evaporative cooling, Graham says. “These technologies are proven to reduce energy consumption and improve sustainability,” Graham says. “The biggest drawback of these approaches is that their effectiveness is limited to certain climates. Geothermal cooling solves that problem.”
Geothermal cooling can significantly reduce energy consumption and water consumption, and it can also help to improve the reliability of data center equipment, Graham says. “The benefit of geothermal cooling relative to [air-assisted cooling] is that it could be used in all geographies,” he says.
“Geothermal cooling is interesting; it would need some geographic feature that would allow you to get rid of the waste heat,” adds Tony Harvey, senior director and analyst, infrastructure and operations, at research firm Gartner.
“Similar to geothermal power, you have to be in a location where you can use it,” Harvey says. “In addition, there will be environmental impacts to the heat sink. Creating a warm area in a lake, for example, could cause a change in the local aquatic life. This may not be a bad thing, but it is an impact. Leaks would also have to be considered especially if the refrigerant is not plain water.”
Using AI to improve data-center cooling
Using AI to address a problem created, in part, by the data center cooling requirements generated by AI-based applications is an interesting approach. “AI is already used in some data centers to optimize cooling [and] their use cases and capabilities are expanding,” Graham says. AI could be used to predict cooling needs and adjust cooling capacity accordingly, and algorithms could dynamically adjust the cooling systems in real-time to ensure optimal performance, Graham says.
AI could also provide recommendations for equipment upgrades or changes in configuration to improve cooling efficiency, Graham says.
Another area where the technology can help is performance analytics. “AI provides valuable insights into the performance of the cooling systems, helping to identify areas for improvement and ensuring that the systems are operating efficiently,” Graham says. By optimizing cooling systems, AI helps in reducing energy consumption, leading to significant cost savings and a reduction in a data center’s carbon footprint, Graham says.
The use of AI for data center cooling is in its infancy, Sharp says. “Today, this technology can be used to assist with the thermal modelling of the facility from a fluid dynamics perspective, to attempt to isolate and prevent hot spots,” he says. “In the future, AI will likely be used in more operational systems which can ingest large amounts of operational data from equipment in the data center in order to optimize cooling across a range of rack densities in a single site.”
AI is driving an increasing average rack density in the data center, which itself creates the need for more liquid cooling, Sharp says. “Innovations in this area including non-electrically conductive coolants, industry standard backplane connectors to easily connect customer equipment to the facility’s liquid loop, and intelligent flow control systems are all promising for improving the safety, accessibility, performance and environmental impact of the data center,” he says.
To decrease the energy usage of cooling systems, Equinix is trialing the use of AI/machine learning technology as a superseding control system on top of the existing control systems, van Gennip says. “The initial results look very promising,” he says.
Which technologies are ideal for existing data centers vs. newly constructed facilities? “Ultimately, the right cooling methodology for any data center will depend on a number of factors including its design, the climate, environmental goals, cooling requirements, budget, etc.,” Graham says.
Newer data centers, especially those focused on high-density workloads such as generative AI, are most likely better suited for direct-to-chip or immersion cooling, Graham says.