Liquid Cooling vs Air Cooling for GPU Servers

·Brenden Reeves

Air cooling blows air over heatsinks (metal fins that conduct heat away from the GPU). Liquid cooling pumps coolant through cold plates bolted to the GPU. Both methods remove heat, but they set different ceilings on what a data center can support.

Liquid cooling fits more GPUs per rack: fewer racks, less floor space, fewer network switches, and shorter cable runs between GPUs, which reduces communication latency for training workloads. Most B200 systems ship air-cooled, but the GB200 NVL72 rack is liquid-only, and NVIDIA's next-generation Rubin GPUs (1,800-2,300W TDP) will require liquid cooling across the board. Most data centers do not support liquid cooling yet. [6]Uptime Institute, "Global Data Center Survey 2025" (2025)https://intelligence.uptimeinstitute.com/resource/uptime-institute-global-data-center-survey-2025

Normalized 42U rack using 10U air and 4U liquid nodes

Air-cooled8x B20010U8x B20010U8x B20010U8x B20010U32 GPUs4 nodes x 8 GPUsvsLiquid-cooled8x B2008x B2008x B2008x B2008x B2008x B2008x B2008x B20064 GPUs8 nodes x 8 GPUs
Actual rack layouts vary by OEM.

Illustrative facility PUE by rack density

1.01.21.41.61.82.02.2020406080100120Average rack power density (kW)PUE~40 kW: air cooling limitAir coolingLiquid (D2C)
PUE (Power Usage Effectiveness) measures total facility power divided by IT equipment power. PUE 1.0 means zero cooling overhead. Illustrative trend based on DCPulse and Uptime Institute data.

Why GPU servers need so much cooling

Every watt a GPU consumes becomes heat. TDP (Thermal Design Power) is the maximum sustained heat the chip generates under load. If the cooling system cannot remove that heat fast enough, the GPU throttles its clock speed to protect itself, and performance drops.

GPUArchitectureYearTDP (SXM form factor)
A100Ampere2020400W
H100Hopper2022700W
H200Hopper2024700W
B200Blackwell20241,000W

Sources: NVIDIA A100 datasheet [1]NVIDIA, "A100 Tensor Core GPU Datasheet" (2020)https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/a100/pdf/nvidia-a100-datasheet-nvidia-us-2188504-web.pdf, H100 datasheet [4]NVIDIA, "H100 Tensor Core GPU Datasheet" (2022)https://resources.nvidia.com/en-us-tensor-core/nvidia-tensor-core-gpu-datasheet, H200 datasheet [5]NVIDIA, "H200 Tensor Core GPU Datasheet" (2024)https://www.nvidia.com/en-us/data-center/h200/, DGX B200 User Guide [2]NVIDIA, "DGX B200 User Guide" (2025)https://docs.nvidia.com/dgx/dgxb200-user-guide/introduction-to-dgxb200.html

An 8-GPU B200 server consumes 8 kW in GPU power alone. CPUs, memory, network cards, fans, and power supply losses add significantly to that. [2]NVIDIA, "DGX B200 User Guide" (2025)https://docs.nvidia.com/dgx/dgxb200-user-guide/introduction-to-dgxb200.html

NVIDIA's GB200 NVL72, a 72-GPU rack-scale Blackwell system, consumes 120-132 kW total. [3]NVIDIA, "GB200 NVL72" (accessed March 2026)https://www.nvidia.com/en-us/data-center/gb200-nvl72/ [9]Supermicro, "Supermicro Ramps Full Production of NVIDIA Blackwell Rack-Scale Solutions" (2025)https://www.supermicro.com/en/pressreleases/supermicro-ramps-full-nvidia-blackwell-rack-scale-solutions-nvidia-hgx-b200 As of early 2026, most data centers run 10-30 kW per rack, and few exceed 30 kW. [6]Uptime Institute, "Global Data Center Survey 2025" (2025)https://intelligence.uptimeinstitute.com/resource/uptime-institute-global-data-center-survey-2025 A single NVL72 rack draws 4-12x that.

How air cooling works

Fans push ambient air across heatsinks attached to the GPU and CPU packages. Cool air enters from the front of the server, absorbs heat as it passes over the fins, and exits hot from the rear.

Data centers organize this into hot aisle/cold aisle containment (physical barriers that separate cool intake air from hot exhaust). Cold air from CRAC (Computer Room Air Conditioning) or CRAH (Computer Room Air Handler) units feeds the cold aisle. Servers draw it in, heat it, and exhaust it into the hot aisle, which routes back to the cooling units. Physical barriers between aisles prevent hot exhaust from recirculating.

ASHRAE TC 9.9, the technical committee that sets thermal guidelines for data center equipment, recommends an inlet air temperature of 18-27°C for server hardware. [7]ASHRAE TC 9.9, "Thermal Guidelines for Data Processing Environments, 5th Edition" (2021)https://www.ashrae.org/technical-resources/bookstore/datacom-series Operating within that range extends equipment life and keeps energy costs predictable.

The physics set the limit. Air has low specific heat capacity (the amount of energy needed to raise a kilogram by one degree): about 1 kJ per kg per °C. Water's volumetric heat capacity is about 3,400x higher, so less coolant can move the same heat. Fan power scales with the cube of fan speed, so a 10% increase in airflow demands about 33% more fan energy. [8]DCPulse, "How Rack Power Impacts PUE in AI Data Centers" (2025)https://dcpulse.com/article/the-density-dividend-how-rack-power-impacts-pue-efficiency

The Uptime Institute's 2025 Global Data Center Survey found that 67% of existing data centers cannot support modern GPU power densities. [6]Uptime Institute, "Global Data Center Survey 2025" (2025)https://intelligence.uptimeinstitute.com/resource/uptime-institute-global-data-center-survey-2025

How liquid cooling works

In direct-to-chip (D2C) cooling, a cold plate, a metal block with internal channels, mounts directly on the GPU. Coolant flows through those channels, absorbs heat at the source, and carries it to a CDU (Coolant Distribution Unit, the heat exchanger that sits outside or beside the rack). D2C handles the hottest components, GPUs and CPUs, with liquid while fans still cool lower-power parts like memory modules and storage drives.

The CDU operates two loops. A secondary loop circulates filtered coolant between the CDU and the cold plates inside the servers. A primary loop connects to the facility's chilled water supply or dry coolers outside the building. The CDU transfers heat from the server loop to the facility loop, then sends cooled fluid back to the cold plates.

CDUs come in two types. Liquid-to-liquid (L2L) CDUs connect to a facility chilled water plant, the standard for large deployments. Liquid-to-air (L2A) CDUs reject heat to air through built-in fans, useful for smaller installations or sites without chilled water infrastructure.

Direct-to-chip cooling: two-loop architecture

SERVERGPUCold plateManifoldCDUHeatExchangerFACILITYChilled waterplant / dry coolershotcoldSecondary loophotcoldPrimary loop
The CDU isolates server coolant from the facility water supply.

A second approach, immersion cooling, submerges the entire server in dielectric fluid (a non-conductive liquid).

Direct-to-chip (D2C)Immersion
How it worksCold plates on GPUs/CPUs, coolant loops to CDUServer submerged in dielectric fluid. Single-phase keeps the fluid liquid throughout. Two-phase lets the fluid evaporate at the hot surface and condense on a cooler surface above, transferring more heat per cycle.
Heat capturedUp to 98% of system heat through liquid [9]Supermicro, "Supermicro Ramps Full Production of NVIDIA Blackwell Rack-Scale Solutions" (2025)https://www.supermicro.com/en/pressreleases/supermicro-ramps-full-nvidia-blackwell-rack-scale-solutions-nvidia-hgx-b200100% (all components submerged)
PUE (Power Usage Effectiveness, total facility power divided by IT equipment power)~1.15 [8]DCPulse, "How Rack Power Impacts PUE in AI Data Centers" (2025)https://dcpulse.com/article/the-density-dividend-how-rack-power-impacts-pue-efficiency1.03-1.08 [10]IDTechEx, "Two-Phase Cold Plate Cooling Will Take Off as Early as 2026-2027" (2025)https://www.idtechex.com/en/research-article/two-phase-cold-plate-cooling-will-take-off-as-early-as-2026-2027/34068
Server compatibilityStandard chassis with cold plate retrofitPurpose-built tanks and enclosures
MaturityProduction standard for Blackwell [9]Supermicro, "Supermicro Ramps Full Production of NVIDIA Blackwell Rack-Scale Solutions" (2025)https://www.supermicro.com/en/pressreleases/supermicro-ramps-full-nvidia-blackwell-rack-scale-solutions-nvidia-hgx-b200Niche; scaling expected 2026-2027 [10]IDTechEx, "Two-Phase Cold Plate Cooling Will Take Off as Early as 2026-2027" (2025)https://www.idtechex.com/en/research-article/two-phase-cold-plate-cooling-will-take-off-as-early-as-2026-2027/34068

Direct-to-chip is the production standard for Blackwell-class hardware. Supermicro, Dell, and NVIDIA's own GB200 NVL72 all use D2C. [9]Supermicro, "Supermicro Ramps Full Production of NVIDIA Blackwell Rack-Scale Solutions" (2025)https://www.supermicro.com/en/pressreleases/supermicro-ramps-full-nvidia-blackwell-rack-scale-solutions-nvidia-hgx-b200 [11]Dell Technologies, "PowerEdge XE9680L Spec Sheet" (2025)https://www.delltechnologies.com/asset/en-us/products/servers/technical-support/poweredge-xe9680l-spec-sheet.pdf [3]NVIDIA, "GB200 NVL72" (accessed March 2026)https://www.nvidia.com/en-us/data-center/gb200-nvl72/ IDTechEx projects two-phase immersion will begin scaling in 2026-2027 as GPU TDPs push past the limits of single-phase systems. [10]IDTechEx, "Two-Phase Cold Plate Cooling Will Take Off as Early as 2026-2027" (2025)https://www.idtechex.com/en/research-article/two-phase-cold-plate-cooling-will-take-off-as-early-as-2026-2027/34068

Air vs liquid tradeoffs

Air coolingLiquid cooling (D2C)
Max rack density25-40 kW80-250+ kW
Facility requirementsRaised floors or containment, CRAC/CRAH unitsCDU per rack or row, piping, chilled water or dry coolers, leak detection
MaintenanceLow: replace fans, clean filtersHigher: trained technicians, coolant management, pump servicing
Upfront costLowerHigher (CDUs, piping, plumbing)
Energy cost at densityHigher (fans scale with cube of speed)Lower (PUE advantage compounds over time)
GPU thermal headroomLimited at high TDPBetter: lower junction temps, sustained boost clocks

Sources: Uptime Institute (2025) [6]Uptime Institute, "Global Data Center Survey 2025" (2025)https://intelligence.uptimeinstitute.com/resource/uptime-institute-global-data-center-survey-2025, DCPulse [8]DCPulse, "How Rack Power Impacts PUE in AI Data Centers" (2025)https://dcpulse.com/article/the-density-dividend-how-rack-power-impacts-pue-efficiency, Supermicro [9]Supermicro, "Supermicro Ramps Full Production of NVIDIA Blackwell Rack-Scale Solutions" (2025)https://www.supermicro.com/en/pressreleases/supermicro-ramps-full-nvidia-blackwell-rack-scale-solutions-nvidia-hgx-b200

The cost trade-off depends on scale. A single 8-GPU server with air cooling is cheaper to deploy: no plumbing, no CDU, no coolant management. At rack scale or higher, liquid cooling's density advantage changes the math. In Supermicro's published HGX B200 examples, the liquid-cooled design fits 8 systems and 64 GPUs in a 42U rack, while the air-cooled design fits 4 systems and 32 GPUs in a 42U rack. [9]Supermicro, "Supermicro Ramps Full Production of NVIDIA Blackwell Rack-Scale Solutions" (2025)https://www.supermicro.com/en/pressreleases/supermicro-ramps-full-nvidia-blackwell-rack-scale-solutions-nvidia-hgx-b200 The exact mix depends on the chassis and rack, but the pattern is consistent: liquid cooling buys density.

PUE is a multiplier on every watt of IT load. In a 10 MW IT deployment, PUE 1.8 means 18 MW total facility draw, 8 MW of it just cooling and power distribution. PUE 1.15 drops that overhead to 1.5 MW. The 6.5 MW difference costs about $4 million per year at $0.07/kWh. [8]DCPulse, "How Rack Power Impacts PUE in AI Data Centers" (2025)https://dcpulse.com/article/the-density-dividend-how-rack-power-impacts-pue-efficiency

What the hardware dictates

The cooling method is not always a free choice. NVIDIA and OEM (Original Equipment Manufacturer) server designs sometimes make it for you. Server chassis are measured in rack units (U), where 1U equals 1.75 inches of vertical space. A standard rack is 42U tall.

SystemCoolingForm factorPractical takeaway
GB200 NVL72Liquid onlyFull rackRequires facility liquid cooling infrastructure
HGX B200 (liquid)Liquid4U per node2x GPU density vs air-cooled HGX B200
HGX B200 (air)Air10U per nodeNo plumbing or CDU needed
DGX B200Air10U per nodeNVIDIA turnkey; no OEM customization
Dell XE9680 (H100/H200)Air6U per nodeFits Hopper-era TDP with air only
Dell XE9680LLiquid4U per nodeSame baseboard as XE9680 in less space

Sources: NVIDIA [2]NVIDIA, "DGX B200 User Guide" (2025)https://docs.nvidia.com/dgx/dgxb200-user-guide/introduction-to-dgxb200.html [3]NVIDIA, "GB200 NVL72" (accessed March 2026)https://www.nvidia.com/en-us/data-center/gb200-nvl72/, Supermicro [9]Supermicro, "Supermicro Ramps Full Production of NVIDIA Blackwell Rack-Scale Solutions" (2025)https://www.supermicro.com/en/pressreleases/supermicro-ramps-full-nvidia-blackwell-rack-scale-solutions-nvidia-hgx-b200, Dell [11]Dell Technologies, "PowerEdge XE9680L Spec Sheet" (2025)https://www.delltechnologies.com/asset/en-us/products/servers/technical-support/poweredge-xe9680l-spec-sheet.pdf [12]Dell Technologies, "PowerEdge XE9680 Spec Sheet" (2024)https://www.dell.com/en-us/shop/ipovw/poweredge-xe9680

The form factor difference comes from the cooling hardware itself. An air-cooled server needs tall heatsinks and rows of high-speed fans to force enough airflow across the GPUs. Those components take physical space, which is why Supermicro's air-cooled HGX B200 is a 10U chassis. Replace the heatsinks and fans with compact cold plates and manifold tubing, and the same baseboard fits in 4U. [9]Supermicro, "Supermicro Ramps Full Production of NVIDIA Blackwell Rack-Scale Solutions" (2025)https://www.supermicro.com/en/pressreleases/supermicro-ramps-full-nvidia-blackwell-rack-scale-solutions-nvidia-hgx-b200

Within the same HGX B200 product family, cooling changes the density ceiling. Supermicro's published examples show 32 GPUs in its air-cooled rack design and 64 GPUs in its liquid-cooled rack design. [9]Supermicro, "Supermicro Ramps Full Production of NVIDIA Blackwell Rack-Scale Solutions" (2025)https://www.supermicro.com/en/pressreleases/supermicro-ramps-full-nvidia-blackwell-rack-scale-solutions-nvidia-hgx-b200

For GB200 NVL72 deployments, the data center must have liquid cooling infrastructure in place before the hardware arrives. Retrofitting an air-cooled facility means adding CDUs, running pipes, and potentially upgrading the building's chilled water capacity. That work takes months.

References

  1. NVIDIA, "A100 Tensor Core GPU Datasheet" (2020)
  2. NVIDIA, "DGX B200 User Guide" (2025)
  3. NVIDIA, "GB200 NVL72" (accessed March 2026)
  4. NVIDIA, "H100 Tensor Core GPU Datasheet" (2022)
  5. NVIDIA, "H200 Tensor Core GPU Datasheet" (2024)
  6. Uptime Institute, "Global Data Center Survey 2025" (2025)
  7. ASHRAE TC 9.9, "Thermal Guidelines for Data Processing Environments, 5th Edition" (2021)
  8. DCPulse, "How Rack Power Impacts PUE in AI Data Centers" (2025)
  9. Supermicro, "Supermicro Ramps Full Production of NVIDIA Blackwell Rack-Scale Solutions" (2025)
  10. IDTechEx, "Two-Phase Cold Plate Cooling Will Take Off as Early as 2026-2027" (2025)
  11. Dell Technologies, "PowerEdge XE9680L Spec Sheet" (2025)
  12. Dell Technologies, "PowerEdge XE9680 Spec Sheet" (2024)

Frequently Asked Questions

What is TDP and what happens if GPU cooling cannot keep up?

Every watt a GPU consumes becomes heat. TDP (Thermal Design Power) is the maximum sustained heat the chip generates under load. If the cooling system cannot remove that heat fast enough, the GPU throttles its clock speed to protect itself, and performance drops.

How much power does a GB200 NVL72 rack draw compared to typical data center racks?

The GB200 NVL72 packs 72 Blackwell GPUs into one rack at 120-132 kW total. As of early 2026, most data centers run 10-30 kW per rack, and few exceed 30 kW. A single NVL72 rack draws 4-12x that.

What is direct-to-chip cooling versus immersion cooling?

Direct-to-chip (D2C) cooling handles the hottest components, GPUs and CPUs, with liquid while fans still cool lower-power parts like memory DIMMs and NVMe drives. Immersion cooling submerges the entire server in dielectric fluid, a non-conductive liquid.

How does liquid cooling change rack density for HGX B200?

In Supermicro's published HGX B200 examples, the liquid-cooled design fits 8 systems and 64 GPUs in a 42U rack, while the air-cooled design fits 4 systems and 32 GPUs. The pattern is consistent: liquid cooling doubles density.

Residual Value Insurance Solutions for GPUs

Coverage creates a minimum value for what your GPUs are worth at a future date. If they sell below the floor, the policy pays you the difference.

Learn how it works →
Liquid Cooling vs Air Cooling for GPU Servers | American Compute