How Much Power Would a Data Center with 30,000 GPUs Consume in a year?

Javier Reyes

Javier Reyes

Data Analyst at aterio.io

The rapid expansion of artificial intelligence (AI) and high-performance computing (HPC) workloads has fueled the demand for massive GPU-powered data centers. To illustrate the scale of power consumption, we'll examine a data center housing 30,000 GPUs and analyze its energy requirements.

To make things simple, we've structured the cost model around the NVIDIA DGX SuperPOD, using reported figures for the DGX H100. The H100 has emerged as a benchmark for measuring performance in the chip industry, setting the standard for high-performance computing and AI workloads. Now lets dive in the numbers.

Key Assumptions:

1. PUE (Power Usage Effectiveness):

  • The PUE metric measures the energy efficiency of a data center.
  • A PUE of 1.15, which is the average for AWS data centers in North America, means that for every 1 watt used by IT equipment, an additional 0.15 watts is used for overhead like cooling, lighting, and power delivery.

2. GPU Utilization Rate:

  • For multi-node deep learning (DL) and HPC workloads, we assume a 80% utilization rate. For comparison, a data center dedicated for Cloud Services has a utilization rate, reported by Citi, that ranges between 60%- 70%.
  • These workloads often involve distributed training of machine learning models or computationally intensive simulations across multiple GPUs, requiring continuous and synchronized operation.

3. GPU Power Consumption:

  • As reported by NVIDIA(Source), each H100 GPU consumes approximately 700 watts under load.

4. Additional Rack Components:

  • Each rack contains networking switches, storage, and other supporting infrastructure consuming an additional ∼4,615 watts per rack, assuming we are using 1 DGX system per rack.

5. Number of GPUs:

  • The data center contains 30,000 GPUs, housed in 3,750 DGX servers (8 GPUs per server).

Step-by-Step Power Consumption Analysis:

1. IT Power Consumption (Critical Load):

The critical IT power represents the energy consumed directly by the GPUs and supporting infrastructure (e.g., CPUs, networking, storage).

[@portabletext/react] Unknown block type "image", specify a component for it in the `components.types` prop

2. Accounting for PUE and Utilization Rate:

The PUE accounts for additional power required for cooling, power delivery, and other overhead, while the utilization rate determines how efficiently the GPUs are being used during workloads. With a PUE of 1.15 and an 80% utilization rate:

[@portabletext/react] Unknown block type "image", specify a component for it in the `components.types` prop

3. Annual Energy Consumption

To calculate the annual energy consumption, we multiply the total power consumption by the number of hours in a year (8,760 hours) since they run 24/7.

We are also using the average price of electricity reported for the US in October 2024 reported by eia: 8.21¢/kWh.

[@portabletext/react] Unknown block type "image", specify a component for it in the `components.types` prop

Conclusion

A data center housing 30,000 GPUs, operating at an 80% utilization rate for multi-node DL and HPC workloads, incurs an annual electricity cost of $25.35 million, accounting for a significant portion of the total operating costs (TOC). Energy alone can represent approximately 30-40% of a data center’s TOC, underscoring the critical importance of optimizing energy efficiency through strategies like improving PUE and utilization rates.

For better understanding of the data center market in the USA download our U.S. Data Centers and Power Demand report.