Understanding GPU’s Energy and Environmental Impact – Part I

Nour Rteil | Sept. 18, 2025

Understanding GPU’s Energy and Environmental Impact – Part I

We have recently integrated GPU power estimation feature in our software to help our clients understand and optimise the energy consumption of their AI accelerated servers.

The GPU has become the beating heart of modern artificial intelligence (AI). From powering large language models (LLM) to driving real-time generative AI applications, GPUs deliver the parallelism that makes state-of-the-art training and inference possible.

According to the 2024 U.S. Data Center Energy Usage Report, AI servers are responsible for 23% of the total US data centre electricity use and are expected to consume 70-80% (240-380 TWh annually) of all US data centre electricity use by 2028.

Green Peace estimates that the electricity consumption to manufacture AI hardware in 2030 is expected to increase to between 11,550 GWh and 37,238 GWh, up to 170 times more than the demand in 2023, highlighting the urgency of improving both operational and lifecycle efficiency of GPUs.

In response, researchers have spent the past decade building tools and models to understand, estimate, and optimize GPU power. Alongside this, a new wave of lifecycle assessments (LCA) is emerging, placing GPU power in the broader context of environmental sustainability.

Higher Environmental Impact due to Increased Complexity in Chip Manufacturing

The environmental footprint of GPUs begins long before they are turned on. Semiconductor fabrication, packaging, and distribution add substantial embodied emissions. Today, several trends are shaping GPU manufacturing and adding to its complexity, notably:


  1. Shrinking Process Nodes: Modern GPUs like NVIDIA’s A100 or H100 are fabricated at 5–7 nm process nodes, requiring extreme ultraviolet lithography (EUV). Smaller transistors generally improve performance-per-watt, but the energy intensity of chip manufacturing itself rises due to the equipment and chemical (photoresists, solvents, ultrapure water) demands.

  2. Concentration of Foundries: Nearly all advanced GPU chips are fabricated by TSMC (Taiwan) or Samsung (South Korea). In addition, SK Hynix (South Korea), Samsung, and Micron manufacture the memory chips used in GPUs. This geographic concentration creates both supply chain risks as well as carbon and water bottlenecks, since regional energy mixes and water stress risks differ based on location.

  3. Memory and packaging challenges: High-bandwidth memory (HBM) integration, now standard in AI GPUs, increases embodied impact. HBM involves 3D stacking of DRAM dies with through-silicon vias (TSVs), requiring advanced packing chip-on-wafer-on-substrate; (CoWoS), adding thermal, yield, and manufacturing complexity and raising embodied energy and materials use.

This year, NVIDIA published the Product Carbon Footprint (PCF) for H100 baseboard with x8 H100 SXM cards. This is the first vendor-based assessment offering transparency into the embodied environmental impact of their hardware. They estimate the embodied footprint to be 1,312 kg CO2e (around 164Kg CO2e per card), with memory contributing to 42% of the material impact, followed by ICs (25%), and thermal components (18%). Below is a compiled summary of all the published embodied emission figures at the time of writing this blog.

GPU Operational Power Demands

GPU power modelling and optimisation has been a central research challenge for more than a decade. As GPUs became essential not only for graphics but also for high-performance computing and cloud workloads, researchers have developed progressively better models - from early analytical efforts to data-driven approaches.

Early efforts focused on runtime performance counters and analytical modelling. In 2010, one of the first validated GPU power models was introduced, showing how microarchitectural events could be mapped to energy predictions. Since then, several studies advanced statistical and regression-based models, enabling more accurate real-time power estimation.

More recent research has shifted to machine learning–based approaches across diverse workloads and explored task allocation strategies to improve performance-per-watt in large-scale AI workloads.

Today, state-of-the-art models integrate multiple layers of measurement, simulation, and workload characterization, capturing not just average power but also transient spikes and peak thermal design power (TDP). The clear trend is toward system-aware modelling to optimize GPU performance, energy efficiency, and thermal reliability.

GPU Thermal Design Power (TDP) 

Alongside modelling efforts, empirical data shows how TDP has scaled over time. Our preliminary analysis of workstation GPUs highlights three distinct phases:


  • pre-2010: [10- 800 W], average 105.9 W, across 138 GPUs

    800W outliers: NVIDIA Tesla S879(2007), Tesla S1075(2008), Tesla S1070(2008)

  • 2010 – 2020: [11 – 900 W], average 147.9 W, 388 GPUs

    900 W outlier: NVIDIA Tesla S2050(2011)

  • post-2020: [15 – 2400 W], average 260.1 W, 138 GPUs

    2400 W outlier: Intel Data Center GPU Max Subsystem (Ponte Vecchio)

TDP values for workstation GPUs are plotted in the figure below across their release years to demonstrate this increase.

A graph of a number of years</p>

<p>AI-generated content may be incorrect.

Figure 1 TDP vs Release Year for 685 Workstation GPUs

GPU Idle Power Estimations

The 2024 U.S. Data Center Energy Usage Report estimates that AI servers consume idle power equal to roughly 20% of their rated power. To validate this estimate, we compiled idle power values from several published studies and calculated the corresponding percentages relative to each server’s TDP, as summarized in Table 2. The resulting average is approximately 21.4%, which supports our decision to use the 20% figure in our models.

Table 1. Published Studies with GPU Idle Power Consumption Estimates

A table with numbers and text</p>

<p>AI-generated content may be incorrect.

Beyond Watts and Carbon Emissions: Full LCAs of AI Hardware

While power modelling provides insights into the energy consumption of AI hardware, a holistic understanding necessitates a thorough LCA.  

Falk et al. (2025) performed a cradle-to-grave LCA of NVIDIA's A100 GPUs, providing the only multi-criteria environmental impact of a GPU. Their study reveals that the use phase dominates 11 out of 16 impact categories using the BLOOM model, and 10 out of 16 impact categories using GPT-4 training, including acidification, climate change, EF-particulate matter, land use, fossils depletion and water use.

The manufacturing phase dominates categories in human toxicity, ozone depletion and minerals and metals depletion. This highlights the significance of both manufacturing and operational phases, as well as end-of-life, in the overall environmental impact of AI hardware.

Although the study has limitations, it provides valuable data for sustainable AI development, highlighting impacts that extend beyond carbon emissions - which can be largely driven by location (grid mix), power purchase agreements (PPAs), and sometimes Carbon credits.

In part 2 of this series, we will examine GPU performance per watt trends and discuss strategies to optimise and reduce their environmental footprint.