Sustainable Open-Source AI Requires Tracking the Cumulative Footprint of Derivatives
- 76–170 ML water consumed training GPT-4 (estimated)
- 1.3B+ gallons/year used by single data centers (e.g., Google Council Bluffs)
- 1 in 4 data centers may face water scarcity by 2050
Water consumption from on-site cooling + off-site electricity generation
- 415 TWh → 945 TWh data center electricity (2024-2030)
- 15% annual growth rate (4× faster than total demand)
- 30% annual growth for AI-specific servers
IEA projections showing aggregate consumption rising despite efficiency gains
Lower marginal costs → increased usage → higher aggregate demand. Efficiency improvements can be overwhelmed by scale without coordination. Thousands of derivative models (fine-tunes, LoRA adapters, quantizations) create cumulative impacts that exceed base model training.
Efficiency gains are critical, but reducing AI's aggregate environmental impact also requires coordination infrastructure. Open-source ecosystems need ecosystem-level carbon accounting to enable measurement, disclosure, and shared targets that make emissions visible, assign responsibility, and prevent the tragedy of the commons.
Abstract
The open-source Artificial Intelligence (AI) ecosystem has grown explosively, with Hugging Face now hosting over 2 million models. While this growth democratizes AI, it also introduces a coordination gap. The downstream derivatives incur energy use, water consumption, and emissions that remain largely unobserved and inconsistently disclosed, limiting collective oversight and masking cumulative impact.
While quantization, pruning, and efficient fine-tuning help make individual models more efficient, cheaper training and inference can also lead to more experimentation and deployment, which may outweigh these gains through rebound effects. A single foundation model like Meta Llama can spawn hundreds of derivatives within months, each consuming additional compute.
We propose Data and Impact Accounting (DIA), a lightweight coordination mechanism that provides ecosystem-level visibility into carbon and water footprints without restricting open-source development. DIA combines standardized reporting, automated integration, and public aggregation dashboards.
This dynamic represents the tragedy of the commons, where individually rational actions—such as fine-tuning models for specific use cases—can collectively increase total energy and water use. Here, the commons are the atmosphere and freshwater resources. The open-source ecosystem currently lacks governance mechanisms to coordinate responsible resource use.
Key Figures
Visual evidence of the hidden environmental reality in AI ecosystems
Meta reports that pretraining Llama 3 (8B and 70B combined) emitted approximately 2,290 tCO₂eq. However, research documents 146+ derivatives for a single model family. Even if most derivatives are cheaper individually, the aggregate emissions across hundreds can exceed base model training by multiples. Precise estimation is currently impossible because derivative compute is rarely disclosed—this motivates DIA.
Model Training Footprints (2020-2024)
Training emissions and water consumption of selected GenAI models. Models marked with ⋆ are open-source. Tree equivalent assumes 25 kg CO₂/tree/year. Water in megalitres (ML; 1 ML = 10⁶ L).
| Model | Year | Params | Open | tCO₂eq | Tree Equiv. | Water (ML) | R/Est. |
|---|
Carbon vs Water (Training Phase)
Each point represents a model from the table. Hover for details. Axes use logarithmic scale by default to show order-of-magnitude differences.
📊 Data Notes & Methodology
- Water values: Estimated using WUEtotal (Water Usage Effectiveness) ranges of 1.8-4.0 L/kWh, combining on-site cooling and off-site electricity generation water consumption (water not returned locally).
- Reported vs Estimated: "R" indicates values disclosed by model creators; "Est." uses GPU-hours, TDP, PUE, and carbon intensity assumptions (see paper Appendix A, Equations 1-3).
- Training phase only: This table covers training phase emissions and water use. The paper emphasizes that inference and derivative proliferation can dominate lifecycle impact—these downstream costs are largely untracked.
- Upper bounds: TDP-based estimates provide upper bounds; actual power draw typically ranges from 60-80% of TDP depending on utilization patterns during training.
- GPT-4 range: Based on IEA's 42.4 GWh estimate with carbon intensity 0.1-0.445 kgCO₂/kWh and WUE 1.2-4.4 L/kWh depending on cooling and grid source. These are third-party estimates, not audited disclosures.
Data and Impact Accounting (DIA)
A lightweight, non-regulatory transparency infrastructure for ecosystem-level sustainability coordination
Minimal footprint schema embedded in model cards or repository metadata:
- Hardware type and device count
- Training duration (GPU-hours)
- Estimated electricity use (kWh)
- Estimated water use (L) or facility WUE (L/kWh)
- Grid carbon intensity (kgCO₂/kWh) or training region as proxy
- Model lineage (base model(s) and major downstream derivatives)
- For inference: standardized per-query benchmarks, optional aggregate usage reporting, deployment efficiency metadata
Automated measurement tools integrated into training pipelines:
- CodeCarbon for automated energy/emissions tracking
- ML CO₂ Impact Calculator for job-level estimation
- Cloud provider sustainability APIs for location-adjusted data
- MLPerf Inference for standardized benchmarking protocols
- Region-based defaults with explicit data-quality tiers when facility-level WUE unavailable
- Generate reports with minimal manual effort
Public registry or dashboard summarizing reported footprints:
- Aggregate data across releases and model families
- Track trends over time and identify high-impact families
- Benchmark efficiency improvements at ecosystem scale
- Enable comparative analysis across lineages
- Estimate deployment-phase impacts via download statistics and voluntary provider reporting
- Natural candidates: existing model hubs like Hugging Face
- Voluntary & low-friction: Adoption driven by social incentives and community norms, similar to model cards
- Imperfect is acceptable: Approximate estimates based on hardware and duration sufficient for directional insight
- Preserves open-source benefits: No barriers to entry; small teams provide minimal info, framework focuses on aggregate patterns
- Positive feedback loop: Making efficiency visible and comparable creates incentive for optimization
- Not a regulation: Does not restrict who can train, fine-tune, or release models
- Not a gate: Does not block access or participation in open-source ecosystem
- Not auditing: Goal is visibility into trends and relative impacts, not policing individual projects
- Not complete solution: Foundational infrastructure that can support complementary mechanisms (compute budgets, shared targets)
(4 Phases)
Methodology: Estimating Environmental Footprint
When direct measurements are unavailable, the paper estimates training electricity from GPU-hours and hardware power, then derives CO₂ emissions and water consumption following established ML carbon accounting methodology.
Etrain = (HGPU × Pavg × PUE) / 1000
Ctrain = (Etrain × CI) / 1000
Wtrain = Etrain × WUEtotal
Typical range: 1.8-4.0 L/kWh | Report in megalitres: W(ML) = W(L)/10⁶
Beyond carbon emissions, AI consumes substantial amounts of water for evaporative cooling in data centers and indirectly through electricity generation. Unlike carbon, water impacts are highly localized and depend on basin-level scarcity.
Scale: Mid-sized data center: ~300,000 gallons/day (≈1,000 households); hyperscale facilities: up to 5 million gallons/day. Risk: MSCI analysis found 1 in 4 data-center assets may face increased water scarcity by 2050.
Closed models (e.g., GPT-4): Trained and served centrally via API; inference scales with demand but remains centrally metered in provider data centers. Organizational boundaries enable internal accountability.
Open models (e.g., Llama 3): Trained once, then branch into many derivatives (fine-tunes, quantizations, adapters, merges) produced by independent users. This diffuses environmental impacts across a distributed ecosystem, making aggregate footprint harder to quantify and enabling the tragedy of the commons.
Call to Action
Concrete steps the ML community can take toward ecosystem-level sustainability
Researchers & Practitioners
- Include emissions and water estimates in model cards and paper submissions
- Use tools like CodeCarbon or Carbontracker to measure training costs
- Document base models used and incremental compute required for derivatives
- Estimate inference footprints for deployed systems
- Remember: imperfect estimates are better than no estimates
Conference Organizers & Reviewers
- Encourage environmental reporting in reproducibility checklists
- Provide graduated expectations (lightweight vs. resource-intensive work)
- Recognize efficiency as a first-class contribution, not secondary consideration
- Consider environmental impact when evaluating scaling-focused work
Model Hub Operators
- Implement standardized metadata fields for carbon and water reporting
- Develop dashboards aggregating data across model families and derivatives
- Surface efficiency metrics alongside accuracy benchmarks in discovery
- Make sustainability visible and actionable for users
Cloud Providers & Hardware Vendors
- Expose per-job carbon intensity and water usage through standardized APIs
- Provide users with actionable data on environmental cost of workloads
- Enable carbon-aware scheduling by default
- Support ecosystem-level reporting standards
Funding Agencies
- Require environmental impact statements in grant proposals for compute-intensive research
- Consider efficiency and sustainability as evaluation criteria alongside scientific merit
- Support research on measurement and coordination infrastructure
- Fund development of low-friction reporting tools
Open-Source Labs & Foundations
- Lead by example with comprehensive environmental reporting for flagship releases
- Invest in tooling that reduces reporting friction
- Participate in developing community standards for sustainability accounting
- Share lessons learned and best practices publicly
We propose that by 2027, major open-weight model releases include standardized DIA reports covering training emissions, water usage, and documented lineage. Achieving this requires no regulatory mandate—only coordination. The tools exist; the data can be collected; the community has demonstrated its capacity for collective action on challenges like model cards, dataset documentation, and reproducibility standards.
Earth does not distinguish between emissions from open and closed-source models, between base models and derivatives, or between training and inference. The infrastructure we build today will determine whether open-source AI develops responsibly or experiences uncoordinated growth. What remains is the decision to act.
Acknowledgements
Resources used in preparing this research were provided, in part, by the Province of Ontario, the Government of Canada through CIFAR, and companies sponsoring the Vector Institute.
GWT acknowledges support from the Natural Sciences and Engineering Research Council (NSERC), the Canada Research Chairs program, and the Canadian Institute for Advanced Research (CIFAR) Canada CIFAR AI Chairs program.
For more information about Vector Institute partners, visit vectorinstitute.ai/#partners
Citation
Use the BibTeX below to cite this work. Authors and venue details will be updated after de-anonymization.
@article{Raza2026SustainableAI,
title={Sustainable Open-Source AI Requires Tracking the Cumulative Footprint of Derivatives},
author={Raza, Shaina and Eyriay, Iuliia and Radwan, Ahmed Y and Lesperance, Nate and Pandya, Deval and Kocak, Sedef Akinli and Taylor, Graham W.},
journal={arXiv preprint arXiv:2601.21632},
year={2026},
doi={10.48550/arXiv.2601.21632}
}
Data attribution: Model training footprint data (Table 1) is compiled from public disclosures, published papers, and prior analyses following established ML carbon accounting methodology. See paper Section 2.3 and Appendix A for detailed estimation procedures and references for original sources.