Credit: Sanket Mishra from Pexels
No matter which questions we ask an
AI, the model will come up with an answer. To produce this
information—regardless of whether the answer is correct or not—the model uses
tokens. Tokens are words or parts of words that are converted into a string of
numbers that can be processed by the LLM.
This conversion, as well as other
computing processes, produce CO2 emissions. Many users,
however, are unaware of the substantial carbon footprint associated with these
technologies. Now, researchers in Germany measured and compared CO2 emissions of different, already trained, LLMs using a set of
standardized questions.
"The environmental impact of questioning trained LLMs is strongly determined by their reasoning
approach, with explicit reasoning processes significantly driving up energy consumption and carbon emissions," said first author Maximilian Dauner, a researcher at Hochschule
München University of Applied Sciences and first author of the Frontiers
in Communication study.
"We found that
reasoning-enabled models produced up to 50 times more CO2 emissions than concise response models."
'Thinking' AI causes most emissions
The researchers evaluated 14 LLMs
ranging from seven to 72 billion parameters on 1,000 benchmark questions across
diverse subjects. Parameters determine how LLMs learn and process information.
Reasoning models, on average,
created 543.5 "thinking" tokens per question, whereas concise models
required just 37.7 tokens per question. Thinking tokens are additional tokens
that reasoning LLMs generate before producing an answer.
A higher token footprint always
means higher CO2 emissions. It doesn't, however, necessarily mean
the resulting answers are more correct, as elaborate detail is not always
essential for correctness.
The most accurate model was the
reasoning-enabled Cogito model with 70 billion parameters, reaching 84.9%
accuracy. The model produced three times more CO2 emissions than similar-sized models that generated concise answers.
"Currently, we see a clear
accuracy-sustainability trade-off inherent in LLM technologies," said
Dauner. "None of the models that kept emissions below 500 grams of CO2 equivalent achieved higher than 80% accuracy on answering the 1,000
questions correctly." CO2 equivalent is the unit used
to measure the climate impact of various greenhouse gases.
Subject matter also resulted in
significantly different levels of CO2 emissions. Questions that required lengthy reasoning processes, for
example abstract algebra or philosophy, led to up to six times higher emissions
than more straightforward subjects, like high school history.
Practicing thoughtful use
The researchers said they hope
their work will cause people to make more informed decisions about their own AI
use. "Users can significantly reduce emissions by prompting AI to generate
concise answers or limiting the use of high-capacity models to tasks that
genuinely require that power," Dauner pointed out.
Choice of model, for instance, can
make a significant difference in CO2 emissions. For example, having DeepSeek R1 (70 billion parameters)
answer 600,000 questions would create CO2 emissions equal to a round-trip flight from London to New York.
Meanwhile, Qwen 2.5 (72 billion
parameters) can answer more than three times as many questions (about 1.9
million) with similar accuracy rates while generating the same emissions.
The researchers said that their
results may be impacted by the choice of hardware used in the study, an
emission factor that may vary regionally depending on local energy grid mixes,
and the examined models. These factors may limit the generalizability of the
results.
"If users know the exact CO2 cost of their AI-generated outputs, such as casually turning themselves into an action figure, they might be more selective and thoughtful about when and how they use these technologies," Dauner concludes.
Source: Some AI prompts could cause 50 times more CO₂ emissions than others, researchers find
No comments:
Post a Comment