Words showing increased frequency in
2024. (A) Frequencies in 2024 and frequency ratios (r). Both axes are on a log
scale. Only a subset of points are labeled for visual clarity. the dashed line
shows the threshold defining excess words (see text). Words with r > 90 are
shown at r = 90. excess words were manually annotated into content words (blue)
and style words (orange). (B) the same but with frequency gap (δ) as the
vertical axis. Words with δ > 0.05 are shown at δ = 0.05. Credit: Science Advances (2025). DOI: 10.1126/sciadv.adt3813
Chances are that you have
unknowingly encountered compelling online content that was created, either
wholly or in part, by some version of a Large Language Model (LLM). As these AI
resources, like ChatGPT and Google Gemini, become more proficient at generating
near-human-quality writing, it has become more difficult to distinguish between
purely human writing from content that was either modified or entirely
generated by LLMs.
This
spike in questionable authorship has raised concerns in the academic community that AI-generated
content has been quietly creeping into peer-reviewed publications.
To
shed light on just how widespread LLM content is in academic writing, a team of U.S. and German
researchers analyzed more than 15 million biomedical abstracts on PubMed to determine if LLMs have had a
detectable impact on specific word choices in journal articles.
Their
investigation revealed that since the emergence of LLMs there has been a
corresponding increase in the frequency of certain stylist word choices within
the academic literature. These data suggest that at least 13.5% of the papers
published in 2024 were written with some amount of LLM processing. The results appear
in the open-access journal Science Advances.
Since
the release of ChatGPT less than three years ago, the prevalence of Artificial
Intelligence (AI) and LLM content on the web has exploded,
raising concerns about the accuracy and integrity of some research.
Past
efforts to quantify the rise in LLMs in academic writing, however, were limited
by their reliance on sets of human- and LLM-generated text. This setup, the
authors note, "…can introduce biases, as it requires assumptions on which
models scientists use for their LLM- assisted writing, and how exactly they
prompt them."
In
an effort to avoid these limitations, the authors of the latest study instead
examined changes in the excess use of certain words before and after the public
release of ChatGPT to uncover any telltale trends.
The
researchers modeled their investigation on prior COVID-19 public-health research,
which was able to infer COVID-19's impact on mortality by comparing excess
deaths before and after the pandemic.
By
applying the same before-and-after approach, the new study analyzed patterns of
excess word use prior to the emergence of LLMs and after. The researchers found
that after the release of LLMs, there was a significant shift away from the
excess use of "content words" to an excess use of "stylistic and
flowery" word choices, such as "showcasing,"
"pivotal," and "grappling."
By
manually assigning parts of speech to each excess word, the authors determined
that before 2024, 79.2% of excess word choices were nouns. During 2024 there
was a clearly identifiable shift. 66% of excess word choices were verbs and 14%
were adjectives.
The team also identified notable differences in LLM usage between research fields, countries, and venues.
Source: Massive study detects AI fingerprints in millions of scientific papers
No comments:
Post a Comment