Reading brain activity with advanced technologies is not a new concept.
However, most techniques have focused on identifying single words associated
with an object or action a person is seeing or thinking of, or matching up
brain signals that correspond to spoken words. Some methods used caption
databases or deep neural networks, but these approaches were limited by
database word coverage or introduced information not present in the brain.
Generating detailed, structured descriptions of complex visual perceptions or
thoughts remains difficult.
A study, recently published in Science Advances,
takes a new approach. Researchers involved in the study have developed what
they refer to as a "mind-captioning" technique that uses an iterative
optimization process, where a masked language model (MLM) generates text
descriptions by aligning text features with brain-decoded features.
The technique also incorporates linear models trained to decode semantic
features from a deep language model using brain activity from functional magnetic resonance imaging
(fMRI). The result is a detailed text description of what a participant is
seeing in their brain.
Generating video captions from human perception
For the first part of the experiment, six people watched 2,196 short videos
while their brain activity was scanned with fMRI. The videos featured various
random objects, scenes, actions, and events, and the six subjects were native
Japanese speakers and non-native English speakers.
The same videos previously underwent a kind of crowdsourced text captioning
by other viewers, which was processed by a pretrained LM, called DeBERTa-large
that extracted particular features. These features were matched to brain
activity and text was generated through an iterative process by the MLM model,
called RoBERTa-large.
"Initially, the descriptions were fragmented and lacked clear meaning.
However, through iterative optimization, these descriptions naturally evolved
to have a coherent structure and effectively capture the key aspects of the
viewed videos. Notably, the resultant descriptions accurately reflected the
content, including the dynamic changes in the viewed events. Furthermore, even
when specific objects were not correctly identified, the descriptions still
successfully conveyed the presence of interactions among multiple
objects," the study authors explain.
The team then compared the generated descriptions to both correct and
incorrect captions across various numbers of candidates to determine accuracy,
which they say was around 50%. They note that this level of accuracy surpasses
other current approaches and holds promise for future improvement.
Reading memories
The same six participants were later asked to recall the videos under fMRI
to test out the method's ability to read memory, instead of visual experience.
The results for this part of the experiment were also promising.
"The analysis successfully generated descriptions that accurately
reflected the content of the recalled videos, although accuracy varied among
individuals. These descriptions were more similar to the captions of the
recalled videos than to irrelevant ones, with proficient subjects achieving
nearly 40% accuracy in identifying recalled videos from 100 candidates,"
the study authors write.
For people who have a diminished or lost capacity to speak, such as those
who have had a stroke, this new technology could eventually serve as a way to
restore communication. The fact that the system has proven itself capable of
picking up on deeper meanings and relationships, instead of simple word
associations, could allow these individuals to regain much more of their
communication ability than some of the other brain-computer interface methods.
Still, further optimization is necessary before getting to that point.
Ethical considerations and future directions
Regardless of some of the more positive applications for mind-captioning
devices capable of reading human thought, there are certainly legitimate
concerns regarding privacy and potential misuse of brain-to-text technology.
The researchers involved in the study note that consent will remain a major
ethical consideration when employing mind-reading techniques. Before more
widespread use of these technologies is common, important questions about
mental privacy and the future of brain-computer interfaces will need to be
addressed.
Still, the study offers up a new tool for scientific research into how the
brain represents complex experiences and a potential boon for nonverbal
individuals.
The study authors write, "Together, our approach balances interpretability, generalizability, and performance—establishing a transparent framework for decoding nonverbal thought into language and paving the way for systematic investigation of how structured semantics are encoded across the human brain."
Source: 'Mind-captioning' technique can read human thoughts from brain scans

No comments:
Post a Comment