Executing an unprecedented task solely based on verbal or written instructions, and then describing it so someone else can replicate it, represents a cornerstone of human communication, one that still challenges artificial intelligence (AI).
A team from the University of Geneva (UNIGE) has managed to model an artificial neural network capable of this cognitive feat. After learning and accomplishing a series of basic tasks, this AI was able to provide a linguistic description to a "sister" AI, which then executed the tasks in turn. These promising results, especially relevant for the robotics industry, can be found in Nature Neuroscience.
Performing an unprecedented task with no prior training, solely based on verbal or written instructions, is a capability unique to humans. Once a task is learned, we are also capable of describing it so that another person can replicate it. This dual ability distinguishes us from other species that, to learn such a task, require numerous trials accompanied by positive or negative reinforcement signals, without being able to communicate it to their peers.
A subfield of artificial intelligence (AI) - natural language processing - seeks to recreate this human capability, with machines that understand and respond to vocal or textual data. This technique relies on artificial neural networks, inspired by our biological neurons and how they transmit electrical signals in our brain. However, the neural computations that might enable the cognitive exploit described above remain poorly understood.
"Currently, conversational agents using AI are capable of integrating linguistic information to produce text or an image. But, to our knowledge, they are not yet able to translate a verbal or written instruction into a sensorimotor action, let alone explain it afterwards to another artificial intelligence for it to replicate," states Alexandre Pouget, a full professor in the Department of Basic Neuroscience at the Faculty of Medicine of UNIGE.
A scaled-down brain model
The researcher and his team have succeeded in developing an artificial neural model endowed with this dual capability, albeit previously trained. "We started with an existing artificial neuronal model, S-Bert, which has 300 million neurons and is pre-trained in language understanding.
We 'connected' it to another, simpler network of a few thousand neurons," explains Reidar Riveland, a doctoral student in the Department of Basic Neuroscience at the Faculty of Medicine of UNIGE, and lead author of the study.
In the first stage of the experiment, the neuroscientists trained this network to simulate the Wernicke's area, the part of the brain that enables us to perceive and interpret language. In the second stage, the network was trained to replicate Broca's area which, under the influence of Wernicke's area, takes care of word production and articulation. The entire process was conducted on standard laptops. Written instructions, in English, were then transmitted to the AI.
For example: pointing to the location - left or right - where a stimulus is perceived; responding in the opposite direction of a stimulus; or more complexly, among two visual stimuli with a slight difference in contrast, indicating which is brighter. The scientists evaluated their model's results, which simulate the intention to move, or here, to point.
"Once these tasks were learned, the network was able to describe them to a second network - a copy of the first - so it could replicate them. To our knowledge, facilitating purely linguistic dialogue between two AIs is a first," rejoices Alexandre Pouget, who led this work.
For future humanoids
This modeling opens new horizons for understanding the interaction between language and behavior. It is particularly promising for the robotics sector, where developing technologies that enable machines to communicate is a key challenge.
"The network we have developed is very small in size. Nothing now prevents us from developing, on this basis, much more complex networks that could be integrated into humanoid robots capable of understanding us but also comprehending each other," conclude the two researchers.