Huawei’s Noah’s Ark Lab in Paris has released a pre-print research article that presents an innovative concept called “embodied artificial intelligence” (E-AI), which they believe could be the next crucial step towards achieving artificial general intelligence (AGI).
AGI, also known as “human-level AI” or “strong AI,” refers to an AI system that can perform any given task with the necessary resources. While there is no consensus on what qualifies an AI system as a general intelligence, companies like OpenAI have been founded specifically to pursue this technology.
Traditionally, experts working on AGI believed that “scale is all you need” after the introduction of generative pre-trained transformer (GPT) technology in the late 2010s. They believed that large-scale transformers, beyond what is currently feasible, would eventually lead to an AGI model. However, the Huawei team argues in their paper that large language models, such as OpenAI’s ChatGPT and Google’s Gemini, lack the ability to understand the real world because they do not exist within it.
According to the researchers, in order for AI agents to truly interact with the real world, models must be embodied in a form that enables perception, action, memory, and learning. Perception involves providing the AI system with the ability to acquire real-time raw data from the real world and process and encode it into a latent learning space. Essentially, AI needs to have its own “eyes” and “ears” to pay attention to the relevant aspects of the real world in order to understand it well enough to function as a general intelligence.
In addition to perception, AI agents must be able to take actions and observe the outcomes. Current AI models are “pre-trained,” similar to a student who is given a test and the answers at the same time. The researchers believe that by allowing AI to act independently and perceive the results of its actions as new memories, agents can learn about the world through trial and error, similar to living creatures.
The researchers present a theoretical framework in their paper that outlines how a large language model (LLM) or foundational AI model could be embodied to achieve these goals in the future. However, they also acknowledge that there are numerous challenges to overcome. One major obstacle is that the most powerful LLMs currently exist on massive cloud networks, making embodiment a difficult proposition with current technology.
In related news, a breakthrough in nuclear fusion technology could potentially revolutionize artificial intelligence.