OpenAI is said to be in the process of developing an innovative autonomous artificial intelligence (AI) assistant that can take control of a user’s device and perform tasks. This potential new product was initially reported by The Information, which cited an anonymous source familiar with the matter. Although OpenAI has not yet responded to requests for comment or clarification, it seems logical that the next step beyond generative AI systems like ChatGPT would be action agents.
Generative AI systems, such as ChatGPT and Google’s Gemini, are designed to produce human-like media, including text, images, audio, and video. Typically, developers need to integrate these models with external applications in order to make them perform real-world actions, such as operating a robot. This involves adapting the output of the AI into a programmable format.
The technology underlying most smart assistants and similar systems is not as advanced as what powers ChatGPT, Gemini, or even Amazon’s generative and foundational AI products. It is reasonable to assume that a virtual assistant built on large language model technology, like ChatGPT, would have greater potential for autonomous action compared to the simpler systems used in previous generations of smart assistants.
Until more information is revealed about OpenAI’s reported autonomous action agents, we can only speculate about their potential capabilities. According to The Information’s report, the new AI system would be able to operate users’ devices to perform requested tasks. For example, a user could ask the AI to copy data from one platform to another. In theory, an AI system with sufficient device privileges could perform any physical function that a human can, such as swiping, tapping, clicking, double-clicking, typing, and even solving CAPTCHA puzzles.
However, the road to autonomous assistance systems is filled with privacy and security challenges. Current state-of-the-art generative AI systems rely on connectivity to massive cloud compute centers. While it is possible to run some AI functions solely on laptops and smartphones, it is unlikely that an AI action agent, as envisioned, could operate solely on an onboard AI chip. This presents potential privacy risks, especially when combined with the security threat of granting a corporate AI system unrestricted access to private information. The ability of smartphones to exchange data on a global scale could result in a significant new cyberthreat.