Google’s AI model, Gemini, is set to be integrated into various aspects of the tech giant’s technology. This includes Gmail, YouTube, and the company’s smartphones.
During Google CEO Sundar Pichai’s keynote speech at the I/O 2024 developer conference on May 14, he revealed some of the upcoming areas where the AI model will be implemented. Pichai emphasized the importance of AI, mentioning it 121 times during his 110-minute speech, with Gemini, which was launched in December 2023, taking center stage.
Google is incorporating the large language model (LLM) into its offerings, such as Android, Search, and Gmail. Here’s what users can expect in the future:
Sundar Pichai at Google I/O 2024. Source: Google
Enhanced App Interactions:
Gemini will now have more context and will be able to interact with applications. In an upcoming update, users will be able to call upon Gemini to interact with apps, such as dragging and dropping an AI-generated image into a message. YouTube users will also have the option to tap “Ask this video” to receive specific information from the AI within the video.
Gemini Integration in Gmail:
Google’s email platform, Gmail, will also benefit from AI integration. Users will be able to search, summarize, and draft their emails using Gemini. The AI assistant will also be capable of taking action on emails for more complex tasks, such as assisting with processing e-commerce returns by searching the inbox, finding the receipt, and filling out online forms.
Introduction of Gemini Live:
Google has introduced a new feature called Gemini Live, which allows users to engage in “in-depth” voice chats with the AI on their smartphones. Users can interrupt the chatbot mid-answer for clarification, and the AI will adapt to their speech patterns in real time. Additionally, Gemini can respond to and understand physical surroundings through photos or videos captured on the device.
Screenshot from Gemini promotional video. Source: Google
Advancements in Multimodal AI:
Google is actively working on developing intelligent AI agents that can reason, plan, and complete complex multi-step tasks with user supervision. The term “multimodal” refers to the AI’s ability to go beyond text and handle image, audio, and video inputs. Examples and early use cases include automating shopping returns and exploring new cities.
Other Updates:
Google plans to replace Google Assistant on Android with Gemini, fully integrating it into the mobile operating system. A new feature called “Ask Photos” enables users to search their photo library using natural language queries powered by Gemini. The AI can understand context, recognize objects and people, and provide summaries of photo memories in response to questions. Additionally, Google Maps will display AI-generated summaries of places and areas, utilizing insights from the platform’s mapping data.
Magazine:
Sci-fi author David Brin suggests “sic AIs on each other” to prevent an AI apocalypse.