Google News

Google Pixel phones to get smarter and more useful with multimodal capabilities powered by Gemini Nano

Dwayne Cubbins

May 15, 2024 3 Min Read

Google Pixel phones to get smarter and more useful with multimodal capabilities powered by Gemini Nano

Google’s I/O 2024 event has been a hotbed of AI announcements, with the tech giant unveiling several significant updates to its AI offerings. One of the most exciting revelations is the introduction of multimodal capabilities to Gemini Nano, Google’s on-device large language model (LLM).

Currently, Gemini Nano, a lightweight and small LLM model designed for on-device AI tasks, is only capable of processing textual inputs. However, with the addition of multimodal capabilities, Gemini Nano will be able to input and process audio, images, and files, in addition to text.

This advancement will enable Gemini Nano to gather contextual information from various sources, including sounds, images, and spoken language, significantly enhancing its capabilities and usefulness. For instance, users will be able to ask Gemini Nano to extract information from YouTube videos or interpret diagrams and graphs, unlocking a whole new realm of possibilities.

Coming to Pixel later this year, we’ll be introducing our latest model, Gemini Nano with Multimodality.

This means your phone will not just be able to process text input but also understand more information in context like sights, sounds and spoken language. #GoogleIO pic.twitter.com/1yTujAl1W7

— Made by Google (@madebygoogle) May 14, 2024

While Google has announced that multimodal capabilities will be rolled out to Gemini Nano starting with Pixel phones later this year, the specifics of which Pixel models will receive the update and the exact timeline are yet to be revealed.

In addition to the multimodal capabilities, Google has also unveiled several other AI-powered features coming to Android devices. One notable addition is the enhanced homework support in Circle to Search, which will enable students to solve word problems in subjects like math and physics by simply highlighting the problem on the screen and receiving step-by-step instructions for solving it.

Circle to Search can now help with homework—directly from your Pixel phone or tablet.

When you circle the exact part of a prompt you're stuck on, you'll get step-by-step guidance to solve physics word problems without leaving your digital info sheet or syllabus. #GoogleIO pic.twitter.com/Fsmtcu7emn

— Made by Google (@madebygoogle) May 14, 2024

Moreover, Google is working on a more convenient Gemini overlay interface for Android, which will simplify tasks like dropping generated images into apps and enable users to ask Gemini to extract information from YouTube videos or answer questions based on the content of PDF files (for Gemini Advanced subscribers).

Other upcoming AI-powered features include real-time scam detection during phone calls and multimodal support in TalkBack, which will help better describe images to the visually impaired, with or without a network connection.

Gemini Nano’s multimodal capabilities are coming to TalkBack later this year.

People who experience blindness or low vision will get richer & clearer details of what’s happening in an image—whether it’s about a photo in a text or style of clothes when shopping online. #GoogleIO pic.twitter.com/JIs2DYhkg4

— Made by Google (@madebygoogle) May 14, 2024

As Google continues to push the boundaries of AI integration into its products and services, it’s clear that Pixel phones, and Android devices in general, are poised to become even smarter and more useful, thanks to the multimodal capabilities of Gemini Nano and other AI-driven advancements. Although, I personally think Google needs to take it easy with the “AI” talk. Android Authority’s Rita El Khoury highlighted that Google used the word Gemini 170 (including Gem/Gems) times during its keynote. Phew! So much so that the folks at Tech Crunch had to make things easier for everyone with a quick recap of the event:

In case you missed today's #GoogleIO keynote presentation, we summed it up for you pic.twitter.com/TdMDTSmc88

— TechCrunch (@TechCrunch) May 14, 2024

We stand out from the tech-media crowd because we break news stories; we mainly bring you stuff that you won’t find anywhere in the mainstream tech media. Our stories have been picked up by some of the world’s most popular websites and media outlets—more info is available here.

Dwayne Cubbins
2806 Posts

I cover fast-moving stories across apps, online platforms, and everyday tech — phones, wearables, consoles, and whatever else people are fighting with this week. Bugs, rollouts, scams, policy enforcement, and the occasional internet-culture rabbit hole are all fair game. My goal is simple — make confusing tech news readable. When I'm not working, I'm working out or chilling with my dog. Got a tip? You can find me on X @dcubbins.

Next article View Article

Android

[U: It's finally here] Fitbit app still missing dark mode support, but you can enable it on Android using this workaround

Update 21/08/25 - 5:55 pm (IST): After making users wait a long time, Google has finally introduced native dark mode support on the Fitbit app. This new eye...

Aug 21, 2025 4 Min Read

Google Pixel phones to get smarter and more useful with multimodal capabilities powered by Gemini Nano

Dwayne Cubbins 2806 Posts

Next article View Article

[U: It's finally here] Fitbit app still missing dark mode support, but you can enable it on Android using this workaround

Dwayne Cubbins
2806 Posts