Member-only story

Google’s Gemini 2.0 Flash Now Lets You Edit Images In Natural Language

Published in

Generative AI

6 min readMar 13, 2025

Native image editing in Google AI Studio. Image by Jim Clyde Monge

Google’s Gemini 2.0 Flash model now lets you make image edits using natural language natively. Unlike earlier multimodal systems that relied on pairing separate models (such as using a language model together with Imagen 3 for image generation) Gemini 2.0 Flash handles multimodality by generating images directly within the same system that processes text. This eliminates the need for inter-model communication, reducing latency significantly.

Since Gemini 2.0 Flash no longer depends on Imagen 3, it offers faster responses and smoother interactions. Plus, you can even embed longer text directly onto images!

Check out this example where I transformed Google Deepmind’s CEO, Sir Demis Hassabis, into a long hair dude.

Here’s another example showing Gemini adding a chocolate drizzle to plain croissants.

This is mind blowing because no other aspect of the original image was changed except for the added chocolate drizzle — which, by the way, looks incredibly realistic.

Here’s how it works

To get started, head over to Google’s AI Studio, login with your Google account and set the model to Gemini 2.0 Flash Experimental. Make sure also that the Output format is set to “Images and text”.

Google AI Studio with Gemini 2.0 Flash experimental. Image by Jim Clyde Monge

Then, upload your image file by clicking on the “+” button at the bottom-right corner of the prompt field. To illustrate, here’s a playful edit I made to an image of a fox. I dressed him in a puffer jacket because he might feel chilly up there in the icy mountains!

Generative AI

Google’s Gemini 2.0 Flash Now Lets You Edit Images In Natural Language

Here’s how it works

Published in Generative AI

Written by Jim Clyde Monge

Responses (3)