Hi everyone, I’m James.
Moments ago, Google quietly dropped a bombshell in the middle of the night: it officially released a brand-new open-source AI coding tool—Gemini CLI.
Yes, you read that right: open source.
GitHub:
https://github.com/google-gemini/gemini-cli
Wild stat: within just three hours of going open source, the repo had already racked up 5.1K stars—and it was still climbing fast as I was writing.
What’s more, Google basically stuffed its most powerful Gemini 2.5 Pro model straight into the place developers live every day—the command line terminal (aka the beloved “black window”).
Here’s an official demo video from Google:
Why the demo is impressive boils down to a few things:
Natural-language control for complex tasks
The user doesn’t babysit the model with step-by-step instructions. They just say, “Make me a 30-second video showing the story of a ginger cat’s adventures around Australia.” That’s a high-level creative brief, not a granular checklist.
Autonomous planning & task decomposition
After receiving the request, Gemini CLI doesn’t immediately start doing stuff. It thinks and plans first—e.g., “Refining the Narrative Structure.” It breaks the big “make a video” goal into executable subtasks, such as:
- Craft a four-scene story.
- For each scene, generate images, short video clips, and voiceover.
- Assemble everything into a final video.
- Clean up temporary files.
Multimodal orchestration
This is the coolest part. To pull it off, Gemini CLI acts like a conductor and calls multiple Google AI models in sequence:
- imagen_t2i (Imagen) for text-to-image — creates still images of the cat in different Australian settings.
- veo_i2v (Veo) for image-to-video — turns those stills into motion clips.
- chirp_tts (Chirp) for text-to-speech — generates narration for each scene.
Seamless interaction with your local machine
It doesn’t just call cloud models; it also touches your local file system. In the demo you can clearly see it run shell commands:
mkdir -p ginger_cat_adventure
to create a working folder;rm ...
to delete all intermediate files at the end, leaving only the final video.Self-healing error handling
At 00:11, there’s an “Error: failed to execute tool.” The pipeline doesn’t crash. The agent recognizes the issue, shows “Adjusting Directory Paths,” and keeps going. That’s a nice bit of robustness and autonomy.
By the way, the analysis above is what I asked Gemini 2.5 Pro to summarize for me from a soundless version of the video—its multimodal understanding is kind of insane.
Gemini CLI is laser-focused on your terminal, giving it the ability to think, understand, and execute complex tasks.
In one line: Gemini CLI turns your terminal into a local AI agent.
And a few things here are straight-up legendary:
1) A free tier that’s off the charts
This is Google being ridiculously generous, and it deserves the top spot.
To make sure every developer can try it without friction, Google is offering an almost over-the-top free plan:
- Free access to Gemini 2.5 Pro—yes, the flagship model with a 1,000,000-token context window.
- High rate limits: up to 60 requests per minute and 1,000 requests per day.
Translation: you can pretty much use AI in your terminal nonstop without worrying.
While other tools are nickel-and-diming by the token, Google’s basically saying “drink up.”
It’s hard not to see this cranking competitive pressure up to 11.
And this commenter absolutely nails the tone—had me laughing.
2) More than coding—it’s an AI Agent
The power of Gemini CLI comes from its built-in tool set, which takes it far beyond simple code generation.
- Large codebase comprehension & editing. Thanks to that 1M-token context window, you can feed in an entire medium-to-large project.
(For comparison, the Augment tool I covered previously has a ~200K-token window—Gemini CLI’s is roughly 5× larger.) Reference: In my previous write-up on Augment, I discussed how it’s starting to eat into mid-level developer workflows.
(“Kangaroo Inn,” a Chinese tech blog on WeChat—China’s largest social app—ran a piece titled roughly, “Crushing Cursor! Augment Is Taking Mid-Level Devs’ Jobs…”) - Automation for DevOps chores. Think: summarize all relevant pull requests from the last 24 hours, or untangle a gnarly
git rebase
conflict—these tedious ops tasks can be delegated. - Powerful tool & MCP integrations. It can call external tools like Imagen (images), Veo (video), even connect to Lyria to create music.
- Built-in Google Search. It can automatically look up the latest info and reason over it for answers.
3) Open source and highly extensible
Gemini CLI is fully open source under the Apache 2.0 license.
That means transparency and security: you can inspect every line of code—no mystery “backdoors.”
It also means the global dev community can contribute features and fixes. Expect it to evolve fast—that’s the magic of open source.
On top of that, it’s built on the MCP (Model Context Protocol) standard and supports system-level prompt configuration via a GEMINI.md
file.
You can shape it to your workflow and team conventions—tune it into an assistant that “gets” you.
How to install & run it
Super simple.
Prerequisite: Node.js 18+ installed on your machine.
Check your version with:
node -v
Step 1: Run the CLI
In your terminal, run either of the following:
npx https://github.com/google-gemini/gemini-cli
or install globally:
npm install -g @google/gemini-cli
Step 2: Pick a theme and sign in
On first run, you’ll be prompted to choose a theme.
Use the arrow keys to select one and hit Enter. I picked “github dark.”
Next, you’ll see a screen where you press Enter to “Login with Google.”
A browser window will pop up asking you to sign in with your personal Google account.
Once authorized, you’re automatically on the free plan.
You can start chatting in the terminal window right away.
If you’d rather use a specific model—or if you outgrow the free quota—go to Google AI Studio to generate an API key, then set it via an environment variable:
export GEMINI_API_KEY="your_api_key_here"
Final thoughts
The open-sourcing of Google Gemini CLI feels like a statement: a new command-line interaction paradigm has arrived.
It marries the convenience of natural language with the efficiency of the terminal, and hands it to every developer in an exceptionally open (open source) and generous (huge free tier) way.
This year belongs to agents—and to builders (not just traditional programmers).
I’m continuing to test Gemini CLI and build some fun workflows with it.
Today’s post was a quick tour to help you get it installed and running.
It’s late here, so I’m calling it a night. Good night 😴
Thanks for the ongoing support—I’ll keep sharing practical AI deep dives.
If you’re into this kind of content, hit follow.
Questions? Drop them in the comments.
Comments