Back to stories
Models

Google Releases Gemma 4 — Most Capable Open Models Yet, Under Apache 2.0

Michael Ouroumis2 min read
Google Releases Gemma 4 — Most Capable Open Models Yet, Under Apache 2.0

Google DeepMind has released Gemma 4, a family of four open-weight models that represent a major step forward for the open AI ecosystem — and a strategic shift in how Google distributes its frontier research.

Four Models, One Architecture

The release spans the full compute spectrum. At the bottom: Effective 2B and 4B models purpose-built for on-device inference on phones, tablets, and edge hardware. At the top: a 26B Mixture-of-Experts model and a 31B Dense model aimed at cloud and data center workloads.

All four are derived from the same research that produced Gemini 3, Google's proprietary frontier model. The 31B Dense variant currently sits at #3 on the Arena AI text leaderboard among open-weight models, with the 26B MoE variant at #6.

Perhaps most significantly, the entire family ships under Apache 2.0 — a fully permissive license with no usage restrictions. Previous Gemma releases carried more restrictive terms that limited commercial deployment.

Multimodal by Default

Every Gemma 4 model processes video and images natively, supporting variable resolutions and excelling at visual reasoning tasks including OCR, chart interpretation, and document understanding. The edge-targeted E2B and E4B models add native audio input for speech recognition and understanding — a first for Google's open model line.

All variants support over 140 languages out of the box, a reflection of Gemini 3's multilingual training corpus.

Context windows range from 128K tokens for the edge models to 256K for the 26B and 31B variants — long enough to process entire codebases, lengthy documents, or extended video sequences in a single pass.

Built for Agentic Workflows

Google explicitly designed Gemma 4 for the agentic AI workflows that have become the dominant deployment pattern in 2026. The models include native support for structured tool calling, multi-step planning, and autonomous task execution.

Android developers get early access through the AICore Developer Preview, which integrates Gemma 4 directly into the Android runtime for on-device agent capabilities without cloud round-trips.

The Strategic Play

The Apache 2.0 licensing is the real headline. By making its most capable open models fully permissive, Google is positioning Gemma as the default foundation for commercial AI applications that need to avoid proprietary lock-in.

The timing is pointed. Meta's Llama 4 Maverick uses a custom license with usage restrictions. DeepSeek V4, while impressively cheap to train, operates under Chinese export considerations that make some Western enterprises uneasy.

Google is betting that permissive licensing plus frontier-tier performance will make Gemma 4 the path of least resistance for enterprise adoption — and that widespread Gemma deployment will keep developers inside Google's broader cloud and tooling ecosystem.

The models are available now on Hugging Face, Google Cloud Vertex AI, and through the Kaggle platform.

Learn AI for Free — FreeAcademy.ai

Take "AI Essentials: Understanding AI in 2026" — a free course with certificate to master the skills behind this story.

More in Models

Google's Gemini 3.5 Flash Beats the Pro Tier on Agent Benchmarks — and Ships a Managed Agents API
Models

Google's Gemini 3.5 Flash Beats the Pro Tier on Agent Benchmarks — and Ships a Managed Agents API

At I/O 2026 Google shipped Gemini 3.5 Flash, a Flash-tier model that outscores Gemini 3.1 Pro on coding and agentic benchmarks at less than half the cost of comparable frontier models, alongside a Managed Agents API that spins up tool-using, code-executing agents in a single call.

4 hours ago2 min read
Google ships Gemini 3.2 Flash at I/O 2026, undercuts GPT-5.5 by 15-20x on inference cost
Models

Google ships Gemini 3.2 Flash at I/O 2026, undercuts GPT-5.5 by 15-20x on inference cost

Gemini 3.2 Flash debuts at $0.25/M input and $2.00/M output tokens, hitting ~92% of GPT-5.5 on coding and reasoning while rolling out across Search, Maps, Gmail, and Chrome simultaneously.

1 day ago2 min read
Thinking Machines Lab Debuts 'Interaction Models' — Mira Murati's First Step Into Frontier AI
Models

Thinking Machines Lab Debuts 'Interaction Models' — Mira Murati's First Step Into Frontier AI

Mira Murati's Thinking Machines Lab released a research preview of 'interaction models,' a new class of full-duplex multimodal AI that listens, sees and speaks at the same time, with turn-taking latency reported at about 0.4 seconds.

1 week ago2 min read