Is Apple using Google Gemini to train its AI models?

Yes. According to The Information, as part of Apple's deal with Google announced in January 2026, Apple has 'complete access' to Gemini in its data centers, which it is using to train smaller on-device AI models via distillation.

What is model distillation?

Distillation is a technique where a large 'teacher' model is used to train a smaller, faster 'student' model. The smaller model learns to mimic the teacher's outputs, achieving strong performance at a fraction of the compute cost.

Why would Apple use Gemini instead of training models on its own?

Using Gemini as a teacher model gives Apple access to a frontier-scale AI without the enormous cost of training one from scratch. The resulting smaller models can run efficiently on iPhones and Macs with strong accuracy.

Apple Is Using Google's Gemini to Train Smaller On-Device AI Models

Apple's partnership with Google runs deeper than previously understood. According to a new report from The Information, Apple has been granted "complete access" to Google's Gemini model inside its own data centers — and is using that access to train smaller AI models for deployment on its devices.

The arrangement, stemming from the deal announced in January 2026, leverages a technique called distillation: a frontier-scale "teacher" model is used to generate training data and supervision signals for a smaller "student" model, which can then run efficiently on-device.

Why Distillation Matters

Model distillation has become one of the most important techniques in practical AI deployment. Training a model that can match GPT-4 or Gemini Ultra on a broad range of tasks requires enormous compute resources. But distilling a focused version of those capabilities into a smaller model — one optimized for specific Apple use cases — is far more tractable.

The resulting models can run on iPhone, iPad, and Mac hardware without constant cloud inference, preserving privacy and reducing latency. It's the same general approach Apple used to build its Apple Intelligence features, though the use of Gemini as a teacher is new.

The Strategic Calculus

For Apple, the arrangement is pragmatic. Building frontier-scale models internally would require massive investment in data centers and research talent. Using Google's model as a scaffold allows Apple's teams to focus on the distillation pipeline, device optimization, and Apple-specific fine-tuning — areas where they already excel.

For Google, it deepens the commercial relationship with Apple's enormous install base, even as the two companies compete in AI assistants and mobile software. Notably, it also means Gemini's capabilities — however indirectly — end up powering Apple's on-device AI features.

Privacy Implications

Apple has been careful to position its AI features around on-device processing and Private Cloud Compute. If Gemini is being used purely as a training-time teacher model, with no inference happening via Google's servers during normal device use, that's largely consistent with Apple's privacy narrative.

The more sensitive question is what training data flows through this arrangement — something neither Apple nor Google has commented on publicly.

What to Watch

The distillation strategy suggests Apple is serious about closing the capability gap with Google and OpenAI without abandoning its hardware-first, privacy-first positioning. If the approach works, it could become a template for how device manufacturers at scale build competitive AI without the resources of a frontier lab.

Expect more details to emerge as Apple Intelligence features roll out in the next major iOS and macOS releases.

Apple Is Using Google's Gemini to Train Smaller On-Device AI Models

Why Distillation Matters

The Strategic Calculus

Privacy Implications

What to Watch

More in Models

OpenAI Ships GPT-Realtime-2 With Live Translation and Streaming Whisper, Pushing Voice Agents Toward GPT-5 Reasoning

Zyphra Releases ZAYA1-8B, the First Frontier-Class Reasoning MoE Trained Entirely on AMD

OpenAI Makes GPT-5.5 Instant the Default ChatGPT Model With 52.5% Fewer Hallucinations