Google Gemma 4 Runs Natively on iPhone With Full Offline AI Inference

April 15, 2026
Detailed image of electronics showing circuit boards and components close-up.
Photo by Tima Miroshnichenko on Pexels

What happened

It has been reported that Google’s open-source Gemma 4 model family now runs directly on iPhones, performing full local inference with no cloud calls. The capability is exposed through a new Google AI Edge Gallery app on the App Store — users allegedly pick a model variant and start running inference entirely offline. Why does that matter? Because on-device AI isn't theory anymore; it's running in your pocket.

Performance and variants

Early benchmarks, it has been reported, place the 31B Gemma 4 variant roughly alongside Qwen 3.5’s 27B model — not a knockout, but competitive. More important are the smaller, mobile-first E2B and E4B variants: Google apparently nudges users toward E2B because it’s lighter, faster, and tuned for real-world memory and thermal limits. Under the hood, the model allegedly routes inference through the iPhone’s GPU, producing noticeably low latency. Fast responses, local compute — that's the combo that flips the script.

How to try it and why it matters

Getting started reportedly requires only the Google AI Edge Gallery download; no API keys, no cloud bill. The app is said to bundle text, image, and voice features plus an extensible Skills framework, positioning Gemma 4 not as a demo but as a platform for on-device experimentation. Offline inference changes the calculus for privacy-sensitive and field use cases — healthcare, remote enterprise work, and any scenario where sending data to the cloud is a non-starter.

The big picture

This feels like one of those moment-of-truth product shifts: edge AI moving from lab demos to consumer hardware. It’s an invitation to developers and a headache for incumbents who’ve built businesses around cloud-only models. Will the ecosystem follow? Time will tell, but for now it has been reported that Google just made the case that powerful AI can — and will — live on the device.

Sources: gizmoweek.com, Hacker News