On-Device AI: Apple, Google & Microsoft

Beyond running models yourself, the major platform vendors have integrated AI directly into their operating systems — using dedicated hardware acceleration, partly or entirely on-device.

On-Device AI: Apple, Google & Microsoft

Beyond running models yourself, the major platform vendors have integrated AI directly into their operating systems — using dedicated hardware acceleration, partly or entirely on-device.

Apple Intelligence (iOS 18+ / macOS Sequoia+)

Requires iPhone 15 Pro or any iPhone 16, or an Apple Silicon Mac.

What runs fully on-device: - Writing tools (rewrite, proofread, tone adjustment) - Photo cleanup and object removal - Smart Reply suggestions in Mail and Messages - Notification summaries - Basic Siri requests

What uses Private Cloud Compute: More complex requests route to Apple's PCC infrastructure. Notable: PCC servers run on Apple Silicon, requests are processed without Apple being able to access content, and independent security researchers can verify the code running on PCC nodes. Apple publishes the PCC software for external audit.

Google Gemini Nano

Embedded in Pixel phones (Pixel 8+) and select Android devices. Powers: - Live Translate — real-time translation of conversations and calls - Call screening — understanding call content to screen spam - Summarization in Recorder — on-device audio transcription - Gboard smart suggestions

Runs entirely on the device's tensor processing unit. No connectivity required for supported features.

Microsoft Copilot+ PCs

Requires an NPU capable of 40+ TOPS. Qualifying devices get: - Phi Silica — a Phi model optimized for NPU inference, built into Windows, accessible to developers via the Windows AI API - Cocreator in Paint — on-device image generation - Live Captions — real-time audio translation, processed locally - Recall — timeline-based semantic search of past screen content (opt-in, processed locally)

On-Device vs. Apps You Install

Integration depth — OS AI can access your photos, messages, and calendar with proper permissions. Optimization — Runs on purpose-built silicon (Neural Engine, TPU, NPU). Control — You cannot swap models, adjust parameters, or inspect what the model is doing. Capability ceiling — On-device models are intentionally small; complex tasks escalate to cloud.

Don't expect from on-device AI: complex multi-step reasoning, code generation, long document analysis, or up-to-date world knowledge. These remain the domain of cloud models and larger local models.

Have a follow-up question about this topic?

Ask AI

← Previous

The Best Open Source Models in 2026

Privacy, Data & What Happens to Your Prompts