Beyond running models yourself, the major platform vendors have integrated AI directly into their operating systems — using dedicated hardware acceleration, partly or entirely on-device.
Beyond running models yourself, the major platform vendors have integrated AI directly into their operating systems — using dedicated hardware acceleration, partly or entirely on-device.
Requires iPhone 15 Pro or any iPhone 16, or an Apple Silicon Mac.
What runs fully on-device: - Writing tools (rewrite, proofread, tone adjustment) - Photo cleanup and object removal - Smart Reply suggestions in Mail and Messages - Notification summaries - Basic Siri requests
What uses Private Cloud Compute: More complex requests route to Apple's PCC infrastructure. Notable: PCC servers run on Apple Silicon, requests are processed without Apple being able to access content, and independent security researchers can verify the code running on PCC nodes. Apple publishes the PCC software for external audit.
Embedded in Pixel phones (Pixel 8+) and select Android devices. Powers: - Live Translate — real-time translation of conversations and calls - Call screening — understanding call content to screen spam - Summarization in Recorder — on-device audio transcription - Gboard smart suggestions
Runs entirely on the device's tensor processing unit. No connectivity required for supported features.
Requires an NPU capable of 40+ TOPS. Qualifying devices get: - Phi Silica — a Phi model optimized for NPU inference, built into Windows, accessible to developers via the Windows AI API - Cocreator in Paint — on-device image generation - Live Captions — real-time audio translation, processed locally - Recall — timeline-based semantic search of past screen content (opt-in, processed locally)
Integration depth — OS AI can access your photos, messages, and calendar with proper permissions. Optimization — Runs on purpose-built silicon (Neural Engine, TPU, NPU). Control — You cannot swap models, adjust parameters, or inspect what the model is doing. Capability ceiling — On-device models are intentionally small; complex tasks escalate to cloud.
Don't expect from on-device AI: complex multi-step reasoning, code generation, long document analysis, or up-to-date world knowledge. These remain the domain of cloud models and larger local models.
Have a follow-up question about this topic?
Ask AI