Run large language models on your Android device. No internet needed. Connect remote servers when you want. The LM Studio experience, now on mobile.
A complete chat experience designed for privacy-first AI on mobile
Run models entirely on your device using LiteRT-LM and llama.cpp. No data leaves your phone. Chat history stays local.
Connect to OpenAI, Anthropic, Ollama, LM Studio, or any OpenAI-compatible server over LAN or internet.
Search and download thousands of GGUF and LiteRT-LM models directly from Hugging Face with pause, resume, and progress tracking.
Real-time token-by-token streaming across all providers. Watch the AI think and respond as it generates text.
Full support for chain-of-thought reasoning from models like DeepSeek R1 and QwQ with collapsible think blocks.
Attach images and record audio in your conversations. Vision and audio models supported on LiteRT-LM and compatible providers.
LiteRT-LM models can use tools: weather, location, calculator, alarms, device controls, web search, and 15+ more.
All API keys and tokens stored with AES-256 encryption via the Android Keystore hardware security module.
Fine-tune generation with temperature, top-P, top-K, repeat penalty, max tokens, context size, and thread count controls.
Local engines or remote servers — one unified chat interface
No account, no cloud, no setup required for local chat
Download from Google Play. Runs on Android 9+ on any arm64 device.
Download a GGUF or LiteRT-LM model from Hugging Face, or import one from your device storage.
Select your model and start typing. No internet needed for local models — everything runs on your phone.