We're Sanchit and Shubham, co-founders of RunAnywhere (W26).
TL;DR: Run Multi-modal AI fully on-device with one SDK and manage model rollouts + policies from a control plane.
We are already live and open source with ~3.9k stars on GitHub.
Edge AI is inevitable — users want instant responses, full privacy (health, finance, personal data), and AI that actually works on planes, subways, or spotty rural connections.
But shipping it today is brutal:
Result: most teams either give up on local AI or ship a brittle, hacked-together experience.
RunAnywhere isn't just a wrapper around a model. It is a full-stack infrastructure layer for on-device intelligence.
1. The "Boring" Stuff is Built-in We provide a unified API that handles model delivery (downloading with resume support), extraction, and storage management. You don't need to build a file server client inside your app.
2. Multi-Engine & Cross-Platform We abstract away the inference backend. Whether it's llama.cpp or ONNX etc, you use one standard SDK.
3. Hybrid Routing (The Control Plane) We believe the future isn't "Local Only"—it's Hybrid. RunAnywhere allows you to define policies: try to run the request locally for zero latency/privacy; if the device is too hot, too old, or the confidence is low, automatically route the request to the cloud.
Try our demo apps:
We're in full execution mode post-launch and hunting design partners + early feedback:
Get in touch:
Excited to hear what you're building and how we can make on-device AI actually shippable at scale.