
Developer APIs to let your AI/LLM understand videos and audio.
Cloudglue - Video Understanding Infrastructure
Cloudglue is a Y Combinator-backed startup building developer APIs that turn video and audio into structured, searchable data. Think of us as the Stripe for video understanding - we handle the hard infrastructure (transcription, visual analysis, search, extraction) so developers can build on top of video without managing ML pipelines themselves.
We process millions of minutes of video for customers building search, analytics, and automation products. The engineering problems are real: high-throughput media processing, complex search and retrieval over multimodal data, and APIs that need to be fast, reliable, and a pleasure to use.
Our team has shipped large-scale systems at Snapchat and Amazon, with work presented at AWS re:Invent, KubeCon, NeurIPS, and DEF CON. We’re a small, technical team where engineers have real ownership and direct impact on the product.
The Role
We’re looking for a founding full stack engineer to build and ship features across Cloudglue’s APIs, developer dashboard, SDKs, and documentation. You’ll be one of the first engineers on the team - this is a high-ownership role where you’ll shape the product and the engineering culture.
You’ll work across the entire stack, from React frontends to backend services to database queries. Day to day, you’ll:
If you like building polished developer products and moving fast across frontend, backend, and infrastructure, this role is for you.
What You’ll Do
What We’re Looking For
Required
Nice to Have
Why Cloudglue?
Video is the largest and most underutilized data source on the internet. Most software still can’t meaningfully work with it. We’re building the infrastructure to change that - and the product surface you build is how developers will interact with it.
You’ll work on genuinely hard engineering problems, ship software that real developers depend on, and have outsized influence on a product that’s defining a new category.