
Developer APIs to let your AI/LLM understand videos and audio.
Cloudglue - Video Understanding Infrastructure
Cloudglue is a Y Combinator-backed startup building developer APIs that turn video and audio into structured, searchable data. Think of us as the Stripe for video understanding - we handle the hard infrastructure (transcription, visual analysis, search, extraction) so developers can build on top of video without managing ML pipelines themselves.
Our team has shipped large-scale systems at Snapchat and Amazon, with work presented at AWS re:Invent, KubeCon, NeurIPS, ICCV, CVPR, and DEF CON. We process millions of minutes of video for customers building search, analytics, and automation products.
We’re a small, technical team where engineers have real ownership and direct impact on the product.
The Role
We’re looking for a founding infrastructure engineer to design and scale the backend systems that power Cloudglue’s video processing pipelines, search and retrieval infrastructure, and async job orchestration. You’ll be one of the first engineers on the team - this is a high-ownership role where you’ll shape the architecture and the engineering culture.
You’ll work on:
This is a systems-heavy role for someone who enjoys building reliable, high-throughput infrastructure and cares about getting the fundamentals right.
What You’ll Do
What We’re Looking For
Required
Nice to Have
Why Cloudglue?
Video is the largest and most underutilized data source on the internet. Most software still can’t meaningfully work with it. We’re building the infrastructure to change that, and this role sits at the core of it.
If you want to work on:
…this is that role.