How We Built Sub-200ms VM Boot Times (And Why It Matters for AI)
How we achieved sub-200ms VM boot times and why sub-second boot changes everything for AI agents and sandboxes.
How We Built Sub-200ms VM Boot Times (And Why It Matters for AI)
When we started building Oblien, our VMs booted in about 4 seconds. That sounds fast for a virtual machine - AWS EC2 takes 30-90 seconds. But 4 seconds isn't fast enough for on-demand AI agent infrastructure.
If a user triggers an agent and waits 4 seconds before anything happens, it feels broken. If the agent needs to spin up 10 sub-agents, that's 40 seconds of setup before any real work begins.
We needed boot times under 200ms. Here's how we got there.
Why Boot Time Is the Key Metric
Fast boot time isn't just about user experience (though that matters). It fundamentally changes what architectures are possible:
With 30-second boot times (EC2)
- You pre-provision pools of VMs and keep them running
- Pay for idle infrastructure "just in case"
- Capacity planning becomes a guessing game
- Dynamic scaling is too slow for user-triggered workloads
With 4-second boot times
- Better, but agents still feel sluggish
- Per-request VMs are possible but noticeable
- Pool-based architectures still preferred
With 130ms boot times
- Create a VM per user request - disposable, no pool needed
- Create a VM per code execution - true sandboxing
- Scale from 0 to 1000 in seconds - no pre-provisioning
- Throw away VMs on error - cheaper than debugging
- Zero idle infrastructure - create when needed, delete when done
The 130ms threshold is where VMs stop feeling like infrastructure and start feeling like function calls.
The Starting Point
Firecracker (created by AWS for Lambda) already boots faster than traditional VMs. A basic Firecracker VM boots in about 125ms for a minimal configuration.
But we need a full Linux environment - package manager, language runtimes, real filesystem, networking, encryption. The naive approach brought us to ~4 seconds.
We identified five bottlenecks and systematically eliminated each one.
Bottleneck 1: The Guest Kernel
Standard Linux kernels aren't optimized for microVM boot speed. They're built for hardware compatibility and flexibility.
We run a kernel optimized for our workload that boots significantly faster.
Bottleneck 2: Image Loading
Standard container image formats aren't optimized for boot speed. Per-boot image assembly adds seconds.
We eliminated per-workspace image setup entirely. Image loading went from the biggest bottleneck to negligible.
Bottleneck 3: Filesystem Encryption
Every workspace disk is encrypted with AES-256 and a unique key per workspace. Standard encryption setups add hundreds of milliseconds to boot.
We optimized the encryption path to add negligible overhead while maintaining full disk encryption.
For data deletion: destroy the key. Without the key, the data is cryptographic noise.
Bottleneck 4: Startup
Workspace startup is optimized for speed - your process is running within milliseconds of the kernel booting.
Bottleneck 5: Network Setup
Networking is ready when the workspace boots - no additional setup time.
The Result
By optimizing every phase of the boot path, we reduced total boot time from ~4 seconds to under 130ms. There was no single silver bullet; it was systematic work across every layer of the stack.
The remaining variance (70-130ms) depends on host load and workload characteristics.
Why Those Milliseconds Matter for AI
Disposable sandboxes
An AI agent that executes user code can create a fresh sandbox per execution. At 4 seconds, users feel the delay. At 130ms, it's imperceptible - like opening a new browser tab.
Agent-per-request architecture
Instead of keeping long-lived agent processes waiting for work, create an agent for each request and destroy it after. Simpler. More secure (no state leakage between requests). Only possible when VMs are disposable.
Burst scaling
Your product goes viral and gets 500 simultaneous users. Each needs an agent workspace. At 130ms per workspace, all 500 are running within seconds. No capacity planning, no over-provisioning.
Error recovery
An agent's workspace gets into a weird state? Don't debug it - destroy and recreate. 130ms later, you're back to a clean slate. This changes how you handle failures: retry is cheaper than investigation.
What We Learned
-
Measure everything - we profiled every millisecond of the boot path. The biggest wins came from places we didn't expect.
-
Don't use general-purpose tools for specific problems - standard Linux infrastructure is designed for flexibility. We needed speed.
-
Milliseconds compound - saving 50ms in several places adds up fast. There was no single silver bullet - it was dozens of optimizations across every layer.
Try It Yourself
Every workspace you create on Oblien goes through this exact boot sequence. Whether you're building an AI product, running code sandboxes, or setting up cloud development environments - your workspaces are ready in ~130ms.
Get started → Oblien Documentation
How to Set Up Cloud Dev Environments That Spin Up in Seconds
Set up instant, reproducible cloud dev environments any developer can spin up in under a second. Ditch fragile local setups for good.
From Idea to Deployed App in 60 Seconds with AI Agents
Go from a text description to a live, deployed web app in under 60 seconds using AI agents. See the real workflow and architecture behind it.