How to Boot a Full Linux VM in Under 200 Milliseconds
Most VMs take 30–60s to boot. Firecracker does it in under 200ms. Learn the engineering behind sub-second boot for AI agents and sandboxes.
How to Boot a Full Linux VM in Under 200 Milliseconds
When most people hear "virtual machine," they think of something that takes 30 seconds to a minute to start. An EC2 instance. A DigitalOcean droplet. A VirtualBox window.
But what if a VM could boot in under 200 milliseconds? Not a container pretending to be a VM - a real Linux virtual machine with its own kernel, its own memory space, its own block device, and hardware-level isolation.
That's what Firecracker microVMs achieve. And it changes what you can build, because environments that would have been too expensive to create per-task become practical.
Here's how it works.
Why boot time matters
Boot time determines how you use infrastructure:
- 30-60 seconds → VMs are long-lived. You keep them running. You share them between tasks. You over-provision to avoid startup delays.
- 500ms-2 seconds → Containers. You can scale them up and down, but you still keep warm pools because cold starts hurt.
- Under 200ms → VMs become disposable. Create one per task, per user, per request. Destroy it when done. The cost of starting fresh is negligible.
This matters for AI agents because you want to give each agent task its own isolated environment. If that takes 30 seconds, you'll share environments between tasks (creating security risks). If it takes 130 milliseconds, you create a fresh environment every time.
The anatomy of a ~130ms boot
Here's what happens when Oblien creates a workspace:
1. API validation and scheduling
The API receives your request, validates parameters, checks quotas, assigns resources, and schedules the workspace on a host with available capacity.
2. Image resolution
Your chosen image (say, node-22) is already ready. No pulling, no extracting, no assembly.
3. Filesystem preparation
The workspace gets its own private, encrypted filesystem. Disk encryption is set up with a unique key - every workspace's storage is encrypted independently.
4. VM launch
The host launches a Firecracker microVM inside a hardened security boundary. KVM creates the hardware-isolated VM partition with its own kernel, memory, and virtual devices.
5. Boot and init
The guest kernel boots, the filesystem is mounted, networking is configured, and your workload starts. No unnecessary services - just what's needed to get your workspace running.
Total: ~125-150ms from API call to running process.
What makes this different from containers
Containers boot quickly because they don't boot a kernel - they start a process on the existing kernel with namespace isolation. That's a fundamentally different tradeoff:
| Phase | Container | MicroVM |
|---|---|---|
| Kernel | Shared (no boot) | Own kernel |
| Filesystem | Layered | Encrypted block device |
| Process start | Fork + exec | Boot + exec |
| Total | ~200-500ms | ~130ms |
| Isolation | Process-level | Hardware-level |
Optimized microVMs achieve boot times comparable to - or faster than - containers, with significantly stronger isolation.
Why this matters for AI and developer tools
Per-task sandboxes
Your AI agent needs to run untrusted code? Create a VM, run the code, destroy the VM. 130ms overhead per task is invisible to the user.
Per-user environments
Building a SaaS where each user gets their own environment? Create a VM per user on demand. No warm pools, no over-provisioning.
Parallel execution
Need to run 10 tasks simultaneously? Create 10 VMs in parallel. Each boots in ~130ms regardless of how many you create (assuming host resources are available).
Instant dev environments
Click "new environment" and you're in a terminal in under a second. No waiting. No warm-up. Clone your repo and start coding.
Preview deployments
Push code, boot a VM with the new version, get a URL. Preview deploys become instant instead of a minutes-long CI/CD pipeline.
The snapshot shortcut
~130ms is the cold start. But there's an even faster path: snapshots.
A snapshot captures the entire VM state - memory, disk, running processes - at a point in time. Restoring from a snapshot is significantly faster than a cold boot.
Use this for:
- Prewarmed environments - Take a snapshot after installing dependencies. New workspaces restore the snapshot instead of booting from scratch.
- Instant resume - Pause a workspace, resume it later. The user picks up exactly where they left off, with the same processes, same open files, same terminal state.
- Fast cloning - Need 10 identical environments? Take one snapshot, restore it 10 times.
What you can't do in 130ms (yet)
Honesty time - there are limits:
- Custom Docker images take longer on first use because they need to be prepared. After the first use, they boot at the same speed as built-in images.
- Very large disks (50GB+) take slightly longer to initialize.
- Snapshot restore with large memory (8GB+) takes longer because there's more state to restore.
For most workloads - 2-4 CPUs, 2-8GB RAM, standard images - you'll consistently see sub-200ms boot times.
Why this matters today
The speed of creating environments determines the architecture of your system:
- If environments are expensive to create, you share them. Sharing creates security risks and reliability problems.
- If environments are cheap to create, you use them once and throw them away. Each task, each user, each request gets a fresh, isolated environment.
Firecracker microVMs made the second approach practical. You get VM-level security with serverless-level convenience.
Oblien runs every workspace on Firecracker. Whether you're deploying a permanent agent home or creating a 30-second throwaway sandbox, every workspace is a ~130ms-boot microVM with its own kernel, encrypted disk, and zero-trust networking.
The Hidden Costs of Running AI Agents on Bare Metal
Running AI agents on bare metal costs more than you think. Security incidents, idle compute - learn the real costs and smarter alternatives.
How to Build an AI Coding Assistant That Runs in the Cloud
Build an AI coding assistant that writes code, runs it, verifies output, and iterates - all in an isolated cloud environment. Works with any LLM.