Article

AWS Lambda MicroVMs for AI agents: architecture, security, costs, and when to use them

Paulo FrugisView profilePublished June 23, 2026Updated June 24, 202611 min read

AWS announced AWS Lambda MicroVMs on June 22, 2026. The service adds a new Lambda resource for dedicated, Firecracker-based execution environments that can run user-supplied or AI-generated code with VM-level isolation, preserved in-session state, dedicated HTTPS endpoints, and direct lifecycle control.

The useful question is not whether MicroVMs are interesting. They are. The useful question is whether they fit a specific workload better than Lambda Functions, a managed code interpreter, containers, or a custom VM platform.

The gap is real: VMs provide strong isolation but usually take too long to start for interactive product experiences; containers start quickly but share a kernel; Lambda Functions provide mature Firecracker isolation, but their request/response model and 15-minute maximum duration do not fit long-running user environments. Lambda MicroVMs combine a dedicated execution boundary, snapshot-based startup, and stateful sessions that can suspend and resume.

The short version

AWS Lambda MicroVMs are best treated as a serverless execution boundary for stateful, interactive sessions: AI-agent sandboxes, user-code environments, notebooks, security analysis, and tools that need a custom runtime plus stronger isolation than a shared container boundary. They are not a complete security model, and they are not a replacement for ordinary Lambda Functions.

Question	Short answer
Where does this fit best?	Dedicated sessions where a user, agent, task, or tenant needs its own isolated environment and endpoint.
Where is it probably too much?	Short, stateless, event-driven work where Lambda Functions already provide the simpler model.
What can block a pilot?	Launch Regions, ARM64 support, quota limits, session duration, networking, image lifecycle, and cost per session.
What does isolation not solve?	VM-level isolation improves the execution boundary. You still own identity, permissions, egress, tool authorization, logging, and state cleanup.

The key shift: sessions, not invocations

Lambda MicroVMs do not behave like a normal Lambda Function with a handler waiting for events. Your application calls run-microvm when it needs an isolated environment for a user, job, agent, or sandbox. Lambda launches that MicroVM from an image snapshot and returns a dedicated HTTPS endpoint for that session.

Clients then connect to the running application over protocols such as HTTP/2, gRPC, or WebSockets. When the session is idle, the MicroVM can suspend and later resume with memory and disk state intact. That makes the practical unit a session, not an invocation.

This also changes the platform work your team owns. Lambda can vertically scale a running MicroVM above its configured baseline, but it does not automatically create a fleet of MicroVMs behind one shared endpoint. If you need more isolated environments, your application creates them, tracks endpoints, routes users or jobs to the right session, and cleans them up.

AWS has now made this pattern more concrete with its Lambda MicroVMs guide for Claude Managed Agents self-hosted sandboxes and the AWS sample implementation. The sample launches one MicroVM per Claude session from a control plane in your AWS account. That is useful validation of the pattern, but it also makes the ownership line clearer: webhook verification, secrets, image builds, network policy, monitoring, and cleanup are still your platform work.

Before you pick a pilot workload

A workload can look like a good MicroVM candidate and still be a poor first pilot if the Region, runtime, lifecycle, or network model does not fit.

Area	Current launch fact	What it means for planning
Region	US East (N. Virginia, Ohio), US West (Oregon), Europe (Ireland), Asia Pacific (Tokyo)	No Canadian Region at launch. Confirm data-location, latency, transfer, and policy fit before choosing the first workload.
Architecture	ARM64	Check native packages, binaries, agents, scanners, and language runtime dependencies.
Session size	Up to 16 vCPUs, 32 GB memory, and 32 GB disk	Decide whether the workload is a session sandbox or needs a larger platform pattern.
Scaling model	Automatic vertical scaling up to 4x the configured baseline; new sessions are created with `run-microvm`	Plan session routing, endpoint tracking, quotas, load placement, and cleanup yourself.
Lifecycle	Suspend/resume within a configured maximum duration of up to 8 hours	Plan external state, user messaging, cleanup, and retry behavior.
Network	Public outbound internet access by default; VPC egress through Lambda Network Connectors	Design egress controls before running sensitive code.

Treat the AWS Lambda MicroVMs documentation as the source of truth before a production build, especially for Regions, quotas, pricing, APIs, lifecycle hooks, and image-version behavior.

When Lambda MicroVMs are the right fit

The pattern is strongest when isolation, session state, and a custom runtime are all central to the product experience. It gets weaker when the workload is just a short task or when launch constraints rule out the design.

Fit	Use MicroVMs when...	Be careful when...
Strong	Each user, task, tenant, or agent needs a dedicated environment with preserved files, processes, dependencies, and an HTTPS endpoint.	The session also needs broad data or tool access; isolation does not replace authorization.
Conditional	The workload is a notebook, analytics session, security scanner, or internal tool where isolation helps but does not settle the design.	Private networking, audit evidence, state cleanup, quotas, or data-governance requirements are still unresolved.
Poor	The work is short, stateless, event-driven, or already fits Lambda Functions cleanly.	The workload needs x86, GPU, longer sessions, a non-launch Region, or platform controls outside the MicroVM model.

Choose the execution model by workload

The first architecture decision is not “MicroVMs or not.” It is which AWS execution model matches the boundary of the workload.

If the workload is...	Start with...	Why
Short, event-driven, and mostly stateless	Lambda Functions	Least operational overhead for ordinary request handling and background work.
Needs automatic request fanout behind one stable service endpoint	Lambda Functions, API Gateway, ALB, or a container service	MicroVM endpoints belong to individual sessions; your application must route users and jobs to the right MicroVM.
Still a Function workload, but tenant separation must be explicit	Lambda tenant isolation pattern	Keeps the Function model while separating execution boundaries by tenant.
An AI agent only needs a managed code execution tool	AgentCore Code Interpreter or a similar managed sandbox	Reduces the amount of runtime and sandbox infrastructure your team owns.
A dedicated interactive session with custom packages, state, and an endpoint	Lambda MicroVMs	Good candidate when Region, ARM64, lifecycle, networking, and cost constraints fit.
Long-running, GPU-backed, x86-dependent, or host-controlled execution	ECS, EKS, Fargate, EC2, or another platform pattern	These requirements usually exceed the MicroVM launch envelope.

How the architecture works

The lifecycle starts with a MicroVM image. You package the application and Dockerfile, place the artifact in Amazon S3, and call the Lambda MicroVM image API. Lambda builds the image, initializes the application, and captures a ready snapshot.

When the product needs a session, it calls the MicroVM run API. The MicroVM starts from the snapshot, receives a dedicated HTTPS endpoint, and can serve interactive traffic such as HTTP/2, gRPC, or WebSockets depending on the application you expose. When idle, it can suspend while preserving memory and disk state. It resumes when traffic arrives or when the application calls the resume API. It terminates when explicitly ended or when the configured maximum duration is exceeded.

This lifecycle is powerful because users can return to a warm working state. It is also easy to misuse. Anything placed into the initialized snapshot can appear in more than one session, so secrets, unique identifiers, temporary credentials, connection handles, and per-user state need deliberate lifecycle handling.

What AWS manages and what you still own

Layer	AWS provides	Your team still owns
Execution boundary	Firecracker-based MicroVM isolation and lifecycle APIs.	Workload selection, tenant mapping, image contents, and session cleanup.
Identity	IAM controls for Lambda resources and API calls.	Which principal can launch sessions, pass roles, access data, or invoke tools.
Endpoint access	Dedicated HTTPS endpoint and token-based request authentication.	Token issuance, expiration, allowed ports, client routing, and application-level authorization.
Network	Default outbound public internet access and optional VPC egress through Lambda Network Connectors.	Egress policy, security groups, network ACLs, private service access, and deny-by-default decisions.
State	Memory and disk preservation during suspend/resume.	Which state belongs inside the MicroVM, which state must be externalized, and how sensitive state is removed.
Evidence	Service events and integration points for AWS logging and monitoring.	Correlating activity to user, tenant, agent, approval, tool call, cost, and outcome.

Treat Lambda MicroVMs as the execution boundary, not the whole security architecture. A session can still create material risk if it has broad IAM permissions, unrestricted egress, long-lived endpoint tokens, overpowered tools, or poor logs.

Networking and endpoint authentication

AWS documents public outbound internet access as the default egress path for Lambda MicroVMs. If the session needs RDS, ElastiCache, internal APIs, on-premises systems, or private AWS services, design VPC egress with a Lambda Network Connector and apply the relevant security group and network ACL controls.

Endpoint traffic also needs its own design. Requests to a MicroVM endpoint require an authentication token in the X-aws-proxy-auth header. AWS describes these as encrypted JWE tokens scoped to a specific MicroVM, allowed ports, and an expiration time. That makes token lifetime and port scope architecture decisions, not implementation details.

For sensitive workloads, the review question is simple: what can this MicroVM reach, and under whose authority?

The serverless illusion: the infrastructure you still own

Lambda MicroVMs remove a lot of undifferentiated infrastructure work. They do not remove platform ownership. The work shifts from managing hosts to managing images, sessions, state, permissions, egress, evidence, and cleanup.

The image drift problem

A MicroVM starts from an image and initialized snapshot. That is what makes launch fast, but it also means your sandbox environment has a release lifecycle of its own. Dependency updates, OS patches, language runtimes, agent tools, prompt templates, certificates, and bootstrap scripts all need versioning and rollback.

If the platform cannot tell which image version launched which tenant or session, image drift becomes hard to debug. One user may be running a patched sandbox while another is still on yesterday's dependency set. A production pipeline should tag images, test snapshots, promote versions deliberately, and record the image version with each session.

The stale state problem

Suspend and resume are powerful because a user can return to a warm working state. They also create a new failure boundary. To the MicroVM, memory and disk may look preserved. Outside the MicroVM, database connections may have been closed, bearer tokens may have expired, remote services may have rotated credentials, and long-lived handles may no longer be valid.

Treat resume as a partial reinitialization path, not as proof that the world stayed still. Application code should validate connections, refresh credentials, re-check clocks and leases, rebuild clients when needed, and fail safely before handing execution back to a user workload.

Do not bake secrets into snapshots. Generate per-session credentials after launch or retrieve them at runtime with scoped permissions.
Track image versions per session. Debugging and rollback depend on knowing exactly what environment ran.
Handle resume explicitly. Recycle dead sockets, refresh tokens, and validate external state before continuing work.
Externalize durable state. A MicroVM session is not the system of record.
Design termination paths. The lifecycle limit still requires cleanup, checkpointing, user messaging, and retry behavior.

Pricing and sizing questions

Do not model MicroVM cost as classic Lambda invocation pricing. The economics are closer to Fargate-style capacity planning: choose a baseline, account for active burst above that baseline, and model one representative session from launch to suspend or termination.

Variable	Why it matters
Baseline resources	You pay for configured baseline compute while the MicroVM is running. Over-provisioning raises the floor for every active session.
Peak resources	A running MicroVM can vertically scale up to 4x its configured baseline during peak activity. Usage above baseline changes the cost of heavy work.
Idle time and suspend policy	Suspended MicroVMs incur no compute charges, but the user experience, resume behavior, and lifecycle policy still need to work.
Snapshot operations and storage	Images, snapshot reads and writes, working files, and preserved state affect startup behavior and cost outside active compute.
Data transfer	Cross-Region, public internet, and private-network paths can change cost and architecture feasibility.
Session orchestration	Horizontal scale means creating more MicroVMs, tracking endpoints, respecting quotas, and terminating sessions when they are done.

What to validate in a pilot

Start with one workload, not a platform rewrite. A credible pilot should produce evidence that can survive architecture, security, and FinOps review.

Launch and resume behavior for your image size and dependency set.
Network policy with default egress disabled or constrained where needed.
Endpoint token scope, expiration, and application authorization.
Runtime identity, IAM permissions, tool access, and per-tenant boundaries.
Snapshot hygiene for secrets, unique state, files, and initialization hooks.
Logging that ties actions to user, tenant, session, agent, approval, and outcome.
Failure behavior at timeout, termination, quota limits, stale connections, and retries.
Cost per representative session, including idle periods and suspended state.

Benchmark methodology we would use

We are not publishing benchmark numbers until they are measured against a reproducible setup. A useful evaluation should record Region, date, image size, baseline and peak configuration, sample count, launch latency, resume latency, endpoint behavior, VPC versus default egress behavior, state preservation, cleanup behavior, and cost assumptions. The AWS launch blog is a useful baseline for the service model; production evidence still needs workload-specific measurement.

The point of that benchmark is not to prove that every workload should use MicroVMs. It is to decide whether one workload has a defensible path to production.

Frequently asked questions

Can Lambda MicroVMs replace Lambda Functions?

Not generally. Lambda Functions remain the simpler option for short, stateless, event-driven work. Lambda MicroVMs are for dedicated sessions that need custom environments, state, isolation, endpoint access, and longer interactive lifecycle control.

Do Lambda MicroVMs scale like Lambda Functions?

No. Lambda Functions automatically scale execution environments in response to invocation demand. Lambda MicroVMs can scale vertically within a running session, but horizontal scale is an application concern: create more MicroVMs, route users or jobs to their dedicated endpoints, monitor quotas, and terminate sessions when they are no longer needed.

Are Lambda MicroVMs enough to run untrusted code safely?

No single compute boundary is enough on its own. MicroVM isolation helps, but production safety still depends on IAM, egress control, endpoint tokens, tool permissions, logging, tenant separation, and state cleanup.

Should every AI agent sandbox use Lambda MicroVMs?

No. If a managed code interpreter gives enough isolation and control, use the simpler managed path. Consider MicroVMs when you need a custom runtime, session state, private endpoint behavior, or more control over the execution environment.

What is the biggest design risk?

The biggest risk is assuming “inside a VM” means “safe.” The right question is what the session can access, change, disclose, or authorize beyond the task it was created to perform.

How Elevata can help

Bring one candidate workload. Elevata can help decide whether Lambda MicroVMs are the right execution model and what a safe pilot must prove: workload fit, AWS service comparison, IAM and tenant boundaries, network design, image and snapshot lifecycle, logging evidence, cost assumptions, and approval criteria.

Assess a MicroVM sandbox pilot with Elevata. If you are still shaping the broader control model, start with our governed AI-agent sandbox architecture guide.