What Serverless Actually Means
The word "serverless" is one of the cloud industry's better pieces of marketing — evocative but imprecise. There are absolutely servers involved. What "serverless" means is that you, the developer, do not provision, patch, maintain, or think about servers. The cloud provider handles all of that. You write code (a function), upload it, and the platform handles the rest.
The serverless model encompasses two main patterns: Function as a Service (FaaS) — discrete functions that execute in response to events — and managed compute services like AWS Fargate, Google Cloud Run, and Azure Container Apps, which run containers without you managing the underlying hosts. This article focuses primarily on FaaS, with AWS Lambda as the reference implementation.
How Lambda Works
When an event triggers a Lambda function (an HTTP request via API Gateway, a message from SQS, a file upload to S3, a scheduled EventBridge rule, etc.), AWS allocates a secure, isolated execution environment — a microVM — and runs your function code inside it. The environment is allocated from a pool of pre-warmed containers managed by AWS. Your function runs, returns a result, and the environment either remains available for the next invocation (warm container reuse) or is de-allocated after a period of inactivity.
You are billed in 1ms increments for the duration your function runs, multiplied by the memory allocated. AWS Lambda's free tier covers 1 million requests and 400,000 GB-seconds of compute per month — enough to run a significant development workload at no cost.
Benefits of Serverless
No server management: no OS patching, no capacity planning, no replacing failed instances. Your operations team can focus on application-level concerns instead.
Automatic scaling: Lambda scales from 0 to thousands of concurrent executions automatically. No scaling policies to configure, no traffic spikes to plan for. For bursty workloads, this is transformative.
Pay-per-execution: you pay only when your code runs. An application that processes 10,000 events per day costs pennies. This is dramatically cheaper than an always-on VM for low-to-moderate volume workloads.
Fast deployment: deploying a Lambda function takes seconds. Combined with infrastructure as code (SAM, Serverless Framework, or Terraform), you can go from code change to production in minutes.
Limitations of Serverless
Cold starts: when Lambda allocates a new execution environment (after inactivity or for a new concurrent invocation), there's a delay of 200ms to 2 seconds while the environment initialises. For user-facing requests, this can be unacceptable. Provisioned Concurrency solves this at additional cost.
Execution time limits: AWS Lambda has a maximum execution timeout of 15 minutes. Long-running processes (video transcoding, complex data transformations, ML training) are not suitable for Lambda — use EC2, ECS, or AWS Batch instead.
Stateless requirement: Lambda execution environments cannot maintain state between invocations. All state must be externalised to a database (DynamoDB, RDS), cache (ElastiCache), or object store (S3). This is a good architectural constraint but requires design discipline.
Vendor lock-in: Lambda functions that use AWS SDK calls, IAM roles, and event source mappings are deeply tied to AWS. Migration to another provider is a significant rewrite. Mitigate by keeping business logic in provider-agnostic code and isolating AWS-specific code in thin adapter layers.
Debugging complexity: distributed serverless architectures with dozens of functions are harder to debug than monoliths. Invest in AWS X-Ray distributed tracing and structured logging from the start.
Serverless Cost Model: The Break-Even Calculation
Lambda pricing (London region, 2026): £0.00000017 per GB-second of compute, plus £0.00000020 per request. A function using 512MB running for 500ms costs £0.0000000425 + £0.00000020 = approximately £0.0000002425 per invocation.
For comparison, a t3.medium EC2 instance (2 vCPU, 4GB RAM) in London costs approximately £0.038/hour on-demand, or £0.024/hour with a 1-year Savings Plan. Running 24/7, that's £17.50/month. A Lambda function at the same 512MB handling 1 million requests/month of 500ms each costs approximately £0.24/month. Lambda wins decisively at this volume.
But at 100 million requests/month? Lambda costs £24/month — more expensive than the EC2 instance that can handle that load continuously. The break-even for most typical API endpoints (100ms–500ms duration, 512MB) is roughly 5 to 10 million requests per month. Above that, ECS or EC2 typically wins on cost.
When Serverless Wins
Event-driven processing: processing S3 file uploads, handling SQS messages, reacting to database change streams — these are perfect Lambda use cases. The function only runs when there's work to do.
API backends with variable traffic: a startup with unpredictable traffic that spikes during launches or promotions benefits enormously from serverless auto-scaling. You pay nothing during quiet periods and scale automatically during spikes.
Scheduled jobs: nightly reports, daily data sync, hourly health checks — EventBridge Scheduler + Lambda is far cheaper and simpler than maintaining a cron server.
Data processing pipelines: Lambda integrates natively with Kinesis, SQS, S3, DynamoDB Streams, and SNS. Fan-out processing, data validation, enrichment, and routing are all excellent Lambda use cases.
When Serverless Loses
Long-running processes: anything exceeding 15 minutes cannot run on Lambda. Video transcoding, ML model training, large file processing, and complex report generation need ECS, Batch, or EC2.
High-throughput consistent workloads: a high-traffic SaaS with millions of requests per hour is often cheaper on ECS with a Compute Savings Plan. Run the numbers before assuming serverless is cheapest at scale.
Latency-sensitive applications: cold starts, even mitigated, add variance to response times. Applications requiring consistent sub-100ms latency are better served by always-warm container services.
Stateful workflows: if your application needs to maintain session state, connections, or in-memory state between requests, Lambda is a poor fit. AWS Step Functions can orchestrate complex stateful workflows using Lambda functions as steps, but the design complexity increases significantly.
FastAPI on Lambda with Mangum
Mangum is an ASGI adapter that lets you run FastAPI (or any ASGI framework) on AWS Lambda with API Gateway. This is an excellent pattern for teams that want FastAPI's developer experience with serverless scaling and zero server management.
# requirements.txt
fastapi==0.111.0
mangum==0.17.0
pydantic==2.7.1
# main.py — FastAPI app adapted for Lambda via Mangum
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from mangum import Mangum
import boto3
import json
import os
app = FastAPI(
title="My API",
version="1.0.0",
# Set root_path for API Gateway stage
root_path=os.getenv("API_GATEWAY_BASE_PATH", "")
)
class Item(BaseModel):
name: str
price: float
in_stock: bool = True
# DynamoDB client — initialised outside handler for warm reuse
dynamodb = boto3.resource("dynamodb", region_name="eu-west-2")
table = dynamodb.Table(os.environ["TABLE_NAME"])
@app.get("/health")
async def health_check():
return {"status": "healthy"}
@app.get("/items/{item_id}")
async def get_item(item_id: str):
response = table.get_item(Key={"pk": item_id})
if "Item" not in response:
raise HTTPException(status_code=404, detail="Item not found")
return response["Item"]
@app.post("/items")
async def create_item(item: Item):
import uuid
item_id = str(uuid.uuid4())
table.put_item(Item={
"pk": item_id,
**item.model_dump()
})
return {"id": item_id, **item.model_dump()}
# Mangum adapter — this is the Lambda handler
# lifespan="off" disables ASGI lifespan events (not supported in Lambda)
handler = Mangum(app, lifespan="off")
Serverless vs Container vs VM: Full Comparison
| Dimension | Serverless (Lambda) | Container (ECS/EKS) | VM (EC2) |
|---|---|---|---|
| Ops overhead | Minimal | Medium | High |
| Scaling | Automatic (0 to thousands) | Configured auto-scaling | Manual or auto-scaling group |
| Cost model | Pay per execution | Pay per running container hour | Pay per instance hour |
| Cold start latency | 200ms–2s | None (always warm) | None (always warm) |
| Max execution time | 15 minutes | Unlimited | Unlimited |
| State management | Stateless — external only | Stateless recommended, in-memory possible | Stateful possible (caution) |
| Deployment speed | Seconds | Minutes (image build + push) | Minutes to hours |
| Vendor lock-in | High | Low (containers portable) | Medium |
| Local development | Complex (SAM local, etc.) | Excellent (Docker Compose) | Excellent |
| Best for | Event-driven, variable traffic, scheduled jobs | Web apps, APIs, microservices at scale | Legacy apps, compliance requirements, GPU workloads |
Not Sure If Serverless Is Right for Your Project?
SpiderHunts Technologies designs cloud architectures that match your actual workload patterns — serverless where it makes sense, containers or VMs where it doesn't. Get an architecture review and we'll give you a clear recommendation.
Get an Architecture Review