How to Scale a SaaS Platform From 100 to 100,000 Users

A practical, stage-by-stage guide to scaling a SaaS product — what to do at 1k, 10k, and 100k users, where the real bottlenecks are, and when to invest in infrastructure vs product.

By SpiderHunts Technologies  ·  23 May 2026  ·  10 min read

TL;DR

  • The database is almost always your first bottleneck — profile queries and add indexes before buying bigger servers
  • Make your application stateless from day one — this is the key enabler of horizontal scaling
  • Add Redis caching at 1k–5k users, read replicas at 10k–20k users
  • Use a CDN (Cloudflare) for static assets from day one — it's free and dramatically reduces server load
  • Move long-running work to background job queues — never do it in an API request

The Golden Rule: Don't Optimise Before You Need To

Premature optimisation is the biggest scaling mistake. Many SaaS products invest weeks in distributed systems and microservices architecture before they have 1,000 users — and never need them. Build a simple, well-structured monolith first. Optimise when you have measured evidence of a bottleneck, not when you imagine one.

The scaling stages below tell you what the real problems are at each tier — and what actually fixes them.

Stage 1: 0–1,000 Users — Survive Launch

Primary concern: Does the product work correctly? Can users sign up, pay, and complete the core workflow?

  • Single application server (1–2 vCPUs, 2–4GB RAM)
  • Managed PostgreSQL (smallest tier — 1 vCPU, 1–2GB)
  • Cloudflare CDN for static assets and DDoS protection (free tier)
  • Sentry for error tracking — you need to know when things break
  • Basic request logging so you can debug production issues
  • Infrastructure cost: ~£100–£200/month

Stage 2: 1,000–10,000 Users — Performance Matters

Primary concern: Slow pages and API timeouts. The database is now feeling the load.

  • Profile slow queries: Enable PostgreSQL's pg_stat_statements and identify queries over 100ms. Most slow queries have a missing index.
  • Add Redis caching: Cache expensive, frequently-accessed reads (dashboard aggregates, plan data, user profile) with a 60–300 second TTL.
  • Background jobs: Move email sending, PDF generation, AI inference, and webhook delivery to Celery workers. API responses stay under 200ms.
  • Upgrade database instance: Move to 2–4 vCPU PostgreSQL. Tune shared_buffers and work_mem.
  • Infrastructure cost: ~£400–£800/month

Stage 3: 10,000–50,000 Users — Horizontal Scaling

Primary concern: Single application server is CPU/memory bound during peak hours.

  • Horizontal application scaling: Deploy 3–5 application server instances behind an AWS ALB or Nginx load balancer. This is only possible if your app is stateless (no server-side session storage).
  • Read replica for PostgreSQL: Route all SELECT queries that don't need to be instantly consistent (reports, dashboard, list views) to a read replica. Reduces primary database load by 40–70%.
  • Auto-scaling groups: Set CPU threshold rules — automatically add instances when CPU > 70%, remove when CPU < 30%.
  • Connection pooling: Add PgBouncer (transaction mode) between app servers and PostgreSQL. Prevents connection exhaustion with many app instances.
  • Infrastructure cost: ~£1,000–£3,000/month

Stage 4: 50,000–100,000 Users — Distributed Systems

Primary concern: Specific features or services become bottlenecks; monolith can't scale parts independently.

  • Extract hot services: If one feature (e.g., AI processing, media handling) dominates server load, extract it to a separate service that can scale independently.
  • Database sharding or multi-region: If you serve multiple geographies, consider multi-region deployments with regional PostgreSQL replicas to reduce latency.
  • Dedicated worker fleet: Scale background job workers separately from API servers based on queue depth.
  • CDN for API responses: Cache public API responses (e.g., public listing pages) at the CDN edge for sub-10ms response times globally.
  • Infrastructure cost: ~£5,000–£15,000/month

The Scaling Decisions That Matter Most

Decision Impact When
Stateless application servers Enables horizontal scaling — must be decided at build time Day 1
Database indexes on query-heavy columns 10–100× query speedup; free performance gain 1k–5k users
Redis caching for expensive reads Reduces DB load 30–60% for read-heavy operations 1k–5k users
Background job queues Keeps API response times fast; prevents timeouts Before launch
Read replica Removes 40–70% of read load from primary 10k–20k users
Horizontal app scaling Linear throughput increase with server count 10k–20k users

The Most Common Scaling Mistakes

Buying bigger servers instead of fixing the code

Vertical scaling (bigger machine) is 10× more expensive than fixing a missing index or query N+1 problem. Profile first, scale second. One missing index can make a query 100× faster — no infrastructure change needed.

Doing heavy work in API requests

Any operation that takes more than 500ms (AI inference, PDF generation, report aggregation, sending emails) must go into a background job. Synchronous long-running requests hold connections, exhaust thread pools, and cause timeouts under load.

Storing session state in application memory

If your application stores user sessions in memory, you can't scale horizontally — a user's next request might go to a different server with no knowledge of their session. Use JWT tokens or Redis-stored sessions from day one.

Need to Scale Your SaaS Product?

We help SaaS teams diagnose performance bottlenecks, optimise database queries, implement caching, and architect for horizontal scale — without over-engineering.

Book a Performance Review