Load Balancer Capacity Calculator

To size a pool of servers behind a load balancer you need to carry peak traffic while leaving headroom and surviving a node failure. This calculator takes your peak requests per second, the requests one server can handle, a target utilization and a redundancy count, then returns the number of servers to provision and the spare capacity the pool will have.

0.00
0.00
0.00
0.00

Capacity sizing formula

Effective per-server = per-server capacity * (target utilization / 100)
Servers for load = ceil(peak rps / effective per-server)
Total servers = servers for load + redundancy spares
Pool capacity = total servers * per-server capacity
Headroom = (pool capacity - peak rps) / pool capacity * 100

The utilization target shrinks each server's usable share so the pool is not run flat out; the redundancy spares keep peak service possible if a node is lost.

Worked example

Peak 10,000 rps, each server handles 2,000 rps, target utilization 70 percent. Effective per-server = 2,000 times 0.70 = 1,400 rps. Servers for load = ceil(10,000 / 1,400) = ceil(7.14) = 8. With one spare, total = 9 servers. Pool capacity = 9 times 2,000 = 18,000 rps. Headroom at peak = (18,000 minus 10,000) / 18,000 = 44.44 percent.

Load balancer capacity: frequently asked questions

How many servers do I need behind a load balancer?

Divide the peak requests per second by the per-server capacity to get the raw server count, then divide by your target utilization so servers are not run flat out, round up, and add redundancy (for example N+1) so the pool survives a node failure at peak.

Why divide by a target utilization?

Running servers at 100 percent leaves no headroom for traffic spikes, latency growth or background work. Sizing to a target such as 70 percent utilization means each server normally handles 70 percent of its maximum, leaving 30 percent of slack.

What does N+1 redundancy mean?

N+1 means you provision one more server than the number strictly needed to carry the load, so that if a single server fails the remaining servers can still serve peak traffic. For higher resilience you can provision N+2 or more.

Is this the same as queueing-theory sizing?

No. This is straightforward throughput-based capacity planning. It does not model queue waiting time. For latency-sensitive systems also consider a queueing model, but throughput sizing is the standard first-pass capacity estimate.

Sources and notes

  • The sizing arithmetic (load divided by effective per-server capacity, plus redundancy) is standard throughput-based capacity planning.
  • Per-server capacity is measured by load-testing your own service; enter the figure you observe.

Reviewed by the CalculatorHub team, edited by James Graham, 19 June 2026. See our methodology.