Load Balancer Capacity Calculator
To size a pool of servers behind a load balancer you need to carry peak traffic while leaving headroom and surviving a node failure. This calculator takes your peak requests per second, the requests one server can handle, a target utilization and a redundancy count, then returns the number of servers to provision and the spare capacity the pool will have.
Capacity sizing formula
Effective per-server = per-server capacity * (target utilization / 100)
Servers for load = ceil(peak rps / effective per-server)
Total servers = servers for load + redundancy spares
Pool capacity = total servers * per-server capacity
Headroom = (pool capacity - peak rps) / pool capacity * 100
The utilization target shrinks each server's usable share so the pool is not run flat out; the redundancy spares keep peak service possible if a node is lost.
Worked example
Peak 10,000 rps, each server handles 2,000 rps, target utilization 70 percent. Effective per-server = 2,000 times 0.70 = 1,400 rps. Servers for load = ceil(10,000 / 1,400) = ceil(7.14) = 8. With one spare, total = 9 servers. Pool capacity = 9 times 2,000 = 18,000 rps. Headroom at peak = (18,000 minus 10,000) / 18,000 = 44.44 percent.
Load balancer capacity: frequently asked questions
How many servers do I need behind a load balancer?
Divide the peak requests per second by the per-server capacity to get the raw server count, then divide by your target utilization so servers are not run flat out, round up, and add redundancy (for example N+1) so the pool survives a node failure at peak.
Why divide by a target utilization?
Running servers at 100 percent leaves no headroom for traffic spikes, latency growth or background work. Sizing to a target such as 70 percent utilization means each server normally handles 70 percent of its maximum, leaving 30 percent of slack.
What does N+1 redundancy mean?
N+1 means you provision one more server than the number strictly needed to carry the load, so that if a single server fails the remaining servers can still serve peak traffic. For higher resilience you can provision N+2 or more.
Is this the same as queueing-theory sizing?
No. This is straightforward throughput-based capacity planning. It does not model queue waiting time. For latency-sensitive systems also consider a queueing model, but throughput sizing is the standard first-pass capacity estimate.
Sources and notes
- The sizing arithmetic (load divided by effective per-server capacity, plus redundancy) is standard throughput-based capacity planning.
- Per-server capacity is measured by load-testing your own service; enter the figure you observe.
Reviewed by the CalculatorHub team, edited by James Graham, 19 June 2026. See our methodology.