Mean Time to Recovery (MTTR) Calculator
Mean Time to Recovery (MTTR) is a critical DevOps and site reliability engineering (SRE) metric measuring how quickly a team can restore service after an incident. It is one of the four DORA key metrics proven to predict both IT and organizational performance. This calculator takes the duration of recent incidents in minutes, computes the mean (average) and 90th percentile recovery time, classifies your performance against DORA's Elite/High/Medium/Low performer thresholds, and estimates the annual downtime impact at your current MTTR and incident frequency.
MTTR formula
MTTR = total_downtime_minutes / number_of_incidents
annual_downtime_hrs = MTTR_mins * incidents_per_month * 12 / 60
DORA MTTR performance thresholds
- Elite: MTTR less than 1 hour (60 minutes).
- High: MTTR less than 1 day (480 minutes in an 8-hour business day).
- Medium: MTTR between 1 day and 1 week.
- Low: MTTR more than 1 week.
- Elite teams achieve MTTR reduction through automated rollback, canary deployments, and thorough incident runbooks.
Frequently asked questions
What is mean time to recovery (MTTR)?
MTTR is the average time it takes to restore service after a failure or outage. It is calculated as the total downtime across all incidents divided by the number of incidents. MTTR is one of the four DORA key metrics for DevOps performance and a key indicator of an organization's ability to recover from failures.
What is a good MTTR?
DORA's 2023 State of DevOps report classifies elite performers as having MTTR of less than 1 hour. High performers recover in less than 1 day. Medium performers take 1 day to 1 week. Low performers take more than 1 week. Reducing MTTR requires investment in monitoring, on-call processes, runbooks, and deployment rollback capabilities.
What is the difference between MTTR and MTBF?
MTTR measures how long it takes to fix a failure. MTBF (Mean Time Between Failures) measures how long a system runs between failures. Together they determine availability: Availability = MTBF / (MTBF + MTTR). A system with high MTBF (rare failures) and low MTTR (fast recovery) has the highest availability.
How do I reduce MTTR?
Key strategies include: better monitoring and alerting to detect failures faster, on-call runbooks that enable fast diagnosis, automated rollback for bad deployments, chaos engineering to find weaknesses before they cause outages, post-incident reviews (blameless retrospectives) to identify and fix root causes, and feature flags to disable broken features without redeployment.
What is the difference between MTTR, MTTD, and MTTF?
MTTD (Mean Time to Detect) measures how long until a failure is noticed. MTTR (Mean Time to Recovery) measures the full restoration time from failure to resolution. MTTF (Mean Time to Failure) measures expected lifespan of non-repairable components. MTBF = MTTF + MTTR for repairable systems.
Official sources
- DORA: DORA Research Program - State of DevOps Reports.
- IEEE: IEEE 610.12 - Standard Glossary of Software Engineering Terminology.
Reviewed by the CalculatorHub team, edited by James Graham, 14 June 2026. See our methodology.