It’s easy to get scared away by acronyms. And that’s fair!

But once you learn what they stand for and where they come from, it’s almost inconvenient to use them in their full form.

It’s the same for MTBF, MTTF, and MTTR. They make a lot of sense when we break them down:

MTBF (Mean Time Between Failures)

Mean Time Between Failures (MTBF) is the average time between production failures that can be repaired. It measures the reliability and the availability of a device or an asset. The higher the MTBF value, the more reliable the system.

The aim is to have as high MTBF as possible, in the hundreds or thousands (hours).

Benefits of Calculating MTBF

There are a few benefits to calculating MTBF:

MTBF Calculation Example

Here is the equation to calculate MTBF:

MTBF = Total Uptime / # of Failures

Imagine that a production line runs 130 hours in a week with 4 outages. The first two last 2 hours each and the other two last 3 hours each.

Total Working Time: 130 Hours

Number of Failures: 4 Outages

Total Failure Time: 2(2 Hours) + 2(3 Hours) = 10 Hours

(130 - 10) / 4 = 30

MTBF = 30

This means that when the operation is live, the average time between failures is 30 hours. If we go a step further and calculate the failure rate, it would be:

Failure Rate: 1/MTBF

1 / 30 = 0.033

MTTF (Mean Time To Failure)

Mean Time To Failure (MTTF) is the average time to non-repairable device or asset failure. This measures how long a device or asset can reliably be used before failing completely and predicts when operators should expect to replace or run regular diagnostics. It’s synonymous with device lifespan.

Obviously, the longer the MTTF the less a company has to spend on replacing that device or asset.

Benefits of Calculating MTTF

Similar to that of MTBF, here are the benefits of calculating MTTF:

  • Measure the reliability of a device/asset

  • Insight into which device/asset would best fit production

MTTF Calculation Example

Here is the equation to calculate MTTF:

MTTF = Total Lifespan of Devices or Assets / # of Devices or Assets

Imagine that a production line has a total of 3 devices of the same kind. Device one completely failed at 5,200 hours, device two at 4,200 hours, and the third at 5,600 hours.

Total Lifespan of Devices: 5,200 + 4,200 + 5,600 = 15,000

Number of Devices: 3

15,000 / 3 = 500

MTTF = 500 hours

It means that this particular device has an average lifespan of 5,000 hours. Using this metric, companies can determine whether this brand of device or asset is right for their production or if they need to switch to a longer-lasting, more high-performance solution.

MTTR (Mean Time To Repair)

MTTR can stand for several different things: Mean Time to Repair, Recovery, Resolution, Resolve, Restore, or Respond. But the most common one used of the 6 are Repair and Recovery.

MTTR is the average time required to repair a failed device or an asset that is ‘repairable’. This is calculated from the moment an operations personnel identifies an unplanned failure, corrects that failure, and the device or asset is up and running again.

Benefits of Calculating MTTR

There are several benefits to calculating MTTR:

  • Understand the operation’s capacity to react to failures

  • Identify frequent repair incidents and plan accordingly

  • Measure against previous MTTR to shorten downtime

MTTR Calculation Example

Here is the equation to calculate MTTR:

MTTR = Total Time Spent Repairing / # of Repairs

Imagine that a production line has 3 devices that went down. The first one was down for 4 hours, the second 2 hours, and the third 3.

Total Time Spent Repairing: 4 + 2 + 3 = 9

Number of Devices: 3

9 / 3 = 3

MTTR = 3 hours

This means that the average repair time for all three devices is 3 hours. The MTTR can vary drastically across the type of device, industry, and the size of the production line. However, as a general rule, a good average is 5 hours or less.

Other Incident Metrics: MTTD, MTTA, MDT

Although the MTBF, MTTF, and MTTR are the three main incident metrics, here are some other ones you may come across in operations.

MTTD (Mean Time to Detect)

Mean time to detect is the average time it takes for a system to detect a device or asset failure from the moment the failure occurs.

MTTD = total time between actual failure to failure detection / # of failures

MTTA (Mean Time to Acknowledge)

Mean time to acknowledge is the average time it took for repair work to begin from when the device or asset failed.

MTTA = total time it takes to acknowledge failures / # of failures

MDT (Mean Down Time)

Mean downtime is simply the average total time that a device or asset is down. This measurement both includes scheduled downtime and unscheduled downtime.

Incident Metrics on Autopilot with Digital Solutions

Incident metrics may not tell the full story of how a failure occurred, but it is the most important performance indicator of how well an operations line is optimizing its production. Ideally, operators should try to shave down the average times of incident metrics over time.

One way of doing this is by putting incident metrics on autopilot using digital solutions like Tulip.

Tulip collects data from operations personnel, machines, and tools during production, so you can get an accurate view of incident metrics like MTBF, MTTF, MTTR, and more. You can either hook up IoT devices to track when assets go down or have operators directly enter asset failures through the Tulip app. With the data in hand, you can conduct analyses into the effects of your continuous improvement efforts over time by using Tulip’s real-time analytics tools.

To increase production visibility, you can also embed the analytics into your app to create a dashboard of your incident metrics over time and monitor their improvements by shifts and lines.

See How Tulip can Help Your Operations Calculate Incident Metrics on Autopilot

Learn how you can gain real-time visibility into production and improve traceability processes with a 30-day free trial.

Day-in-the-life of a manufacturing facility illustration