Table of Contents
Chapter One: Defining Root Cause and the Benefits of Root Cause Analysis (RCA)
What is a Root Cause?
A root cause is defined as an event or factor that results in nonconformity within a manufacturing environment. Once a root cause is identified, manufacturers should take steps to permanently eliminate the cause through continuous process improvement.
At the end of the day, a root cause isn’t just any contributing factor. The root cause needs to be an underlying cause so that the person identifying the source of the problem can prevent it from occurring in the future. Furthermore, the root cause needs to be identifiable.
Identifying the source of nonconformities through Root Cause Analysis
A Root Cause Analysis (RCA) is one of the primary tools used by manufacturers to identify the contributing factors to quality issues within their operations.
Root cause analyses are often conducted by process engineers in manufacturing to identify what, how, or why a precipitating event occurred.
This approach to problem-solving is usually used when the consequence involves a “safety, health, environmental, quality, reliability, or production impact”.
Identifying the underlying cause of the problem, or root cause, empowers the analyst to identify and implement a potential solution to the problem. These solutions vary and can include process change or other remedies. Unlike leading indicator-based analysis, root cause analysis is a reaction to an existing or historical problem. The goal is to prevent it from happening again in the future.
If a root cause analysis isn’t turning up a true root cause of the problem, the analyst should consider leveraging another analytical tool to solve the problem.
The root cause also needs to be within management’s control to fix. For example, if a change to trade policy resulted in a drop in the only source of a specific material, that is out of management’s control. They don’t set government policy! Finally, in order to be a root cause, the issue needs to have a solution that will prevent recurring issues. If it won’t prevent recurrences, there’s likely a causal issue versus a root cause to solve.
Chapter Two: How to Complete a Root Cause Analysis
Before you can complete a root cause analysis, you must collect as much data as possible about the events and people involved in the lead-up.
If you're primarily basing your analysis on employee testimony, it’s important that you establish a framework for collecting hard data as soon as possible. People can be unreliable and their memories are vulnerable to suggestions, especially after time has passed.
The next steps you should take to complete a root cause analysis will depend on your approach. There are generally 5 popular approaches to root cause analysis:
- Events and causal factor analysis
- Change analysis
- Barrier analysis
- Risk tree analysis
- Kepner-Tregoe problem solving
In our next session, we'll dive into the 5 different approaches that manufacturers can take to conduct a root cause analysis.
Chapter Three: Types of Root Cause Analysis
What is a Causal Factor Analysis?
Causal factor analysis requires identifying all of the contributing events that led to the problem. Avoid the pitfall of focusing on the most obvious, final contributors. One of the ways to delve into all of the causes is by leveraging the “5 Whys”. The 5 Whys is “an iterative interrogative technique used to explore the cause-and-effect relationships underlying a particular problem.”
For example, imagine your cost of scrap increased over the last quarter. If you were creating a causal factor analysis for increase, you might focus on the most obvious cause. What changed during that time period? The answer might be that a specific line is producing more scrap. If you ask again–why is that line generating more scrap?–you might uncover that there has been significant operator turnover over the last period. Ask why again and you could learn that a few of your experienced operators retired.
Continue this process long enough and you learn a lot more contributing factors to the increase in scrap that provides more detail than “there was an increase in scrap”. This line of logical questioning is called the “5 Whys” because 5 is the number of times someone can benchmark their questioning against. Using the 5 Whys can help you identify the causal factors that contributed to the problem you would like to prevent in the future.
For basic challenges, the 5 Whys themselves can be enough to get to the root cause of the problem. In the non-technical example provided above, a clear solution to the problem would be to replace the latch so the gate closes without needing a brick to hold it closed.
With more sophisticated problems, the 5 Whys might not be enough to solve the root of the issue. Let’s add some color to the example above. What happens if you factor in that the plant waited to hire new team members until a week before the experienced operators retired? What if new operators usually had a couple of months of training before having that same level of responsibility?
It’s not just the operators that caused the problem. There was a series of causal factors that influenced the poor performance.
What is Change Analysis?
Change Analysis is a root cause analysis technique that focuses on a specific problem or problematic event. This type of analysis seeks to expose which deviation from the regular procedure, or change, drove the unfavorable event. This is the type of analysis manufacturing folks typically think of when discussing change analysis.
Change analysis is easy to learn and apply. Looking for a deviation from a norm also results in clear corrective action. This provides concrete next steps for anyone conducting the analysis. Furthermore, it makes it easier to detect unusual root causes.
Consider applying a different type of root cause analysis if your standard process isn’t well-defined enough to provide a good basis for comparison. Also, depending on how variable your processes are, the number of moving parts might significantly increase the scope of this type of analysis.
Whether you decide to apply this type of analysis or another form of root cause analysis, make sure to test your assumptions. In the worst-case scenario, you’ll determine that your hypothesis is inconclusive or fail to find an actual root cause. This result, while unpleasant, is better than drawing an incorrect conclusion that causes additional issues in the future.
What is Barrier Analysis?
Barrier analysis is a systematic process used to identify failures of physical, administrative, and procedural barriers that should have prevented the adverse event. This analysis identifies why the barriers failed and determine which types of corrective action are needed to prevent them from failing again in the future.
Start your barrier analysis by identifying all of the barriers that were in place before the adverse event occurred. Review each barrier to determine if it was functioning under normal operating conditions. If there was a deviation in operating conditions, was it performing its intended function under these conditions? Did the barrier help decrease the total cost of the adverse event? Was the barrier’s design strong enough to fulfill its intended purpose? Finally, review whether it was built, maintained, and inspected appropriately leading up to the event.
Use these questions with each barrier to identify how the barriers failed to prevent the event. Note that this may not be the best type of root cause analysis depending on what you are investigating and the state of the existing process or setup leading up to your event.
What is Risk Tree Analysis?
Risk tree analysis, like the previous two analyses we’ve reviewed, is used to analyze the effects of a failed system after an adverse event has occurred. Event trees were developed during the WASH-1400 nuclear power plant safety study in 1974. A fault tree analysis under certain circumstances becomes large and unruly. The event tree was developed to help identify which pathway creates the most significant risk for a failure in a system without requiring each path to be mapped out in the tree.
The risk tree analysis has a few benefits. First, it helps you identify multiple coexisting contributors to failure. This provides multiple layers of detail. On the flip side, the amount of detail available in this analysis can make it easy to overlook subtle differences between branches. Also, this is a more complex form of root cause analysis. The person conducting the analysis needs training and some experience to ensure success.
What is the Kepner-Tregoe method?
The Kepner-Tregoe method of root cause analysis became famous when NASA used it to bring the Apollo 13 team home. It’s a structured methodology for gathering, prioritizing, and evaluating information. Like other forms of root cause analysis, the Kepner-Tregoe method is a systematic approach to solving a problem and analyzing risk.
The first step in this methodology is to identify problems and classify them by level of concern. Then, set the priority level by potential impact, urgency, and growth. Next, decide what action to take or which step to take next. Finally, make a plan for who will be involved, what they will do, where they’re involved, and when they take part. Be sure to scope the extent of each person’s involvement.
The next step to applying this analysis is to determine which objectives must be accomplished, as well as which ones you want to accomplish that aren’t absolutely necessary. This will help you evaluate your options against your objectives so you can determine the best possible choice of action.
The key benefit of the Kepner-Tregoe analysis is the ability to prioritize and focus the analysis. By weighing and setting objectives, this type of analysis provides a more direct review of an issue.
Chapter Four: Challenges with Root Cause Analysis
Causal Factors vs a Root Cause
One challenge when conducting a root cause analysis is ensuring you are identifying root causes rather than causal factors. A causal factor is any behavior, omission, or deficiency that, if corrected, eliminated, or avoided, probably would have prevented the event. A root cause is a factor that if eliminated would definitely prevent a recurrence.
Root cause analysis purists focus on identifying a root cause over a causal factor. However, many of the processes where root cause analysis is applied generate adverse events because of human error. Removing a specific manifestation of the error doesn’t necessarily highlight how the type of mistake can be repeated. Ultimately this specific focus can ignore a systematic error.
The most basic requirement for root cause analysis is data. Collecting as much data as possible throughout the process you are examining will improve the quality and efficiency of the root cause analysis. However, this is another focus for root cause analysis critics. Oftentimes data collection about the precipitating event begins after the event. It also requires multiple testimonies and interviews. This qualitative data collection can be unreliable, especially if these interviews need to occur days, weeks, or even months after the event.
Easy vs Lasting Solutions
Unfortunately, solutions to these events can be complex. Instead of removing a piece from an existing process, the best solution may be a complete redesign, new technology implementation, or other large-scale adjustments. Even when necessary, administrators, managers, and leaders tend to look for quick and easy solutions.
Prioritizing a solution that doesn’t solve the problem in the long term, while easier, can lead to event recurrence.
If the root cause analysis is seen as a quest to identify culpability, you might be in trouble. The data collection process could be compromised if the root cause analysis looks like a way of finding someone to blame for the event. Balance identifying who is at fault with whatever system produced the unintended result. It’s unlikely a single person created the issue in malice. However, if this were the case, accountability for the individual and organization would be necessary.
Proponents of root cause analysis often encourage groups to collaborate and brainstorm during the process. However, some critics argue that this promotes groupthink and stifles creative approaches and analyses. If you’re building an RCA team, include team members from different groups and functional areas to promote fruitful collaboration.
Tools to Overcome Challenges & Apply Root Cause Analysis
Companies use tools like Tulip to collect data from their people, processes, and machines in real-time. This empowers them to conduct root cause analysis after smaller events and enables faster, and more efficient improvement of these processes. Furthermore, precise data on operators, machines, and changes to procedures makes it easier to avoid the challenges to root cause analysis highlighted above.
Tools for Root Cause Analysis
There are a number of different tools and techniques that can be used when conducting a Root Cause Analysis. We've written comprehensively about tools used for root cause analysis in the past, so we'll just summarize them here.
Pareto Charts - Pareto charts make it easy to analyze large quantities of production data at a glance. By displaying the most common sources of a defect in descending order, Pareto charts can help teams prioritize improvements for maximum impact.
The 5 Whys - We've spoken quite a bit about the 5 Whys as an investigative technique throughout this guide. To summarize, the 5 Whys can help process engineers conduct a root cause analysis by drilling down on the core reason why an event occurred. The 5 Whys technique is most effective when investigating a rudimentary problem that doesn't require quantitative analysis.
Fishbone Diagrams - Fishbone Diagrams, also known as Cause-and-Effect Diagrams, are useful when there are a number of potential sources of a problem that can be categorized into different buckets. This tool is particularly effective when the root cause of a problem is entirely unknown.
Scatter Diagrams - Scatter Diagrams or Scatter Plots are visual representations of the relationship between two sets of data. To use a scatter diagram as a root cause analysis tool, an individual would plot an independent variable (the suspected root cause) on the x-axis and the dependent variable (the resulting problem) on the y-axis. If after plotting these two variables, there is a clear and present pattern, you can suspect the variables may be correlated.
Failure Mode and Effect Analysis (FMEA) - A Failure Mode and Effect Analysis is a tool that can be used at any stage of production and involves identifying and exploring all potential points of failure within a design, process, or product, as well as the potential effect that the failure might cause. FMEA often involves a cross-functional group of stakeholders that are familiar with the design, process, or product and can help document potential root causes before they actually occur.
Digitally transform your operations with Tulip
See how systems of apps enable agile and connected operations.