Featured Product
This Week in Quality Digest Live
Lean Features
Akhilesh Gulati
To solve thorny problems, you can’t have either a purely internal or external view
Katie Rapp
The future of manufacturing is about making processes more efficient
Bryan Christiansen
And when to hire one
Tom Taormina
How to transition from a certified quality professional to an expert in business management systems
Chip Reavley
Well-designed solutions lower costs and increase revenue

More Features

Lean News
Quality doesn’t have to sacrifice efficiency
Weighing supply and customer satisfaction
Specifically designed for defense and aerospace CNC machining and manufacturing
From excess inventory and nonvalue work to $2 million in cost savings
Tactics aim to improve job quality and retain a high-performing workforce
Sept. 28–29, 2022, at the MassMutual Center in Springfield, MA
Enables system-level modeling with 2D and 3D visualization, reducing engineering effort, risk, and cost
It is a smart way to eliminate waste and maximize value
Simplified process focuses on the fundamentals every new ERP user needs

More News

William A. Levinson

Lean

CAPA, FMEA, and the Process Approach

The AIAG offers a clearly defined and powerful synergy between the three

Published: Monday, December 19, 2022 - 13:03

Corrective action and preventive action (CAPA) is probably the most important process in any quality management system because so much else depends on it. This includes not only its traditional role as a response to defects, nonconformances, customer complaints, and audit findings, but also outputs of the management review. It can even address all seven Toyota production system wastes if we redefine as a “nonconformance” any gap between the current state and a potential or desirable future state. AIAG’s CQI-22, Cost of Poor Quality Guide1, recommends that we compare “the ideal state for how work processes should perform” against “the current reality.”

Inadequate CAPA is a leading source of ISO 9001:20152 and IATF 16949:20163 findings, and FDA Form 483 citations.4 “Five Signs Your Company Is in Dire Need of Root Cause Analysis and Corrective Action Training” reinforces this point even further.5 While ISO 9001:2015 doesn’t have a specific requirement for preventive action, one could argue that clause 6.1.1 (c), “prevent, or reduce, undesired effects,” constitutes an implied requirement.

In addition, ISO 9001:2008 clause 8.5.3 did require preventive action, which supports further that the current clause (6.1) on actions to address risks and opportunities requires it by implication.

There’s no ambiguity whatsoever in IATF 16949:2016, where Clause 6.1.1.2 quotes a good part of ISO 9001:2008 Clause 8.5.3 with an explicit requirement for a preventive action process. The bottom line is that—regardless of explicit requirements—common sense says we should seek to prevent trouble before it happens instead of waiting for a defect, nonconformance, or customer complaint to tell us there’s a problem.

IATF 16949:2016 has additional material that’s well worth reading, and it’s by no means limited to automotive applications. Clause 8.5.1.1, which is not found in ISO 9001:2015, requires control plans that are in turn deliverables from failure mode effects analysis, which is a very powerful tool for proactive identification and suppression of poor quality before it has the chance to create even one defect or nonconformance. IATF 16949:2016 Clause 10.2.3—and this doesn’t appear in ISO 9001:2015, either—requires a documented process for problem solving, including root cause analysis, systemic corrective actions, and deployment to similar applications. Clause 10.2.4 requires a documented process for error proofing (poka-yoke) where applicable.

The Automotive Industry Action Group’s (AIAG’s) CQI-20, Effective Problem Solving,6 and AIAG/VDA’s Failure Mode Effects Analysis manual7 offer a clearly defined and powerful synergy between CAPA, FMEA, and the process approach. Their diligent use should prevent most poor quality from happening at all, and make short work of whatever trouble sources were initially overlooked.

Process approach, FMEA, and CAPA

A process for product or service realization consists of a series of clearly defined steps or operations, and these are the focus elements of the AIAG/VDA’s relatively new FMEA approach. Each step can be expanded, e.g., on a spreadsheet, to include the process FMEA (PFMEA) and the control plan for the process in question. The person who does the job doesn’t need to see the PFMEA for daily activities, but it can and should be available to him or her. This synergy extends to the job breakdown sheet, which defines, for each step, 1) what to do; 2) how to do it; and 3) why it should be done. It also relates to standard work, which consists of 1) the sequence of operations; 2) standard inventory; and 3) takt time, or the pace the operation must maintain to meet downstream demand.

It has always been known, meanwhile, that FMEAs must be updated to reflect newly discovered failure modes or mechanisms, as might be identified from CAPA. The new AIAG/VDA manual expands on this to turn PFMEA into a proactive form of CAPA. Planners work with the traditional elements of the cause and effect diagram to identify in advance likely failure causes (previously known as failure mechanisms) rather than wait for them to make their presences known.

Consider, for example, a job breakdown sheet that is more than 200 years old: Baron Friedrich Wilhelm von Steuben’s Regulations for the Order and Discipline of the Troops of the United States. Each step clearly defines what to do, how to do it, and also the number of required motions. One of the steps to load a musket was “Ram down-Cartridge! One motion. Ram the cartridge well down the barrel, and instantly recovering and seizing the rammer back handed by the middle, draw it quite out, turn it, and enter it as far as the lower pipe, placing at the same time the edge of the hand on the butt end of the rammer, with the fingers extended.” A potential failure mode might consist of breaking the ramrod during the ramming step, a failure effect that would render the musket useless until the soldier could get a replacement. The Severity rating would almost certainly be 10, the worst possible, because the soldier would be disarmed in the middle of a battle.

One could speculate on the possible failure mechanisms or causes, such as wear or stress concentrators in the ramrod or the soldier ramming too hard due to the obvious stress of battle. These would fall into the work elements “machine” and “man,” respectively. Prussia’s Leopold I of Anhalt-Dessau, “the Old Dessauer,” addressed both failure causes by introducing iron ramrods to replace the wooden ones then commonly in use. The soldier no longer had to “be careful” to not ram down the charge too hard lest his ramrod break because it is difficult to “be careful” while under enemy fire and trying to keep up with the tempo of the firing drill. One must instead make the failure impossible. The lesson carries over into modern civilian applications, where “operator error” is a red flag that the problem’s actual root cause hasn’t been identified.

CQI-20 meanwhile addresses three potential root causes for poor quality and relates them directly to the prevention and detection controls depicted by the AIAG/VDA FMEA manual.

Root causes and process controls

Traditional CAPA seeks to identify “the root cause” of a quality problem. CQI-20 discusses explicitly, however, three root causes for poor quality. These are the 1) occurrence, 2) escape, and 3) systemic root causes. There’s a direct relationship between the first two and, “Don’t take it, don’t make it, don’t pass it along,” with “it” referring to poor quality,8 and also to the prevention and detection controls in FMEA.

1. The occurrence root cause, which is usually the traditional one, is why the defect or nonconformance was created in the first place. This relates directly to “don’t make it,” which is the mission of the prevention controls, i.e., process controls that disable failure mechanisms or failure causes so poor quality is never created.

2. The escape root cause is why the defect or nonconformance reached the next internal or external customer, if it did. This relates to “don’t pass it along,” which is the mission of the detection controls that ensure that poor quality does not reach the next internal or external customer. A Shigeo Shingo case study described, for example, a probe that moved up and down in concert with a drill to ensure that there was a hole in the most recently drilled item. If there wasn’t it meant the drill bit had broken, so the workstation stopped and alerted the operator. This didn’t prevent generation of poor quality, but it limited it to a single item that required rework and didn’t pass it along to the next internal customer.

3. The systemic root cause is why the issue wasn’t anticipated ahead of time, or even addressed in the design phase. CQI-20 (page 87) defines this as “the earliest point in the system that could have prevented the Occurrence and Escape Root Causes but failed to do so....” “Systemic” refers to something that pervades the organization, and this ties in with read across/replicate process or best practice deployment, along with organizational knowledge as depicted by ISO 9001:2015.

Lessons learned from a CAPA for one process or operation must be deployed to similar processes and operations throughout the organization because, as stated by Henry Ford 96 years ago, “the benefit of our experience cannot be thrown away.”9 While PFMEA was always a form of proactive CAPA, the new approach reinforces this role substantially and makes it more likely that the occurrence and/or escape failure causes will be disabled before they can generate even one defect or nonconformance.

Occurrence and detection ratings are now functions of process controls

The AIAG/VDA FMEA manual also takes a new approach to the occurrence and detection ratings. The occurrence rating was previously based on the frequency of occurrence of the nonconformance or defect in question. This is very hard to quantify if quality is relatively good, e.g., one defect per 100,000 opportunities. Maybe we will never even make enough parts to get a single defect under these conditions.

One can, of course, estimate nonconforming fractions in defects per million opportunities with a process capability analysis. But there’s a very big elephant in that living room. Process capability analysis estimates the nonconforming fraction due to random or common cause variation. Failure mechanisms, or failure causes as they are now known, are almost universally special or assignable causes.

The new approach, however, bases the occurrence rating not on the estimated frequency of occurrence but instead on the nature of prevention controls, such as error-proofing devices. Controls that don’t rely on human vigilance get better (i.e., lower) occurrence ratings than those that do. The detection rating is based similarly on the nature of the controls that detect poor quality before it leaves a workstation. Controls such as automatic go/no-go gauges that don’t rely on vigilance or judgment get better (lower) detection ratings than those that do.

This leads to a new synergy in which the occurrence root cause from CAPA reflects inadequate or nonexistent prevention controls, and the escape root cause reflects inadequate or nonexistent detection controls. A CAPA deliverable should therefore include appropriate changes to the process’s detection and/or prevention controls, and the FMEA should then reflect the changes in question.

CAPA, PFMEA, and control plans

The control plan is a natural extension of PFMEA, and the combination of the two is known as a dynamic control plan. Controls can address process characteristics, which are measurable and controllable during product realization, and/or product characteristics, which are usually measurable or otherwise assessable only after the product is made.

Process characteristics are usually the focus of prevention controls as well as other process controls, e.g., on tool speeds and stock feed rates. Product characteristics are usually the focus of detection controls, as well as sampling and statistical process control plans. There are some exceptions in which a product characteristic, such as etching progress on a semiconductor wafer, can be evaluated during product realization and used as a process control (e.g., endpoint detection). The usual rule is, however, that process controls and prevention controls apply to process characteristics, and detection controls to product characteristics.

PFMEA as proactive CAPA

The new AIAG/VDA manual’s approach to PFMEA reinforces the relationship between this and CAPA, and suggests that PFMEA is essentially proactive CAPA. The planners are supposed to identify the work elements for each process step, and recommends the 4M analysis (man, machine, material, and environment). These are familiar as four of the six traditional elements of the cause and effect diagram, to which we can add method and measurement as stated by the AIAG/VDA reference (page 86).

CAPA brainstorming will often use the cause and effect diagram to identify potential root causes of what the new FMEA approach calls a failure cause (previously called a failure mechanism). PFMEA starts with the proactive identification of failure modes and then looks for potential failure causes among the work elements; that is, it does what CAPA does but before anything bad has happened.

The AIAG/VDA manual (pages 87–89) treats the process step as the focus element. The negative of the process step, which means it does something it shouldn’t, or doesn’t do something it should, is the failure mode. The failure effect is the consequence of the failure mode, and it also is a negative; something happened that should not have happened, or something that should have happened did not. The negative of the process work element is meanwhile the failure cause.

Shigeo Shingo provides the example of vacuum cleaners that were assembled without handle screws because the workers forgot this step.10 The failure mode, which is the starting point in FMEA, is that the handle screw is not installed. The failure effect is probably that the handle can fall off, and the failure cause is ostensibly “worker error.” Anybody who has read Shigeo Shingo case studies knows, however, that a statement like “defects must be prevented by worker vigilance” is a sure sign that defects are occurring because if it is possible to do the job wrong, it will eventually be done wrong.

If we look instead at the occurrence ratings in the AIAG/VDA manual (page 111), we see that “best practice” behavioral controls, i.e., those that rely on worker vigilance, can get no better than a 2 rating on a 1–10 scale, where 1 means the failure mode cannot occur, and 10 means there are no prevention controls. In practice, however, reminding workers to “be careful” will be only somewhat effective (O=7) or of little effectiveness (O=8 or 9). The detection ratings (pages 113–114) will not allow a detection rating of less than 6 for an inspection that relies on human vigilance. Machine-based methods and jidoka (autonomation) can, however, earn far better detection ratings, and they must be able to prevent the escape of nonconforming product.

The solution depicted by Shingo was to install a mechanical detector that would determine whether the worker had taken the necessary screw from the supply, and apparently another that would detect the absence of the screw and prevent the vacuum cleaner from leaving the workstation if the screw was not present. The former could qualify as a prevention control, i.e., one that makes a mistake impossible, and the latter a detection control that notifies the operator and prevents escape of nonconforming work.

This was a CAPA project that began with a known problem (vacuum cleaners with missing screws) and delivered permanent corrective action in the form of prevention and detection controls. PFMEA would begin with the focus element (“install screw”) whose negative (“screw not installed”) is the failure mode. It would then look at the work elements to identify potential failure causes, such as the worker forgetting to install the screw. It would almost certainly deliver the same prevention and detection controls, which reinforces the perception that PFMEA is, in fact, proactive CAPA.

How to do it

The next step is to deploy PFMEA and effective problem solving into every product or service realization process in the organization. We must, of course, have processes, and these should be documented. Processes consist of sequences of clearly defined steps or operations, and these are the focus elements for the new PFMEA approach.

1. Identify the work elements (man, machine, material, method, measurement, and environment) for each process step and identify the potential failure causes that could generate a failure mode. That is, what could happen to cause something to happen that should not happen, or something that should happen to not happen? This is essentially proactive CAPA because it seeks to identify the potential failure modes and their causes in advance. The Shingo case study (and there are others like it) cited the example of a screw not being installed in a handle. A subtle and seemingly insignificant change in a supplied material could be another. The people who actually do the job are often in an ideal position to identify potential failure causes that others overlook. If, for example, a worker complains, “I have to remind myself repeatedly to not forget to do this,” that’s a potential failure cause; reminding people to “be careful” is not a real control or corrective action.

2. Identify the prevention and detection controls that can disable the failure causes or, if this doesn’t happen, detect the failures and prevent nonconforming work from reaching the next internal or external customer. Ask, “Which controls prevent the generation of poor quality?” and “Which controls intercept nonconforming work before it can leave this operation?”
• Critical process characteristics will generally require prevention controls.
• Critical product characteristics, if measurable or otherwise appraisable only after product realization, will generally require detection controls.
• The people who actually do the job are often well situated to do this as well.

3. Document this information in FMEA and control plan format. Its organization suggests that it can appear side by side with the work instructions in, for example, spreadsheet format. The worker doesn’t have to see this information for routine activities, but it should be available to him or her.

4. If the controls don’t prevent the generation of poor quality, the output of the CAPA process should include whatever new controls are implemented as permanent corrective action. The corresponding PFMEA that accompanies the process must be updated to reflect these new controls.

We have seen so far, then, that PFMEA is what we do to avoid the need for CAPA, and CAPA is what we do if PFMEA overlooks a failure cause. We must in all cases, though, share the lessons learned with related activities to avoid having to solve the same problem, or a similar problem, more than once.

Summary

Ineffective CAPA is a major source of ISO 9001 and IATF 16949 audit findings, and FDA Form 483 citations. The Boudreaux reference shows how it can even result in elevation of minor nonconformances to major ones due to inadequate corrective action and, in extreme circumstances, loss of ISO 9001 or API (American Petroleum Institute) certification. In any event, nobody wants to have to solve the same problems more than once, or solve similar ones in related applications due to failure to deploy lessons learned, because time spent on containment and correction of problems does not add value.

The good news is, however, that AIAG’s CQI-20, Effective Problem Solving is probably the best CAPA process ever developed, and it’s apparently based on the older but still excellent 8D (Eight Disciplines) process. It’s relatively easy to understand and use, and it will work on almost anything including wastes unrelated to poor quality. It’s also highly synergistic with the relatively new AIAG/VDA failure mode effects analysis process due to the correlation between root causes and controls. The new PFMEA approach fits in perfectly with the process approach because it makes the process steps the focus elements of the risk analysis. This suggests that diligent parallel application of CQI-20 and the AIAG/VDA FMEA manual will largely eliminate the risks associated with ineffective CAPA and contribute instead to bottom-line performance by suppressing root causes before they can even make their presence known.

References

1. Automotive Industry Action Group. CQI-22, The Cost of Poor Quality Guide. 2012, page 26.

2. Ouellette, Penny. “ISO 9001:2015 Implementation: The Good, the Bad and the Trending.Quality, Nov. 8, 2018.

3. Brown, Robert. “Beyond the IATF Transition: Analysis of Non-Conformities and Next Steps.” BSI webinar, Jan. 22, 2019.

4. Durivage, Mark. “An Introduction To qFMEA – A Tool For QMS Risk Management.” 2017, Pharmaceutical Online, May 8, 2017.

5. Boudreaux, Miriam. “Five Signs Your Company Is in Dire Need of Root Cause Analysis and Corrective Action Training.Quality Digest. Jan. 7, 2020.

6. Automotive Industry Action Group. CQI-20, Effective Problem Solving. 2018.

7. AIAG/VDA (Verband der Automobilindustrie). FMEA Handbook. 2019.

8. Cutcher-Gershenfeld, Joel, and ‎Dan Brooks, ‎Martin Mulloy. Inside the Ford-UAW Transformation. MIT Press, 2015.

9. Ford, Henry, and Crowther, Samuel. Today and Tomorrow. Doubleday, Page & Company (Reprint available from Productivity Press, 1988), 1926, page 85.

10. Shingo, Shigeo. Zero Quality Control: Source Inspection and the Poka-Yoke System. Productivity Press, 1986, page 221.

Discuss

About The Author

William A. Levinson’s picture

William A. Levinson

William A. Levinson, P.E., FASQ, CQE, CMQOE, is the principal of Levinson Productivity Systems P.C. and the author of the book The Expanded and Annotated My Life and Work: Henry Ford’s Universal Code for World-Class Success (Productivity Press, 2013).