Featured Product
This Week in Quality Digest Live
Lean Features
Chris Caldwell
Significant breakthroughs are required, but fully automated facilities are in the future
Megan Wallin-Kerth
Or, how mistakes factor into a kaizen mindset
Eric Whitley
Manufacturing methods and technologies that improve waste management
Donna McGeorge
Design the day for maximum productivity with this Nano Tool
Scott A. Hindle
Part 2 of our series on SPC in a digital era

More Features

Lean News
Embrace mistakes as valuable opportunities for improvement
Introducing solutions to improve production performance
Helping organizations improve quality and performance
Quality doesn’t have to sacrifice efficiency
Weighing supply and customer satisfaction
Specifically designed for defense and aerospace CNC machining and manufacturing
From excess inventory and nonvalue work to $2 million in cost savings
Tactics aim to improve job quality and retain a high-performing workforce
Sept. 28–29, 2022, at the MassMutual Center in Springfield, MA

More News

William A. Levinson

Lean

When Assignable Cause Masquerades as Common Cause

Deciding whether you need CAPA or a bigger boat

Published: Wednesday, September 27, 2023 - 11:03

The difference between common (or random) cause and special (or assignable) cause variation is the foundation of statistical process control (SPC). An SPC chart prevents tampering or overadjustment by assuming that the process is in control, i.e., special or assignable causes are absent unless a point goes outside the control limits. An out-of-control signal is strong evidence that there has been a change in the process mean or variation. An out-of-control signal on an attribute control chart is similarly evidence of an increase in the defect or nonconformance rate.

The question arises, however, whether events like workplace injuries, medical mistakes, hospital-acquired infections, and so on are in fact due to random or common cause variation, even if their rates follow binomial or Poisson distributions. Addison’s disease and syphilis have both been called “the Great Pretender” because their symptoms resemble those of other diseases. Special or assignable cause problems can similarly masquerade as random or common cause if their metrics fit the usual np (number of nonconformances) or c (defect count) control charts.

The exponential distribution is used as a model for rare events, and the metric is the time between occurrences such as days between lost-worktime injuries. Sufficiently infrequent workplace injuries could conform to this distribution and convince the chart users that they are, in fact, random variation.

We also know it’s important to not try to track more than one process, or in the case of attribute data, more than one kind of defect or nonconformance, on a single control chart. The latter probably makes an out-of-control signal less likely if one of the attributes does begin to cause trouble; if we do get an out-of-control signal, the chart won’t show which attribute is responsible. It’s similarly futile to have a single control chart for an aggregate of safety incidents with wide arrays of underlying causes and effects.

Are control charts applicable to safety incidents or medical mistakes?

Some very authoritative sources recommend using control charts for workplace injuries, medical mistakes, and so on. According to a 2014 public health report, “Statistical process control charts have recently been used for public health monitoring, predominantly in healthcare and hospital applications, such as the surveillance of patient wait times or the frequency of surgical failures [e.g., 1–10]. Because the frequency of safety incidents like industrial accidents and motor vehicle crashes will follow a similar probability distribution, the use of control charts for their surveillance has also been recommended [11–15]. These control chart uses can be extended to military applications, such as monitoring active-duty Army injuries.”1

This reference includes control charts for “injuries per 1,000 soldiers,” and the points are all inside the control limits. The reference does cite a decrease in the injury rate, and this could well be due to corrective and preventive action (CAPA) that removed the root causes of the incidents in question to prevent recurrence. That is, CAPA for special or assignable cause problems will make them less frequent, so their aggregated count will exhibit a decrease. The presence of control limits could, however, have the unintended consequence of implying that these incidents result from random variation rather than assignable causes.

Another reference claims, “Deming estimated that common causes may be responsible for as much as 99% of all the accidents in work systems, not the unsafe actions or at-risk behaviors of workers.”2 Although one might be reluctant to challenge W. Edwards Deming, the truth is that almost all safety incidents have assignable causes. I’ve yet to see the Occupational Health and Safety Administration or the Chemical Safety Board write one off to random variation. When OSHA fines somebody for an unsafe workplace, it’s always for an assignable cause because OSHA cites a rule and how it was violated (e.g., no fall protection). If Deming contended that 99% of all incidents are due to management-controllable factors, that’s another matter entirely. But these factors are ultimately special or assignable causes. If a problem has an identifiable root cause, it’s a special or assignable cause by definition.

Rethinking common vs. assignable cause

Quality practitioners equate common cause and random cause variation. Random is exactly what it says because process and quality characteristics always experience some variation. Common cause relates to factors that aren’t controllable by the workers. Deming’s Red Bead demonstration shows why it’s worse than useless to reward or penalize workers for them. If these factors are correctable by management, it might be better to not equate them to random variation.

The Ford Motor Co. presented an outstanding example of this more than 100 years ago.3 “Even the simple little sewing machine, of which there are 150 in one department, did not escape the watchful eyes of the safety department. Every now and then the needle of one of these high speed machines would run through an operator’s finger. Sometimes the needle would break after perforating a finger, and a minor operation would become necessary. When such accidents began to occur at the rate of three and four a day, the safety department looked into the matter and devised a little 75-cent guard which makes it impossible for the operator to get his finger in the way of the needle.”

The reference says the accidents took place at a rate of three and four a day; let’s assume an average of 3.5 per day. It’s quite likely that the daily count would have fit a Poisson distribution for undesirable random arrivals, and would have probably served as a textbook example for a c (defect count) control chart. If we view common or random cause as something inherent to the system in which people must work, in this case an unguarded moving sharp object, then this was a common cause problem. The fact that it was possible to put a finger under the needle shows, however, that the root cause was in the machine (equipment) category of the cause-and-effect diagram. The fact that installation of the guards (figure 1) eliminated the problem completely underscores the fact that they were dealing with special, assignable, or correctable cause variation.

Make no mistake: CAPA is, or at least should be, mandatory for every safety incident or near miss, regardless of the frequency of occurrence, because it almost certainly has a correctable cause.


Figure 1: Sewing machine finger guard

Shigeo Shingo offered several case studies that involved workers forgetting to install or include parts.4 It’s quite conceivable that these nonconformances might have followed a binomial or Poisson distribution, and their counts could have been tracked on an np (number nonconforming) or c (defect count) chart. This might convince many process owners that this was random or common cause variation, especially if no points were above the upper control limit. Shingo determined, however, that the root cause was machine and/or method (as opposed to manpower) because the job design permitted the mistakes to happen. Installing simple error-proofing controls that made it impossible to forget to do something fixed these problems entirely.

If we accept the premise that something management-controllable, like a job design that allows mistakes, is common cause variation, then these problems were common cause variation. The fact that specific, assignable causes were found and removed, however, argues otherwise.

Is a known cause always a special cause?

Does the fact that we know a problem’s root cause always make it a special or assignable cause? Suppose a 19th-century army recognizes that a musketeer is unlikely to hit his target from beyond 50–100 yards because muskets are inherently incapable of precise fire, as shown in figure 2. The only way to improve the situation is to rearm the entire army with rifles, which everybody eventually did.


Figure 2: Gun target example

The prevailing variation in musket fire, however, had to be classified as common cause because the tool was simply not capable of better performance. There was no adjustment a soldier could make to improve this performance, and adjustment in response to common or random cause variation (i.e., tampering) actually makes matters worse. If, however, the shot group from a firearm was centered elsewhere than the bull’s-eye, this was special or assignable cause because the back sight could be adjusted to correct the problem the same way a machine tool that is operating off nominal can be adjusted to bring it back to center.

Another example involves particle-inflicted defects on semiconductor devices. These devices are so small that even microscopic particles will damage or destroy them during fabrication. Thus the cause is known, but the only way to improve the situation is to get a better clean room with an air filtration system that will reduce the particle count, or get better process equipment and chemicals; the latter also must be relatively particle-free.

The takeaway from these examples is that if the problem’s root cause is known but we can solve it only with a large capital investment, retooling, or whatever, we can construe it as common cause variation. This is emphatically not true, however, of safety incidents and medical mistakes.

Joseph Juran and Frank Gryna reinforce this perception.5 “Random in this sense means of unknown and insignificant cause, as distinguished from the mathematical definition of random—without cause.” If a root cause analysis (RCA) in the course of corrective and preventive action can find a cause, it’s assignable and not random.

Conclusion

The fact that nonconformance data—and safety incidents and medical mistakes are obviously nonconformances—may fit an attribute distribution and behave in the expected manner on an attribute control chart doesn’t make them random or common cause variation that we must accept in the absence of major capital investments or other overhauls. We must recognize upfront that the aggregate of multiple special-cause incidents can masquerade as binomial or Poisson data. We also need to realize that OSHA violations involve failures to conform to a very specific regulation or standard (such as fall protection), which are special or assignable causes by definition.

Medical regulatory agencies such as Medicare do not, meanwhile, deny payment for things that “just happen,” like surgery on the wrong body part, surgery on the wrong patient, medication errors, and so on.6 These are “never events” that should never happen, so common or random cause variation is not an acceptable explanation.

This underscores the conclusion that any accident or near miss requires corrective and preventive action regardless of whether the count or frequency of these events falls inside traditional control limits, and even raises questions as to whether control limits (which imply the presence of a random underlying distribution) should be used at all.

In summary: If the only way to improve the situation involves extensive retooling, capital investments, and so on, as in “You’re going to need a bigger boat” from the movie Jaws, it’s common cause variation. The issue isn’t urgent because it’s not practical to take immediate action on it. But it is important. If a competitor gets a bigger boat, a superior rifle, a better cleanroom, or a tool with less variation, we will eventually be in trouble.

If the issue has an identifiable root cause that can be removed with corrective and preventive action, it’s a special or assignable cause variation regardless of whether the metric is inside control limits. CAPA is mandatory when the issue involves worker or customer safety, and highly advisable when it involves basic quality.

References
1. Schuh, Anna, and Canham-Chervak, Michelle. “Statistical Process Control Charts for Public Health Monitoring.” U.S. Army Public Health Command, Public Health Report, 2014.
2. Smith, Thomas.  “Variation and Its Impact on Safety Management.” EHS Today, 2010.
3. Resnick, Louis. “How Henry Ford Saves Men and Money.” National Safety News, 1920.
4. Shingo, Shigeo. Zero Quality Control: Source Inspection and the Poka-Yoke System. Routledge, 1986.
5. Juran, Joseph, and Gryna, Frank. Juran’s Quality Control Handbook, Fourth Edition. McGraw-Hill, 1988.
6. Centers for Medicare & Medicaid Services. “Eliminating Serious, Preventable, And Costly Medical Errors—Never Events.” 2006.

Discuss

About The Author

William A. Levinson’s picture

William A. Levinson

William A. Levinson, P.E., FASQ, CQE, CMQOE, is the principal of Levinson Productivity Systems P.C. and the author of the book The Expanded and Annotated My Life and Work: Henry Ford’s Universal Code for World-Class Success (Productivity Press, 2013).