‘I’m shoveling two feet of your partly cloudy off my sidewalk” is an old joke about what happens when meteorologists get the forecast wrong, and there is a similar running joke among quality practitioners. “Your centered Six Sigma process is delivering 580 defects per million opportunities!” That’s 580,000 times the expected one per billion at each specification limit, and it can really happen.

ADVERTISEMENT |

This is because traditional process capability calculations assume that the critical-to-quality characteristic follows a normal distribution, and the central limit theorem does *not* help us with individual measurements.

“The universe is full of dead people who lived by assumption,” says science fiction writer Alan Dean Foster, and another take on this is “assume makes an ass of u and me.” This is certainly true if your so-called Six Sigma process is sending 0.5-percent nonconforming parts or materials to your customer. The rest of this article will show exactly how that can happen.

…

**FREE**account.

## Comments

## Where are you Dr Wheeler?

Looking forward to Don's response to this Six Sigma rubbish.

## "Rubbish"

ADB - can you provide your (prefessional and respectful) explanation as to why this is "rubbish"?

## "Six Sigma rubbish"

I think the comment of "Six Sigma rubbish" was not meant to be disrespectful.It was aimed at Don Wheeler's mantra that one should not overemphasize the question of distribution models when it comes to controlling quality effectively.

ref.: Wheeler and Chambers, "Understanding Statistical Quality Control", 2nd ed., chapter 4.4.

## Probability plots can help

I agree with the over-riding premise that if your data is highly skewed, the capability indices will be wrong. In the end, all methods except empirical are integrating underneath the curve. My company has a similar parameter to the one you illustrate, in that we have actual values out of specification but the Ppk >> 2.0. To advertise a Ppk > 2 and routinely have observed values out of spec from a small sample size is indeed nonsense, so something has to be done. It’s the how in where we diverge in thought.

Several comments on your article.

The argument about 313 vs 580 and 46% difference is a bit specious. It would be informative to the audience about how large the confidence intervals are for capability indices based upon such a small sample size (in this case, 30 – very small). Statistically, I’m willing to wager any amount of money that there is no statistical difference in these estimates. Your claim of an exact ppmd diminishes the article since you are treating these as point estimates that have no variability; just the opposite – they have huge variability. You know the magnitude in which your process lives and that is knowledge.

WL: ALWAYS test the distribution for goodness of fit: Goodness of fit tests must be used with great care. They are highly dependent upon sample size and outliers, so this is just bad advice without a lot more knowledge. Just because you can doesn’t mean you should. In the event a goodness of fit test is appropriate, use the probability plot. Use your eyes, not the p-values. After time series/control charts, probability plots are one of our most powerful tools to identify what is really happening.

WL: Fit the appropriate non-normal distribution to the data: Here is where I disagree the most. Most people have no clue what an appropriate non-normal distribution might be, so telling the user to choose something in Minitab is fraught with danger. The problem with non-normal distributions is all of the work going into modeling a part of the distribution you don’t care about. All you care about is the distribution of the tail so you can integrate. Through the use of probability plots with the tail area and regression, you can come up with a reasonable estimate. But that’s ALL it is. Just an estimate that is likely close to the truth (which is your original point).

## Question on 1.08 Capability

When I compute the capability with a Gamma distribution, I get 1.08 relative to the low specificaton limit, which you don't care about since it is bounded by 0. I get 1.38 relative to the upper specification limit. This is still considerably worse than 2 Ppk when using the normal distribution, but much closer than 1.08. So, my question is, shouldn't we be using the 1.38 instead of 1.08?

## Distribution selection

I would emphatically NOT suggest that the practioner let Minitab select what looks like the best distribution. In the case study, however, the gamma distribution makes physical sense because the quality characteristic involves undesirable random arrivals, but on a continuous scale. The gamma distribution is the continuous scale analogue of the Poisson distribution. If, for example, a Weibull distribution provided an even better fit, I would still go with the gamma distribution because there is no physical explanation for the Weibull.

If, on the other hand, I had something that failed at its weakest point, I would first try an extreme value distribution. For reliability studies of products with constant hazard rates, the exponential distribution is known to model the cycles or time to failure.

You are correct that the confidence interval will be rather wide for only 30 measurements. This can be computed for Weibull distributions, and I was able to do it for a gamma distribution as well (using the method in the Lawless reference). This means we can be 95% sure, for example, that Ppk is greater than a certain amount, which will in turn be much less than the point estimate. The width of this interval decreases as the sample increases.

## Add new comment