Our PROMISE: Our ads will never cover up content.

Our children thank you.

Six Sigma

Published: Monday, May 2, 2022 - 12:03

Students are told that they need to check their data for normality before doing virtually any data analysis. And today’s software encourages this by automatically providing normal probability plots and lack-of-fit statistics as part of the output. So it’s not surprising that many think this is the first step in data analysis.

The practice of “checking for normality” has become so widespread that I have even found it listed as a prerequisite for using a distribution-free nonparametric technique! Yet there is little consensus about what to do if your data are found to “not be normally distributed.” If you switch to some other analysis, you are likely to find it too is hidden behind the “check for normality” obstacle. So you are left needing to customize your analysis by fitting some probability model to your data before you can proceed—and this opens the door to all kinds of complexities.

The histogram in Figure 1 represents 20 days of production for one product. It is clearly not going to pass any test for normality. But is there some other probability model that might be used?

If we attempt to fit some skewed probability model to these data, we typically begin by estimating the mean and standard deviation. These 160 data have an average of 52.33, and an estimated within-subgroup standard deviation of 2.88. This results in natural process limits of 43.69 and 60.97. When we apply these limits to this histogram we find 19 of the 160 values outside this interval, which is 12 percent of the data.

So to fit this histogram, we are looking for a skewed probability model having a mean of 52.33, a standard deviation of 2.88, with 12 percent outside the three-sigma limits. And this is where we come up against a mathematical fact of life. *No such probability model exists!*

No mound-shaped, unimodal probability model, regardless of how skewed that model might be, can ever have more than 2 percent outside the interval defined by the mean plus or minus three standard deviations. This limitation is imposed by the laws of rotational inertia and cannot be violated.

Rather than following the white rabbit down the hole of checking for normality, we need to think about a fundamental assumption that is an actual prerequisite for the use of most statistical techniques. This is the assumption that the data are logically homogeneous. This means that the data need to be “of a similar kind or nature, having no discordant elements, and of a uniform structure throughout.” In practice this means that there are *no* *unknown* *changes* in the conditions under which the data are collected. And the primary technique for examining this assumption of homogeneity is the process behavior chart (also known as a control chart).

But how can a process behavior chart work without reference to a probability model? In answer to this question, Walter Shewhart wrote the following on page 35 of *Statistical Method from the Viewpoint of Quality Control*:

“We next come to the requirement that the criterion [a process behavior chart] shall be as simple as possible and adaptable to a continuing and self-correcting operation. Experience shows that the process of detecting and eliminating assignable causes of variability so as to attain a state of statistical control is a long one. From time to time the chart limits must be revised as assignable causes are found and eliminated.

“A simple procedure is used for establishing the limits without the use of probability tables because it does not seem that much is to be gained during the process of weeding out assignable causes by trying to set up exact probability limits upon the basis of assumptions that we know from experience do not hold until the state of statistical control has been reached. This is particularly true since such probabilities do not indicate the probability of detecting assignable causes but simply the probability of looking for such causes when they do not exist, which is of secondary importance until a state of statistical control has been reached. Then too, as already indicated, the design of an efficient criterion for the important job of indicating the presence of assignable causes depends more upon [rational sampling and rational subgrouping] than it does upon the use of any exact mathematical distribution.”

As Shewhart observes, there are two mistakes we can make when we analyze data, but fortunately we cannot make them both at the same time. When our data contain signals, we are open to the mistake of missing these signals. It is only when our data *do not contain signals* that we are open to the mistake of getting false alarms. Shewhart argues here that as long as your process is being operated unpredictably it is changing, and the only mistake you need concern yourself with is the mistake of missing a signal of these changes. You need not be concerned with the probability of false alarms until you have finally learned how to operate your process predictably.

But when we impose a probability model upon our data, and use that model to compute probability limits, we are only computing the probability of a false alarm. This is why checking your data for normality before placing them on a process behavior chart is to get things backward.

One of the few statistical techniques that is not built on any distributional assumption is the process behavior chart. So, while no probability model exists that will fit the mean, variance, and percentage outside three-sigma limits for Figure 1, we can still place the data from Figure 1 on a process behavior chart. When we do this we find ample evidence of a lack of homogeneity within these data.

*X*

So here is an example where the process behavior chart works *even when* *no probability model exists*.

The process behavior chart allows us to detect signals of process changes when they occur and as they occur. We need not be concerned with the probability of false alarms in Figure 2. Rather we need to identify the assignable causes of the unpredictable behavior from day to day.

The “skewness” of the histogram in Figure 1 is the result of the process going on walkabout. This process has different personalities on different days. Attempting to fit a probability model to such data is always going to be an exercise in futility.

The secret foundation of most statistical techniques is “Assume the data are homogeneous.” But the secret of data analysis is “Your data are rarely homogeneous.” And the premier technique for examining your data for homogeneity is the process behavior chart.

Shewhart clearly contradicts the claim that we have to fit a probability model to the data prior to computing the limits for a process behavior chart. If you and Shewhart see things differently, who do you think is right? Continuing in this vein, Shewhart concluded the epilogue of his 1939 book with the following:

“Throughout this monograph care has been taken to keep in the foreground the distinction between the distribution theory of formal mathematical statistics and the use of such theory in statistical techniques designed to serve some practical end. Distribution theory rests upon a framework of mathematics, whereas the validity of statistical techniques can only be determined empirically. ... The technique involved in the operation of statistical control [i.e. a process behavior chart] has been thoroughly tested and not found wanting, whereas the formal mathematical theory of distribution[s] constitutes a generating plant for new techniques to be tried.”

Mathematical theory can only approximate what happens in practice. In order for any data analysis technique to be useful it will, of necessity, have to be robust to the assumptions of mathematical theory. Otherwise it would not work in practice. When we turn the assumptions of mathematical theory into requirements that must be satisfied before we use some data analysis technique, we do nothing but add unnecessary complexity to our analysis.

Another aspect of the absurdity of turning mathematical assumptions into preconditions for practice lies in the fact that the techniques for checking on the preconditions will generally be much less robust than the analysis techniques being qualified. As Francis Anscombe, Fellow of the American Statistical Association, said: “Checking your data for normality prior to placing them on a control chart is like setting to sea in a rowboat to see if the Queen Mary can sail.”

## Comments

## How about Central Limit Theorem?

I agree that don't need checking for normality of data sample, but it due to application of Central Limit Theorem (CTM). With CTM, whatever distribution doesn't matter. Is it any point conflict with this topic? To me, agree with this topic, control chart is needed for assignable causes, but nothing involving with data distribution. Am I wrong? Anyway, if an experiment with small scale (10 samples, limited by resource for example), control chart also has quite limited meaning, is it right?

## Great article, as always,

Great article, as always, Don.

You say: "The practice of “checking for normality” has become so widespread." The reason is that keeping it simple doesn't make money for consultants, nor does it help the sale of irrelevant and unnecessary statistical software. In most circumstances, Process Behavior Charts can be drawn manually, and in doing so, gives better insight into what's happening.

Keeping it simple and sticking to the fundamentals, is what clients need to learn. Clients need to learn to avoid buying into the latest fads, farce and fraud.

Everyone should purchase your brilliant book "Normality and the Process Behavior Chart". It is very easy reading and most entertaining.

Tony

## Who do you think is right, you or Shuhart? :)

Thank you Dr. Donald J. Wheeler.

I study all your articles with great interest. There is always a lot of useful information in your articles for those who work with real processes. And a subtle sense of humor "who do you think is right, you or Shuhart?" in your articles adds a special relevance to the articles.

Kind Regards,

Sergey Grigoryev

Scientific Director at Center AQT (Advanced Quality Tools)

DEMING.PRO

## Control Chart Origin

Even as a non-statistian, it seems clear that Shewhart did not have a distribution requirement for using process behavior charts. So why do highly educated folks still refuse to accept Shewhart's rational? Any thoughts on that? I just don't get it.

Rich

## Why on transformation?

There are two reasons I can think of off-hand for transformation of data (and I agree with Wheeler on the underlying assumption of homogeneity is theoretically correct but pragmatically not in play)

1. Need to understand the distribution of the product being sent in the truck

2. Modelling data that realistic prediction intervals are required

Other than that, transformation should never be discussed in my opinion. Unfortunately, it is propagated all of the time in training because we confuse these two specific needs with all other applications. As Wheeler rightfully argues, if we have unusual values finding a distribution making them IN control defeats the whole purpose. If a probability plot fails the A-D test but shows that it is merely an outlier, then the data IS close to normal. The failure is co-mingling of purposes and Wheeler is correct that concerns over normality don’t apply here until we have a “stable” process. Then limits have a basis in probability. I was fortunate I spent 15 years in an industry that nothing was normal. We solved lots of problems never transforming the data.

## Highly educated folks are human too

Thanks to Donald J. Wheeler I have been learning much about common statistical misconceptions. Where do they come from? Consider reading "Statistics Done Wrong: The Woefully Complete Guide" by Alex Reinhart. At the end he ponders the problem of misonceptions, hyopthesizing that the standard lecture teaching model is to blame. As I recall, his point is that students come to the class with prior knowledge -- misconceptions -- and in the class those misconceptions don't get revealed or they don't get pummeled into non-existence with the lecture model. He also said, probably paraphrasing, "Misonceptions are like cock roaches. They are everywhere, even when you least expect them. And they are impervious to nuclear weapons." These students go on to become pharmaceutical researchers, doctors, scientists, etc. I extend his concept to this: humans become rather enamored in what they believe. When facts are presented that call into question those beliefs, the beliefs often live on, in defiance of the facts. Yes, even engineering, math, science and technology specialists are subject to this human trait.

## Good question

Thank you, struggle with the same question. Of course, the first hard question is whether you are mad or if everyone else is mad (statistics professors included). If the conclusion is that the people who likely have a higher IQ than you (the statistics professors) are wrong, then the question is why they are wrong.

Maybe it is because we are all mostly just bald apes trying to do our best in the hierarchies, and that intelligent people are just better at arithmetic. The ability to ask the basic questions, however, might be something that is spread much more scarcely amongst the species.

Or maybe there are other, more specific answers that lend themselves better to constructive criticism of classical SPC. It would be great if someone dared ask text-book authors like Douglas Montgomery how they respond to the criticism. That would require people to be able to ask the basic questions, and not just be good at arithmetic. As mentioned, the answer on that is not obvious.

## Montgomery.

My paper here exposes some of Montgomery's nonsense:

https://www.linkedin.com/pulse/control-charts-keep-simple-dr-tony-burns/

He did not respond.

Dr Wheeler's book "Normality and the Process Behavior Chart" proves Dr Shewhart's assertion by testing 1143 different distributions. The book is an essential read.