Making Decisions in a Non-Normal World

The power of the central limit theorem

Throughout the last couple of articles, I have explained and illustrated that understanding the random sampling distribution (RSD) of a statistic is key to understanding the entire basis of inferential statistics. Which is just a fancy way of saying “avoiding career-terminating decisions.” This month I’ll show you how the central limit theorem is your best friend, statistically speaking.

…

Want to continue?

By logging in you agree to receive communication from Quality Digest. Privacy Policy.

Create a FREE account

Forgot My Password

Comments

Back to the dark ages

This sort of nonsense, which is a product of the even greater rubbish of "Six Sigma", has sent quality backwards a century, to the days before Shewhart. You would do well to read Dr Wheeler's excellent book "Normality and the Process Behaviour Chart". He examines 1143 distributions in detail and shows that Shewhart charts work well for the entire range of skewness and kurtosis in this article. (see page 88)

Dark Ages + Heretic = Time to Fetch the Wood!

Hello again ADB. Glad to see you are still reading! Or maybe you didn't read the article, since your comment doesn't actually pertain to it. That would make me sad.
.
I am not catching which bit is rubbish - perhaps you could specify. The Central Limit Theorem has been around far longer than Six Sigma, and is pretty well depended upon for hypothesis testing to work as a real-world heuristic and (as I demonstrate in the article) is a statement of fact. I don't know of anybody, Wheeler included, who would argue that it is wrong or unneeded, and a number of his examples rely on it. Wheeler would say it is not needed for control charts to work even though it applies to the subgroup means, but that is a different topic about which experts disagree*. The article is not about control charts.
.
My article deals with hypothesis testing and (peripherally) sample size calculation. As I show, the CLT allows you to make decisions about shifts in average regardless of the population distribution. Good luck finding something by Wheeler that contradicts that!
.
So I would recommend that you re-evaluate what you think you learned from Wheeler's book and consider if this article has anything to do with it. To paraphrase Inigo Montoya, "I don't think that book means what you think it means." I understand that your habitual hagiography of Dr. Wheeler may impede your ability to think critically on the matter, but I encourage you to approach it scientifically, as opposed to arguing by authority (and incorrectly interpreting authority at that). No one, not myself and not Dr. Wheeler, is above critical examination.
.
Out of curiosity - if you choose to have faith that we don't need to do anything special for non-normal populations, how would you handle capability calculations for a non-normal process - use the mean and standard deviation as estimated from the dispersion chart? Because that is going to get you a really wrong answer. Even Wheeler doesn't advise ignoring non-normality for capability.
_____________________________________________________________________
*Control charts do not require a process to be normally distributed to work - they are there to detect special causes, which of course could very well make a population fail a normality test. However, certain adjustments do need to be made for non-normal populations. The individuals chart would need to be adjusted for the population distribution in order to determine if it is in control, and capability is calculated differently for non-normal populations for all control charts.
.
>edited to add a note about control charts and normality

Stats Geek

Steven,

Most of the readers of QD are not a "totally stats geek" which may be the problem. As Quality Engineers we are tasked to improve proceesses/systems. We frankly don't like statistics or have no time to play with numbers and histograms. What we need, and what Dr. Wheeler provides, are simple tools to help us understand the UNDERLYING PROCESS that produces the numbers so we can improve these processes. Examples include:

-If a process is stable-perform capability or Wheeler's EP&U metric
-If a process is not stable-hunt down those special causes first before performing a Capability analysis
-If a process is not stable-hunt down those special causes first before running a DOE

Now for a QA Engineer that's useful stuff. As a degreed stats guy you could contribute much in this needed area of applied data analysis. You would be shocked how poorly data analysis is performed in industry. It's awful.

You are right that not everyone agrees with Dr. Wheeler but the same could be said about Deming.

Rich

Stats Geek? Me??? :-)

Hi Rich!
.
Hmm, possibly true, but that doesn't answer why ADB (who has a PhD in stats if I recall correctly) goes all frothing over a topic that this article isn't about. One might suspect that he had read the title and then posted his comment, if one were uncharitably inclined. Now I'll chat about your points, but note that the entire article above has nothing to do with ADBs objection and your points are in response to it, rather than to the article. Heck, I'll chat about anything, obviously. :-)
.
Control charts are very powerful tools for understanding a process - I love using them and teaching them and they are essential tools in the toolkit. No one ever said that you can't use a control chart if the process is non-normal, or even if a process is out of control. After all, that is the whole purpose of the thing! Anyone saying either of those things is presenting a strawman argument. But this article wasn't about that.
.
There are simple tools to understand and improve a process: e.g. the Seven Basic Tools. Nothing wrong with that, but they will only take you so far. If you are in health-care, you can probably make HUGE improvements just using those tools to pick the low-hanging fruit. If you are in manufacturing, you have probably been using them for years. To get the "higher, sweeter fruit" you are going to have to learn a new level of understanding about the strengths (and limitations) of statistics, including experimental design. Statistics is just another word for data-based decision-making. By the way, I would not be shocked at the level of analysis in industry - I have been in it consulting and teaching since 1991. I know just how bad it is, even at big companies with lots of Master Black Belts. I like to think my little articles might help influence a change in that. Oh and don't forget, I started off as a process and product engineer, so I know whereof I speak, and it ain't from no ivory stats tower my friend! I tell you from experience - learning to love that extra level of knowledge will in fact make a QE's job easier and make them more effective.
.
No lie - most of what I see stats-lacking Black Belts, Master Black Belts, and Quality Engineers do as part of their job is 1) not needed and 2) misleading them. (I won't say "stats-hating" since I haven't had a chance to teach them yet!) I particularly dislike what I call "black box" statistics, where people are taught to just put the data into the software and push the button, without understanding anything about what they are doing. This is where ADB and I agree.
.
If a process is stable and non-normally distributed, calculating the capability of the process to meet customer requirements will give you the wrong answer for two reasons: first, the estimate of the variance will be wrong since there is a non-robust assumption of normality to use the dispersion metric to estimate the true process variability; and second, the capability indices assume normality the way most people calculate them. So you have to big errors that result in a capability index that is just not all that related to what you actually have. The capability from such processes can easily be calculated, you just can't use what pops out of some software's control chart for it.
.
You *can* run a DOE on a process that is out of control...but it will be more expensive (due to a larger sample size, process controls implemented for the experiment, etc) and your results will obviously not confirm as frequently as you would expect from the alpha level you chose. Still it can (and has) been done.
.
Actually, Wheeler and I probably agree on more than we disagree. But what is the fun in that? Or another way of saying it, what do you learn from that? It is points of disagreement that lead to new knowledge. Otherwise I would write a bunch of articles saying, "Yeah me too!" Bleahh.

Normality?

As Dr. deming said in a four-day seminar I attended in Cincinatti (1991)..."Normal distribution?....I never saw one." This leads me to wonder.... Why are we so obsessed with "checking for normality"? The so-called Goodness of Fit tests should actually be called Lack of Fit tests (and sometimes they are called this), and, given enough data, virtually any data set collected from a "real" process will show a lack of fit for the normal distribution, even if the histrogram "looks" like a normal bell curve; in which case, the distribution generally shows a lack of fit near the tails of the distribution. So, the world in general IS non-normal.

Deming Knew from Normality

Hi Steve, and thanks for reading!
.
Yep, Deming said that, but he was very aware of the theoretical basis for making decisions using statistics and testing for normality - my colleague Dr. Jeff Luftig worked for Deming at Ford (and is mentioned by Deming in Out of the Crisis) and he is a fanatic on testing for normality. You have merely misinterpreted what Deming's comment was intended to demonstrate. (If I recall correctly, he made that comment when talking about the difference between enumerative and analytical statistics and would not have meant to convey that testing for normality was unimportant in decision-making.)
.
As I quoted Box in the article, “Essentially, all models are wrong, but some are useful.” We test for normality in order to determine if it is *useful* to use the normal distribution as an approximation. The normal distribution has a sound theoretical basis for occurring, at least some of the time, and it is incredibly useful as an approximation since it allows you to do a lot of powerful tests with it. (Power here means "able to see effects for a lot less time, trouble, and money than otherwise").
.
If it fails a test for normality, by definition it is too different from a normal distribution for that approximation to be useful. Using the normal distribution as an approximation would mislead us a lot more frequently than a test's stated alpha and beta error rates. Don't be worried about that, though, since there are a number of ways we can handle distributions that are not normal. We just don't want to use the normal approximation when it might mislead us (as in Figure 7 above).
.
As to real process data, that is why I recommend the use an alpha of 0.05 with the Anderson-Darling test for n<25 and the skewness and kurtosis moment tests for larger sample size. This is a good balance between falsely rejecting a distribution that is reasonably normal (alpha error), and missing a significant deviation from the normal probabilities (beta error). In my experience, about 30-40% of real processes are reasonably approximated by the normal distribution (higher when dealing with measurement error).
.
But that is only part of the story too - a process that is out of control (as determined by a control chart) might fail (or heck, even pass) a normality test too. That is why you need to understand all four characteristics of a data set: shape, spread, location, and through-time behavior in order to understand what is going on.
.
You say, "The world in general IS non-normal," to which I say, "Yep. Thank goodness for the CLT and the RSD which allow us to use the heuristic of statistics to help us make (some) decisions!