Featured Video
This Week in Quality Digest Live
Six Sigma Features
Mike Richman
Strengthening U.S. business, metrology certification, kaizen for kids, and a Tech Corner demo featuring the zCAT portable DCC CMM
Eston Martz
Getting familiar with these tools is a good way to get started on your quality journey
Eston Martz
What they do and why they’re important
Anthony D. Burns
What if quality training was as engrossing as the most entertaining mobile app?

More Features

Six Sigma News
Ask questions, exchange ideas and best practices, share product tips, discuss challenges in quality improvement initiatives
Says capitalization gives false impression that Six Sigma is more significant than other methodologies
His influence on the methodology can’t be denied
Nov. 30, 2016, in Copenhagen
A story about how organizations rise and fall—and can rise again
Quality Essentials includes downloadable tools and resources, videos of Juran, and his Quality Handbook
Company headquarters and 30 jobs in Dayton, operations in Europe, stay in place

More News

John Flaig

Six Sigma

A Bell-Shaped Distribution Does Not Imply Only Common Cause Variation

Random does not imply normal

Published: Monday, September 18, 2017 - 12:03

Story update 9/26/2017: The words "distribution of" were inadvertently left out of the last sentence of the second paragraph.

Some practitioners think that if data from a process have a “bell-shaped” histogram, then the system is experiencing only common cause variation (i.e., random variation). This is incorrect and reflects a fundamental misunderstanding about the relationship between distribution shape and the variation in a system. However, even knowledgeable people sometime make this mistake.

For example, paraphrasing from a popular Six Sigma textbook, when most values fall in the middle and tail off in either direction, we have statistical evidence of common cause variation.1,2 This is an invalid statement, and the misunderstanding probably stems from the fact that if we were sampling means from a stable process, the central limit theorem would assure us that the distribution of sample means would be approximately normally distributed. However, even though the histogram of the subgroup means is bell-shaped, the process itself may still be non-normal or be experiencing special or systematic causes of variation (i.e., it may be out-of-control). To determine the correct status of the process, we must look at the control chart of the individual observations, not the distribution of subgroup means.

The fact that a “normal” distribution shape does not imply process stability is known as the Quetelet Fallacy and is documented in The History of Statistics.3 You may be surprised to learn that many educated people, including statisticians and engineers, have no knowledge of the fallacy or believe it to be true, and that the belief in the fallacy has a long history. The first documented example that it is false was given in Sir Frances Galton’s famous sweet pea experiment of 1875 that exposed the Quetelet conjecture as false.4

A proof is given below for the argument that a normal or bell-shaped histogram does not imply that the system is experiencing only common cause variation, and conversely a system experiencing only common cause variation will not necessarily have a normal distribution of observations.

Theorem: Normal does not imply Random, and Random does not imply Normal
Part 1. The proof that “Random does not imply Normal” is obvious because you can generate random (i.e., common cause) distributions that are uniform, triangular, Weibull, Poisson, Cauchy, etc., and yes, even Normal (see JMP or Minitab for examples). Also, Walter A. Shewhart’s figure 9 in his 1931 book, Economic Control of Quality of Manufactured Product, contains an example. It is the histogram of the modulus of rupture for sitka spruce trees. The histogram is skewed, but Shewhart observes that it is at least approximately in a state of statistical control.5

Part 2. The proof that “Normal does not imply Random” is false is illustrated by a counter example given below. In this example the histogram is bell-shaped, but the system is experiencing both special cause (in this case systematic) variation and common cause (i.e., random) variation. In the graph the slope of the polynomial trend line characterizes special cause (systematic) variation, and common cause (random) variation is characterized by the spread of the points about the trend line.

Clothing sales data for spring, summer, and fall (× 1,000 units)
{1, 2, 3, 3, 4, 4, 4, 5, 5, 5, 5, 5, 6, 6, 6, 7, 7, 8, 9}
Period 1: May, June (six weeks, new marketing dialog)
Period 2: July, August (seven weeks, old marketing dialog)
Period 3: September, October (six weeks, new marketing dialog)

Histogram of the sales data

The graph of the sales over time shows the effect of the marketing programs in the spring and fall. This change in performance was caused by systematic changes in the process (i.e., the marketing initiatives) and not just random variation.

Plot of sales performance over time

Irrespective of the shape of the distribution, a good way to arrive at the correct conclusion regarding process stability is by looking at a control chart of the behavior of the individual observations from the process, or for highly skewed distributions, by using the F* test6 [Cruthis, 1993] or the Dixon and Massey z-test7 where z ~ N(0, 1) and is given by:

1. Eckes, G. The Six Sigma Revolution. New York: John Wiley & Sons, 2001, pg. 97.
2. Eckes, G. Six Sigma for Everyone. New York: John Wiley & Sons, 2003, pp. 72, 73.
3. Stigler, S. M. The History of Statistics. Cambridge, MA: The Belknap Press of Harvard University Press, 1986.
4. Wheeler, D. J. personal communications, 2016.
5. Shewhart, W. A. Economic Control of Quality of Manufactured Product. New York: D. Van Nostrand, 1931. (Republished in 1980 by the American Society for Quality Control, Milwaukee, WI.)
6. Cruthis, E. N. and S. E. Rigdon. “Comparing Two Estimates of Variance to Determine the Stability of a Process,” Quality Engineering, vol. 5, no. 1., 1993.
7. Dixon, W. J. and F. J. Massey. Introduction to Statistical Analysis. New York: McGraw-Hill, 1969.



About The Author

John Flaig’s picture

John Flaig

John J. Flaig, Ph.D., is a fellow of the American Society for Quality and is managing director of Applied Technology at www.e-at-usa.com, a training and consulting company. Flaig has given lectures and seminars in Europe, Asia, and throughout the United States. His special interests are in statistical process control, process capability analysis, supplier management, design of experiments, and process optimization. He was formerly a member of the Editorial Board of Quality Engineering, a journal of the ASQ, and associate editor of Quality Technology and Quantitative Management, a journal of the International Chinese Association of Quantitative Management.


Still not sure...

I'm still not sure what you're saying here, John. Does not an in-control R or S chart imply the presence of only common cause variation? 

I agree that to look at capability you have to examine the distribution of individuals, not averages, but I believe that is a different question. 

s-chart and stability


A stable s-chart implies that the factors that control the variance are only experiencing common cause varuation. However the system itself could still be unstable if the factors that control location were experiencing special cause variation i.e., the mean was unstable.

An interesting case is the x-chart where all the information about the process is in the chart (i.e., it is a sufficient statistic). Then if the x-chart is stable, does that imply the both the mean and variance are experienceing only common cause variation?  



Rip, Sorry, I had a typo in

Rip, Sorry, I had a typo in the sentence. "To determine the correct status of the process, we must look at the control chart of the individual observations, not the DISTRIBUTION of subgroup means." That is distribution shape does not imply common cause. Also, your statement about in-control R or S charts is correct.

Typo fixed in text

Hi guys Thanks for pointing out the typo. It has been fixed.

Steve and Rip, The Central

Steve and Rip,

The Central Limit Theorem tells us that the subgroup means will be approximately normally distributed, but just because this distribution is normal does not imply common cause variation of the system. The same is true for individual observations.

Individual Charts

Thanks for the article.  I do not agree with the statement:  "To determine the correct status of the process, we must look at the control chart of the individual observations, not the subgroup means"

I don't see why it follows that just because the means from a STABLE process follow a normal distribution, why that would lead you to disqualify the use of an averages chart for assessing the process stability.  Certainly means from an unstable process do not necessarily follow a bell.  And if we are sampling correctly (to capture common cause variation only, the xbar chart should certainly detect the instability.

A fundamental issue with individual charts is that they do NOT detect small process changes quickly.  Conversely. charts of averages can detect small process changes by determining the appropriate sample size.  Many of my clients need to detect much smaller process changes than can be detected quickly/reliability with an "I chart", so Xbar charts are much more useful.  Where sampling is costly, a CUSUM chart on the individuals may be used.

Great point

I used to teach ASQ Black Belt exam prep classes using the Black Belt Primers from QCI. They contained an example that stated that one could determine stability by looking at a histogram. I don't know whether they have corrected that or not. 

I think I agree with Steve, but I'd have to know what you meant by "status" of the process. I don't know of any practical utility in running an XmR chart on data for an XBarR chart, unless you have a rule seven violation. If you've subgrouped well, and the XbarR chart shows a stable process, then you have a stable process. If you want to know the shape of the underlying data, then a discrete plot or a histogram will tell you that (but it still won't say anything about stability).