Our PROMISE: Our ads will never cover up content.

Our children thank you.

Statistics

Published: Thursday, September 10, 2015 - 15:40

It’s a cold winter’s night in northern New Hampshire. You go out to the woodshed to grab a couple more logs, but as you approach, your hear a rustling inside the shed. You’ve gotten close enough to know you have a critter in the woodpile. You run back inside, bolt the door, hunker down with your .30–06, and prepare for a cold, fireless night.

Analyzing data using common tools like f-tests, t-tests, transformations, and ANOVA methods are a lot like that scenario. They can tell you that you’ve got a critter in the woodshed, but they can’t tell you whether it’s a possum or a black bear. You need to take a look inside to figure this out. Limiting data analysis to the results that you get from the tools cited above is almost always going to lead to missed information and, often, to wrong decisions. Charting is the way to take a look inside your data.

In this article, I will explore data sets that illustrate this point. I’ve chosen two specific sets, but the truths are basic and apply to all data sets; there are many I could have chosen from to use as examples. Both data sets used here are real, nonsimulated data. First I’ll look at the groups using classical methods, and then I’ll look again using control chart methods. I’ll leave the final decision about charting *your* data to you.

A major change was made on a process. Data were collected from parts made before the change and compared to a group made after the change. The question (as initially proposed) was, “Are the two groups the same?” Rather than going into the complexities of asking all of the questions needed to validate this question, or getting into confidence intervals and errors of the estimates, I’ll reword the question as, “With this data set, are there detectable differences between the two groups?” because this is probably what the asker really wanted to know.

To answer this question, I perform a test for the differences in standard deviations (or variances) and a test of the differences in averages (means). Excel quickly provided the statistics below. It can also provide both tests of the variances and the means, but for these analyses, I’ll use Minitab and will conduct the test of the variances first because I need to know if there are significant differences in the variance for performing the correct version of the test of the averages.

Both tests above indicate that I could expect to see differences in standard deviations this large, or larger, just due to sampling. This means I can assume equal variances for testing the difference in means.

The t-test indicates that given the number of samples, and the variation of the processes, the chance of seeing this big a difference in averages due to chance is nil.

The answer to the question, “With this data set, are there detectable differences between the two groups?” is “Yes, there are detectable differences.” The variation is about the same, but there’s a significant difference in the averages.

The data describe a process that makes several parts at a time. Grouping by the time each set came off the machine would reduce the data to a few averages and standard deviations. Sometimes this is enough to give an indication if anything else is going on, but it would be sketchy: A lot could still be hiding in that woodshed. I’m going to use a moving range chart. I can still separate it into larger groups if I want to, but this method will allow me to see all of the data.

The chart quickly reveals several things I didn’t get from the initial analysis:

• The range chart indicates that even though the total spread of the data might not be as great, the “before” process has more moment-to-moment variation.

• The measure of dispersion used in the classic tests of mean and variance assumes that the data are homogenous. It can’t differentiate between variation caused by capability and variation caused by stability.

• Both processes show a lack of stability. Any attempts at estimating defect rates will be fallacious no matter how you transform the data. You may find a model that fits the data today, but it will be meaningless tomorrow. Again, these transformations count on a homogenous data set.

• Both processes exhibit a lot of “freaks” that are outside of the normal model used for the control chart; however, nothing in the conventional methods even hints at this, especially if we just transform the data to make them disappear.

• The “after” process is drifting, inflating its total spread. This is visible on the average chart. Shifting of the average in the “after” group is confounding standard analysis and requires further investigation.

• Based on the range over which the “after” process drifted, it’s not unreasonable to assume that, if stabilized, the process could be adjusted to run at about the same average as the “before” data, meaning that the difference in averages may not really be a problem.

“These [samples] were taken and measured in sequential order,” reads the original note about these data from 1989. “No adjustments were made during sample collection.” To arrive at the data, 125 parts were taken and measured in the shortest time-frame possible, a classic capability study if ever there was one. The data, as well as both types of capability analysis that are available in Minitab, follow:

Although the data don’t form a perfect bell curve and appear to be a little heavy on the low end, the P value on the Anderson-Darling probability chart indicates that I could expect to see a value this high about 12 times out of a hundred with normal data. Kurtosis and skew values are also relatively small. There’s no indication that these data are anything but normal, so I won’t transform the data. The Cp is 0.88, the Cpk is 0.55 (according to two out of three of the methods)—an unacceptable process, but a sound capability analysis. So the question is, “What is wrong with this?”

The short answer is, “Just about everything.” The basic assumption behind all of the above conclusions is flawed. The results above, computed from summary statistics, assume that the data are from a stable, consistent, homogenous source. Both the control chart and the point plot tell us at a glance that this is not a valid assumption. This process is unstable. Any conclusions drawn from the summary statistics, as well as any transformations that I might have done, are meaningless if the data don’t all come from the same process. These parameters and tests assume that if a given value has a certain probability of occurring in one sampling, it will have the same probability of occurring in any other sampling. This is not true in an unstable process.

Although we are told to “always chart our data,” usually that occurs as an afterthought. Often, when charts are readily available, such as in the Minitab analysis above, they are only glanced at or even ignored. Charting the data (and looking at and understanding the chart) must be the first step in an analysis, not the last. Only by doing so can a meaningful decision be made about any additional analysis.

There are a lot more unstable processes out there than stable ones. When properly used, control charts can tell you so much more about your data than conventional statistical methods, including whether your (stable) process is actually skewed or not. During the last several years, I’ve heard less and less about the use of control charts, especially as an analytical tool. Designed experiments are sexy, but unless you’re using ANOM techniques, who gives a moment’s thought to within-treatment stability?

“The problem is not with the choice of the model or with the mathematics, but rather with the assumption that the data were homogeneous,” says Donald J. Wheeler in the* Quality Digest Daily* article, “Why We Keep Having 100-Year Floods.” “Anytime we compute a summary statistic, or fit a probability model, or do just about anything else in statistics, there is an implicit assumption that the data, on some level, are homogeneous. If this assumption of homogeneity is incorrect, then all of our computations, and all of our conclusions, are questionable.”

## Comments

## How Elegant, (deceptively) Simple and Eloquently Stated

Hi, Douglas,

All I can say is BRAVO! for a clear article that should, but unfortunately won't, eliminate hours of legalized torture that goes in the name of statistical training for a "belt." As I like to say, there is no "app" for critical thinking applied to a simple plot of data over time.

Regarding the Normal distribution: I saw a live broadcast of a Deming 4-day seminar and someone mentioned the Normal distribution in a question to him. He gave that famous terrifying scowl and GROWLED, "Normal distibution? I've never seen one!" -- end of answer.

I don't feel so alone in my approach to data any more. Thank you!

Kind regards,

Davis Balestracci

## Transformining Data

For anyone who may not of seen it, Dr. Wheeler expressed it well in his response to comments on his latest article:

"The point is not about how we find an estimate of the parameters for a probability model. But rather that regardless of how we estimate our parameters, the whole process is filled with uncertainty, and that these uncertainties will have the greatest impact upon the extreme critical values. The statistical approach and Shewhart's approach are diametrically opposite, and until someone understands this, they cannot begin to understand how Shewhart's distribution free approach can work."

This is the "roll of the dice" that defines any data transformation.

## Transforming Data

Douglas,

Thanks for the nice article. There is just one point of discomfort for me when you state: "There’s no indication that these data are anything but normal, so I won’t transform the data." In my 40+ years utilizing Process Behavior Charts, I cannot recollect a single time I have had to worry about normality or transform data before constructing a chart to get the understanding of the process I needed. Shewhart himself dispelled the notion that the normal distribution was important to the use of control charts: "The normal distribution is neither a prerequisite for nor a consequence of statistical control."

## Transformining Data

Hi Steve,

I guess I didn't get it across well, but that was my point. I think many these days transform their data as the first step. If I ever did, (and I never have either) it would only be on a known homogeneous population with strong and I repeat STRONG statistical evidence that it was something other than normal and I would not do it for my control charting. I am sitting here trying to think of a situation where it would be really important to perform a transformation, but even in determining predicted proportions defective at extreme tails (on said homogeneous population), it is really a roll of the dice whether you get a better estimate or not. I think usually it is done these days when people don't like the results of the normal model and want to present prettier numbers.

## Transforming

Thank-you, Douglas. Well said!