Good Limits From Bad Data (Part III)

When you use

rational sampling and rational

subgrouping, you will have powerful charts.

by Donald J. Wheeler

In March and April, this column illustrated the difference between the right and wrong ways of computing control chart limits. Now I would like to discuss how you can make the charts work for you.

The calculation of control limits is not the end of the exercise, but rather the beginning. The chief advantage of control charts is the way they enable people -- to reliably separate potential signals from the probable noise that is common in all types of data. This ability to characterize the behavior of a process as predictable or unpredictable, and thereby to know when to intervene and when not to intervene, is the real outcome of the use of Shewhart's charts. The computations are part of the techniques, but the real objective is insight, not numbers.

To this end, you will need to organize your data appropriately in order to gain the insights. This appropriate organization of the data has been called rational sampling and rational subgrouping.

First, you must know the context for the data. This involves the particulars of how the data were obtained, as well as some appreciation for the process or operations represented by the data.

Rational sampling involves collecting data in such a way that the interesting characteristics of the process are evident in the data. For example, if you are interested in evaluating the impact of a new policy on the operations of a single office, you will need to collect data that pertains to that office, rather than for a whole region.

Rational subgrouping has to do with how the data are organized for charting purposes. This is closely linked to the correct ways of computing limits. With average and range charts (X-bar and R charts), there will be k subgroups of data. The right way to compute limits for these charts involves the computation of some measure of dispersion within each subgroup (such as the range for each subgroup). These k measures then combine into an average measure of dispersion (such as the average range) or a median measure of dispersion (such as a median range), and this combined measure of dispersion is then used to compute the limits.

The objective of the control chart is to separate the probable noise from the potential signals. The variation within the subgroups will be used to set up the limits, which we shall use as our filters. Therefore, we will want the variation within the subgroups to represent the probable noise, i.e., we want each subgroup to be logically homogeneous. Shewhart said that we should organize the data into subgroups based upon our judgment that the data within any one subgroup were collected under essentially the same conditions.

In order to have a meaningful subgrouping, you must take the context of the data into account as you create the subgroups. You have to actively and intelligently organize the data into subgroups in order to have effective average and range charts. When you place two or more values together in a single subgroup, you are making a judgment that, for your purposes, these data only differ due to background noise. If they have the potential to differ due to some signal, then they do not belong in the same subgroup.

This is why the average chart looks for differences between the subgroups while the range chart checks for consistency within the subgroups. This difference between the charts is inherent in the structure of the computations -- ignore it at your own risk.

But what if every value has the potential to be different from its neighbors, such as happens with monthly or weekly values? With periodically collected data, the chart of preference is the chart for individual values and a moving range (the XmR chart). Here, each point is allowed to sink or swim on its own. The moving range approach to computing limits uses short-term variation to set long-term limits. In this sense, it is like the average chart, where we use the variation within the subgroups to set the limits for the variation between the subgroups.

While the right ways of computing limits will allow you to get good limits from bad data, the chart will be no better than your organization of the data. When you use rational sampling and rational subgrouping, you will have powerful charts.

If you organize your data poorly, you can end up with weak charts that obscure the signals. Until you have the opportunity to develop subgrouping skills, it is good to remember that it is hard to mess up the subgrouping on an XmR chart.