In the 1940s the War Production Board trained approximately 50,000 individuals in how to use process behavior charts (also known as control charts). At that time the computations were done by hand, and the emphasis was on making things as easy as possible for those doing these computations. As a result, sets of scaling factors were created for use in computing limits for the different charts.
With this approach, the number of scaling factors increased as the number of different charts increased, resulting in a veritable alphabet soup. Some of these scaling factors date back to 1935, while others are of more recent vintage. Some of these scaling factors were created for purely academic situations while others were created for use with the data. Since different references would give different collections of these scaling factors, it was inevitable that over time notational differences would occur. In 1995 I collected all 35 of these scaling factors together, along with their derivations and formulas, in my Advanced Topics in Statistical Process Control, Second Edition (SPC Press, 2004). In this article I have extracted the 22 scaling factors which can be used to compute correct limits directly from the data. Thus, this article is intended as a quick, yet complete, reference guide that will help the user to know when each scaling factor is to be used.
Process behavior charts will generally consist of two parts: a chart for location and a chart for dispersion. The first step in determining which chart to use will depend upon how the data are organized. Here there are two major categories. The data may be organized into k rational subgroups of size n, where each subgroup is logically homogeneous, or the data may consist of a stream of k individual values where these values are logically comparable. These two categories are represented in the first column of figure 1.
When your data consist of a sequence k individual values you should place these values on a chart for individual values and a moving range, as shown in the last row of figure 1.
When your data are arranged into rational subgroups you will have to determine what summary statistic will be used to characterize the location and the dispersion for each of the k subgroups.
Commonly we will use the k within-subgroup averages to characterize the location of each subgroup. Since the idea of rational subgrouping is to have homogeneous subgroups, there is rarely any need to use the within-subgroup medians. (Formerly we would occasionally use the within-subgroup medians in order to minimize the arithmetic for the benefit of the operators, but software has essentially made this usage obsolete.) This choice for the within-subgroup measure of location is shown in column two of figure 1.
Next you will need to determine which within-subgroup measure of dispersion you are going to use. You might use the subgroup ranges, the subgroup standard deviation statistics, or the root mean square deviation statistics.
For the sake of clarity, the root mean square deviation (RMS deviation) is defined by its name (read in reverse) and is the measure of dispersion with n in the denominator.
The standard deviation statistic differs from the RMS deviation in that it has (n–1) in the denominator.
With subgroup sizes of n = 15 or less there is little practical difference between the within-subgroup range, the within-subgroup standard deviation, and the within-subgroup RMS deviation. They work the same and result in comparable charts. Your choice is more a matter of your preferences than anything else. These choices are listed in the third column of figure 1. (However, as shown in figure 1, when using subgroup medians there is no reason to use anything other than the subgroup ranges.) Once you have made the choices offered in columns two and three of figure 1, the type of chart you will need to use is listed in the last column.
Once you know which type of chart you will be using, you will need to determine which summary within-subgroup measure of dispersion will be used to obtain your limits. As may be seen in the second column of figure 2, your choice here will consist of the average of your within-subgroup measures of dispersion or the median of your within-subgroup measures of dispersion. In general, for the default computation, the average of your within-subgroup measures of dispersion is recommended. However, in those cases where some extremely large values may have inflated this average, you may switch to using the median of your within-subgroup measures of dispersion to improve the sensitivity of your analysis.
The limits for the chart for subgroup averages and the limits for the chart for subgroup medians will be found using equation 1:
Equation 1: Grand Average ± Scaling Factor * Summary Within Subgroup Measure of Dispersion
The grand average is the typical central line for the average chart and may also be used for a median chart. Occasionally it may be replaced by the median of the subgroup averages or even, in rare cases, by a nominal value. The scaling factors denoted by the letter A are used with equation 1.
The central line for the chart for dispersion is simply the summary within-subgroup measure of dispersion used. The limits for this chart are found using equations 2 and 3:
Equation 2: Lower Limit = Lower Scaling Factor * Summary Within Subgroup Measure of Dispersion
Equation 3: Upper Limit = Upper Scaling Factor * Summary Within Subgroup Measure of Dispersion
Depending upon the statistic used, these scaling factors are denoted by the letters B or D. These scaling factors come in pairs. When the lower limit would be smaller than zero, figure 3 will simply not list a value for the lower scale factor.
When working with subgrouped data there will be occasions where the limits for individual values will also be required. (For example, these limits may be needed for comparison with the specification limits.) These values may also be computed from the same information used to compute the limits defined above. Limits for individual values may be found using equation 4:
Equation 4: Average ± Scaling Factor * Summary Within Subgroup Measure of Dispersion
The average of the original data is the central line for these limits. The scaling factors for use with equation 4 are denoted by the letter E.
When these generic formulas are used with the scaling factors given in figure 3 they will all yield the appropriate three-sigma limits. These limits will filter out virtually all of the routine variation (regardless of the shape of the histogram) allowing you to identify any potential signals contained within the data.
The values for A6 and A9 are only given for odd values of n. This is because the rationale for using a median chart precludes the use of even values of n.
As noted earlier, there are 13 additional scaling factors. The scaling factors A, A8, D1, D2, B1, B2, B5, and B6 are not intended for use with data. Rather they are for the academic situation where the standard deviation of the process is said to be known. Since this essentially never happens in practice, these scaling factors were not included in this summary. In my January 7 column, “The Right and Wrong Ways of Computing Limits,” we found the pooled variance approach to computing limits to be almost right. This approach uses scaling factors A7, B7, B8, B11, and B12. Since some of these scaling factors depend upon both n and k, and since the results are less robust than the approaches given above, these scaling factors were not included in this summary.
The software available today has added many different (and incorrect) options to the computation of limits for process behavior charts. The first foundation of the process behavior chart approach to making sense of your data is the use of within-subgroup measures of dispersion computed using rational (i.e. logically homogeneous) subgroups. The use of any other measure of dispersion is wrong. Likewise, irrational subgroupings are inappropriate. The second foundation of the process behavior chart is the use of three-sigma limits. The within-subgroup dispersion computed using rational subgroups creates the robustness that allows you to get useful limits in spite of changes that may occur within your data. The three-sigma limits are sufficiently conservative to work with any type of data and still filter out virtually all of the routine variation, thereby minimizing false alarms while allowing you to detect those signals that are of economic importance.
Do not attempt to use any other measure of dispersion. Do not attempt to use something other than three-sigma limits. Virtually all of the correct ways of computing the appropriate limits are covered here. Use any other approach at your own peril.