Hence, with a single-stream process (i.e., no parallel paths such as a multiple-cavity die), a good sampling plan would be to take five consecutive pieces to make each subgroup. Subgroups should be taken at those times when expert knowledge of the process gives the greatest concern for manufacturing problems such as:
Such a sampling scheme will provide the smallest possible variation within the subgroups, thereby providing the greatest opportunity for a statistical signal to serve as a roadmap for process improvement.
Unfortunately, improper sampling plans are sometimes used, which tends to inflate the control limits. A serious error in the use of the Xbar & R chart is to blindly believe a control chart that indicates that a process is in control. This error is serious because it robs you of your "roadmap for improvement." Artificially inflated control limits may easily give a false indication that a process is free from nonrandom variation, when that isn’t the case at all. This renders the control chart worse than useless—no information is better than misinformation.
The usual cause of inflated control limits is the existence of a systematic within-subgroup pattern (e.g., the first reading of a subgroup is usually the highest within that subgroup). The problem of inflated control limits due to systematic within-subgroup stratification is so common that one of Lloyd S. Nelson’s rules of lack of control evidence deals with it: 15 points in a row inside the 1-sigma limits. Subgroup stratification is even found in the following example from K. Ishikawa’s Guide to Quality Control (Asian Productivity Organization, 1982).
In the manufacture of resin parts, a critical measurement was made on five parts each day for 25 days. The first part was measured at 6:00 a.m., and another part was measured each four hours thereafter. The data were kept in time order to make an Xbar & R chart with the usual 3-sigma limits, having 25 subgroups (days) of size five (times of the day) (Figure 1). The highest values on the Xbar chart and on the R chart both occurred on Day 19. With random normal data the Xbar values and R values are independent, so the probability of such a coincidence is too small to attribute the signal to chance. However, we’ll overlook this, as Ishikawa did, and discuss the inflated control limits.
Ishikawa concluded that the process was in a state of control with only common-cause variation arising from what appears to be random data. This conclusion was grossly in error and was the result of inflated control limits. The mistake easily could have been avoided.
Figure 1: Xbar & R chart subgrouped by day with subgroups of size 5 |
![]() |
Dr. W. Edwards Deming strongly advocated the use of run charts, simple time-ordered plots of the individual measurements, for data analysis. During a conversation with Dr. Deming in 1980, one of his associates stated that a particular client "didn’t have enough sense to make a run chart before he made a control chart." The run chart for the resin data is shown in Figure 2, where it’s been interrupted between days making the within-day pattern clearly visible. The run chart enables you to focus on peculiarities in the individual measurements, which might suggest nonrandom special-cause influence.
Figure 2: Run chart interrupted by day. |
![]() |
The most glaring feature of the run chart is that in 21 of the 25 days the startup measurement was the highest. It’s not random. The false message delivered by the Xbar & R chart was due to poor sampling, resulting in within-subgroup stratification and inflated control limits. The variation between times of the day, which should have been made part of the between-subgroup variation, was improperly made part of the within-subgroup variation, inflating the control limits and rendering the control chart useless.
We have achieved our initial purpose for analysis. The observed data fluctuations can’t be attributed to common-cause variation. But there’s still considerably more to be learned from the interrupted run chart in Figure 2. In addition to the nonrandom within-day variation, Figure 2 shows that there was also nonrandom between-day variation. A reference line can be opportunely set at 13.65, skimming off the 12 highest points. Eleven of the 12 are 6:00a.m. startup measurements but the twelfth point is particularly interesting. This very high 10:00a.m. reading occurred on Day 19, the day which also had the highest of the 125 readings and the highest range of 5 readings. It’s not random.
The numerical data for this example were without explanatory notations. Totally lacking is the sorely needed information on day of the week, date, periods when shut down, and the conditions under which the measurements were taken. Without this information, one might reasonably suppose that the startup problem was most severe on Day 19, possibly after a prolonged shutdown. Clearly such conjecture after the fact is no substitute for complete annotation of the numerical data.
Finally, reconsider the R chart in Figure 1, noting that the highest subgroup range value (well within the inflated control limits) occurred on Day 19, the same day as the other problems above. It may now be seen that this range would very likely have been out of control had the limits not been so severely inflated.
An alternative to the interrupted run chart is to identify patterns in the extreme values of the data (having nothing to do with whether these very high or very low points might be considered as statistical outliers). This has been done for the 25 by 2 array of the resin data in Table 1 where all values greater than 13.65 have been put in parentheses. The highest of the 125 values, the 6:00 reading on Day 19, has been put in double parentheses.
The only value in parentheses that wasn’t a 6:00a.m. startup measurement was the 10:00a.m. measurement on the same day. It’s not random. As a helpful variation of this technique, the highest (and again, the lowest) values in each subgroup of five may be identified.
The alternate approach of searching out patterns in the extreme data points gives results equivalent to the interrupted run chart. It’s important to note that the ability of these two methods to discover systematic within-subgroup stratification isn’t hampered by the occurrence of even very large spikes of between-day variation. That’s to say these two methods work well in problems with two-way variation. This won’t be the case in the next method of detecting within-subgroup stratification.
Table 1: Xbar & s chart subgrouped by day with subgroups of size 5 | |||||
Day | 6 a.m | 10 a.m. | 2 p.m. | 6 p.m. | 10 p.m. |
1 | (14) | 12.6 | 13.2 | 13.1 | 12.1 |
2 | 13.2 | 13.3 | 12.7 | 13.4 | 12.1 |
3 | 13.5 | 12.8 | 13 | 12.8 | 12.4 |
4 | (13.9) | 12.4 | 13.3 | 13.1 | 13.2 |
5 | 13 | 13 | 12.1 | 12.2 | 13.3 |
6 | (13.7) | 12 | 12.5 | 12.4 | 12.4 |
7 | (13.9) | 12.1 | 12.7 | 13.4 | 13 |
8 | 13.4 | 13.6 | 13 | 12.4 | 13.5 |
9 | (14.4) | 12.4 | 12.2 | 12.4 | 12.5 |
10 | 13.3 | 12.4 | 12.6 | 12.9 | 12.8 |
11 | 13.3 | 12.8 | 13 | 13 | 13.1 |
12 | 13.6 | 12.5 | 13.3 | 13.5 | 12.8 |
13 | 13.4 | 13.3 | 12 | 13 | 13.1 |
14 | (13.9) | 13.1 | 13.5 | 12.6 | 12.8 |
15 | (14.2) | 12.7 | 12.9 | 12.9 | 12.5 |
16 | 13.6 | 12.6 | 12.4 | 12.5 | 12.2 |
17 | (14) | 13.2 | 12.4 | 13 | 13 |
18 | 13.1 | 12.9 | 13.5 | 12.3 | 12.8 |
19 | (14.6) | (13.7) | 13.4 | 12.2 | 12.5 |
20 | (13.9) | 13 | 13 | 13.2 | 12.6 |
21 | 13.3 | 12.7 | 12.6 | 12.8 | 12.7 |
22 | (13.9) | 12.4 | 12.7 | 12.4 | 12.8 |
23 | 13.2 | 12.3 | 12.6 | 13.1 | 12.7 |
24 | 13.2 | 12.8 | 12.8 | 12.3 | 12.6 |
25 | 13.3 | 12.8 | 12.2 | 12.3 | 13 |
A third method of detecting within-subgroup stratification is to make an Xbar & S chart on the transpose of the data (Figure 3). In this example, it means subgrouping the data by time-of-day rather than by day. This gives five subgroups of size 25 (days). An R chart isn’t suitable because the subgroup sizes are too large. 2.5 sigma limits are the proper choice for use with five subgroups in order to keep the false-alarm (alpha risk) similar to that when using 3-sigma limits with 25 subgroups. The Xbar & S chart, with the startup readings far beyond the upper control limit, does an excellent job of showing the source of the inflated control limits in Figure 1. A study of six methods for the "detection of lack of within-subgroup homogeneity" in the absence of two-way variation found this method the best of the six. However, this method may not be helpful in the presence of severe two-way variation.
Xbar & s chart subgrouped by time-of-day, 2.5-sigma limits |
![]() |
In retrospect we see that two processes were present in this example: the startup process and the normal process. When the two processes were wrongfully analyzed as a single process, as in Figure 1, the "noise" of the differences between the two processes blocked the information needed to make improvement to either.
The cardinal rule here is that unless an assignable cause of variation is removed, it must be taken into account in the data analysis. The two processes must be tracked separately. For improvement, the best course of action would probably be to concentrate on the startup problem to learn how to minimize the startup effect. In any event, the "bad" startup product has to be separated from the bulk of the "good" product, reassigning the "bad" product or reworking or scrapping as necessary.
All of the above methods for discovering within-subgroup stratification depend upon this stratification being systematic. In this example, if the time-order of the observations within the day had been lost, the stratification and inflated control limits wouldn’t have changed, but the source would have been hidden. It’s critical to preserve any natural order that exists in the data. The usual tests for stratification or "centerline-hugging" are seldom powerful enough to be helpful.
One more example from the literature will be mentioned briefly, this one involving three-way variation. The silicon content was measured for each of five heats within each of three shifts for five consecutive days. An Xbar & R chart, as well as an interrupted I chart for the individual measurements all led to the erroneous conclusion that there was only common-cause variation present. The analysis error was due to inflated control limits. It could readily be seen by looking at the pattern of the extreme values in the data table that the highest measurement on four of the five days was on the last heat of the third shift.
Further, the three highest readings of the 75 were on the final heats of days 1, 3, and 5. It’s not random. This was apparently a shut-down problem. The data aberrations could probably have been found also by using the interrupted run chart on individuals, once the inflated I chart limits had been removed. A reference line to skim off the high values like the one in Figure 2 would have been helpful.
When you think a process is in control and it isn’t, you’re at a loss to know how to go about improving it. Searching out the irregularities of the process is a fun game, but it’s also bread and butter. It tells you how to improve the process.
Sign In to get started!