Featured Video
This Week in Quality Digest Live
Six Sigma Features
Mike Richman
Strengthening U.S. business, metrology certification, kaizen for kids, and a Tech Corner demo featuring the zCAT portable DCC CMM
Eston Martz
Getting familiar with these tools is a good way to get started on your quality journey
Eston Martz
What they do and why they’re important
Anthony D. Burns
What if quality training was as engrossing as the most entertaining mobile app?

More Features

Six Sigma News
Ask questions, exchange ideas and best practices, share product tips, discuss challenges in quality improvement initiatives
Says capitalization gives false impression that Six Sigma is more significant than other methodologies
His influence on the methodology can’t be denied
Nov. 30, 2016, in Copenhagen
A story about how organizations rise and fall—and can rise again
Quality Essentials includes downloadable tools and resources, videos of Juran, and his Quality Handbook
Company headquarters and 30 jobs in Dayton, operations in Europe, stay in place

More News

Six Sigma

We Do Need Good Measurement Systems

Knowing a measurement system’s variability and stability over time is valuable

Published: Wednesday, May 24, 2017 - 12:03

In his February 2017 Quality Digest column, “Don’t We Need Good Measurements?” Donald J. Wheeler recommends that a measurement system contributing up to 80 percent of the overall variation (on the variance scale) is good enough to detect persistent mean shifts when using a process behavior (control) chart. As a result, he concludes that assessing the quality of the measurement system before implementing the chart is likely a waste of resources and time.

We disagree with both his argument and conclusion. We suggest that you first look at Wheeler’s December 2010 column, “The Intraclass Correlation Coefficient,” a reference he kindly provided us. This column describes the intra-class correlation and provides additional details about the example discussed in the 2017 article. 

Wheeler uses the model


and figure 1 below (copied from his February 2017 article) to make his argument. Based on the model and assuming independence of the product value and the measurement error, we have  where  quantifies the variability due to the measurement system,  the variability of the product values, and  the observable variability of the measured product observations. The model and definitions apply when both the process and measurement system are stable.  

The intra-class correlation

tells us the proportions of the routine (i.e., common cause) variance are due to the process and measurement system.

Consider figure 1 (or figure 6 in the December 2010 column). The horizontal axis has the intra-class correlation, , varying from 1.00 to 0.00 or, equivalently, the percentage of variation due to measurement error increasing from 0 to 100%. The vertical axis gives the probability of one or more signals in the next 10 sampling periods for an X chart using rule 1 or rules 1–4 from Western Electric’s Statistical Quality Control Handbook (Western Electric Co., 1958, p. 25). We focus on the use of rule 1, but our conclusions apply to the chart using the four rules.

Without loss of generality, we let the in-control mean be 0 and the in-control variance be . In Wheeler’s article, the goal of the charting is to detect a mean shift of . The key point is that the properties of the X-chart used to construct figure 1 depend on , not on the components of the variance. So, using 3-sigma limits when , the lower and upper control limits are –3 and 3. We get a signal when an observation is greater than 3 or is smaller than –3, regardless of the relative magnitude of the measurement error variation. So the title of figure 1 is misleading because the chart does not show the effect of changing the relative contribution of the measurement error variation. The mean shift and the intra-class correlation are both changing.

What does figure 1 show? Look at the horizontal axis. The label “100%” corresponds to no process variation (i.e., ), so there the mean shift is . In other words, the vertical axis corresponding to 100-percent measurement variation is the probability of one or more false alarms over the next 10 observations. Similarly, corresponding to the label “0%,” we have  (and = 0), and a 3-sigma shift corresponds to a mean shift of . Such a shift is almost certain to be detected by the 10th observation. So figure 1 shows the sensitivity to various mean shifts using an X chart and a combination of Western Electric rules. The chart tells us nothing about the effect of changing the relative contribution of measurement error.

Figure 1: How measurement error affects the X chart

Consider the process observations. Again, without loss of generality, we let the in-control mean be 0. To demonstrate the effect of measurement error, we fix  and look at the ability of the chart to detect fixed mean shifts as  increases. Now we have  and, based on the overall variation, the 3-sigma control limits are .  We see that the control limits inflate as we increase.

Figure 2 shows the probability of one or more signals (using Western Electric’s rule 1) on the X chart over the next 10 observations as a function of . We include the chance of detecting persistent one, two, or three sigma mean shifts—by assumption mean shifts of 1, 2, and 3 units since  is fixed. For example, if  (), then the chance of a signal for a 3-sigma shift is about 0.9, not substantially less than when there is no or little measurement error. However, for detecting smaller mean shifts, the effect of measurement error is much larger. For instance, the chance of detecting a 2-sigma or 1-sigma mean shift is reduced by about 50 percent due to the measurement variation. So, in our view, substantial measurement variation does adversely affect the performance of the X chart.


Figure 2: Probability of detecting a 1, 2, or 3 sigma shifts in the next 10 observations when the process variation is

Wheeler next uses figure 3 (copied below from figure 2 in his February 2017 column and figure 1 in the December 2010 column) as an example to boost his contention that even if there is substantial variation due to the measurement system, a process performance chart can signal the actions of assignable causes.

Figure 3: Average and range chart for product 2131 created using a measurement system having an intra-class correlation coefficient of 0.52

From Wheeler’s December 2010 article, we learn that figure 3 describes the behavior of the process during the month of June. Eight parts are selected at the start of every hour each day of production (the subgroup) and measured in the lab. We also learn that the lab measured a known standard once a week for the first 25 weeks of the year, a time period that includes most of June. Figure 4 (a copy of figure 2 from the December 2010 column) is an X and moving range chart of the measurements of the known standard.

Figure 4: XmR chart for repeated measurements of a known standard using test method 65

From figure 4 the measurement system appears stable, as does the within-day variability. We can use the average moving range in figure 4 to estimate the measurement variability  and the average range in figure 3 to estimate the overall process variation . One minus the ratio gives us the estimate  for the intra-class correlation. So, in this example, the measurement system contributes almost half of the within-day variation.

In the data from the 20 subgroups displayed on the X-bar chart in figure 3, there are plenty of signals (we count 11 using the four Western Electric rules as on figure 1). So Wheeler is correct in his assertion that the performance chart can signal the action of an assignable cause even if the intra-class correlation is small. In fact, if the process shifts are large enough, even a fourth-class monitor, as defined by Wheeler in his December 2010 column, will do. But as shown in figure 2, a noisy measurement system makes the detection of smaller shifts less likely.

In the summary section of the February 2010 column, Wheeler states that the measurement system is adequate as long as the intra-class correlation, , exceeds 0.20. We cannot know if this is the case without a study of the measurement system to estimate . We can set up a performance chart without this knowledge, but there is a price to be paid. In the example, the process managers must have been concerned about the measurement system since they carried out a simple but effective continuing assessment of the measurement system as shown in figure 4.

Suppose the process managers had followed Wheeler’s advice and had not assessed the measurement system. In that case they would not know the intra-class correlation. The control limits and centerlines on figure 3 are calculated from the June data, so the chart was constructed after all of the June data were available. Looking at the chart, the day-to-day variation dominates the within-day variation (in other words, there are lots of signals!). The instability may be due to the measurement system (without the information provided by figure 4), or to the rest of the process, or some combination of the two sources. Looking back over the month, it will be very difficult to isolate the assignable cause(s). All we know is that these causes act day to day and not within days. Knowing that the measurement system is not the source of the instability would simplify the task.  

Additional comments

1. Wheeler argues that it is often unnecessary to assess the measurement system but then introduces a new classification (monitoring class). We cannot know which class we fall into without a measurement system assessment, so in this case, it is unclear as to the value of the classification.

2. Many authors (e.g., Geoff Vining, author of the article, “Technical Advice: Phase I and Phase II Control Charts”) recommend that we implement a process behavior chart in two phases. In phase I, we collect data to establish the control limits for use in phase II. We may omit some data if they correspond to instability. If we treat the June data as phase I, it is difficult to decide which subgroups to delete since the process is so unstable. We cannot trust the centerline of the average chart and hence the control limits to apply in the future.
3. In phase I, if we have a period of stability, we can assess the potential capability of the process by comparison to the process specifications. Based on the capability, we may reconsider proceeding with the process behavior chart. Note that the variability from the measurement system deflates the capability.

4. Phase I is an ideal time to assess the measurement system. An easy approach is to measure each of the sampled parts twice.
5. We are not given the reason that measurements are taken hourly. If today’s measurements are used for tomorrow’s setup, then the large measurement system variation might explain the large day-to-day variation seen in figure 3.

6. In figure 2, we show that if the measurement system is a large source of variation, then it is more difficult to detect mean shifts. In this case, assuming the measurement system itself is in-control, a simple way to increase the sensitivity of the chart is to measure each part in the subgroup two or more times and average. Note that the control limits need to be adjusted accordingly.
7. If the goal is continuous improvement of the process, then using a process behavior (i.e., control) chart may be an inefficient way to proceed. We provide a step-by-step approach to reduce variation in medium- to high-volume processes in Statistical Engineering: An Algorithm for Reducing Variation in Manufacturing Processes (American Society for Quality, 2005). Often, the first step is to carry out a multivari study. Control charts play a minor role in the “Holding the Gains” step of our algorithm.

In conclusion, when deciding to set up a process performance chart and then establishing and using the chart, we disagree with Wheeler’s assertion that conducting an initial measurement system assessment is “overhead” without value. We think that knowing the measurement system variability and some idea about its stability over time is valuable information to isolate assignable causes and improve the process.

For further discussion of the effect of measurement error on the performance of a Shewhart chart, see “Effect of Measurement Error on Shewhart Control Charts,” by K. W. Linna and W. H. Woodall (Journal of Quality Technology, 2001, p. 213–222).


About The Authors

Stefan H. Steiner’s picture

Stefan H. Steiner

Stefan H. Steiner is a professor in the Statistics and Actuarial Science department and director of the Business and Industrial Statistics Research Group at the University of Waterloo, Canada.

R. Jock MacKay’s picture

R. Jock MacKay

R. Jock MacKay (retired) is an adjunct professor in the Statistics and Actuarial Science department at the University of Waterloo, Canada


Different interpretation

Interesting article.  I enjoy reading alternative viewpoints on this subject.

My interpretation of Wheeler's discussion of intraclass correlation and measurement systems differs.  I understood his argument as prioritizing process behavior charts over measurement system analyses.  He demonstrated that a poor - but stable - measurement system is still capable of detecting "signals of exceptional variation" provided the charts are constructed and interpreted properly.  Wheeler goes on to imply that the absence of process signals should prompt the investigation of the measurement system.

To me, the implicit point of Wheeler's discussion was that in the reality of limited time and money, one is better off charting the process measurement data to identify the "signals of exceptional variation" so they can be dealt with.  The process behavior charts are robust enough to overcome measurement error in the presence of exceptional variation.  Once the process is stable (homogeneous), a deeper investigation of the measurement system may be in order.

Good Summary

BILLS, I could not have said it better or more consicely.

Thank you!


Right or wrong, congratulations on your boldness in attacking the world's leading expert on quality.  Too often people are afraid to speak up.  Only by stepping forward into open discussion, can quality progress.  Very much looking forward to Dr Wheeler's enlightening reply.  No matter who proves correct, we will all learn.