{domain:"www.qualitydigest.com",server:"169.47.211.87"} Skip to main content

        
User account menu
Main navigation
  • Topics
    • Customer Care
    • FDA Compliance
    • Healthcare
    • Innovation
    • Lean
    • Management
    • Metrology
    • Operations
    • Risk Management
    • Roadshow
    • Six Sigma
    • Standards
    • Statistics
    • Supply Chain
    • Sustainability
    • Training
  • Videos/Webinars
    • All videos
    • Product Demos
    • Webinars
  • Advertise
    • Advertise
    • Submit B2B Press Release
    • Write for us
  • Metrology Hub
  • Training
  • Subscribe
  • Log in
Mobile Menu
  • Home
  • Topics
    • 3D Metrology-CMSC
    • Customer Care
    • FDA Compliance
    • Healthcare
    • Innovation
    • Lean
    • Management
    • Metrology
    • Operations
    • Risk Management
    • Roadshow
    • Six Sigma
    • Standards
    • Statistics
    • Supply Chain
    • Sustainability
    • Training
  • Login / Subscribe
  • More...
    • All Features
    • All News
    • All Videos
    • Training

Find the Signals by Filtering Out the Noise

The foundation of modern statistics

Pawel Czerwinski / Unsplash

Donald J. Wheeler
Bio
Mon, 12/08/2025 - 12:03
  • Comment
  • RSS

Social Sharing block

  • Print
Body

One of the principles for understanding data is that while some data contain signals, all data contain noise. Therefore, before you can detect the signals you’ll have to filter out the noise. This act of filtration is the essence of all data analysis techniques. It’s the foundation for our use of data and all the predictions we make based on those data. In this column, we’ll look at the mechanism used by all modern data analysis techniques to filter out the noise.

ADVERTISEMENT

Given a collection of data, it’s common to begin with the computation of some summary statistics for location and dispersion. Averages and medians are used to characterize location, while either the range statistic or the standard deviation statistic is used to characterize dispersion. This much is taught in every introductory class. However, what’s usually not taught is that the structures within our data will often create alternate ways of computing these measures of dispersion. Understanding the roles of these different methods of computation is essential for anyone who wishes to analyze data.

Perhaps the most common type of structure for a dataset is to have k subgroups of size n where the n values within each subgroup were collected under the same set of conditions. This structure is found in virtually all types of experimental data, as well as in most types of data coming from a production process. To illustrate the alternate ways of computing measures of dispersion, we’ll use a simple dataset consisting of k = 3 subgroups of size n = 8, as shown in Figure 1.


Figure 1: Dataset One


Figure 2: Method One for estimating dispersion

Method One with Dataset One

The first method of computing a measure of dispersion is the method taught in introductory classes in statistics. All of the data from the k subgroups of size n are collected into one large histogram of size nk, and a single dispersion statistic is found using all nk values. This dispersion statistic is then used to estimate a dispersion parameter such as the standard deviation for the distribution of X, SD(X).

As shown in Figure 3, the range of all 24 values is 6. The bias correction factor for ranges of 24 values is 3.895. Dividing 6 by 3.895 yields an unbiased estimate of the standard deviation of the distribution of X of 1.540.

The global standard deviation statistic is 1.551. The bias correction factor for this statistic when it’s based on 24 values is 0.9892. Dividing 1.551 by 0.9892 yields an unbiased estimate of the standard deviation of the distribution of X of 1.568.


Figure 3: Method One with Dataset One

Since the original data are given to the nearest whole number, there is no practical difference between the two estimates of SD(X) shown in Figure 3. Whether we use the range statistic or the standard deviation statistic will not substantially affect our analysis.

Method Two with Dataset One

While Method One ignores the subgroups, Method Two respects the subgroup structure within the data. Here, we calculate a dispersion statistic for each subgroup. These separate dispersion statistics are then averaged, and the average dispersion statistic is used to form an unbiased estimate for the standard deviation parameter of the distribution of X.


Figure 4: Method Two for estimating dispersion

Using Dataset One, we compute a dispersion statistic for each of the three subgroups. Because the subgroups are all the same size, we can average the statistics prior to dividing by the common bias correction factor.

As shown in Figure 5, the subgroup ranges are respectively 5, 5, and 3. The average range is 4.333, and the bias correction factor for ranges of eight data is 2.847. Dividing 4.333 by 2.847, we estimate the standard deviation for the distribution of X to be 1.522.

The subgroup standard deviation statistics are respectively 1.690, 1.690, and 1.195. The average standard deviation statistic is 1.525, and the bias correction factor is 0.9650. Dividing 1.525 by 0.9650, we estimate the standard deviation for the distribution of X to be 1.580.


Figure 5: Method Two with Dataset One

As before, there’s no practical difference between the two estimates shown in Figure 5. Neither is there any practical difference between the estimates in Figure 3 and those in Figure 5. The four estimates obtained using the two different measures of dispersion and the two different methods are all very similar.

Method Three with Dataset One

The third method will probably seem rather strange. It’s certainly indirect. Instead of working with the individual values as the first two methods do, the third method works with the subgroup averages. These subgroup averages are used to obtain a dispersion statistic, and this dispersion statistic is then used to estimate the standard deviation parameter of the distribution of X.


Figure 6: Method Three for estimating dispersion

For Dataset One, the subgroup averages are respectively 5.0, 4.0, and 5.0. The range of these three averages is 1.00. The bias correction factor for the range of three values is 1.693. Since each of these averages represents eight original data, we’ll have to multiply by the square root of 8 and divide by the bias correction factor to estimate the standard deviation parameter for the distribution of X. When we do this with the values above, we obtain an estimate of SD(X) of 1.671.

Using the three subgroup averages, we compute a standard deviation statistic of 0.5774. Dividing by the bias correction factor of 0.8862 and multiplying by the square root of 8, we obtain an unbiased estimate of the standard deviation of the distribution of X of 1.843.


Figure 7: Method Three with Dataset One

Once again, there’s no practical difference between using the range and using the standard deviation statistic. Here, the two estimates are slightly larger than before, but not by any appreciable amount.


Figure 8: Summary of three methods for Dataset One

As summarized in Figure 8, we’ve just obtained six unbiased estimates for the standard deviation parameter for the distribution of X using three different methods and two different statistics. These six values are listed along with their coefficients of variation (c.v.). The first four unbiased estimates are all quite similar because they all have similar coefficients of variation. The last two unbiased estimates aren’t as cozy as the first four because they have much larger coefficients of variation and therefore have more uncertainty attached.

Before we attempt to draw any lesson from this example, we need to know that Dataset One has a very special property. When we place Dataset One on an average and range chart, we end up with Figure 9. There, we see no evidence of any differences between the three subgroups. Dataset One contains no signals. It’s pure noise.


Figure 9: Average and range chart for Dataset One

Therefore, at this point we can reasonably conclude that when the data are homogeneous and contain no signals, the three methods will yield similar values for unbiased estimates of SD(X) regardless of whether we use the range or the standard deviation statistic.

Dataset Two

But what happens in the presence of signals? After all, the objective is to filter out the noise so we can detect any signals that might be present. To see how signals affect our estimates of SD(X), we’ll modify Dataset One by inserting two signals. Specifically, we’ll shift subgroup two down by two units while we shift subgroup three up by four units. This will result in Dataset Two, which is shown in Figure 10. As may be seen in the average and range chart in Figure 11, these changes have introduced two distinct signals.


Figure 10: Dataset Two


Figure 11: Average and range chart for Dataset Two

Method One with Dataset Two

Method One uses all 24 values in Dataset Two to compute global measures of dispersion. As shown in Figure 12, the global range is 10.0, which results in an unbiased estimate of the standard deviation parameter of 2.567. The global standard deviation statistic is 3.279, which gives an unbiased estimate of the standard deviation parameter of 3.315.


Figure 12: Method One with Dataset Two

Method Two with Dataset Two

Using Method Two, we compute a dispersion statistic for each of the three subgroups. Because the subgroups are all the same size, we can average the statistics prior to dividing by the common bias correction factor. As shown in Figure 13, the average range is 4.333, and the bias correction factor for ranges of eight data is 2.847. Dividing 4.333 by 2.847, we estimate the standard deviation for the distribution of X to be 1.522.

The average standard deviation statistic is 1.525, and the bias correction factor is 0.9650. Dividing 1.525 by 0.9650, we estimate the standard deviation for the distribution of X to be 1.580.


Figure 13: Method Two with Dataset Two

The Method Two estimates of SD(X) for Dataset Two are exactly the same as those obtained for Dataset One in Figure 5. Thus, the Method Two estimates are not affected by the signals introduced by shifting the subgroup averages.

Method Three with Dataset Two

For Dataset Two, the subgroup averages are respectively 5.0, 2.0, and 9.0. The range of these three averages is 7.00. The bias correction factor for the range of three values is 1.693. Since each of these averages represents eight original data, we’ll have to multiply by the square root of 8 and divide by the bias correction factor to estimate the standard deviation parameter for the distribution of X. When we do this with the values above, we obtain an estimate of SD(X) of 11.693.

The standard deviation statistic for the three subgroup averages is 3.512. Dividing by the bias correction factor of 0.8862 and multiplying by the square root of 8, we obtain an unbiased estimate of the standard deviation of the distribution of X of 11.209.


Figure 14: Method Three with Dataset Two

These Method Three estimates of SD(X) are seven times larger than values found in Figure 7. Thus, the signals introduced by shifting the subgroup averages have severely inflated both of the Method Three estimates.

When we summarize the results of the three methods with Dataset Two, we get the table in Figure 15. We’ve obtained six unbiased estimates of SD(X) using three different methods and two different statistics, yet these six values differ by almost an order of magnitude.


Figure 15: Summary of three methods for Dataset Two

The differences left to right in Figure 15 show the effects of using the different dispersion statistics. The differences top to bottom reveal the differences due to using the different methods. Clearly, the differences left to right pale in comparison with those top to bottom. The key to filtering out the noise so we can detect the signals doesn’t depend on whether we use the standard deviation statistic or the range, but rather on which method we employ to compute that dispersion statistic.

Method One estimates of dispersion are commonly known as the total variation or the overall variation. Method One is used for description. It implicitly assumes that the data are globally homogeneous. When the data aren’t globally homogeneous, this method will be inflated by the signals contained within the data, and the value obtained will no longer estimate SD(X).


Figure 16: Total or overall variation

Method Two estimates of dispersion are commonly known as the within-subgroup variation. Method Two is used for analysis. Whenever we seek to filter out the noise to detect signals, we use Method Two to establish the filter. Method Two implicitly assumes that the data are homogeneous within the subgroups, but it places no requirement of homogeneity on the different subgroups. Thus, even when the subgroups differ, Method Two will provide a useful estimate of SD(X).


Figure 17: Within-subgroup variation

Method Three estimates of dispersion are commonly known as the between-subgroup variation. Method Three is used for comparison purposes. It assumes that the subgroup averages are globally homogeneous. When Method Three is computed, it’s generally compared with Method Two, the idea being that any signals present in the data will affect Method Three more than they affect Method Two. When the subgroups differ, Method Three will not provide an estimate of SD(X).


Figure 18: Between-subgroup variation

Separating the signals from the noise

The essence of every statistical analysis is the separation of the signals from the noise. We want to find the signals so we can use this knowledge constructively. We want to ignore the noise where there’s nothing to be learned. To this end, we begin by filtering out the noise. And for the past 100 years, the standard technique for filtering out the noise has been Method Two. To illustrate this point, Figure 19 shows the average chart for Dataset Two with limits computed using each of the three methods. Only Method Two correctly identifies the two signals we deliberately buried in Dataset Two.

So when it comes to filtering out the noise, you have a choice between Method Two, Method Two, or Method Two. Any method is right as long as it’s Method Two!

Method One is inappropriate for filtering out the noise because it gets inflated by the signals. Method One has always been wrong for analysis, and it will always be wrong. Trying to use Method One for analysis is so wrong that it has a name. It’s known as Quetelet’s Fallacy, and it’s the reason there was so little progress in statistical analysis in the 19th century.


Figure 19: Average charts for Dataset Two

Method Three is completely inappropriate for filtering out the noise because it will be severely inflated in the presence of signals. If you use Method Three to filter out the noise, you’ll have to wait a very long time before you detect a signal. So, while there are analysis techniques that make use of the Method Three (between subgroup) estimate of dispersion, they do so only to compare it with a Method Two (within subgroup) estimate of dispersion.

Thus, the foundation of all modern data analysis techniques is the use of Method Two to filter out the noise. This is the foundation for the analysis of variance. This is the foundation for the analysis of means. And this is the foundation for Shewhart’s process behavior charts. Ignore this foundation and you’ll undermine your whole analysis.

Many analysis techniques from the 19th century, such as Franklin Pierce’s test for outliers, are built on the use of Method One to filter out the noise. As may be seen in Figure 19, this approach will let you occasionally detect a signal. But it will cause you to miss other signals.

In fact, many techniques developed in the 20th century also suffer from Quetelet’s Fallacy. Among these are Grubb’s test for outliers, the Levey-Jennings control chart, and the Tukey control chart. Moreover, virtually every piece of statistical software available today allows the user to choose Method One for creating control charts and performing various other statistical tests. Nevertheless, this error on the part of naive programmers doesn’t make it right or even acceptable to use Method One for analysis.

So while there are proper uses of Method One and Method Three, they are never appropriate for filtering out the noise. The only correct method for filtering out the noise is Method Two. Understanding this point is the beginning of competence for every data analyst.

You now know the difference between modern data analysis techniques and naive analysis techniques. Naive techniques use Method One or Method Three to filter out the noise. Today, all sorts of new naive techniques are being created by those who know no better. Let the user beware.

To help with this problem of identifying naive techniques, Figure 20 contains a listing of 27 of the more commonly encountered within-subgroup estimators of both the standard deviation parameter and the variance parameter. There, we see the hallmark of the within-subgroup approach: Each estimator is based on either the average or the median of a collection of k within-subgroup measures of dispersion. Method One and Method Three each use a single measure of dispersion.

Now you know the importance of using the right method, and you know what the right method will look like in practice. Although this may be more than you ever wanted to know about statistics, it’s essential knowledge for all who seek to understand their data.


Figure 20: Some within-subgroup estimators

This article is based on material found in Advanced Topics in Statistical Process Control (second edition, 2004, SPC Press). Used with permission.

Donald J. Wheeler’s complete “Understanding SPC” seminar may be streamed for free; for details, see spcpress.com.

Add new comment

The content of this field is kept private and will not be shown publicly.
About text formats
Image CAPTCHA
Enter the characters shown in the image.

© 2025 Quality Digest. Copyright on content held by Quality Digest or by individual authors. Contact Quality Digest for reprint information.
“Quality Digest" is a trademark owned by Quality Circle Institute Inc.

footer
  • Home
  • Print QD: 1995-2008
  • Print QD: 2008-2009
  • Videos
  • Privacy Policy
  • Write for us
footer second menu
  • Subscribe to Quality Digest
  • About Us