Featured Product
This Week in Quality Digest Live
Statistics Features
Harish Jose
How to generate an OC curve based on sample size and number of rejects
Donald J. Wheeler
Do you know what really happens in phase two?
Jody Muelaner
Keeping an eye on the big picture
Bill Snyder
How we measure innovation,and how innovation drives productivity and affects inequality
Anthony D. Burns
It’s overhyped and virtually of no benefit in production. The essential production tool is the control chart.

More Features

Statistics News
Ability to subscribe with single-user minimum, floating license, and no long-term commitment
A guide for practitioners and managers
Gain visibility into real-time quality data to improve manufacturing process efficiency, quality, and profits
Tool for nonstatisticians automatically generates models that glean insights from complex data sets
Version 3.1 increases flexibility and ease of use with expanded data formatting features
Provides accurate visual representations of the plan-do-study-act cycle
SQCpack and GAGEpack offer a comprehensive approach to improving product quality and consistency
Ask questions, exchange ideas and best practices, share product tips, discuss challenges in quality improvement initiatives
Strategic investment positions EtQ to accelerate innovation efforts and growth strategy

More News

Davis Balestracci

Statistics

‘Any Theory Is Correct in Its Own World...

‘...but the problem is that the theory may not make contact with this world’

Published: Tuesday, September 3, 2019 - 11:03

As statistical methods become more embedded in everyday organizational quality improvement efforts, I find that a key concept is often woefully misunderstood, if it is even taught at all. W. Edwards Deming distinguished between two types of statistical study, which he called “enumerative” and “analytic.”

The key need in quality improvement is that statistics should relate to reality, which then lays the foundation for a theory of using statistics (analytic). Whether you realize it or not, the perspective from which virtually all college courses and many belt courses are taught is population-based (enumerative), its purpose is estimation. 

In a real-world environment, this becomes questionable at best because everyday processes are usually not static populations. Deming was emphatic that the purpose of statistics in improvement is prediction; the question becomes, “What other knowledge beyond probability theory is needed to form a basis for action in the real world?”

Think of population-based statistics as studying a static pond, and a designed study going even further to create a custom-made pond like a swimming pool—a sanitized version of a pond, much easier to study and sample because of reduction of “nuisance” (i.e., everyday) variation. 

Beyond design of the actual study circumstances, the statistical data processes now come into play: 1) measurement definition, 2) appropriate data collection (which includes the process of choosing the sample), so that 3) any statistical analysis is appropriate, and 4) correct interpretation of the analysis results.

In a research study, the variation of each of these processes should be (and usually are) tightly controlled. This makes the application of enumerative methods valid... for the specific sample of experimental units chosen for this specific study

Ignore variation or study it?

Inevitably, as results from a study are applied, this...

...has now become this:

What was easy in a “swimming pool” environment now becomes much more complicated; the real world is more like a whitewater rapids. Not only that, but uncontrolled variation manifests in the four statistical data processes as well: 

“Random sample” has an entirely different meaning in a minimally controlled, semi-chaotic environment—it’s not possible. A good example is an everyday medical environment with patients flowing in and out. You cannot take repeated samples from the exact same population, except in rare cases.

Analytic statistics are very concerned with where and how one should sample.  

For example, we may take a group of patients who attend a particular clinic and suffer from arthritis. But the resulting sample is not necessarily a random sample of the patients who will be treated in the future at that same clinic. Still less is it a random sample of the patients who will be treated in any other clinic.

In fact, the patients who will be treated in the future will depend on choices that others have not yet made. And those choices will depend on the results of any study we are doing, and on studies by other people that may be carried out in the future. R. A. Fisher called it “a hypothetical infinite population” that neither yet exists nor ever will exist, i.e., imaginary.

And there is an additional issue of how the impact of variation in a particular environment on a theoretical result compares to what could happen in yet another environment (below), and the same is true for any benchmarking: 

The late David Kerridge, one of the world’s leading Deming thinkers, wrote:

“Suppose that we compare two antibiotics in the treatment of some infection. We conclude that one did better in our tests. How does that help us? 

“Suppose that all our testing was done in one hospital in New York in [2013]. But we may want to use the antibiotic in Africa in [2016]. It is quite possible that the best antibiotic in New York is not the same as the best in a refugee camp in Zaire. In New York the strains of bacteria may be different, and the problems of transport and storage really are different. If the antibiotic is freshly made and stored in efficient refrigerators, it may be excellent. It may not work at all if transported to a camp with poor storage facilities. 

“And even if the same antibiotic works in both places, how long will it go on working? This will depend on how carefully it is used, and how quickly resistant strains of bacteria build up.

“This may seem an extreme case, and it is. But in every application of statistics, we have to decide how far we can trust results obtained at one time and under one set of circumstances as a guide to what will happen at some other time under new circumstances.” 

Statistical theory, as it is stated in most textbooks (enumerative), simply analyzes what would happen if we took repeated, strictly random samples, from the same population under circumstances in which nothing changes with time. Enumerative analyses or studies either naively assume no possible influence of outside variation or have the luxury of tightly controlling it as part of a study’s design. 

Unfortunately, the potential influence of outside variation usually continues to be ignored even after the study. When it’s time to actually apply the result, analytic statistics’ purpose is to anticipate and formally study the manifestations of such outside, uncontrolled variation. This approach to get more information inherently improves the situation because when you understand—rather than ignorethe sources of uncertainty, you understand how to reduce it.  

In the case of medicine, analytic statistics are totally different from the clinical trial mindset in which most physicians have been taught, and in which “tight control” is an understatement. A good example of this stark contrast can be demonstrated in the case of hospital-acquired infections. Let’s say that a statistically significant result from a well-designed enumerative study has been found to eliminate them, i.e., apply the result, and you shouldnt have them. With enumerative thinking, the post-application tendency would then be to treat any occurrence of an infection (undesirable variation) as a special cause. This is why people are drowning in root cause analyses. This would be helpful in the case of an outbreak.

But in terms of everyday work, one usually has to take the view that the environment might be “perfectly designed” to have such infections, in which case a common-cause strategy would be warranted. It is only by “plotting the dots”—the basis of analytic statistics—that you will be able to distinguish between the two.

And even if one is careful enough to take this view and use the currently in vogue technique of “rapid-cycle PDSA,” (plan-do-study-act ) everyday variation rears its ugly head again.

Besides the subsequent study of the variation of the actual application process, do people even consider studying the variation in each of the four data processes? If not, there is a very real danger: People could act on the basis of interpreting variation due solely to any or all of the four data processes. When these processes are ad hoc and not formally designed, there is real danger for variation from these data processes to overshadow and cloud the study of any variation—beneficial or otherwise—caused by the actual process being tested.

In many applications I’ve observed, I barely see even cursory consideration of the environmental (cultural) effects of variation on the four data processes. The result? Vague plans leading to vague data and vague results. Wouldn’t it be easier to test the study result if the data processes had their variation minimized as part of the plan so as not to cloud the interpretation of the tested process application? This is a most nontrivial process.

“If I had to reduce my message to management to just a few words, I’d say it all has to do with reducing variation.”
—W. Edwards Deming

Discuss

About The Author

Davis Balestracci’s picture

Davis Balestracci

Davis Balestracci is a past chair of ASQ’s statistics division. He has synthesized W. Edwards Deming’s philosophy as Deming intended—as an approach to leadership—in the second edition of Data Sanity (Medical Group Management Association, 2015), with a foreword by Donald Berwick, M.D. Shipped free or as an ebook, Data Sanity offers a new way of thinking using a common organizational language based in process and understanding variation (data sanity), applied to everyday data and management. It also integrates Balestracci’s 20 years of studying organizational psychology into an “improvement as built in” approach as opposed to most current “quality as bolt-on” programs. Balestracci would love to wake up your conferences with his dynamic style and entertaining insights into the places where process, statistics, organizational culture, and quality meet.

Comments

Static vs. Dynamic Population Analysis

This is an important issue for the construction materials industry as well; where the 'population' may be consumed and replenished as fast as it may be sampled.

Static vs. Dynamic Population Analysis

This is an important issue in the construction materials industry as well where the 'population' is being consumed and replenished almost as fast as it can be sampled.

Prediction is the Problem

Good article Davis. Prediction is central to Deming's idea of Analytic Studies in statistics. The following is from the Forward to Quality Improvement through Planned Experimentation by Moen, Nolan and Provost and written by Deming in 1990:  

“…Prediction is the problem, whether we are talking about applied science, research and development, engineering, or management in industry, education, or government. The question is, what do the data tell us? How do they help us to predict?

Unfortunately, the statistical methods in textbooks and in the classroom do not tell the student that the problem in the use of data is prediction. What the student learns is how to calculate a variety of tests (t-test, F-test, chi-square, goodness of fit, etc.) in order to announce that the difference between the two methods or treatments is either significant or not significant. Unfortunately, such calculations are a mere formality. Significance or the lack of it provides no degree of belief—high, moderate, or low—about prediction of performance in the future, which is the only reason to carry out the comparison, test, or experiment in the first place.

Any symmetric function of a set of numbers almost always throws away a large portion of the information in the data. Thus, interchange of any two numbers in the calculation of the mean of a set of numbers, their variance, or their fourth moment does not change the mean, variance, or fourth moment. A statistical test is a symmetric function of the data.

In contrast, interchange of two points in a plot of points may make a big difference in the message that the data are trying to convey for prediction.

The plot of points conserves the information derived from the comparison or experiment. It is for this reason that the methods taught in this book are a major contribution to statistical methods as an aid to engineers, as well as to those in industry, education, or government who are trying to understand the meaning of figures derived from comparisons or experiments. The authors are to be commended for their contributions to statistical methods.”

                                                                                          W. Edwards Deming

                                                                                          Washington, July 14, 1990