The Wisdom of David Kerridge
I discovered a wonderful unpublished paper by David and Sarah Kerridge called “Statistics and Reality” online at http://homepage.mac.com/dfkerridge/.Public/DEN/Reality.pdf [link changed from print version of this article. This is a valid link as of 11/30/2007]. Its influence on my thinking has been nothing short of profound. As statistical methods get more embedded in the everyday job of organizational quality transformation, I feel that now is the time to get us “back to basics,” because the basics are woefully misunderstood, if taught at all. Kerridge is an academic at the University of Aberdeen in Scotland, and I consider him one of the leading Deming proponents in the world today. This wisdom will be helpful for those of you struggling with the “theory of knowledge” aspect of Deming’s “profound knowledge.”
Deming distinguished between two types of statistical study, which he called “enumerative” and “analytic.” This takes into consideration the way that statistics relates to reality and lays the foundation for a theory of using statistics, which makes plain what other knowledge, beyond probability theory, is needed to form a basis for action in the real world.
As Donald Wheeler stated in one of his columns more than 10 years ago, there are three kinds of statistics:
• Descriptive: What can I say about this specific widget?
• Enumerative: What can I say about this specific group of widgets?
• Analytic: What can I say about the process that produced this specific widget or group of widgets?
Let’s suppose there’s a claim that, as a result of a new policing system, murders in a particular major city have been reduced by 27 percent, a result that would be extremely desirable if that kind of reduction could be produced in other cities, or other countries, by using the same methods. But there are a great many questions to ask before benchmarking, even if the action is to design an experiment to find out more.
Counting the number of murders in different years is an “enumerative” problem (i.e., defining “murder” and counting them for this specific city). Interpreting the change is an “analytic” problem.
Could the 27percent reduction be due to chance? If we imagine a set of constant conditions, which would lead on average to 100 murders, we can, on the simplest mathematical model ¾ Poisson counts ¾ expect the number we actually see to be anything between 70 and 130. If there were 130 murders one year, and 70 the next, many people would think that there had been a great improvement. But this could be just chance and is the least of our problems.
The murders may be related ¾ for example, due to a war between drug barons. If so, the model is wrong because it’s calculated as if the murders were independent. Or, the methods of counting might have changed from one year to the next. (Are we counting all suspicious deaths, or only cases solved?) Without knowing such facts, from these figures we can’t predict what will happen next year. So, if we want to draw the conclusion that the 27percent reduction is a “real” one, that is, one that will continue in the future, we must use knowledge about the circumstances that aren’t given by those figures alone.
We can predict with even less accuracy what would happen in a different city, or a different country. The causes of crime, or the effect of a change in policing methods, may be completely different.
This is the context of the distinction between enumerative and analytic uses of statistics. Some things can be determined by calculation alone, others require judgment or knowledge of the subject, and still others are almost unknowable. Your actions to get more information improve when you understand the sources of uncertainty, because you then understand how to reduce it.
Most mathematical statisticians state statistical problems in terms of repeated sampling from the same population. This leads to a very simple mathematical theory, but doesn’t relate to the real needs of the statistical user. You can’t take repeated samples from exactly the same population, except in rare cases.
In every application of statistics, we have to decide how far we can trust results obtained at one time, and under one set of circumstances, as a guide to what will happen at some other time, and under new circumstances. Statistical theory, as it’s stated in most textbooks, simply analyzes what would happen if we repeatedly took strictly random samples, from the same population, under circumstances in which nothing changes with time.
This does tell us something. It tells us what would happen under the most favorable imaginable circumstances. In almost all applications, we don’t want a description of the past but a prediction of the future. For this we must rely on theoretical knowledge of the subject, at least as much as on the theory of probability.
So, get your head around these concepts, and I’ll give you more of Kerridge’s wisdom next month.
Davis Balestracci is a member of the American Society for Quality and past chair of its statistics division. He would love to wake up your conferences with his dynamic style and unique, entertaining insights into the places where process, statistics, and quality meet. Visit his Web site at www.dbharmony.com.
