In my March 7, 2012, column, “An Elegantly Simple but Counterintuitive Approach to Analysis,” I suggested the necessity to formally assess the stability of the process producing any data—a step not usually taught in most academic statistics courses. This is done by plotting the data in their naturally occurring time order with the median put in as a reference line—known as a run chart.
This simple technique is more likely to stimulate appropriate initial dialogue than more complicated statistical analysis. It also shows the power of plotting samples calculated more frequently—monthly, as opposed to, say, quarterly or every six months. If volumes were high enough, weekly could even be used.
Here is the (alleged) summary of three hospitals’ mortality performance from my previous column:
|N||Mean||SE Mean||St Dev||Min||Q1||Median||Q3||Max|
Can you envision a meeting with this result among pages and pages of similar one-point summaries of other key performance indicators? Perhaps in the ubiquitous “red-yellow-green” stoplight format, with a few pretty bar graphs, and maybe a few trend lines thrown in for good measure?
Would you prefer to make a decision with one-point summaries or time plots (i.e., run charts) of monthly results, similar to the following examples from my earlier column? They both result in follow-up actions, but which type of summary allows the more appropriate follow-up actions?
Regarding common computer-generated statistics: After considering the attached run charts, what do the averages of Hospital 1 and Hospital 2 mean? I’ll tell you: If I stick my right foot in a bucket of boiling water and my left foot in a bucket of ice water, on the average, I’m pretty comfortable.
Utilizing run charts offers a much higher-yield strategy: generating new conversations, especially about Hospitals 1 and 2, via different types of questions from those typically asked in the past:
• What changes occurred during this time?
• Have any beneficial gains been made and held?
• Is Hospital 3 even capable of achieving desired goals consistently?
• What is different about these three processes and resulting practice environments?
• What is truly different about their results, and are these differences appropriate?
There is great power in using smaller samples taken more frequently over time. But what about the process that produces these data? Ignoring the time element implicit in every data set can lead to incorrect statistical conclusions. This introduces the concept of “analytic” statistics, which are process-oriented, with a goal of predicting future process results; vs. “enumerative” statistics, which are summary-oriented, with a goal of accurately estimating the current state under the assumption that it won’t change.
The latter are generally taught in most basic statistical requirements, yet they have limited applicability in an improvement context where potential for unstable processes are the rule and not the exception. And it is only analytic statistics that can expose and deal appropriately with the variation causing the process “instability.” As hinted above, the enumerative framework ignores its presence.
Note that the ultimate conclusion reached by the local statistical guru was that one would need to “benchmark” a “cutting-edge” hospital’s result and copy the “best practices” discovered. However, in looking at Hospital 2, one sees two distinct shifts in the data, as well as a current performance averaging around 2.3 percent. What if one could study aspects of Hospital 2’s process, determine the reasons for the shifts, and implement appropriate process changes made more recently systemwide?
Or are differences due to Hospital 1 acquiring some state-of-the-art cardiac technology—with the system implication that the other two hospitals would send them more complicated cases? In which case, maybe that might appropriately explain Hospital 2’s recent drop in mortality and Hospital 1’s increase in same.
But wait a minute, if Hospital 3 were doing that as well, why didn’t we observe a corresponding drop in its mortality? Could Hospital 3 learn some things from Hospital 2? Or is it that they are not appropriately referring difficult cases to Hospital 1?
Isn’t asking questions like these an easier solution than implementing a systemwide redesign based on copying best practices from somewhere else? That is, assuming that the benchmarking process deciding whom to copy was even carried out appropriately. As shown, there might even be no need to benchmark an outside organization. Even if done well, this is always fraught with encountering the tired, predictable “not invented here” or “we’re different” syndromes upon your return.
Regardless, note that what gets these more productive conversations started is plotting the dots!
The statistics needed for improvement are far easier than ever imagined. However, this philosophy in which to use them—i.e., statistical thinking—seems to be quite counterintuitive to many statistical practitioners, very counterintuitive to the cultures they work with, and fiercely resisted by executives. Many times, it is hidden in the middle of “belt” training among all the rarely used techniques it usually invalidates.
My respected colleague, Dr. Donald Wheeler, paraphrases two rules of Walter Shewhart for the presentation of data:
1. Data should always be presented in such a way that preserves the evidence in the data for all the predictions that might be made from these data.
2. Whenever an average, range, or histogram is used to summarize data, the summary should not mislead the user into taking any action that the user would not take if the data were presented in a time series.
If nothing else, the statistical thinking approach will at least make many jobs easier by freeing up a lot of time recognizing when to walk out of time-wasting meetings. It will also help practitioners gain the cultural respect they deserve as improvement professionals because data collections and analyses will be simpler and more efficient—and ultimately more effective. The respect will also be magnified because of the practitioners’ ability to recognize and stop inappropriate, yet well-meaning responses to variation—responses that make people’s jobs more complicated and time-consuming without adding any value to the organization.
Almost all improvement experts agree that merely plotting a process’s output over time is one of the most simple, elegant, and awesome tools for gaining deep understanding of any situation. Even before plotting, one must ask questions, clarify objectives, contemplate action, and review current use of the data.
Questioning from this statistical-thinking perspective leads immediately to unexpected and deeper understanding of the process. The end results will also be valuable baselines for key processes and honest dialogue to determine meaningful goals and action.
Contrast this approach to the more typical one of imposing arbitrary numerical goals of desired performance. These are then retrofitted onto the process and enforced by exhortation that treats any deviation of process performance from the goal as unique and needing explanation—known as a special-cause strategy.
And these days there’s no escaping benchmarking.