Cost for QD employees to rent an apartment in Chico, CA. $1,200/month. Please turn off your ad blocker in Quality Digest
Our landlords thank you.
Davis Balestracci
Published: Wednesday, June 14, 2017 - 11:03 My last column mentioned how doctors and hospitals are currently being victimized with draconian reactions to rankings, either interpreted literally or filtered through the results of some type of statistical analysis. Besides the potential serious financial consequences of using rankings in the current craze of “pay for performance,” many hard-working people are stigmatized inappropriately through what has been called a “public blaming and shaming” strategy. Is it any wonder why many physicians are so angry these days? Rankings with alleged helpful feedback to those allegedly needing it are also used as a cost-cutting measure to identify and motivate alleged poor performers. Many are analyzed and interpreted using an analysis that, based on courses people have taken, intuitively feels appropriate, but should actually be avoided at all costs. In an effort to reduce unnecessary expensive prescriptions, a pharmacy administrator developed a proposal to monitor and compare individual physicians’ tendencies to prescribe the most expensive drug within a class. Data were obtained for each of a peer group of 51 physicians; specifically, the total number of prescriptions written and, of that number, how many were for the targeted drug. Someone was kind enough to send me this proposal, which included the data—while begging me not to be identified as the source. I quote it verbatim (adding my emphases). Given the 51 physician results: “1. Data will be tested for the normal distribution. “2. If distribution is normal—physicians whose prescribing deviates greater than one or two standard deviations from the mean are identified as outliers. “3. If distribution is not normal—examine distribution of data and establish an arbitrary cutoff point above which physicians should receive feedback (this cutoff point is subjective and variable based on the distribution of ratio data).” For my own amusement, I tested the data for normality and it “passed” (p-value of 0.277, which is > 0.05). Yes, I said “for my own amusement” because this test is moot and inappropriate for percentage data like this (the number of prescriptions in the denominators ranged from 30 to 217). The computer will do anything you want. The scary issue here is the proposed ensuing analysis that will result from whether the data are deemed normally distributed or not. If data are normally distributed, doesn’t that mean there are no outliers? But suppose outliers are present—doesn’t this mean they’re atypical? In fact, wouldn’t their presence tend to inflate the traditional calculation of standard deviation? But wait, the data passed the normality test. It’s all so confusing! Yet that doesn’t seem to stop our quality police from lowering the “Gotcha!” threshold to two or even one standard deviation to find outliers. In my experience, I am shocked at the extent to which this has become common practice. Returning to the protocol, even scarier is what’s proposed if the distribution is deemed not normal: establish an arbitrary cutoff point for either what the administrator feels performance should be, or the point that will expose a pre-determined arbitrary percentage (ending in “0” or “5,” of course) of alleged bad performers and/or reward a similar arbitrary percentage of good performers. I’ll play his game. Because the data pass the normality test, the graph below shows the suggested analysis with one, two, and three standard deviation lines drawn in around the mean. The standard deviation of the 51 numbers was 10.7. Depending on the analyst’s mood and the standard deviation criterion subjectively selected, he or she could claim to statistically find one—or 10—“high utilizers” who would receive helpful feedback. Just curious: How does the analyst intend to deal with the 10 performances below the one standard deviation limit of 5.15 percent—and the three zeroes? He or she could have just as easily decided that “less than 15 percent” should be a standard, resulting in 27 physician high utilizers who would receive feedback. There is also the common alternative arbitrary strategy: Let’s go after... oops, I mean give feedback to... the—pick one—top quartile, top 10 percent, top 15 percent, top 20 percent. Another option would be to set a tough stretch goal of “less than 10 percent,” with the following choices: The high-utilizer feedback was a thick packet of professional journal articles considered the gold standard of evidence-based practices and rationale. When I present this example and its proposed actions to a roomful of doctors, they erupt in laughter. When I ask what they do with such feedback, without fail, I see a beautifully synchronized collective pantomime of throwing things into the garbage. For those of you in education, government, manufacturing, or administration, is this scenario similar to many conversations you routinely experience in any meetings you attend? How much waste in time, money, and morale do analyses and resulting meetings like this cost you? “Unknown or unknowable?” (Does it matter?) Much of this results from teaching people what Donald Wheeler calls “superstitious nonsense” in the guise of statistics (especially that relating to the normal distribution). Most such material is pretty much useless when it comes to application in a real-world, everyday environment and causes far more confusion and problems than it solves. Is it possible to change those conversations to make them more productive? More about that next time when I revisit these data. Quality Digest does not charge readers for its content. We believe that industry news is important for you to do your job, and Quality Digest supports businesses of all types. However, someone has to pay for this content. And that’s where advertising comes in. Most people consider ads a nuisance, but they do serve a useful function besides allowing media companies to stay afloat. They keep you aware of new products and services relevant to your industry. All ads in Quality Digest apply directly to products and services that most of our readers need. You won’t see automobile or health supplement ads. So please consider turning off your ad blocker for our site. Thanks, Davis Balestracci is a past chair of ASQ’s statistics division. He has synthesized W. Edwards Deming’s philosophy as Deming intended—as an approach to leadership—in the second edition of Data Sanity (Medical Group Management Association, 2015), with a foreword by Donald Berwick, M.D. Shipped free or as an ebook, Data Sanity offers a new way of thinking using a common organizational language based in process and understanding variation (data sanity), applied to everyday data and management. It also integrates Balestracci’s 20 years of studying organizational psychology into an “improvement as built in” approach as opposed to most current “quality as bolt-on” programs. Balestracci would love to wake up your conferences with his dynamic style and entertaining insights into the places where process, statistics, organizational culture, and quality meet.What Do ‘Above’ and ‘Below’ Average Really Mean?
Not knowing can have very serious consequences
A real example
And the consequences? Pick one…
• Financially reward the 16 physicians below 10 percent?
• Perhaps offer a bonus for those below the one standard deviation threshold of 5.15 percent?
• You could reward the bottom quartile (or 10 or 15 percent), which, along with the previous scheme, would no doubt cause displeasure among the doctors below 10 percent who didn’t get rewarded.
• Should everyone above 10 percent receive feedback? Should there be a financial penalty for a certain level above 10 percent?
Our PROMISE: Quality Digest only displays static ads that never overlay or cover up content. They never get in your way. They are there for you to read, or not.
Quality Digest Discuss
About The Author
Davis Balestracci
© 2023 Quality Digest. Copyright on content held by Quality Digest or by individual authors. Contact Quality Digest for reprint information.
“Quality Digest" is a trademark owned by Quality Circle Institute, Inc.
Comments
Whether or not you understand statistics.....
A statement in a previous Balestracci article sums up the situation"
"Whether or not you understand statistics, you are already using statistics!"
This a real problem and with sometimes dire and even deadly consequences.
As early as 1916, Walter A. Shewhart began to think about and do something about "data sanity."
Great to see Davis Balestracci expand on Shewhart's work. Is anybody listening? We darned well better!
"In the summer of 1916, Walter worked with the Western Electric Co. in New York (an integral part of the Bell Telephone system). He was examined by the medical officers of the firm who all thought he was suffering from tuberculosis. All tests were negative, but he had small fluctuations of temperature which seven medical men accepted as convincing proof of tuberculosis. Walter then started taking temperatures of other people, and found that his own father had similar fluctuations although he had never been ill except only once in his life. Walter discovered that most people have temperature fluctuations; so that his own case could not possibly be considered as abnormal, and he decided not to worry any further about T.B. This clearly was an authentic early piece of work in Quality Control."
Walter A. Shewhart and Statistical Quality Control in India Author(s): P. C. Mahalanobis Source: Sankhyā: The Indian Journal of Statistics (1933-1960), Vol. 9, No. 1 (Oct., 1948), pp. 51-60 Published by: Indian Statistical Institute
Ugh!
Based on what you cite as the proposal, the pharmacy administrator demonstrates a profound lack of understanding of variation. (I suspect there wasn't much curiosity to understand the system that generated the data, either.) In the span of about 15 seconds I went through disbelief, anger, and disgust as I recognize the very real impact decisions based on such superstitious nonsense have on people's lives.
You keep doing what you do, Davis. I am following in Dr. Wheeler's and your footsteps to debunk these methods, and help those open to listening with better analyses.
All the best, Shrikant Kalegaonkar (https://twitter.com/shrikale or https://shrikale.wordpress.com)
Distributed folk
Reminds me of the fact that in the general population, 50% of people are below average intelligence. Those who argue are usually in this group. (Median is approximately equal to mean for IQ. Yes, and it's pretty normally distributed. ) I had an aquaintaince once, who was very excited that she scored 99% in an IQ test. Then we those who have IQ between 0 and 25 who are idiots; IQs between 26 and 50 are considered imbeciles; and those who have an IQ between 51 and 70 are considered morons. Does this mean it is better to be considered a moron than the village idiot?
Still, I would prefer my physician to be a genius rather than an idiot.