Customer satisfaction surveys are all the rage these days. Healthcare has a couple of “800 pound gorilla” surveyors whose services (and nontrivial expense) have been pretty much forced upon it. In many cases, targets are set that are used to drive reimbursement.
ADVERTISEMENT |
A client once shared a 186-page quarterly summary from one such vendor (most of it worthless; I gave up at page 28). I noticed two frequent tendencies: 1) when current percentile performance was less than 50th, the number was color-coded red (obviously nobody wants to be below average); 2) next to their number was what they called a “trend” showing the latest three months’ performances as a line graph.
Let’s suppose a system with two hospitals has its quarterly meeting to assess patient satisfaction progress, and the following (real) data are presented. “Top box” means, on a scale of 1 to 10, percent (9s + 10s). I’m sure there were red, yellow, green targets, but I have no idea what they were, and frankly, I don’t care.
How do you compare these performances, and what action should be taken?
What might you do quietly behind the scenes?
With a little digging, you find the last four years of monthly data and plot run charts...
...which pretty much render the previous data tables and trend analyses worthless. For at least the past year, both hospitals have exhibited common cause performance, i.e., nothing has changed for either. In the case of Hospital 1, nothing has changed in four years! Hospital 2 exhibited a “needle bump” from month 25 to month 26.
I’ve had experience with several such data sets. The amount of inherent month-to-month variation is incredible. But, hey, don’t take my word for it. Here are the data replotted as control charts, with Hospital 2’s data adjusted for the step change at month 26:
You will notice that these two hospitals currently have identical overall performance (i.e., average). Also, individual monthly performances of top-box percentages between 65 percent and 90 percent can occur simply due to common cause.
• If this 25-percent common cause range encompasses arbitrary red, yellow, and green goals, it just adds to the futility and unwitting damage caused by using summaries such as those at the beginning of this column.
The percentile rank is virtually worthless as an isolated one-time measure (common cause range of pretty much zero to 100 percent). But the overall average of a stable period could have some use.
• Do you see a problem if organizations set a green goal of 90th percentile (unfortunately, typical)? This is a case where using a traffic light display could have very toxic consequences.
• In this case, both hospitals currently average in the 60th percentile—slightly above average—and have been performing at that level for at least the past two years.
I’m curious: If there has been no change in satisfaction in more than a year, why is this organization continuing expensive data collection to tell it the same thing? I guess it’s important for them to know, “Did the TV call button work?”
Beware of soliciting and implementing “good ideas.” Isn’t that what’s probably been going on for the past year (and longer) in the two hospitals above?
The result? Someone will react to charts showing no improvement in almost two years by suggesting the tired, “Maybe it’s time to have a refresher in customer satisfaction training and hold people accountable.” Ah, yes, the classic, “Smile or else!”
• Statistics on performance by themselves do not improve performance.
• There is no such thing as “improving customer satisfaction” in general.
• Vague solutions to a vague problem will yield vague results.
Focus... focus... focus... what is the 20 percent of your organization that is causing 80 percent of your customer dissatisfaction?
You can now quietly seek out several past summaries over a stable period (demonstrated by chart behavior), which in this case would be almost two years. One can apply the Pareto principle by tallying bottom box scores, or use the aggregated performance of appropriately grouped questions to do some p-chart analyses of means. This could expose what aspects of customer satisfaction have excessive negative scores. You could also break it down by, say, diagnostic services (e.g., lab, X-ray), in-patient surgical, one-day surgery, or department. Here’s where innovation can apply—a much better use of energy rather that having to find out, “Why did we go from the 77th percentile to the 44th percentile?” (Hospital 1 in table above.)
Can you think of one... or two... or three such opportunities in your current work? Wouldn’t that make your work life far more interesting? What if you got a reputation for getting results that stop futile, unproductive reactions, and dreaded meetings, to data such as what those presented at the beginning of this column?
Is conscious improvement starting to sound more feasible?
Comments
Stable results
When discussing customer satisfaction surveys I've heard similar suggestions that there is no point in collecting the data if the results are stable:
I see your larger point about monitoring not changing anything, but I don't understand why results that indicate no change would be less valuable than those showing an improvement or decline.
I agree with you
Thanks for commenting. Deming: "Statistics on performance do not improve performance." My point is that if all you're going to do with the result is react the same way you always do, you're going to get the same result. Doing another expensive survey is waste -- you may already have a wealth of information in the most recent stable history. IF possible, why not take some of the data already collected to see whether you can aggregate enough to stratify down to a focused opportunity (as I suggest).
If all you have is the ranking itself and not the actual survey data, you have no idea what to do and need to do some temporary collection that digs into process inputs. If that's the case, put your money there for now rather than yet another survey that gives you the same answer.
Subsequent survey results can show whether you've "bumped the needle" with any intervention.
This is a very typical scenario. In such cases, people tend to react to the latest result with no idea of the true context of variation. Right now, percentile ranking is a number that makes executives perspire ("Can't be below 50th !") and even drives some reimbursement. The charts show that the number in islolation is worthless -- the average of its history on the other hand lets you make a relatively accurate estimate of your ranking, which is about as far as its usefulness goes. But, as I see it currently practiced, it's sheer crazy-making. It's like someone thinking that weighing themselves 10 times a day is going to decrease their weight.
Thanks
That's a useful way of thinking about it. Thanks for the thoughtful and in-depth response.
Add new comment