



© 2022 Quality Digest. Copyright on content held by Quality Digest or by individual authors. Contact Quality Digest for reprint information.
“Quality Digest" is a trademark owned by Quality Circle Institute, Inc.
Published: 01/11/2012
One of the cornerstones of quality and lean Six Sigma is data: “We insist on it.” “Don’t tell us what you think the situation is; let the data do the talking.” “In God we trust—all others bring data.” You get the idea.
An unfortunate side effect of this emphasis is the proliferation of useless data. If the useless data weren’t used, then collecting the data would merely be a waste of time. But if a person’s performance is being measured by these data, you can bet your last euro that the measurements will get a lot of attention, and it will drive a lot of behavior. And if the system doesn’t change, there’s still one way to make the measurements look better: cheat.
I often open my face-to-face training sessions with Dr. Deming’s Red Bead Experiment. It’s a great icebreaker, and it introduces some important statistical ideas. The experiment is actually a game with very simple rules. “Willing workers” are required to use a paddle with holes in it to sample beads from a container that has red and white beads in it. “We don’t want any red beads,” the workers are told. To drive the point home, there are “quality Inspectors” to check the samples for the unwanted red beads and to record the results, and “supervisors” to use the results to “coach” and discipline the hapless willing workers.
Before the game concludes, there are always participants who, seeing a bunch of red beads on their paddle, quickly dump the sample back before the count can be made. Others deliberately pick out red beads and throw them back. Still others bring partially filled paddles to the quality inspectors. There are all manner of ways to try and beat the system. And this is just a fun game, played for no stakes at all. Imagine what people do when real consequences are on the line, such as pay and promotions.
The most serious games are probably played in totalitarian countries, where factory managers are measured and sometimes executed when the results are less than required by the authorities. According to the History Learning Site, in Stalin’s Russia:
Factories took to inflating their production figures and the products produced were frequently so poor that they could not be used—even if the factory producing those goods appeared to be meeting its target. The punishment for failure was severe.
In the book Eat the Rich (Atlantic Monthly Press, 1999), author P. J. O’Rourke tells us that in the former USSR:
The trouble wasn’t that factory managers disobeyed orders. The trouble was that they obeyed them precisely. If a shoe factory was told to produce 1,000 shoes, it produced 1,000 baby shoes because they were the cheapest and easiest to make. If it was told to produce 1,000 men’s shoes, it made them all one size. If it was told to produce 1,000 shoes in a variety for men, women, and children, it produced 998 baby shoes, one pump, and a wingtip. If it was told to produce 3,000 pounds of shoes, it produced one enormous pair of concrete sneakers.
Perhaps O’Rourke is exaggerating, but the point is still essentially valid: Metrics can—and probably will—be gamed. In lean Six Sigma there’s a common metric gaming activity that I call Denominator Improvement. One of the most popular metrics is defects per million opportunities, or DPMOs. The formula itself is quite simple: DPMO = 1,000,000 x Defects/Opportunities. If someone’s performance is being measured using DPMOs, he can make the metric look better by reducing defects (the numerator), or by increasing the number of opportunities (the denominator).
For example, we might be interested in the number of typing errors in this post. The DPMO metric might be 1,000,000 x Errors/Total Words. But if this number didn’t look good enough, I might also use 1,000,000 x Errors/Total Letters, or 1,000,000 x Errors/Total Characters, counting spaces and punctuation.
The solution to metrics gaming is to use metrics to guide improvement, not to measure the performance of people. Metrics should be limited to those numbers that quantify an important outcome (Y metrics), or quantify an input that is critical to the quality of the outcome (a CTQ or X metric). The reason for quantifying these things is to discover, validate, and use a transfer function—such as Y = f(x), a model of the cause-and-effect relationship—to guide improvement planning and activity. When metrics serve a useful purpose such as this, the tendency to manipulate and game them is, if not eliminated, at least reduced.
Links:
[1] http://www.historylearningsite.co.uk/Stalin.htm
[2] http://www.amazon.com/Eat-Rich-Treatise-Economics-ORourke/dp/0871137607