PROMISE: Our kitties will never sit on top of content. Please turn off your ad blocker for our site.
puuuuuuurrrrrrrrrrrr
Davis Balestracci
Published: Thursday, April 23, 2015 - 15:43 Welcome to baseball season! I always do a baseball-themed article around this time, and I found my topic after stumbling on this article recently: How accurate are umpires when calling balls and strikes? From what I understand, since 2008, home plate umpires have been electronically monitored every game and given immediate feedback on their accuracy—i.e., the number of actual balls they called as strikes, and vice versa. Using the aggregated data from the 2008–2013 seasons, the author observed that wrong calls were made 15 percent of the time (average of both rates combined), which, according to him, “is just too high.” He provided a table of umpires whose inaccuracy rate was 15 percent or higher—38 umpires out of approximately 80 (Tsk, tsk... that’s close to half of them.) He also listed the top 10 most accurate umpires. Oops—I mean the 10 umpires who happened to have the lowest rates of wrong calls. I was curious and found the data source. This site was unbelievable—you can slice and dice the data any way you want. I obtained the 2014 data for all umpires. It was displayed as the two figures below using a common presentation—left-axis data as a bar graph and right-axis data as a line graph. Individual umpires are the horizontal axis: By hovering my cursor over each “dot,” I obtained and entered the data on each umpire, converted the graphs above to p-charts, and added a third chart for the combined rates: According to this chart, umpires 7, 9, 11, 12, 14, 28, 75, and 82 had above-average mistake rates, and umpires 35, 38, 52, 71, and 78 had below-average mistake rates. According to this chart, umpires 5, 26, 55, and 85 had above-average mistake rates, and umpires 9, 28, 30, and 72 had below-average mistake rates. Note that no umpire was either good at both or poor at both. I was curious about the overall wrong call rate, so I combined them: In this chart, umpires 11, 12, 14, 33, 55, and 82 had above-average mistake rates (umpires 11, 12, and 14 appeared previously, but not in both), and umpires 15, 35, 52, 71, and 88 had below-average mistake rates (umpires 35, 52, and 71 appeared previously, but not in both). Of course there are the 10 lowest rates, but there is truly only a “top five” in accuracy (umpires 15, 35, 52, 71, and 88). How might these three p-charts change the current conversation? One might wonder whether the two individual types of errors are related based on a theory that if someone were a “bad” umpire, he would have high rates of both and vice versa. What is the correlation between the two? Correlation = –0.242 (p-value = 0.021, which is < 0.05: statistically significant... or is it?) Let’s clear things up with a scatter plot, with a trend line of course: Many people don’t realize that a trend line is an implicit regression and that any regression has at least three diagnostics. The data point at the lower right happens to be a whopping outlier, which invalidates the analysis. In fact, after eliminating that point and looking at the correlation of what remains: Correlation = –0.135 (p-value = 0.207, which is > 0.05) As Ellis Ott used to say, “First, you plot the data, then you plot the data, then you plot the data.” • 10 percent will be the top 10 percent and a different 10 percent will be the bottom 10 percent. • An arbitrary number ending in 0 or 5 percent will be the top (same number) percent, and a different (same number) percent will be the bottom (same number) percent. • 10 people will be the top 10 and 10 different people will be the bottom 10. Our ranking-obsessed society continues its quest to find the best and worst of everything. As I hope this has shown, there is no pre-set percentage of outliers—and there is also the possibility of no outliers! I remember an illustration in one of Deming’s books where he took a figure similar to my p-charts and wrote on the chart about the performances between the common cause limits: “These cannot be ranked.” Based on the given data, they are indistinguishable from each other and from the overall average. More data might shed further light—some umpires currently near either limit might now have a big enough denominator to indeed declare them above or below average. There will also be the poor person whose performance, as in the umpire analysis above, could, for example, go from 15th best (No. 15) to 15th worst (No. 75)—through no fault of his own or change in his performance—provided the others maintained their current “process” as well. What does this lack of basic knowledge about variation cost our society? How would analyses like these change conversations to make subsequent actions productive rather than the status quo of increasing confusion, conflict, complexity, and chaos? The article’s author felt that a 15-percent wrong call rate was too high. Well, it’s what the current system is perfectly designed to get. He may not like it and other people may not like it, but that’s what it is, and ranking to death won't solve a thing. And—horrors!—note that half of the umpires were above average. Until next time... Quality Digest does not charge readers for its content. We believe that industry news is important for you to do your job, and Quality Digest supports businesses of all types. However, someone has to pay for this content. And that’s where advertising comes in. Most people consider ads a nuisance, but they do serve a useful function besides allowing media companies to stay afloat. They keep you aware of new products and services relevant to your industry. All ads in Quality Digest apply directly to products and services that most of our readers need. You won’t see automobile or health supplement ads. So please consider turning off your ad blocker for our site. Thanks, Davis Balestracci is a past chair of ASQ’s statistics division. He has synthesized W. Edwards Deming’s philosophy as Deming intended—as an approach to leadership—in the second edition of Data Sanity (Medical Group Management Association, 2015), with a foreword by Donald Berwick, M.D. Shipped free or as an ebook, Data Sanity offers a new way of thinking using a common organizational language based in process and understanding variation (data sanity), applied to everyday data and management. It also integrates Balestracci’s 20 years of studying organizational psychology into an “improvement as built in” approach as opposed to most current “quality as bolt-on” programs. Balestracci would love to wake up your conferences with his dynamic style and entertaining insights into the places where process, statistics, organizational culture, and quality meet.Ah... Baseball
Kill the umpire!
The correlation is significant
Given a set of numbers...
Our PROMISE: Quality Digest only displays static ads that never overlay or cover up content. They never get in your way. They are there for you to read, or not.
Quality Digest Discuss
About The Author
Davis Balestracci
© 2023 Quality Digest. Copyright on content held by Quality Digest or by individual authors. Contact Quality Digest for reprint information.
“Quality Digest" is a trademark owned by Quality Circle Institute, Inc.