That’s fake news. Real news COSTS. Please turn off your ad blocker for our web site.
Our PROMISE: Our ads will never cover up content.
Davis Balestracci
Published: Monday, July 13, 2015 - 16:05 This is a continuation of my last column, which I’ve written to honor my late dad who loved golf. As promised, let’s look at the Masters golf tournament final four-round scores for the 55 players who survived the cut. We’ll analyze and then give it a twist based on the ongoing enumerative vs. analytic conundrum. Analyzing the four-round final scores with analysis of variance (ANOVA): S = 2.46032 It was interesting that “Round” showed significance (p < 0.05) and “Golfer” did not. Hole placements on the greens are changed for every round, and some can significantly increase a hole’s difficulty. Sometimes the weather during a round can also be a factor. Curious, I did an analysis of means (ANOM) by round: Given that the p-value was somewhat marginal, and that the ANOM criteria are more conservative, none of the rounds are flagged by ANOM (not that it matters). Below is the ANOM for the Masters golf tournament final scores (range calculated by nonparametric box plot method: 271.5 to 299.5): As you can see, Jordan Spieth, who was pretty much on fire the entire tournament, was a true champion, statistically different from the rest of the field. Applying the least significant difference criterion described in my last column for this sample—[1.96 x sqrt(2 x (4 x 6.053))]—scores that differ by 14 or less aren’t statistically different. However, because the p-value for “Golfer” (0.116) wasn’t significant, it probably should not be used, which is the recommendation of most statistical textbooks. To get a more conservative figure, I used the “Studentized” range criterion from my trusty old Statistical Methods, by George Snedecor and William Cochran (first published in 1937) for multiple comparisons. Unfortunately, the table went to only 20 comparisons maximum. But even using that, one would declare a difference only if two final scores differed by more than 25, which, as you may note, encompasses the score range of golfers No. 2 (274) through No. 55 (297) and is in line with the p-value of 0.116, hinting at no differences. Now, suppose you’re at a meeting with the purpose of ranking 55 employees from best to worst. You hand out a similar table and tell people that a difference of 14 “might” show a difference but, more conservatively, only a difference greater than 25 should be considered different. Can you envision the chaos resulting from the variation in how people would perceive and interpret the variation? I can easily imagine “little circles” being drawn and discussion about “above average” performers, “below average” performers, and “quartiles,” along with puzzlement as to who might be different from the person with the best score. Can you see how the elegant simplicity of the ANOM would frame a different, perhaps more productive, conversation? Applying the ANOM to this scenario of employee ranking, there is one “superstar” and 54 “average” employees: no one above or below average, no top-quartile, no second quartile, no third quartile... and no bottom quartile. One could almost consider the results of golfers No. 2 through No. 55 a lottery. Using the resulting variation of this tournament as an example, the four-round scores of a golfer for two consecutive tournaments can differ, due to common cause, by as much as 18. Let’s apply this to golfer No. 2 (274). The resulting score of 292 would now place him 50th in the current pack due just to common cause, which in this case means a difference between a prize of $880,000 and $25,000! How do you think the golf world would treat this difference: as common or special cause? (I hope you’ve concluded that it should be common cause.) For those of you in healthcare who are slaves to the current patient-satisfaction survey nonsense, how are your survey-to-survey changes in rankings and percentiles treated? What light could ANOM shed? If these were all the data you had, what would this analysis allow you to predict about these golfers at the next major tournament, which happened to be the U.S. Open on June 20, 2015? Recall that there are three types of statistics. For this example: Descriptive statistics: What can I say about this specific golfers’ score? Enumerative statistics: What can I say about this specific group of golfers’ scores? This was an enumerative analysis. All of these analyses and conclusions have been based on this specific data set, and action was taken on this specific group. Some of you might ask if this is a random sample. It is, of sorts: 55 elite golfers of varying ability participating in the 2015 Masters. But is it truly random? It’s hardly sampling with replacement. How does it compare with the “sample” from the U.S. Open? Of these 55 golfers, 48 participated and 16 missed the cut (in line with my last column’s ANOM result of ~76% rate of making cut). So, from that group, 32 remained and 43 different golfers who made the cut participated. Care to predict? For the 32 golfers who made the cut for both tournaments, let’s compare the results. Prepare to be surprised: Masters Golfer U.S. Open Rank difference Open score minus Masters score 1 Jordan Spieth 1 0 0 2 Phil Mickelson 64 -62 19 2 Justin Rose 27 -25 11 4 Rory McIlroy 9 -5 4 5 Hideki Matsuyama 18 -13 6 6 Paul Casey 39 -33 7 6 Dustin Johnson 2 4 -3 6 Ian Poulter 54 -48 12 9 Zach Johnson 72 -63* 15 12 Kevin Na 46 -34 6 17 Sergio Garcia 18 -1 0 19 Louis Oosthuizen 2 17 -8 19 Henrik Stenson 27 -8 1 22 Keegan Bradley 27 -5 -1 22 Angel Cabrera 64 -42 7 22 Ernie Els 54 -32 5 22 Patrick Reed 14 8 4 28 Jason Day 9 19 -7 28 Morgan Hoffman 27 1 -2 28 Webb Simpson 46 -18 1 33 Chris Kirk 75 -42 13 33 Brooks Koepka 18 15 -5 33 Ryan Palmer 52 -19 2 38 Charl Schwartzel 7 31 -11 38 Adam Scott 4 34 -12 38 John Senden 14 24 -7 38 Cameron Tringale 54 -16 2 38 Jimmy Walker 58 -20 3 46 Matt Kuchar 12 34** -9 46 Lee Westwood 50 -4 -1 48 Geoff Ogilvy 18 30 -8 49 Jason Dufner 18 31 -9 * largest drop For this group of 32, was there a significant difference between the U.S. Open scores and the Masters scores? t-test of (U.S. Open score – Masters score): Variable N Mean StDev T p Since there’s no significance, let’s treat them as two “replicates” and take the range for each golfer: Average range: 6.281 R max = 3.268 × 6.281 ~ 20 Median range: 6.0 R max = 3.865 × 6.0 ~ 23 Using the ANOVA standard deviation of 2.8: R max ~ 21 (Maximum observed difference between two scores was 19—Phil Mickelson.) If you read my last column, do you remember when I calculated that two tournament scores could differ by as much as 18? And, by the way, the standard deviation for an individual round obtained from the ANOVA for all the golfers playing four rounds was 2.80, compared to 2.46 for the Masters. For those of you who watched, this was a very frustrating course, but the Masters course is no picnic, either. For those of you who are interested, here is the ANOM for the U.S. Open final scores: Note that, unlike his unbelievable Masters performance, Jordan Spieth was not a “true” champion in this case. For those of you who watched the exciting conclusion (down to the last putt!), the last three or four holes saw three people interchanging the leader position seemingly at random. Everyone’s putting was (common cause) erratic, and Luis Oosthuizen came out of nowhere to almost win via the back door! No one was truly in “the zone.” Not much. So, we now have two individual enumerative analyses. Suppose I did similar calculations of “significant differences” for the U.S. Open and passed those out with these data in addition to the Masters analysis and data, with the goal of ranking these golfers? You’d be there until Christmas. Do you think this might apply to some of your organizational data? As Deming said, “Management is prediction.” How do you see through the heavy fog of common cause to make good management decisions? Are you as amazed as I at the amount of common cause? And why, on any given day, any professional golfer is probably capable of winning any tournament or just as easily missing the cut? But in this golf example, despite the amount of common cause, one could ask, “Are some players more consistently at the top of the leader board regardless of the tournament and competition?” Looking at data over time begs the analytic statistics question: “What can I say about the process that produced both of these tournament results, both of these groups of golfers, and the individual golfers’ results?” I can just picture my dad if I tried to explain this all to him. As if it were yesterday, I see that wrinkled brow and twinkle in his eye as he says, “Where did you come from?” I miss you, Dad. Quality Digest does not charge readers for its content. We believe that industry news is important for you to do your job, and Quality Digest supports businesses of all types. However, someone has to pay for this content. And that’s where advertising comes in. Most people consider ads a nuisance, but they do serve a useful function besides allowing media companies to stay afloat. They keep you aware of new products and services relevant to your industry. All ads in Quality Digest apply directly to products and services that most of our readers need. You won’t see automobile or health supplement ads. So please consider turning off your ad blocker for our site. Thanks, Davis Balestracci is a past chair of ASQ’s statistics division. He has synthesized W. Edwards Deming’s philosophy as Deming intended—as an approach to leadership—in the second edition of Data Sanity (Medical Group Management Association, 2015), with a foreword by Donald Berwick, M.D. Shipped free or as an ebook, Data Sanity offers a new way of thinking using a common organizational language based in process and understanding variation (data sanity), applied to everyday data and management. It also integrates Balestracci’s 20 years of studying organizational psychology into an “improvement as built in” approach as opposed to most current “quality as bolt-on” programs. Balestracci would love to wake up your conferences with his dynamic style and entertaining insights into the places where process, statistics, organizational culture, and quality meet.More Golf, Statistically
True champion or lottery winner?
Source
DF
SS
MS
F
P
Round
3
51.382
17.127
2.83
0.040
Golfer
54
420.800
7.793
1.29
0.116 (!)
Error
162
980.618
6.053
Total
219
1452.800
Be careful about declaring ‘differences’
Uncomfortable analogy?
How does this compare with the U.S. Open round three?
finish
finish
** largest gain
Diff 32 0.843750 7.903080 0.60 0.550 (No)Bottom line: What can you predict from these two analyses?
Our PROMISE: Quality Digest only displays static ads that never overlay or cover up content. They never get in your way. They are there for you to read, or not.
Quality Digest Discuss
About The Author
Davis Balestracci
© 2023 Quality Digest. Copyright on content held by Quality Digest or by individual authors. Contact Quality Digest for reprint information.
“Quality Digest" is a trademark owned by Quality Circle Institute, Inc.