Featured Product
This Week in Quality Digest Live
Six Sigma Features
Mark Rosenthal
The intersection between Toyota kata and VSM
Scott A. Hindle
Part 7 of our series on statistical process control in the digital era
Adam Grant
Wharton’s Adam Grant discusses unlocking hidden potential
Scott A. Hindle
Part 6 of our series on SPC in a digital era
Douglas C. Fair
Part 5 of our series on statistical process control in the digital era

More Features

Six Sigma News
Helps managers integrate statistical insights into daily operations
How to use Minitab statistical functions to improve business processes
Sept. 28–29, 2022, at the MassMutual Center in Springfield, MA
Elsmar Cove is a leading forum for quality and standards compliance
Is the future of quality management actually business management?
Too often process enhancements occur in silos where there is little positive impact on the big picture
Collect measurements, visual defect information, simple Go/No-Go situations from any online device
Good quality is adding an average of 11 percent to organizations’ revenue growth

More News

Eston Martz

Six Sigma

Lessons From a Statistical Analysis Gone Wrong, Part 1

Watching horses and eating crow

Published: Monday, July 27, 2015 - 11:32

I don’t like the taste of crow, which is a shame, because I’m about to eat a huge helping of it.

I’m going to tell you how I messed up an analysis. But in the process, I learned some new lessons and was reminded of some older ones I should remember to apply more carefully.

This failure starts in a victory

Photo of American Pharoah used under Creative Commons license 2.0. Source: Maryland GovPics

My mistake originated in the 2015 Triple Crown victory of American Pharoah. I’m no racing enthusiast, but I knew this horse had ended almost four decades of Triple Crown disappointments, and that was exciting. I’d never seen a Triple Crown won before. It hadn’t happened since 1978.

So when an acquaintance asked to contribute a guest post to the Minitab Blog that compared American Pharoah with previous Triple Crown contenders, including the record-shattering Secretariat, who took the Triple Crown in 1973, I eagerly accepted.

In reviewing the post, I checked and replicated the contributor’s analysis. It was a fun post, and I was excited about publishing it. A few days after it went live, however I had to remove it, because the analysis was not acceptable.

To explain how I made my mistake, I’ll need to review that analysis.

Comparing American Pharoah and Secretariat

In the post, we used Minitab's statistical software to compare Secretariat’s performance to other winners of Triple Crown races.

Since 1926, the Belmont Stakes has been the longest of the three races at 1.5 miles. The analysis began by charting 89 years of winning horse times:

Only two data points were outside of the I-chart’s control limits:
• The fastest winner, Secretariat’s 1973 time of 144 seconds
• The slowest winner, High Echelon’s 1970 time of 154 seconds

The average winning time was 148.81 seconds, which Secretariat beat by more than 4 seconds.

Applying a capability approach to the race data

Next, the analysis approached the data from a capability perspective: Secretariat’s time was used as a lower spec limit, and the analysis sought to assess the probability of another horse beating that time.

The way you assess capability depends on the distribution of your data, and a normality test in Minitab showed this data to be non-normal.

When you run Minitab’s normal capability analysis, you can elect to apply the Johnson transformation, which can automatically transform many non-normal distributions before the capability analysis is performed. This is an extremely convenient feature, but here is where I made my mistake.

Running the capability analysis with Johnson transformation, using Secretariat’s 144-second time as a lower spec limit, produced the following output:

The analysis found a 0.36-percent chance of any horse beating Secretariat’s time, making it very unlikely indeed.

The same method was applied to data from the other two events making up horse racing’s Triple Crown, the Kentucky Derby and the Preakness:

We found a 5.54-percent chance of a horse beating Secretariat’s Kentucky Derby time.

We found a 3.5-percent probability of a horse beating Secretariat’s Preakness time.

Despite the billions of dollars and countless time and effort spent trying to make thoroughbred horses faster during the past 43 years, no one has yet beaten “Big Red,” as Secretariat was known. The analysis, therefore, indicated that although American Pharoah may be a great horse, he is no Secretariat.

That conclusion may well be true, but it turns out we can’t use this analysis to make that assertion.

My mistake is discovered, and the analysis unravels

Here’s where I start chewing those crow feathers. A day or so after sharing the post about American Pharoah, a reader sent the following comment:
 “Why does Minitab allow a Johnson Transformation on this data when using Quality Tools > Capability Analysis > Normal > Transform, but does not allow a transformation when using Quality Tools > Johnson Transformation? Or could I be doing something wrong?”

Interesting question. In all honestly, it hadn’t even occurred to me to try to run the Johnson transformation on the data by itself. If the Johnson Transformation worked when performed as part of the capability analysis, though, it ought to work when applied outside of that analysis, too.

I suspected the person who asked this question might have just checked a wrong option in the dialog box, so I tried running the Johnson Transformation on the data by itself.

The following note appeared in Minitab’s session window:

Uh oh.

Our reader hadn’t done anything wrong, but it was looking like I made an error somewhere. But where?

I’ll show you exactly where I made my mistake in my next column.


About The Author

Eston Martz’s picture

Eston Martz

For Eston Martz, analyzing data is an extremely powerful tool that helps us understand the world—which is why statistics is central to quality improvement methods such as lean and Six Sigma. While working as a writer, Martz began to appreciate the beauty in a robust, thorough analysis and wanted to learn more. To the astonishment of his friends, he started a master’s degree in applied statistics. Since joining Minitab, Martz has learned that a lot of people feel the same way about statistics as he used to. That’s why he writes for Minitab’s blog: “I’ve overcome the fear of statistics and acquired a real passion for it,” says Martz. “And if I can learn to understand and apply statistics, so can you.”