Featured Product
This Week in Quality Digest Live
Six Sigma Features
Harish Jose
Why and how vs. what
Scott A. Hindle
Why is it important to keep the process stable?
Shobhendu Prabhakar
The tool we love to ignore
Harish Jose
Lean inspiration from a short-story master
Dirk Dusharme @ Quality Digest
Being small, thinking big, catching on fire. We're sure they're connected.

More Features

Six Sigma News
Version 3.1 increases flexibility and ease of use with expanded data formatting features
The FDA wants medical device manufactures to succeed, new technologies in supply chain managment
Provides accurate visual representations of the plan-do-study-act cycle
SQCpack and GAGEpack offer a comprehensive approach to improving product quality and consistency
Customized visual dashboards by Visual Workplace help measure performance
Helps manufacturers by focusing on problems and problem resolution in real time
Ask questions, exchange ideas and best practices, share product tips, discuss challenges in quality improvement initiatives
Says capitalization gives false impression that Six Sigma is more significant than other methodologies
His influence on the methodology can’t be denied

More News

Eston Martz

Six Sigma

Lessons From a Statistical Analysis Gone Wrong, Part 1

Watching horses and eating crow

Published: Monday, July 27, 2015 - 12:32

I don’t like the taste of crow, which is a shame, because I’m about to eat a huge helping of it.

I’m going to tell you how I messed up an analysis. But in the process, I learned some new lessons and was reminded of some older ones I should remember to apply more carefully.

This failure starts in a victory


Photo of American Pharoah used under Creative Commons license 2.0. Source: Maryland GovPics

My mistake originated in the 2015 Triple Crown victory of American Pharoah. I’m no racing enthusiast, but I knew this horse had ended almost four decades of Triple Crown disappointments, and that was exciting. I’d never seen a Triple Crown won before. It hadn’t happened since 1978.

So when an acquaintance asked to contribute a guest post to the Minitab Blog that compared American Pharoah with previous Triple Crown contenders, including the record-shattering Secretariat, who took the Triple Crown in 1973, I eagerly accepted.

In reviewing the post, I checked and replicated the contributor’s analysis. It was a fun post, and I was excited about publishing it. A few days after it went live, however I had to remove it, because the analysis was not acceptable.

To explain how I made my mistake, I’ll need to review that analysis.

Comparing American Pharoah and Secretariat

In the post, we used Minitab's statistical software to compare Secretariat’s performance to other winners of Triple Crown races.

Since 1926, the Belmont Stakes has been the longest of the three races at 1.5 miles. The analysis began by charting 89 years of winning horse times:

Only two data points were outside of the I-chart’s control limits:
• The fastest winner, Secretariat’s 1973 time of 144 seconds
• The slowest winner, High Echelon’s 1970 time of 154 seconds

The average winning time was 148.81 seconds, which Secretariat beat by more than 4 seconds.

Applying a capability approach to the race data

Next, the analysis approached the data from a capability perspective: Secretariat’s time was used as a lower spec limit, and the analysis sought to assess the probability of another horse beating that time.

The way you assess capability depends on the distribution of your data, and a normality test in Minitab showed this data to be non-normal.

When you run Minitab’s normal capability analysis, you can elect to apply the Johnson transformation, which can automatically transform many non-normal distributions before the capability analysis is performed. This is an extremely convenient feature, but here is where I made my mistake.

Running the capability analysis with Johnson transformation, using Secretariat’s 144-second time as a lower spec limit, produced the following output:

The analysis found a 0.36-percent chance of any horse beating Secretariat’s time, making it very unlikely indeed.

The same method was applied to data from the other two events making up horse racing’s Triple Crown, the Kentucky Derby and the Preakness:

We found a 5.54-percent chance of a horse beating Secretariat’s Kentucky Derby time.

We found a 3.5-percent probability of a horse beating Secretariat’s Preakness time.

Despite the billions of dollars and countless time and effort spent trying to make thoroughbred horses faster during the past 43 years, no one has yet beaten “Big Red,” as Secretariat was known. The analysis, therefore, indicated that although American Pharoah may be a great horse, he is no Secretariat.

That conclusion may well be true, but it turns out we can’t use this analysis to make that assertion.

My mistake is discovered, and the analysis unravels

Here’s where I start chewing those crow feathers. A day or so after sharing the post about American Pharoah, a reader sent the following comment:
 “Why does Minitab allow a Johnson Transformation on this data when using Quality Tools > Capability Analysis > Normal > Transform, but does not allow a transformation when using Quality Tools > Johnson Transformation? Or could I be doing something wrong?”

Interesting question. In all honestly, it hadn’t even occurred to me to try to run the Johnson transformation on the data by itself. If the Johnson Transformation worked when performed as part of the capability analysis, though, it ought to work when applied outside of that analysis, too.

I suspected the person who asked this question might have just checked a wrong option in the dialog box, so I tried running the Johnson Transformation on the data by itself.

The following note appeared in Minitab’s session window:

Uh oh.

Our reader hadn’t done anything wrong, but it was looking like I made an error somewhere. But where?

I’ll show you exactly where I made my mistake in my next column.

Discuss

About The Author

Eston Martz’s picture

Eston Martz

For Eston Martz, analyzing data is an extremely powerful tool that helps us understand the world—which is why statistics is central to quality improvement methods such as lean and Six Sigma. While working as a writer, Martz began to appreciate the beauty in a robust, thorough analysis and wanted to learn more. To the astonishment of his friends, he started a master’s degree in applied statistics. Since joining Minitab, Martz has learned that a lot of people feel the same way about statistics as he used to. That’s why he writes for Minitab’s blog: “I’ve overcome the fear of statistics and acquired a real passion for it,” says Martz. “And if I can learn to understand and apply statistics, so can you.”