



© 2021 Quality Digest. Copyright on content held by Quality Digest or by individual authors. Contact Quality Digest for reprint information.
“Quality Digest" is a trademark owned by Quality Circle Institute, Inc.
Published: 02/22/2021
There is a type of error that occurs when conducting statistical testing: to work very hard to correctly answer the wrong question. This error occurs during the formation of the experiment.
Despite creating a perfect null and alternative hypothesis, sometimes we are simply investigating the wrong question.
Let’s say we really want to select the best vendor for a critical component of our design. We define the best vendor as one whose solution or component is the most durable. OK, we can set up an experiment to determine which vendor provides a solution that is the most durable.
We set up and conduct a flawless hypothesis test to compare the two leading solutions. We can see very clear results. Vendor A’s solution is, statistically, significantly more durable than Vendor B’s solution.
Yet neither solution is durable enough. We should have been evaluating if either solution could meet our reliability requirements instead.
Oops.
Even if we perfectly answer a question in our work, if it’s not the right question, then the work is for naught.
Jerzy Neyman and Egon Pearson used the terminology for type I and II errors as “error of the first kind” and “errors of the second kind,” respectively. This led others to consider other types of errors, naming them “errors of the third kind,” and so forth.
In a paper published in 1947, Florence N. David, an occasional colleague of Neyman and Pearson, suggested she may have a need to extend the Neyman and Pearson sources of error to a third source by possibly “choosing the test falsely to suit the significance of the sample.”
Frederick Mosteller, in 1948, defined type III error as “correctly rejecting the null hypothesis for the wrong reason.”
Extending Mosteller’s definition, Henry Kaiser in 1966 defined such a type III error as coming to an “incorrect decision of direction following a rejected two-tailed test of hypothesis.”
Allyn Kimball, in 1957, suggested a definition close to how I consider a type III error, as “the error committed by giving the right answer to the wrong problem.”
And so on.... There is no one widely accepted definition for an error of the third kind or for type III errors. Yet for any of the above definitions, the error is one to guard against by careful consideration when designing, conducting, and analyzing statistical tests.
An obvious situation, in hindsight, is the experimenter solving the wrong problem or asking the wrong question. The cause here could be simple ignorance of sufficient information to recognize the error. Another cause could be focusing on the first or most interesting question to investigate.
Another set of situations may be the deliberate or unconscious effort to connect the experimental results to an expected outcome. This sometimes occurs when reinterpreting the results when the results don’t agree with the desired outcome.
Another set includes the process of just doing what we always have done. In this case, the experimenter may not even have a connection between the experiment and a suitable hypothesis that would enable analysis. We can do a test or experiment perfectly well, yet it has no meaningful result or influence on any future work.
Other situations exist. If you spot one or more that I missed, please add your thoughts in the comment section below.
First published on the Accendo Reliability blog.
Links:
[1] https://accendoreliability.com/beware-type-iii-error/