PROMISE: Our kitties will never sit on top of content. Please turn off your ad blocker for our site.

puuuuuuurrrrrrrrrrrr

Statistics

Published: Wednesday, August 23, 2017 - 11:03

It’s been a while since I’ve written about statistics. So in this column, I will be looking at the rules of three and five. These are heuristics, or rules of thumb, that can help us out. They are associated with sample sizes.

Let’s assume that you are looking at a binomial event (pass or fail). You took 30 samples and tested them to see how many passes or failures you get. The results yielded no failures. Then, based on the rule of 3, you can state that at a 95-percent confidence level, the upper bound for a failure is 3/30 = 10%; in other words the reliability is at least 90 percent. The rule is written as:

p = 3/n

where p is the upper bound of failure, and n is the sample size.

Thus, if you used 300 samples, then you could state with 95-percent confidence that the process is *at least *99-percent reliable based on p = 3/300 = 1%. Another way to express this is to say that with 95-percent confidence, fewer than 1 in 100 units will fail under the same conditions.

This rule can be derived from using binomial distribution. The 95-percent confidence comes from the alpha value of 0.05. The calculated value from the rule-of-three formula gets more accurate with a sample size of 20 or more.

I came across the rule of five in Douglas Hubbard’s informative book, *How to Measure Anything* (Wiley, third edition 2014). Hubbard states the rule of five as: “There is a 93.75% chance that the median of a population is between the smallest and largest values in any random sample of five from that population.”

This is a really neat heuristic because you can actually tell a lot from a sample size of five. The median is the 50th percentile value of a population, the point where half of the population is above it, and half of the population is below it. Hubbard points out the probability of picking a value above or below the median is 50 percent—the same as a coin toss. Thus, we can calculate that the probability of getting five heads in a row is 0.5^5 or 3.125%. This would be the same for getting five tails in a row.

Then the probability of not getting all heads or all tails is (100 – (3.125+3.125)) or 93.75%. Thus, we can state that the chance of one value out of five being above the median *and* at least one value below the median is 93.75 percent.

Readers should keep in mind that both of the rules require the use of randomly selected samples. The rule of three is a version of Bayes’ Success Run Theorem and Wilk’s One-sided Tolerance calculation. I invite readers to check out my articles, “Relationship Between AQL/RQL and Reliability/Confidence,” “Reliability/Confidence Level Calculator (With c = 0, 1....., n),” and “Wilk’s One-Sided Tolerance Spreadsheet,” which shed more light on this.

When we are utilizing random samples to represent a population, we are calculating a statistic—a representation value of the parameter value. A statistic is an estimate of the parameter, which is the true value from a population. The higher the sample size used, the better the statistic can represent the parameter, and the better your estimation will be.

I will finish with a story based on chance and probability:

It was the day of the final exam, and an undergraduate psychology major was totally hung over from the previous night. He was somewhat relieved to find that the exam was a true/false test. He had taken a basic stat course and did remember his professor once performing a coin-flipping experiment. In a moment of clarity, he decided to flip a coin he had in his pocket to determine the answer for each question. The psychology professor watched the student for the entire two hours of the exam as he was flipping the coin. . . writing the answer... flipping the coin... writing the answer, on and on.

At the end of the two hours, everyone else had left the room except for this one student. The professor walked up to his desk and angrily interrupted the student, saying: “Listen, it is obvious that you did not study for this exam since you didn’t even open the question booklet. If you are just flipping a coin for your answer, why is it taking you so long?”

The stunned student looked up at the professor and replied bitterly (still flipping the coin): “Shhh! I am checking my answers!”

Always keep on learning....