Featured Video
This Week in Quality Digest Live
Statistics Features
Anthony D. Burns
Why has it taken so long to understand that processes need analytic methods, not enumerative ones?
Cheryl Pammer
Using intervals to get at the tail ends of the problem
Rip Stauffer
It helps to build a table
Donald J. Wheeler
The gap between computations and reality
Mike Richman
A conversation with Neil Polhemus

More Features

Statistics News
SQCpack and GAGEpack offer a comprehensive approach to improving product quality and consistency
Ask questions, exchange ideas and best practices, share product tips, discuss challenges in quality improvement initiatives
Strategic investment positions EtQ to accelerate innovation efforts and growth strategy
Satisfaction with federal government reaches a four-year high after three years of decline
TVs and video players lead the pack, with internet services at the bottom
Using big data to identify where improvements will have the greatest impact
Includes all the tools to comply with quality standards and reduce variability
A free, systematic comparison of upcoming changes to the ISO 9001:2008 standard

More News

Harish Jose


Rules of Three and Five

Tips for sample sizes

Published: Wednesday, August 23, 2017 - 12:03

It’s been a while since I’ve written about statistics. So in this column, I will be looking at the rules of three and five. These are heuristics, or rules of thumb, that can help us out. They are associated with sample sizes.

Rule of three

Let’s assume that you are looking at a binomial event (pass or fail). You took 30 samples and tested them to see how many passes or failures you get. The results yielded no failures. Then, based on the rule of 3, you can state that at a 95-percent confidence level, the upper bound for a failure is 3/30 = 10%; in other words the reliability is at least 90 percent. The rule is written as:

p = 3/n

where p is the upper bound of failure, and n is the sample size.

Thus, if you used 300 samples, then you could state with 95-percent confidence that the process is at least 99-percent reliable based on p = 3/300 = 1%. Another way to express this is to say that with 95-percent confidence, fewer than 1 in 100 units will fail under the same conditions.

This rule can be derived from using binomial distribution. The 95-percent confidence comes from the alpha value of 0.05. The calculated value from the rule-of-three formula gets more accurate with a sample size of 20 or more.

Rule of five

I came across the rule of five in Douglas Hubbard’s informative book, How to Measure Anything (Wiley, third edition 2014). Hubbard states the rule of five as: “There is a 93.75% chance that the median of a population is between the smallest and largest values in any random sample of five from that population.”

This is a really neat heuristic because you can actually tell a lot from a sample size of five. The median is the 50th percentile value of a population, the point where half of the population is above it, and half of the population is below it. Hubbard points out the probability of picking a value above or below the median is 50 percent—the same as a coin toss. Thus, we can calculate that the probability of getting five heads in a row is 0.5^5 or 3.125%. This would be the same for getting five tails in a row.

Then the probability of not getting all heads or all tails is (100 – (3.125+3.125)) or 93.75%. Thus, we can state that the chance of one value out of five being above the median and at least one value below the median is 93.75 percent.

Final words

Readers should keep in mind that both of the rules require the use of randomly selected samples. The rule of three is a version of Bayes’ Success Run Theorem and Wilk’s One-sided Tolerance calculation. I invite readers to check out my articles, “Relationship Between AQL/RQL and Reliability/Confidence,” “Reliability/Confidence Level Calculator (With c = 0, 1....., n),” and “Wilk’s One-Sided Tolerance Spreadsheet,” which shed more light on this.

When we are utilizing random samples to represent a population, we are calculating a statistic—a representation value of the parameter value. A statistic is an estimate of the parameter, which is the true value from a population. The higher the sample size used, the better the statistic can represent the parameter, and the better your estimation will be.

I will finish with a story based on chance and probability:
It was the day of the final exam, and an undergraduate psychology major was totally hung over from the previous night. He was somewhat relieved to find that the exam was a true/false test. He had taken a basic stat course and did remember his professor once performing a coin-flipping experiment. In a moment of clarity, he decided to flip a coin he had in his pocket to determine the answer for each question. The psychology professor watched the student for the entire two hours of the exam as he was flipping the coin. . . writing the answer... flipping the coin... writing the answer, on and on.

At the end of the two hours, everyone else had left the room except for this one student. The professor walked up to his desk and angrily interrupted the student, saying: “Listen, it is obvious that you did not study for this exam since you didn’t even open the question booklet. If you are just flipping a coin for your answer, why is it taking you so long?”

The stunned student looked up at the professor and replied bitterly (still flipping the coin): “Shhh! I am checking my answers!”

Always keep on learning....


About The Author

Harish Jose’s picture

Harish Jose

Harish Jose has more than seven years experience in the medical device field. He is a graduate of the University of Missouri-Rolla (U.S.), where he obtained a master’s degree in manufacturing engineering and published two articles. Harish is an ASQ member with multiple ASQ certifications, including Quality Engineer, Six Sigma Black Belt, and Reliability Engineer. He is a subject matter expert in lean, data science, database programming, and industrial experiments. Harish publishes frequently on his blog harishnotebook. He can be reached on LinkedIn.