© 2014 Quality Digest Magazine. Copyright on content held by Quality Digest or by individual authors. Contact Quality Digest for reprint information.

Published on *Quality Digest* (http://www.qualitydigest.com)

Hint: They’re not about adding and subtracting

**Published: **12/14/2011

It’s all too easy to make mistakes involving statistics. Statistical software can remove a lot of the difficulty surrounding statistical calculation, reducing the risk of mathematical errors, but correctly interpreting the results of an analysis can be even more challenging.

A few years ago, Minitab trainers compiled a list of common statistical mistakes, the ones they encountered repeatedly. Being somewhat math-phobic myself, I expected these mistakes would be primarily mathematical. I was wrong: *Every *mistake on their list involved either the incorrect interpretation of the results of an analysis, or a design flaw that made meaningful analysis impossible.

Here are three of their most commonly observed mistakes that involve drawing an incorrect conclusion from the results of analysis. (I’m sorry to say that, yes, I have made all three of these mistakes at least once.)

**Mistake No. 1:** Not distinguishing between statistical significance and practical significance

It’s important to remember that using statistics, we can find a statistically significant difference that has no discernible effect in the “real world.” In other words, just because a difference *exists *doesn’t make the difference *important*. And you can waste a lot of time and money trying to “correct” a statistically significant difference that doesn't matter.

Let’s say you love Tastee-O’s cereal. The factory that makes them weighs every cereal box at the end of the filling line, using an automated measuring system. Say that 18,000 boxes are filled per shift, with a target fill weight of 360 grams and a standard deviation of 2.5 grams.

Using statistics, the factory can detect a shift of 0.06 grams in the mean fill weight 90 percent of the time. But just because that 0.06 gram shift is statistically significant doesn’t mean it’s practically significant. A 0.06 gram difference probably amounts to two or three Tastee-O’s—not enough to make you, the customer, notice or care.

In most hypothesis tests, we know that the null hypothesis is not *exactly* true. In this case, we don’t expect the mean fill weight to be precisely 360 grams; we are just trying to see if there is a *meaningful* difference. Instead of a hypothesis test, the cereal maker could use a confidence interval to see how large the difference might be and decide if action is needed.

**Mistake No. 2:** Stating that you’ve proved the null hypothesis

In a hypothesis test, you pose a null hypothesis (H0) and an alternative hypothesis (H1). Then you collect data, analyze them, and use statistics to assess whether or not the data support the alternative hypothesis. A p-value above 0.05 indicates “there is not enough evidence to conclude H1 at the 95-percent confidence level.”

In other words, we can reject the alternative hypothesis, but the null hypothesis may or *may not* be true.

For example, we could flip a fair coin three times and test:

H0: Proportion of heads = 0.40

H1: Proportion of heads ≠ 0.40

In this case, we are guaranteed to get a p-value higher than 0.05. Therefore we cannot conclude H1. But not being able to conclude H1 doesn’t prove that H0 is correct or true. This is why we say we “fail to reject” the null hypothesis, rather than we “accept” the null hypothesis.

**Mistake No. 3:** Assuming correlation = causation

Simply put, correlation is a linear association between two variables. For example, a house’s size and its price tend to be highly correlated: larger houses have higher prices, while smaller houses have lower prices.

But while it’s tempting to observe the linear relationship between two variables and conclude that a change in one is causing a change in the other, that’s not necessarily so: Statistical evidence of correlation is not evidence of causation.

Consider this example: Data analysis has shown a strong correlation between ice cream sales and murder rates. When ice cream sales are low, the murder rate is low. When ice cream sales are high, the murder rate is high.

So could we conclude that ice cream sales lead to murder? Or vice versa? Of course not! This is a perfect example of correlation not equaling causation. Yes, the murder rate and ice cream sales *are *correlated. During the summer months, both are high. During the winter months, both are low. So when you think beyond the correlation, the data suggest not that the murder rate and ice cream sales affect each other, but rather that both are affected by *another *factor: the weather.

If you’ve ever misinterpreted the significance of a correlation between variables, at least you’ve got company: The media is rife with examples of news stories that equate correlation and causation—especially when it comes to the effects of diet, exercise, chemicals, and other factors on our health.

Have you ever jumped to the wrong conclusion after looking at statistics?

**Links:**

[1] http://www.minitab.com/uploadedFiles/Shared_Resources/Documents/Articles/not_accepting_null_hypothesis.pdf