{domain:"www.qualitydigest.com",server:"169.47.211.87"} Skip to main content

User account menu
Main navigation
  • Topics
    • Customer Care
    • FDA Compliance
    • Healthcare
    • Innovation
    • Lean
    • Management
    • Metrology
    • Operations
    • Risk Management
    • Six Sigma
    • Standards
    • Statistics
    • Supply Chain
    • Sustainability
    • Training
  • Videos/Webinars
    • All videos
    • Product Demos
    • Webinars
  • Advertise
    • Advertise
    • Submit B2B Press Release
    • Write for us
  • Metrology Hub
  • Training
  • Subscribe
  • Log in
Mobile Menu
  • Home
  • Topics
    • 3D Metrology-CMSC
    • Customer Care
    • FDA Compliance
    • Healthcare
    • Innovation
    • Lean
    • Management
    • Metrology
    • Operations
    • Risk Management
    • Six Sigma
    • Standards
    • Statistics
    • Supply Chain
    • Sustainability
    • Training
  • Login / Subscribe
  • More...
    • All Features
    • All News
    • All Videos
    • Contact
    • Training

Simpson’s Paradox and How to Avoid Its Effects

Always look at the many layers of data.

Smita Skrivanek
Mon, 06/21/2010 - 09:04
  • Comment
  • RSS

Social Sharing block

  • Print
  • Add new comment
Body

In the January 2010 issue of MoreNews, we discussed “Simpson’s Paradox,” a well-known phenomenon that can distort causal relationships in data sets in the presence of a confounder or covariate. In this article, we will talk about some practical ways to guard against becoming the victim of this insidious effect.

ADVERTISEMENT

Simpson’s Paradox is the name given to the phenomenon where the direction of an effect is reversed when you take into account a previously ignored (lurking) variable that significantly affects the relationship.

An example of the paradox

Let’s elaborate on the definition with an example. You’re in charge of a study that compares how two weight-loss techniques—diet and exercise—affect the weight loss of overweight patients. Overall, you had 240 patients participate in the study, with 120 assigned to a weight-loss diet and the remaining 120 assigned to a supervised exercise regimen.

 …

Want to continue?
Log in or create a FREE account.
Enter your username or email address
Enter the password that accompanies your username.
By logging in you agree to receive communication from Quality Digest. Privacy Policy.
Create a FREE account
Forgot My Password

Comments

Submitted by Karen Homa on Fri, 06/25/2010 - 14:01

Yes but not statistical significant

Confounding is an important concept so thanks for the article. I'm not sure how you got to the conclusion that 58% is "significantly different" from 48%. Using a chi-square test to see if these two proportions were significantly different with 120 subjects in each group results in a p-value of 0.093 (a p-value of less than 0.05 is consider significant). If we double the subjects to 240 in each group (keep the weight loss proportions the same) then the p-value is 0.017. Proportions may look significantly different but it all depends on the number of the denominator. A way to see if confounding was an issue is to use a logistic regression model. Running the logistic regression model with just weight loss (dependent variable as yes or no; coded 1 or 0) and the group (diet or exercisers; coded 1 or 0) similar to the chi-square test, you get a significant p-value of 0.018 (with 240 subjects in each group). When you run the model again including the additional variable of BMI, the p-value for the group variable is not significant - it is 0.069. The model adjusts for the inequality of BMI and shows that BMI was a confounder and the diet and exercise groups really lost weight at similar rates.

  • Reply

Submitted by Aldous Wong on Fri, 07/02/2010 - 13:09

How to display Simpson's Paradox graphically?

I would recommend the graph shown by Howard Wainer in Chapter 10 Two Mind-Bending Statistical Paradoxes of the book Graphic Discovery. Since the graph can't be pasted here, I will describe the construction.

It is line graph with Y axis = % of losing weight and X = proportion of "BMI >40" (or X = proportion of "30 < BMI < 40")

First line: "Exercise" with end points: (0%, 27.5%), (100%, 87%)
Second line: "Diet" with end points: (0%, 25%), (100%, 75%)

The graph has the appearance of 2-factor interaction plot in DOE.

Overlaying on these 2 lines are two points:
(1) on the Exercise line: (33%, 48%) - 48% is the result of the study group consistsing of 33% (=40/120) of "BMI > 40"
(2) on the Diet line: (67%, 58%) - 58% is the result of the study group consisting of 67% (= 80/120) of "BMI > 40"

Now the mystery is clear. Although Exercise line is clearly above the Diet line, results of the 2 studies showed Diet is better because the group involved in the Diet study had more "BMI >40". It is also clear that the 48% of the Exercise result is just the weighted average of 27.5% and 87%.

  • Reply

Add new comment

Image CAPTCHA
Enter the characters shown in the image.
Please login to comment.
      

© 2025 Quality Digest. Copyright on content held by Quality Digest or by individual authors. Contact Quality Digest for reprint information.
“Quality Digest" is a trademark owned by Quality Circle Institute Inc.

footer
  • Home
  • Print QD: 1995-2008
  • Print QD: 2008-2009
  • Videos
  • Privacy Policy
  • Write for us
footer second menu
  • Subscribe to Quality Digest
  • About Us
  • Contact Us