Davis Balestracci  |  08/01/2008

Analyzing Rare Occurrences of Events

When to use Fisher’s Exact Test

Eighty-four doctors treated 2,973 patients, and an undesirable incident occurred in 13 of the treatments (11 doctors with one incident and one doctor with two incidents), a rate of 0.437 percent. A p-chart analysis of means (ANOM) for these data is shown in figure 1.

This analysis is dubious. A good rule of thumb: Multiplying the overall average rate by the number of cases for an individual should yield the possibility of at least five cases. Each doctor would need 1,000 cases to even begin to come close to this!

The table in figure 2 uses the technique discussed in last month’s column, “A Handy Technique to Have in Your Back Pocket,” calculating both “uncorrected” and “corrected” chi-square. Similar to the philosophy of ANOM, I take each doctor’s performance out of the aggregate and compare it to those remaining to see whether they are statistically different. For example, in figure 2, during the first doctor’s performance, one patient in the 199 patient treatments had the incident occur. So, I compared his rate of 1/199 to the remaining 12/2,774.

Things break down very quickly as the denominator size decreases, especially the gap between the “uncorrected” and “corrected” chi-square values.

With data like these, one has no option but to use the technique known as Fisher’s Exact Test (available in most good statistical packages). Its resulting p-value is shown in the far right column of figure 2. Using the example of the doctor with one incident out of 199 patients, one has to ask, “If I have a population where 13 out of 2,973 patients experienced an incident, and if I grabbed a random sample of 199 of these 2,973 patients, what is the probability that I would have at least one patient who had an incident?” As you can see in figure 2, in the first row of the Fisher’s exact test column, it is 0.594 (~ 60%)--not unusual.

Figure 3 sets up the calculation for the only doctor for whom two patients had the event occur (out of 14 patients). So, one is comparing 2/14 vs. 11/2,959. One now has to calculate the exact probabilities of randomly obtaining zero (p 0) and one (p 1) event in a random sample of 14, then calculating (1 - (p 0 + p 1)) to answer, “What is the probability of obtaining two or more events in this sample due to sheer randomness?” As you see from the table in figure 2, it is 0.0016 (~0.2%).

The question now becomes, “What constitutes an outlier?” To put things in perspective, I’m going to use the technique discussed in my February 2006 column, “Why Three Standard Deviations?” to see what the threshold of probability might be for overall risks of 0.05 and 0.10 (one-tailed).

In this case of 84 simultaneous decisions:

Overall 5-percent risk - p < 0.00061 to declare “significance”

Overall 10-percent risk - p < 0.00125


Only the 2/14 is close when compared with these criteria, but barely at the 10-percent risk level.

There are never any easy answers when rates of rare adverse events regarding human life are being compared and someone’s professional reputation is at stake. At least choose the correct analysis, and beware of packaged “easy answers.”




About The Author

Davis Balestracci’s picture

Davis Balestracci

Davis Balestracci is a past chair of ASQ’s statistics division. He has synthesized W. Edwards Deming’s philosophy as Deming intended—as an approach to leadership—in the second edition of Data Sanity (Medical Group Management Association, 2015), with a foreword by Donald Berwick, M.D. Shipped free or as an ebook, Data Sanity offers a new way of thinking using a common organizational language based in process and understanding variation (data sanity), applied to everyday data and management. It also integrates Balestracci’s 20 years of studying organizational psychology into an “improvement as built in” approach as opposed to most current “quality as bolt-on” programs. Balestracci would love to wake up your conferences with his dynamic style and entertaining insights into the places where process, statistics, organizational culture, and quality meet.