Featured Product
This Week in Quality Digest Live
Statistics Features
Donald J. Wheeler
What does this ratio tell us?
Harish Jose
Any statistical statement we make should reflect our lack of knowledge
Donald J. Wheeler
How to avoid some pitfalls
Kari Miller
CAPA systems require continuous management, effectiveness checks, and support
Donald J. Wheeler
What happens when the measurement increment gets too large?

More Features

Statistics News
How to use Minitab statistical functions to improve business processes
New capability delivers deeper productivity insights to help manufacturers meet labor challenges
Day and a half workshop to learn, retain, and transfer GD&T knowledge across an organization
Elsmar Cove is a leading forum for quality and standards compliance
InfinityQS’ quality solutions have helped cold food and beverage manufacturers around the world optimize quality and safety
User friendly graphical user interface makes the R-based statistical engine easily accessible to anyone
Collect measurements, visual defect information, simple Go/No-Go situations from any online device
Good quality is adding an average of 11 percent to organizations’ revenue growth
Ability to subscribe with single-user minimum, floating license, and no long-term commitment

More News

Eston Martz

Statistics

Imprisoned by Statistics

How poor data collection and analysis sent an innocent nurse to jail

Published: Wednesday, February 24, 2016 - 16:31

If you want to convince someone that at least a basic understanding of statistics is an essential life skill, bring up the case of Lucia de Berk. Hers is a story that’s too awful to be true—except that it’s completely true.

A flawed analysis irrevocably altered de Berk’s life and kept her behind bars for a full decade, and the fact that this analysis targeted and harmed just one person makes it more frightening. When tragedy befalls many people, aggregating the harmed individuals into a faceless mass helps us cope with the horror. You can’t play the same trick on yourself when you consider a single innocent woman, sentenced to life in prison, thanks to an erroneous analysis.

The case against Lucia

It started with an infant’s unexpected death at a children’s hospital in The Hague. Administrators subsequently reviewed earlier deaths and near-death incidents, and identified nine other incidents in the previous year they believed were medically suspicious. Dutch prosecutors proceeded to press charges against pediatric nurse Lucia de Berk, who had been responsible for patient care and medication at the time of all of those incidents. In 2003, de Berk was sentenced to life in prison for the murder of four patients and the attempted murder of three.

The guilty verdict, rendered despite a glaring lack of physical or even circumstantial evidence, was based (at least in part) on a prosecution calculation that only a 1 in 342 million chance existed that a nurse’s shifts would coincide with so many suspicious incidents. “In the Lucia de B. case, statistical evidence has been of enormous importance,” a Dutch criminologist said at the time. “I do not see how one could have come to a conviction without it.” The guilty verdict was upheld on appeal, and de Berk spent the next 10 years in prison.

One in 342 million?

If an expert states that the probability of something happening by random chance is just 1 in 342 million, and you’re not a statistician, perhaps you’d be convinced those incidents did not happen by random chance.

But if you’re statistically inclined, perhaps you’d wonder how experts reached this conclusion. That’s exactly what statisticians Richard Gill and Piet Groeneboom, among others, began asking. They soon realized that the prosecution’s 1-in-342-million figure was very, very wrong.

Here’s where the case began to fall apart—and not because the situation was complicated. In fact, the problems should have been readily apparent to anyone with a solid grounding in statistics.

What prosecutors failed to ask

The first question in any analysis should be, “Can you trust your data?” In de Berk’s case, it seems nobody bothered to ask.

Richard Gill graciously attributes this to a kind of culture clash between criminal and scientific investigation. Criminal investigation begins with the assumption that a crime occurred, and proceeds to seek out evidence that identifies a suspect. A scientific approach begins by asking whether a crime was even committed.

In de Berk’s case, investigators took a decidedly nonscientific approach. In gathering data from the hospitals where she worked, they omitted incidents that didn’t involve Lucia from their totals (cherry-picking), and made arbitrary and inconsistent classifications of other incidents. Incredibly, events De Berk could not have been involved in were nonetheless attributed to her. Confirmation and selection bias were hard at work on the prosecution’s behalf.

Further, much of the “data” about events were based on individuals’ memories, which are notoriously unreliable. In a criminal investigation where witnesses know what’s being sought and may have opinions about a suspect’s guilt, relying on memories of events that happened weeks and months ago seems like it would be a particularly dubious decision. Nonetheless, the prosecution’s statistical experts deemed the data gathered under such circumstances trustworthy.

As Gill, one of the few heroes in this sordid and sorry mess, pointed out, “The statistician has to question all his clients’ assumptions and certainly not to jump to the conclusions which the client is aiming for.” Clearly, that didn’t happen here.

Even if the data had been reliable...

So the data used against de Berk didn’t pass the smell test for several reasons. But even if the data had been collected in a defensible manner, the prosecution’s statement about 1 in 342 million odds was still wrong. To arrive at that figure, the prosecution’s statistical expert multiplied p-values from three separate analyses. However, when combining those p-values the expert failed to perform necessary statistical corrections, resulting in a p-value that was far, far lower than it should have been. You can read the details about these calculations in this paper.

In fact, when statisticians, including Gill, analyzed the prosecution’s data using the proper formulas and corrected numbers, they found the odds that a nurse could experience the pattern of events exhibited in the data could have been as low as 1 in 25.

Justice prevails at last (sort of)

Even though de Berk had exhausted her appeals, thanks to the efforts of Gill and others, the courts finally reevaluated her case in light of the revised analyses. The nurse, now declared innocent of all charges, was released from prison (and quietly given an undisclosed settlement by the Dutch government). But for an innocent defendant, justice remained blind to the statistical problems in this case across 10 years and multiple appeals, during which de Berk experienced a stress-induced stroke. It’s well worth learning more about the role of statistics in her experience if you’re interested in the impact data analysis can have on one person’s life.

At a minimum, what happened to Lucia de Berk should be more than enough evidence that a better understanding of statistics could set you free. Literally.

Discuss

About The Author

Eston Martz’s picture

Eston Martz

For Eston Martz, analyzing data is an extremely powerful tool that helps us understand the world—which is why statistics is central to quality improvement methods such as lean and Six Sigma. While working as a writer, Martz began to appreciate the beauty in a robust, thorough analysis and wanted to learn more. To the astonishment of his friends, he started a master’s degree in applied statistics. Since joining Minitab, Martz has learned that a lot of people feel the same way about statistics as he used to. That’s why he writes for Minitab’s blog: “I’ve overcome the fear of statistics and acquired a real passion for it,” says Martz. “And if I can learn to understand and apply statistics, so can you.”

Comments

Another case from Great Britain

Sally Clark was convicted in 1999 of murdering two of her sons.  The two children both died suddenly as infants.   A pediatric professor testified that the odds of 2 children from the same (affluent) household dying of SIDS was 1 in 73 million.  He obtained this probability by squaring the odds of dying from SIDs for a single child (1 in 8500).  The professor got 1 in 73M by 1/(8500*8500).  Now there was  LOT wrong with this assessment, but she was convicted and stayed in prison from 1999 util January 2003 when her second re-trial ended in an acquittal.  

The forensic pathologist failed to disclose microbiological test results that indicated that the second son likely died of natural causes.

The 'odds' (actually the death rate) of 1 in 8500 was obtained by biased selection of critical factors (affluent family, non-smoking, parents got along) and the exclusion of other factors (boys succumb to SIDs more frequently than girls).  The professor also assumed (incorrectly) that the odds of 2 children from the same household dying of anything were independent.  Not so if there were a genetic component or other environmental component that resulted in their sudden death.  This is the aptly named "prosecutors fallacy", which requires that the relative odds of two competing theories be assessed.  It was actually less likely that the mother would murder both children at different times than they were to die of SIDs (or any other cause) at different times.  This fallacy also goes to the fact that 'odds' outside of the well controlled environment of gambling are not homogenously or independently distributed.  

 

Sally Clark eventually succumbed to alcoholism in 2007.  The courts ordered a review of similar cases and two other women were also released from prison for similar 'crimes' and similar mis-use of statistics.  

Shonky stats

It may seem extraordinary that people would believe such shonky stats but how many people have bothered to check the even more shonky stats behind Six Sigma's six sigma?  Companies should be embarrassed to admit they have a quality program based on such lack of quality assurance.