Probability Models Don’t Generate Your Data

Don’t start your analysis by asking, “How are these data distributed?”

The number of major hurricanes in the Atlantic since 1940 (as we considered in my February column, “First, Look at the Data”) are shown as a histogram in figure 1, below. Some analysts would begin their treatment of these data by considering whether they might be distributed according to a Poisson distribution.

The 68 data in figure 1 have an average of 2.60. Using this value as the mean value for a Poisson distribution, we can carry out any one of several tests collectively known as “goodness-of-fit” tests. Skipping over the details, the results show that there’s no detectable lack of fit between the data and a Poisson distribution with a mean of 2.60. Based on this, many analysts would proceed to use techniques that are appropriate for collecting Poisson observations. For example, they might transform the data in some manner, or they might compute probability limits to use in analyzing these data. Such actions would be wrong on several levels.

…

Want to continue?

By logging in you agree to receive communication from Quality Digest. Privacy Policy.

Create a FREE account

Forgot My Password