What About Charts for Count Data?

Count data differ from measurement data in two ways. First, count data possess a certain irreducible discreteness that measurement data do not. Second, every count must have a known "area of opportunity" to be well-defined.

With measurement data, the discreteness of the values is a matter of choice. This is not the case with count data, which are based on the occurrence of discrete events (the so-called attributes). Count data always consist of integral values. This inherent discreteness is, therefore, a characteristic of the data and can be used in establishing control charts.

The area of opportunity for any given count defines the criteria by which the count must be interpreted. Before two counts may be compared, they must have corresponding (i.e., equally sized) areas of opportunity. If the areas of opportunity are not equally sized, then the counts must be converted into rates before they can be compared effectively. The conversion from counts to rates is accomplished by dividing each count by its own area of opportunity.

These two distinctive characteristics of count data have been used to justify different approaches for calculating the control limits of attribute charts. Hence, four control charts are commonly associated with count data-the np-chart, the p-chart, the c-chart and the u-chart. However, all four charts are for individual values.

The only difference between an XmR chart and an np-chart, p-chart, c-chart or u-chart is the way they measure dispersion. For any given set of count data, the X-chart and the four types of charts mentioned previously will show the same running records and central lines. The only difference between these charts will be the method used to compute the distance from the central line to the control limits.

The np-, p-, c- and u-charts all assume that the dispersion is a function of the location. That is, they assume that SD(X) is a function of MEAN(X). The application of the relationship between the parameters of a theoretical probability distribution must be justified by establishing a set of conditions. When the conditions are satisfied, the probability model is likely to approximate the behavior of the counts when the process displays a reasonable degree of statistical control.

Yet, deciding which probability model is appropriate requires judgment that most students of statistics do not possess. For example, the conditions for using a binomial probability model may be stated as:

If these four conditions apply to your data, then you may use the binomial model to compute an estimate of SD(X) directly from your estimate of MEAN(X). Or, you could simply place the counts (or proportions) on an XmR chart and estimate the dispersion from the moving range chart. You will obtain essentially the same chart either way.

Unlike attribute charts, XmR charts assume nothing about the relationship between the location and dispersion. It measures the location directly with the average, and it measures the dispersion directly with the moving ranges. Thus, while the np-, p-, c- and u-charts use theoretical limits, the XmR chart uses empirical limits. The only advantage of theoretical limits is that they include a larger number of degrees of freedom, which means that they stabilize more quickly.

If the theory is correct, and you use an XmR chart, the empirical limits will be similar to the theoretical limits. However, if the theory is wrong, the theoretical limits will be wrong, and the empirical limits will still be correct.

You can't go far wrong using an XmR chart with count data, and it is generally easier to work with empirical limits than to verify the conditions for a theoretical model.

© 1996 SPC Press Inc. Telephone (423) 584-5005.