© 2021 Quality Digest. Copyright on content held by Quality Digest or by individual authors. Contact Quality Digest for reprint information.

“Quality Digest" is a trademark owned by Quality Circle Institute, Inc.

Published on *Quality Digest* (https://www.qualitydigest.com)

**Published: **07/08/2013

Based on some recent inquiries there seems to be some need to review the four capability indexes in common use today. A clear understanding of what each index does, and does not do, is essential to clear thinking and good usage. To see how to use the four indexes, to tell the story contained in your data, and to learn how to avoid a common pitfall, read on.

Four indexes in common use today are the capability ratio, *C**p;* the performance ratio, *P**p*; the centered capability ratio, *C**pk;* and the centered performance ratio, *P**pk*. The formulas for these four ratios are:

To understand these ratios we need to understand the four components used in their construction. The difference between the specification limits, *USL* –* LSL*, is the specified tolerance. It defines the *total space available* for the process.

The distance to the nearer specification, *DNS*, is the distance from the average to the nearer specification limit. Operating with an average that is closer to one specification than the other effectively narrows the space available to the process. It is like having a process that is centered within limits that have a specified tolerance = 2 *DNS*. Thus, the numerator of both the centered capability ratio and the centered performance ratio characterizes the *effective space available* due to the fact that the process is not centered within the actual specification limits.

*Sigma(X)* denotes any one of several within-subgroup measures of dispersion. One such measure would be the average of the subgroup ranges divided by the appropriate bias correction factor. Another such measure is the average of the subgroup standard deviation statistics divided by the appropriate bias correction factor. The quantity denoted by 6 *Sigma(X) *represents the *generic space required by a process* when that process is operated up to its full potential.

The global standard deviation statistic, *s*, is the descriptive statistic introduced in every statistics class. Since it is computed using all of the data, it effectively treats the data as one homogeneous group of values. This descriptive statistic is useful for summarizing the past, but if the process is not being operated up to its full potential the changes in the process will tend to inflate this global measure of dispersion. Thus, this measure of dispersion simply describes the past without respect to whether the process has been operated up to its full potential or not. The denominators of 6*s* define the *space used by the process in the past*.

A glance at the formulas above will reveal that the only difference between the capability indexes and the corresponding performance indexes is simply which measure of dispersion is used. The performance indexes use the global standard deviation statistic to describe the past. The capability indexes use a within-subgroup measure of dispersion to approximate the process potential. Whenever and wherever this profound difference between these measures of dispersion is not appreciated it is inevitable that capability confusion will follow.

Depending upon what is happening with the underlying process, the four indexes above can be four estimates of one quantity, four estimates of two different quantities, or even four estimates of four different quantities. This variable nature of what these index numbers represent has complicated their interpretation in practice. As a result, many different explanations have been offered. Unfortunately, some of these explanations have been flawed and even misleading.

Using these four components defined above, we see that the capability ratio, *C**p*, expresses the space available within the specifications as a multiple of the space required by the process when it is centered within the specifications and is operated predictably. It is the *space available* divided by the *space required *under the best possible circumstances.

The performance ratio, *P**p*, expresses the *space available *within the specifications as a multiple of the *space used in the past* by this process. If the process has been operated up to its full potential, the space used in the past and the space required by the process will be essentially the same, and the performance ratio will be quite similar to the capability ratio. If the process has not been operated up to its full potential then the space used by the process in the past will always exceed the space required by the process, and the performance ratio will be smaller than the capability ratio. Thus, the agreement between the capability ratio and the performance ratio will characterize the extent to which the process is, or is not, being operated predictably.

The centered capability ratio, *C**pk*, expresses the *effective space available* as a multiple of the *space required *by the process when it is operated predictably at the current average. It is the effective space available divided by the space required. The extent to which the centered capability ratio is smaller than the capability ratio will characterize how far off-center the process is operating.

The centered performance ratio, *P**pk*, expresses the *effective space available* as a multiple of the *space used by the process in the past*. This ratio essentially describes the process as it is, where it is, without any consideration of what the process has the potential to do. The extent to which the centered performance ratio is smaller than the performance ratio is a characterization of how far off-center the process has been operated.

The relationship between these four indexes may be seen in figure 1. There the top tier represents either the actual capability of a process that is operated predictably, or the hypothetical capability of a process that is operated unpredictably. The bottom tier represents the actual performance of a process that is operated unpredictably. The left side represents what happens when the process is centered at the mid-point of the specifications, while the right side takes into account the effect of having an average value that is not centered at the midpoint of the specifications.

Thus, the top tier of figure 1 is concerned with the process potential, and the bottom tier describes the process performance. As a process is operated ever more closely to its full potential, the values in the bottom tier will move up to be closer to those in the top tier.

The left side implicitly assumes the process is centered within the specifications; the right side takes into account the extent to which the process may be off-center. As a process is operated closer to the center of the specifications the values on the right will move over to be closer to those on the left.

Thus, when a process is operated predictably and on target, the four indexes will be four estimates of the same thing. This will result in the four indexes being close to each other. (Since the indexes are all statistics, they will rarely be exactly the same.)

When a process is operated predictably but is not centered within the specifications, the discrepancy between the right and left sides of figure 1 will quantify the effects of being off-center. With a predictable process, the two indexes on the right side of figure 1 will both estimate the same thing while the two indexes on the left side will be two estimates of another quantity.

When a process is operated unpredictably, the indexes in the bottom row of figure 1 will be smaller than those in the top row, and these discrepancies will quantify the gap due to unpredictable operation.

When a process is operated unpredictably and off-target, the four indexes will represent four different quantities.

Thus, the capability ratio, *C**p*, is the best-case value, and the centered performance ratio, *P**pk*, is the worst-case value. The gap between these two values is the opportunity that exists for improving the current process by operating it up to its full potential.

The capability ratio, *C**p*, approximates what can be done without reengineering the process. If this best-case value is good enough, then the current process can be made to operate in such a way as to meet the process requirements. Experience has repeatedly shown that it is cheaper to learn how to operate the existing process predictably and on-target than it is to try to upgrade or reengineer that process.

Thus, by comparing the four capability and performance indexes you can quickly and easily get some idea about how a process is being operated. How close is it to being operated up to its full potential? Is it being operated on-target? Will it be necessary to reengineer the process, or can it be made to meet the process requirements without the trouble and expense of reengineering?

Figure 2 contains 260 observations from a predictable process. The corresponding average and range chart is shown in figure 3. The specifications for this process are 10.0 ± 3.5.

This process has a grand average of 10.15. The specification limits are 6.5 and 13.5. Thus, the distance to nearer specification will be *DNS* = 13.5 – 10.15 = 3.35. The average range is 4.25. With subgroups of size 5 this latter value results in a value for *Sigma(X)* of 4.25/2.326 = 1.83. Finally, the global standard deviation statistic is *s* = 1.847. Thus, the four capability and performance ratios are:

Here all four indexes tell the same story. They all might be taken to be estimates of the same quantity. Even without the average and range chart of figure 3 we could tell that this process was being operated predictably and is fairly well-centered within the specifications. The fact that these indexes are all near 60 percent implies that this process is not capable of meeting the specifications even though it is being operated up to its full potential.

Raw materials for a compound are dry-mixed in a pharmaceutical blender. The recipe calls for batches that are supposed to weigh 1,000 kg. If the weight of a batch is off, then presumably the recipe is also off. As each batch is dumped out of the blender the weight is recorded. Figure 4 shows the weights of all 259 batches produced during one week. The values are in time-order by rows. The *XmR* chart for these values is shown in figure 5. The limits shown were based on the first 45 values. There are points outside the limits within this baseline period, and the process deteriorates as the week progresses.

The specifications for the batch weights are 900 kg. to 1,100 kg. With an average moving range of 27.84 the value for *Sigma(X) *is 27.84/1.128 = 24.7 kg. The global standard deviation statistic for all 259 values is *s* = 61.3 kg. With an average of 936.9, the *DNS* value is 36.9 kg. Thus, the four indexes are:

The discrepancy between the capability ratio and the performance ratio shows that this process is being operated unpredictably. The discrepancy between the centered performance ratio and the performance ratio shows that the average is not centered within the specifications. The capability ratio describes what the current process is capable of doing when operated predictably and on target. The centered performance ratio describes the train wreck of what they actually accomplished during this week, and the gap between these two indexes describes the opportunity that exists for this process.

As shown in these examples, each of the four index numbers makes a specific comparison between the specified tolerance or the effective space available and either the within-subgroup variation or the global standard deviation statistic. In an effort to distinguish between the capability indexes and the performance indexes the performance indexes have sometimes been called “long-term capability indexes.” This nomenclature is misleading and inappropriate.

The idea behind the terminology of long-term capability is that if you just collect enough data over a long enough period of time you will end up with a good estimate of the process capability. To illustrate how this is supposed to work we will use data from example one to perform a sequence of computations using successively more and more data at each step. Although we would not normally perform the computations in this way in practice, we do so here to see how increasing amounts of data affect the computation of performance and capability ratios.

We begin with the first eight subgroups. The global standard deviation statistic for these 40 values is 1.974. The specifications are 6.5 to 13.5, so our *USL – LSL = 7.0*. Using these values we get a performance ratio of 0.591. The average range for these eight subgroups is 4.375, so *Sigma(X)* is 1.881, and with this value we get a capability ratio of 0.620. It is instructive to note how close these values are to the values found using all the data in example one above.

The first 12 subgroups contain 60 values. The global standard deviation statistic for these 60 values is 1.742. Using this value we get a performance ratio of 0.670. The average range for these 12 subgroups is 3.833, so *Sigma(X)* is 1.648, and with this value we get a capability ratio of 0.708.

The first 16 subgroups contain 80 values. The global standard deviation statistic for these 80 values is 1.678. Using this value we get a performance ratio of 0.691. The average range for these 16 subgroups is 3.875, so *Sigma(X)* is 1.666, and with this value we get a capability ratio of 0.700.

Continuing in this manner, adding 20 more values at each step, we get the performance ratios and capability ratios shown in figure 6. There we see that as we use greater amounts of data in the calculations these ratios settle down and get closer and closer to a value near 0.640.

Of course, as may be seen above, when a process is operated predictably, the capability ratio and the performance ratio both estimate the same quantity. Thus, when a process is operated up to its full potential there is no distinction to be made between the short-term capability and the long-term capability. Both computations describe the actual capability of the predictable process.

The convergence of a statistic to some asymptotic value that occurs with increasing amounts of data that is seen in figure 6 is the idea behind many things we do in statistics. Unfortunately, this convergence only happens when the data are homogeneous. In order to see what happens with a process that is not operated up to its full potential, we shall repeat the exercise above using the data from example two.

The first 40 batch weights have a global standard deviation statistic of 41.60. The specifications are 900 to 1,100, so our specified tolerance is *USL – LSL = 200.* Using these values we get a performance ratio of 0.801. The average moving range for these 40 values is 29.10, so *Sigma(X)* is 25.80, and with this value we get a capability ratio of 1.292.

The first 60 batch weights have a global standard deviation statistic of 44.20. Using this value we get a performance ratio of 0.754. The average moving range for these 40 values is 25.76 so *Sigma(X)* is 22.84, and with this value we get a capability ratio of 1.459.

Continuing in this manner, adding 20 more values at each step, we get the performance ratios and capability ratios shown in figure 7. For the sake of comparison, both figure 6 and figure 7 use the same horizontal and vertical scales.

To what value is the performance ratio curve in figure 7 converging? After 120 values it appears to be approaching 0.80, then with 20 additional values it suddenly drops down to the neighborhood of 0.70. After 180 values it seems to be approaching 0.70, then with 20 more values it drops down to the neighborhood of 0.60. After 240 values we are still in the vicinity of 0.60, but then with 259 values we drop down to 0.54. So which value are you going to use as your long-term capability? 0.80? 0.70? 0.60? or 0.54?

Here we see that even though we use ever greater amounts of data, the ratios do not settle down to any particular value. Neither do we see the agreement between the performance ratio and the capability ratio that was evident in figure 6. Clearly these two ratios characterize different aspects of the data in this case. Both the migration and the estimation of different things happen because this process is changing over time. Because of these changes there is no magic amount of data that will result in a “good number.” The computations are chasing a moving target. The question “What is the long-term capability of this process?” is meaningless simply because there is no such quantity to be estimated regardless of how many data we might use.

With an unpredictable process, as we use greater amounts of data in our computation we eventually combine values that were obtained while the process was acting differently. This combination of unlike values does not prevent us from computing our summary statistics, but it does complicate the interpretation of those statistics. With an unpredictable process there is no single value for the process average, or the process variation, or the process capability. All such notions of process characteristics become chimeras, and any attempt to use our statistics to estimate these nonexistent process characteristics is an exercise in frustration. This is why the idea of long-term capability is just so much nonsense.

However, once we understand that we are working with an unpredictable process, we are free to use our statistics to characterize different aspects of the *data* (as opposed to the process). As noted earlier, the capability ratio of 1.35 computed from the first 45 values of example two provides an approximation of what this process has the potential to do. In the same manner, the centered performance ratio of 0.20 describes what was done during this week. And the difference between these two statistics characterizes the gap between performance and potential. Thus, we may use the capability and performance indexes to identify opportunities even when they do not estimate fixed aspects of the underlying process.

Thus, referring to the performance indexes as long-term capabilities confuses the issue and misleads everyone. They are descriptive statistics that summarize the past. They do not estimate any fixed quantity unless the process is being operated predictably. And they definitely do not describe the indescribable “long-term capability of an unpredictable process.”

**Links:**

[1] /IQedit/Images/Articles_and_Columns/2013/July_2013/Wheeler-LngTrmCap/fig-2-3.gif

[5] /IQedit/Images/Articles_and_Columns/2013/July_2013/Wheeler-LngTrmCap/fig-4-5.gif

[9] /IQedit/Images/Articles_and_Columns/2013/July_2013/Wheeler-LngTrmCap/fig-6.gif

[11] /IQedit/Images/Articles_and_Columns/2013/July_2013/Wheeler-LngTrmCap/fig-7.gif