There are more than 1,100 textbooks referring to “short-term process capability,” as distinct from “long term.” Surely 1,100 textbooks can’t be wrong? Let’s apply the first Bull Wrangler test. Does short- and long-term process capability make common sense?
ADVERTISEMENT |
What is capability?
According to ISO 15504—“Information technology—Process assessment,” process capability is the capability of a process to do what it’s supposed to do, or “to meet its purpose.” In Juran’s Quality Control Handbook (McGraw-Hill, 1988), Joseph Juran called it “a competence to produce quality products.” In order to be assured that our product or service will be capable of meeting its purpose in the future, we must have some means of forecasting the future. Walter Shewhart gave us a tool to do just that. It allowed us to predict what was going to happen to our process. The tool is the Shewhart Chart, which says that if a process is in control, it is predictable. The Shewhart Chart is the only such process-behavior forecasting tool available.
Some may claim that processes can be modeled to predict their future. Forecasting the future is theoretically possible if all the inputs and physical processes in a system can be modeled precisely from first principles mathematically. However for even the most simple flow system, the mathematics such as Navier-Stokes are horrendous. Modeling the myriad variables affecting most real-world processes is futile. It’s no wonder that predicting with any accuracy whether it will rain tomorrow remains impossible, despite the billions of dollars spent on developing mathematical models.
Another approach is to assume a system is a “black box,” look at the distribution of its output, then carry out hypothesis testing for that distribution. However, it requires for example, 3,200 homogeneous data points to verify that data from a process is normally distributed out to just +/– 2.95 sigma. It’s highly unlikely any process will remain totally unchanged when thousands of observations are taken. The futility of attempting to model a process in this manner is illustrated in Donald Wheeler’s recent Quality Digest column, “Why We Keep Having 100-Year Floods.”
Shewhart said: “...we never know f(y, n) in sufficient detail....” Most important, as Wheeler notes in Normality and the Process Behavior Chart (SPC Press, 2000), there is no reason to attempt to model a process or attempt to determine the distribution of its output. A Shewhart Chart works for almost any data distribution; process modeling is not required. Indeed, Wheeler has tested 1,143 distributions to validate Shewhart’s assertion. If a Shewhart Chart shows that a process is out of control, or unstable, we can be certain that rejects may be produced, no matter what the claimed process capability, or where specification limits have been set.
Cp: Not a magic number
A Shewhart Chart is therefore the real measure of the capability of a process. Although managers would love to condense a Shewhart Chart into a single number, this isn’t possible. There is no number that can tell us whether a process is capable of doing whatever, in the future. Cp, as a statistic, originated with Juran’s capability ratio of tolerance width to process capability (6σ); see Juran on Quality by Design (Free Press, 1992). Cp gives an indication of the ratio of specification to process variability, but it must assume a stable process. Although Cp may give managers a target to shoot for, it gives no indication by itself as to whether the process is capable of meeting its target.
How far into the future?
If a number can’t tell us how predictable a process is, using numbers to differentiate between short- and long-term futures is even more ridiculous. It would suggest that a number can in some way predict that a process might be OK for a short period, then later run off the rails or perhaps experience the unavoidable “shift” that Six Sigma devotees claim. Even more mysteriously, the converse is that these numbers might suggest a process may be bad for a while, but later we can be sure that all will be well. What an amazing claim that two numbers can be produced that forecast a process is going to run off the rails for a “short” undefined period, then of its own accord, correct itself “long term.”
The Lean Six Sigma Institute asserts that the difference between short- and long-term capability is caused by the 1.5 sigma drift or shift, which we should all know by now is nonsense. For a more detailed analysis of this, see the March 2013 Quality Digest column, “Six Sigma Lessons From Deming, Part 1.” Wikipedia claims “the long-term Cpk value will turn out to be 0.5 less than the short-term Cpk value.”
What’s the origin of the terms short-, medium-, and long-term capability? Does “short term” mean in the coming seconds, minutes, or hours? Does “long term” mean the coming years, perhaps? The terms were introduced by Mikel Harry in the first of his attempts to prove the drift or shift he claimed all processes were supposed to experience. His original “long term” was 50 samples. He later revised that number to six subgroups in his second attempt at propping up Six Sigma’s six sigma. In “Process Capability and Process Design,” Kenneth Crow maintains that long term is “one week or longer.” Of course, it can be anything you want and has no mathematical or statistical validity. From this ridiculous beginning, these terms became muddled with process capability.
Harry introduced a “midterm capability” to further muddy the waters. We can quickly dispose of it. Thankfully, this hasn’t been picked up by industry as much as the other terms. In Six Sigma Producibility Analysis and Process Characterization (Addison-Wesley, 1992), he claims that “midterm capability” is Cpk. “The calculation requires a substantial amount of data from a fairly stable (and presumably mature) manufacturing process,” he says, but gives no indication what a “fairly stable” process is supposed to be, nor does he give any indication why a “substantial amount” is required for this compared, to his Z.st “short term” and Z.lt “long term.” There isn’t even a hint as to why this offers the slightest prediction of process capability in whatever “midterm” is supposed to be.
Let’s look more closely at some definitions you’ll find in most Six Sigma handbooks. The following are from the Six Sigma Dictionary:
• Z.st—short-term capability: Z.st is a short-term capability index of a process or a part (Z.st = 1.0 is bad performance, Z.st = 4.0 is mediocre, and Z.st = 6.0 is considered ‘world class’).
• Z.lt—long-term capability: Z.lt is the Z bench calculated from the overall standard deviation and the average output of the current process. Used with continuous data, Z.lt represents the overall process capability and can be used to determine the probability of making out-of-spec parts within the current process.
This definition makes the erroneous assumption that Shewhart Charts are probability charts. It assumes that using a number to replace a Shewhart Chart grants an ability to predict the future.
Six Sigma goes on to combine Z.st and Z.lt into what is called the “Z.shift”:
• Z.shift: Z.shift is the difference between Z.st and Z.lt. The larger the Z.shift, the more you are able to improve the control of the special factors identified in the subgroups. Z.shift is usually assumed to be 1.5 (Z.st = Z.lt + 1.5). However, it can be computed precisely for any given process by calculating its “between subgroup variation” using process capability analysis.
ASQ defines Z.shift as the “difference between short-term and long-term Z bench,” where “Z bench” is claimed to be “the resulting Z when all defects are moved into one tail of the distribution (PPM < LSL + PPM > USL), expressed as either long term or short term, depending on the PPM values used in the calculation.”
Future without time
To gain a better perspective on the real meaning of Six Sigma’s Z shift equation, let’s start with a simple enumerative study, so popular with Six Sigma folks. Unlike Shewhart Charts, in this example, we must assume normally distributed data. This example illustrates how non-time-based data have been used to erroneously calculate time frames. The example is based on the enumerative study that Mikel Harry used in his second attempt to prove his fallacious +/–1.5 sigma drift/shift, which he now switches to an “allowance” or “correction.” (See Appendix B at the end of this column). Our following example, like Harry’s, is a hypothesis test using 30 samples:
In recent years, many people have had an interest in housing investment. Suppose we are going to buy a house and want to determine if one or more of six areas is more or less expensive than the others. We look at five of the most recent house prices in each of the six areas and find a lot of variation and price overlap. We calculate the six average prices, but how are we to know whether there is any significant difference, given that each has a variability associated with it?
We can determine whether there is a difference in the averages by testing the hypothesis based on the following equation:
Equation 1: Overall variation = Variation between groups + Variation within groups
Note that each of these sums is across all the data. (See Appendix A at the end of this column.)
Now, what have house prices to do with processes? Absolutely nothing. We could arrange the six sets of data in any order and get exactly the same result. However, if the sets of data represented a process, the sequence would be critical. For the housing data, the sequence in each group isn’t important. There’s no mention or relevance of time. However, despite time playing no part, Harry used this approach to prove short- and long-term capability.
If we used the same data to represent a process parameter measurement, we might take five samples every hour for six hours. Superficially, this situation looks very similar to the houses example, but the sequence in which house prices are collected has absolutely no effect whatsoever on the result. However, in the case of the process, time and the sequence of measurements are critical. Imagine a run or control chart where the data are mixed up and time is removed. The chart would have no meaning. The goal with process charting is prediction. We want to know, based on the past, what is likely to happen next, so we can better manage our process. This is the very reason for systematic data collection. However, Z.shift and hypothesis tests don’t take time into account.
The delicious irony in most discussions about short- and long-term capability is that time is implied to be irrelevant in a proof about possible futures.
W. Edwards Deming comments on this in Out of the Crisis (MIT press, 2000): “Analysis of variance, t-test, confidence intervals, and other statistical techniques taught in books, however interesting, are inappropriate because they provide no basis for prediction.... Chi-square and tests of significance, taught in some statistical courses, have no application here [in quality management].... Some books teach that the use of a control chart is a test of hypothesis: the process is in control, or it is not. Such errors may derail self-study.”
If we take a look in more detail at the Six Sigma Z.shift equation, as described by Mikel Harry in “How do I determine the value of short term and long term standard deviation?”, we find that he bases his “long term” on variation on the left-hand side of Equation 1, above; and “short-term” on the “variation within groups,” the second term on the right side of Equation 1. Most important, both terms in Equation 1 are summed across all the data, and neither has any reference to time. (See Appendix C, at the end of this column.)
To complete the capability mish-mash, we should mention Pp and Ppk. These are defined as the “capability” (or should we say “incapability”) of an out-of-control process. An out-of-control process is not capable. Others sources claim that Pp and Ppk represent the nebulous “long term,” while Cp and Cpk are supposed to be “short term.” One can only shake one’s head in dismay at the merry-go-round.
Summary
In summary, “short term” is meaningless in the context of the Z.shift equation. It does not represent a period of time. “Long term” means all the data, not for any especially “long” period of time. “Long term” covers data over exactly the same period of time as “short term.” Hence “short term” and “long term” are meaningless and should not be used. “Midterm” is also nonsense. Using hypothesis testing in this manner ignores all effects of time. It should not be used. The Z.shift equation contributes nothing. It is pointless to use it. Ignore Harry’s instructions that, “When computing Cp, it is necessary to employ the short-term standard deviation.”
By contrast, Shewhart Charts do incorporate time and are an effective means of monitoring and predicting process behavior. The broader message is that the key to good quality is to think and to question what you’ve heard, even if it has been parroted in 1,100 textbooks.
Appendices
Appendix A
Click here for larger image
Where:
X is an individual value
X bar is the average value for a group
X bar bar is the overall average of the values
Appendix B
In an attempt to prop up the fundamental metric for Six Sigma, Mikel Harry presented a new version of the his “1.5” in 2003. This time, rather than a “shift,” he suggests it is a “correction.” He suggests that a correction is required because of the error in the estimate of sigma from a finite set of data. The following is a simplified explanation of his approach.
Chi square tables tell us the likelihood that that sigma for a population will fall within a certain range of the sample standard deviation (SD). For a sample SD = 1:
We can choose whatever value we like for n, but Harry chooses a special case of n = 30, and a confidence interval of 99.5 percent (rather than the more common 95 percent), giving:
or 0.74 < sigma < 1.49.
Harry then takes the right-hand side value, multiplies by 3 for 3-sigma control limits, subtracts 3, and gets his 1.5. If we use more reasonable values, say a control chart with 30 sets of 5 points, at a 95-percent confidence, we get:
.91 < sigma < 1.11
or a factor of 0.33 instead of 1.5.
The value ranges from 1.0 to about 20, depending on the choice of conditions. However, none of these values are required because Shewhart Chart control limits do not rely on probabilities. Shewhart points out they they are “economic” not “probability” limits. Neither of these numbers have any relevance at all because control limits are not probability limits.
Appendix C
Z.shift = Z.st – Z.lt
With:
Z.st = |SL – T| / S.st
S.st = sqrt [SSw / k(n – 1)]
Z.lt = |SL – M| / S.lt
S.lt = sqrt [SSt / (ng – 1)]
Six Sigma’s Z.shift calculation makes a serious error. It calls the SSwithin figure “short term” and the SSoverall figure “long term.” It can be seen from the preceding discussion and the equations above that “short term” and “long term” do not occur anywhere. In reality S.st is the SDwithin groups, and the S.lt is the SDoverall.
It is apparent from the calculations that the SSwithin term uses all the data, with kn representing this, for the k groups of size n. Moreover, as discussed above, using the test-of-hypothesis approach ignores the element of time element altogether.
It might also be noted that SDoverall, the overall standard deviation, is not equal to the sum of SDw and SDb, as is often seen quoted. The sums of squares equation, SSoverall = SSw + SSb, is, however, correct.
Comments
The Terminal Man
It's the title of a 1972 Michael Crichton's novel, certainly worth reading, if not done, so far. It seems it may have little to do with stats formulas, but for one thing: I do strongly share both QS 9000 and ISO/TS 16949 - and their likes - approaches that STATS must be shop-floor business, first of all, not ivory-towers foggy exercises. Thank you.
Love It!!
For what it's worth, I've used short term capabilty to refer to a single sample set, perhaps one production run, whilst a long term capability would include additional data or production runs. The primary point of that would be to convey that the initial Ck / Cpk values were from a short term or small sample size. Of course, we would always subtract 1.5 from the value for long term estimates! (j/k! - never have seen or agreed with that seemingly arbitrary number)
As you lovingly point out, a Shewhart chart will offer that same information over time.
Add new comment