A report of how a process performs is not only a function of process characteristics and sampling chance differences. It can also depend on sampling approach. For example, one person could describe a process as out of control, which would lead to activities that address process perturbations as abnormalities; another person could describe the same process as being in control.
ADVERTISEMENT |
To illustrate how different interpretations can occur, let's evaluate the time series data in figure 11, which could be the completion time for five, randomly selected daily procedural transactions in a hospital, insurance company, or one-shift manufacturing facility. The data initially record process stability and then, if stable, the process's capability relative to customer specifications of 95 to 105.
A standard statistical control chart guide, based on Walter Shewhart's guidelines2,3, would recommend using an and R control chart for assessing the stability of the process, as illustrated in figure 2.
Some courses teach that you shouldn't generate a control chart with only 10 subgroups, a caution intended to reduce the uncertainty of the standard deviation estimate. Having fewer than 25 subgroups or data points will only reduce the chance of detecting an out-of-control condition. Using a smaller number of data points in a control chart increases a beta-risk equivalent, where a true out-of-control condition may not be detected. However, smaller sample control charts that show an out-of-control condition should still be investigated because it is likely that an out-of-the-norm event occurred, given underlying assumptions used to create the chart.
Whenever a value on a control chart is beyond the upper control limit (UCL) or lower control limit (LCL), the process is said to be out of control. Out-of-control occurrences are called special-cause conditions and can trigger a causal problem investigation. Since so many out-of-control conditions are apparent in figure 2, causal investigation would be occurring very frequently. In addition, no process capability statement should be made about how this unstable process is expected to perform relative to its specification limits.
Thechart in figure 2 is out of control; however, if one were to step back and look at thechart alone, it does look like the process would continue to produce mean values within the boundaries of the displayed chart axes; i.e., 92 to 108. If the process shows the average values ranging from 92 to 108 over the next 10 or 20 subgroups, should we consider the process stable? If we ignore the control chart definition of stable and replace it with the word "consistent," then we could consider that the process is consistent.
If the process subgroup means are consistent over time, and theand R chart indentifies the process as out of control, or inconsistent, what is wrong? Maybe nothing is wrong. Anand R chart was created by Shewhart to identify any "assignable" cause that causes the process mean to change so that a process operator may adjust the process to return the process mean to the historic values. This is a good question to ask of a single process in order to provide a control signal that drives an adjustment to the process. But it is not the best question to ask when a business wants to assess the overall performance of its business process relative to customer needs.
What is an appropriate generic sampling and control charting approach that would provide a high-level business and/or customer view of a process output, noting that this response may need to include performance differences between worker shifts, equipment, and/or locations? The customer or the business does not care how a process is executed; they expect the process output to be acceptable no matter how it was conducted. I refer to an assessment that addresses these needs as a 30,000-foot-level view.
The 30,000-foot-level view
For the chart portion of the and R control chart pair, the UCL and LCL are calculated from the relationships
where is the overall average of the subgroups, A2 is a constant depending on subgroup size, and is the average range within subgroups.
For the X chart from the XmR chart pair, the UCL and LCL are calculated from the relationships
where is the average moving range between subgroups.
The limits for the chart of an and R chart pair are derived from within-subgroup variability (), while sampling standard deviations for the X chart of an XmR chart pair are calculated from between-subgroup variability ().
The implication of this difference in control-chart calculations is that if there is a large natural-process variance component between subgroups, this variability could result in out-of-control signals for an and R control chart, while an XmR chart of the same situation could indicate an in-control process.
Control charts only address process stability, not performance relative to customer needs. Process capability and performance statements can provide this form of insight when a process is stable. For processes that have a specification, typical process-capability statements are provided using process-capability indices such as Cp, Cpk, Pp, and Ppk4. However, the interpretation of these indices relative to how a process is performing relative to customer needs can be confusing and deceptive because reported values can be a function of how samples are drawn from the process.
What is needed to address this potential sampling plan inconsistency is a consistent approach where the effect from this common-cause input noise occurs between subgroupings; e.g., day, week, or month. I refer to the sampling plan that accomplishes this as "infrequent subgrouping/sampling," and this process assessment perspective as a 30,000-foot view. When creating control charts at the 30,000-foot level, we must include between-subgroup variability within our control-chart limit calculations. Also needed is a report of how a process performs relative to customers' needs that is easier to understand than typical process-capability statements.
To address this customer-performance reporting in words that are easily understood, a 30,000-foot assessment provides a process-stability assessment in addition to a percentage nonconformance rate, as figure 3 illustrates using a probability plot.
For this randomly generated data set that had both a within- and between-subgroup variance component, a conclusion based on the two control charts is that the process is stable. The probability plot's null hypothesis test of data normality is rejected at a level of 0.05 because the reported probability plot p-value is less than 0.005; however, visually the best estimate line in the probability plot appears close enough to straight that a rough estimate for nonconformance could be reported in terms that everyone can easily understand—i.e., about 27-percent nonconformance.
In figure 3's control-chart pair, it could also be noted how within-subgroup stability was assessed over time using a log transformation of standard deviation; i.e., Box-Cox transformation with lambda = zero. Because standard deviation can't be less than zero, a log-normal transformation can, in general, be used to model the skewness of this within-subgroup variability's distribution for 30,000-foot reporting.
Summary
For the presented data, the 30,000-foot-level report changed how one would view the process's performance from considering that the process was not stable using a traditional and R control chart to a process that has a noncompliance rate of about 27 percent. This approximate unacceptability rate can be expected in the future unless something changes. To improve a process's common-cause level of performance when reported at the 30,000-foot level, the process must be enhanced, e.g., through an improvement project.
When 30,000-foot metric reporting is conducted throughout a business using a system such as Integrated Enterprise Excellence (IEE)5 that enables automatic predictive scorecard metrics updates, the business as a whole can readily be analyzed to determine where improvement projects should be focused so that the enterprise as a whole benefits.
References
1. Integrated Enterprise Excellence, Volume III Improvement Project Execution: A Management and Black Belt Guide for Going Beyond Lean Six Sigma and the Balanced Scorecard, Forrest W. Breyfogle III, Citius Publishing, 2008.
2. Economic Control of Quality of Manufactured Product, W. A. Shewhart, D. Van Nostrand Co., New York, 1931.
3. Statistical Method from the Viewpoint of Quality Control, W. A. Shewhart, United States Department of Agriculture, 1939.
4. Statistical Process Control (SPC) Reference Manual, Second edition, Chrysler Corporation, Ford Motor Company, General Motors Corporation, AIAG, 1995
5. Integrated Enterprise Excellence, Volume II – Business Deployment: A Leader’s Guide for Going Beyond Lean Six Sigma and the Balanced Scorecard, Forrest W. Breyfogle III, Citius Publishing, 2008.
Comments
Three-Way Chart
Don't know enough about this process to say for sure if it is appropriate, but another alternative is to use a three-way chart as described in Donald Wheeler's book "Understanding Statistical Process Control".
Individuals vs. Xbar charts
Agree with the previous post, that 3 way charts (within/between) are a good way to monitor two sources of variation (e.g. within cavity, between cavity).
Also regardless of the presence of other sources of variation between subgroups, individuals charts will always be less sensitive to detecting process changes than will be xbar charts. Smaller shifts are detectible as the sample size increases. The inherent potential Type II errors found in Individuals charts seesm to be rarely discussed (assuming we are using SPC to detect small process changes when they occur).
Rational Subgrouping and Sampling
Whenever I see an Average/Range chart that looks like the one presented in this article, I always question the subgroup rational. What souces of variation are within and between subgroups? Is the Xbar/R chart the most appropriate one based on context and the data collection scheme?
I would start off with the XmR chart as a first step in your analysis.
Rich DeRoeck
Wrong use of probability plot and transformation
I agree with the fundamental argument that if the variability between subgroups is significantly larger than the variability within subgroups, your X-Bar chart will have limits that says everything is out of control. Limits for the averages should be developed via an ANOVA method (or ANOM – essentially treating the averages as individuals) until such time (if ever) that they discover the source as to why the variability between groups is so large. If one does a quick ANOVA, you find that 96% of the total variability is the variability between “days”.
However, there are two big flaws in the analysis here. Where this went off the track is the notion that one uses a probability plot with all 50 data points (appropriate if you have rational subgroups which is not the case). The graph and subsequent analysis implies that the points are all from the same distribution and clearly they are not. In a situation such as this, where the between group variability overwhelmingly dominates the within variability, a sample size of 1/day is more than sufficient. The conclusion this probability plot provides that the data is not normal is erroneous and is further exacerbated by implying a transformation is required. If one does an Anderson-Darling statistic on the daily averages (n=10), the p-value is 0.27, therefore not rejecting the hypothesis that the data is normal. Even if a transformation is required (which it isn’t), blindly accepting a recommendation of a lognormal distribution makes absolutely no physical sense. Lognormal distributions are inherently used when the data spans several magnitudes and that’s not the case here.
Response to previous comments
Thanks for your comments:
1. Three-way control charting is applied, for example, when there is variability within a part and several parts are selected within subgroups that are tracked over time using a control chart. This was not the case in the example described; however, the three-way control chart does use an individuals chart of subgroup means, which is similar to the described 30,000-foot-level reporting.
2. Agree that the hypothesis of normality from the probability plot would be rejected for this data set; hence, the estimates of non-compliance will have some associated error because of this. However, even for this extreme example, where there are two distinct distributions, the probably plot does not look as bad as one might initially think from being a straight line; hence, we can still provide a rough estimate of the percentage not conforming. In all likelihood, this non-normality topic would not have been brought up if the capability statement were presented in the article using Minitab’s capability analysis routine, which does not include a normality assessment when providing its process capability reporting. The probability plot non-conformance result noted in the article is the same as what could have been reported using Minitab’s capability analysis routine output "PPM total". Also, the 30,000-foot-level charting approach would lead to the appropriate behavior – understanding what could be done to improve the process.
3. Agreed that the next step that one would take when attempting to reduce the amount of non-conformance in the 30,000-foot-level charting would be to create hypotheses to test out various theories. One theory might create a hypothesis test for assessing variability between days relative to within days, which would be significant. This could then lead to investigation why this is occurring and what might be done to reduce the source of this between subgroup variability. If an improvement were made, the 30,000-foot-level control chart would transition to a new level of stability that had a reduced rate of non-conformance shown in the probability plot. Note how this behavior is quite different than addressing all the special cause conditions that were created using traditional control charts, which could have led to much firefighting.
4. A control charting of standard deviation could have been made without a transformation. The standard deviation would have been in control for this set of data; however, zero would have been within the control limits. Since standard deviation cannot physically be below zero, this leads us to a dilemma. We could have moved the control limit on the lower side to zero; however, there would have been no way to detect if the within-subgroup variability reduced because of a change. A more general approximate approach to address this situation is to take the log of standard deviation. Transformations should be made only when they make physical sense. Since standard deviation can never get below zero, the log-normal distribution is a general, easy to use, transformation that fits fairly well and makes physical sense for this situation.
5. Subgrouping is very important. With 30,000-foot-level reporting, we need to have all common-cause input variability to occur between subgroups.
6. Relative to the type II error comment, the primary goal of a 30,000-foot-level assessment is to not detect small changes but describe at a high level how the process is performing relative to stability and customer needs. When our common-cause variability from this high-level view is not providing what we desire then process improvements are needed. Improvements to the process are then demonstrated when the 30,000-foot-level chart transitions to a new, improved level of stability. Note, this is different than traditional control charting which has a primary intent of identifying when special cause occurs so that these problems can be addressed in a timely fashion.
Transformation confusion followup
#2 - “Rough estimate of the percentage not conforming” – here is where Wheeler makes a strong argument that we should not transform. Our estimate is just that. It is an estimate that has variability. With the data that Breyfogle presented, I bristle at the mere mention of transformation. Where does that obsession come from? It is not borne in the data. There is no justification for it and if we use voodoo statistics, we do become our own worst enemy. Even Breyfogle agrees that it is not from the same population – therefore we NEVER transform just to transform so some black-box statistic comes out. If we want to live in theory and not in reality, then the process capability is 0, because the process is not in control. Obviously managers will not tolerate such a thing. So why are we trying to be so precise if it is “out of control”? If you accept the argument that it should be the subgroup averages that should be tracked (which I think we both agree – I would conclude and agree with any who state there are really 10 data points to worry about – the subgroup averages, given that over 96% of the variability is between), then why are we going through distributional gyrations with 50 data points? I would agree with the paper if it had stopped at the 3 way graph and had gone no further.
As for the control charting of the standard deviation in #4, color me confused. You can easily do a s-chart vs a R-chart. For this, the lower limit is zero for n=5 according to my constants table. Why make it harder? Maybe I didn’t understand your point and I’ll await a reply. But we know from statistical theory that the distribution of standard deviations for low sample sizes is not normally distributed. Why would we force it to be when we already have established methods to deal with it?
Response to: Transformation confusion followup
Sorry about not responding sooner, but I did not see this comment until someone recently point it out to me.
It is interesting that even mentioning the word data transformation and this can draw strong emotions from some people. One needs to keep in mind that Wheeler, when making the statement that transformations are not necessary, is using a traditional control charting strategy where focus is given to control charting individual processes; e.g. creating a control chart for each machine individually, where ten machines may be manufacturing a part. Wheeler's primary emphasis is not making process capability statements of how the overall process (e.g., from 10 machines) is doing relative to specification requirements.
What I described in this article is not real-time control charting of individual processes but 30,000-foot-level performance tracking over time, with the inclusion of how the overall process is doing relative to specification needs; i.e., collectively evaluating all ten machines over time and providing a process capability/performance statement relative to how well the overall process (from the 10 machines) is performing relative to customer needs.
When one is making a process capability statement, it is important to keep in mind that an appropriate transformation be considered otherwise a good fit cannot be made for the estimate; i.e., like what is needed for making any estimate in engineering from a model. Some data are not normally distributed by their nature. For example, the time it takes to complete a process. The lower boundary is zero. If the process operates near this boundary, then there the distribution will tend to have a skewed distribution. A log-normal distribution can fits this situation well as a model for making a process capability/performance statement.
Responses to the specific points made in the comment:
* From a 30,000-foot-level approach, the process was in control; hence, a process capability/performance statement could be made. This statement is even predictive, which means that if this non-conformance rate is undesirable the process needs to be improved.
* The 3-way graph provides no process capability/performance statement, which the 30,000-foot-level charting provides; hence, the 3-way control chart is not adequate for what this article suggests.
* With the R-bar chart providing a 0 boundary, it can be very difficult to determine if the within subgroup variability of the process improved, since it is physically impossible to have an out of control condition below zero. This shortcoming is overcome with 30,000-foot-level chart of standard deviation that has a log-normal transformation.
control charting & capabilitying
I wonder wether USA is an acronym for Using Statistical A ...: if Statistix were an effective Tool, its so much intensive use should have resulted in a better World, whatever the meaning of this Statement. Which is not; for many reasons; may be the root cause - or reason - is that Statisitix is like the mythical Panacea - just a myth. Thank you.
Add new comment