Featured Product
This Week in Quality Digest Live
Six Sigma Features
Richard Harpster
Good news? You are probably already doing it.
Donald J. Wheeler
Does your approach do what you need?
James J. Kline
Quality professional organizations need to adjust their body of knowledge to include an understanding of big data
Donald J. Wheeler
In spite of what everyone says to the contrary
Brittney McIver
Every CAPA should begin with investigation

More Features

Six Sigma News
Sept. 28–29, 2022, at the MassMutual Center in Springfield, MA
Elsmar Cove is a leading forum for quality and standards compliance
Is the future of quality management actually business management?
Too often process enhancements occur in silos where there is little positive impact on the big picture
Collect measurements, visual defect information, simple Go/No-Go situations from any online device
Good quality is adding an average of 11 percent to organizations’ revenue growth
Floor symbols and decals create a SMART floor environment, adding visual organization to any environment
A guide for practitioners and managers
Making lean Six Sigma easier and adaptable to current workplaces

More News

Forrest Breyfogle—New Paradigms

Six Sigma

NOT Transforming the Data Can Be Fatal to Your Analysis

A case study, with real data, describes the need for data transformation.

Published: Wednesday, September 16, 2009 - 09:16

Not surprisingly, there was controversy over Forrest Breyfogle's article, "Non-normal Data: To Transform or Not to Transform," written in response to Donald Wheeler’s article "Do You Have Leptokurtophobia?" Wheeler continued the debate with "Transforming the Data Can Be Fatal to Your Analysis." This article is Breyfogle’s response to Wheeler’s latest column.


Donald Wheeler stated in his second article "Transforming the Data Can Be Fatal to Your Analysis," "out of respect for those who are interested in learning how to better analyze data, I feel the need to further explain why the transformation of data can be fatal to your analysis."


My motivation for writing this article is not only improved data analysis but formulating analyses so that a more significant business performance reporting question is also addressed within the assessment; i.e., a paradigm shift from the objectives of the traditional four-step Shewhart system, which Wheeler referenced in his article.

The described statistical business performance charting methodology can, for example, reduce firefighting when the approach replaces organizational goal-setting red-yellow-green scorecards, which often have no structured plan for making improvements. This article will show, using real data, why an appropriate data transformation can be essential to determine the best action or non-action to take when applying this overall system in both manufacturing and transactional processes.

Should Traditional Control Charting Procedures be Enhanced?

I appreciate the work of the many statistical-analysis icons. For example, more than 70 years ago Walter Shewhart introduced statistical control charting. W. Edwards Deming extended this work to the business system with his profound knowledge philosophy and introduced the terminology common and special cause variability. I also respect and appreciate the work that Wheeler has done over the years. For example, his book Understanding Variation has provided insight to many.

However, Wheeler and I have a difference of opinion about the need to transform data when a transformation makes physical sense. The reason for writing this article is to provide additional information on the reasoning for my position. I hope that this supplemental explanation will provide readers with enough insight so that they can make the best logical decision relative to considering data transformations or not.

In his last article, Wheeler commented: "However, rather than offering a critique of the points raised in my original article, he [Breyfogle] chose to ignore the arguments against transforming the data and to simply repeat his mantra of ‘transform, transform, transform.' "

In fact, I had agreed with Wheeler’s stated position that, "If a transformation makes sense both in terms of the original data and the objectives of the analysis, then it will be okay to use that transformation."

What I did take issue with was his statement: "… Therefore, we do not have to pre-qualify our data before we place them on a process behavior chart. We do not need to check the data for normality, nor do we need to define a reference distribution prior to computing limits. Anyone who tells you anything to the contrary is simply trying to complicate your life unnecessarily."

In my article, I stated "I too do not want to complicate people’s lives unnecessarily; however, it is important that someone’s over-simplification does not cause inappropriate behavior."

Wheeler kept criticizing my random data set parameter selection and the fact that I did not use real data. However, Wheeler failed to comment on the additional points I made relative to addressing a fundamental issue that was lacking in his original article, one that goes beyond the transformation question. This important issue is the reporting of process performance relative to customer requirement needs, i.e., a goal or specification limits.

This article will elaborate more on the topic using real data which will lead to the same conclusion as my previous article: For some processes an appropriate transformation is a very important step in leading to the most suitable action or non-action. In addition, this article will describe how this metric reporting system can create performance statements, offering a prediction statement of how the process is performing relative to customer needs. It is important to reiterate that appropriate transformation selection, when necessary, needs to be part of this overall system.

In his second article, Wheeler quoted analysis steps from Shewhart’s Economic Control of Quality of Manufactured Product. These steps, which were initiated over seven decades ago, focus on an approach to identify out-of-control conditions. However, the step sequence did not address whether the process output was capable of achieving a desired performance output level, i.e., expected process non-conformance rate from a stable process.

Wheeler referenced a study "… encompassing more than 1,100 probability models where 97.3 percent of these models had better than 97.5-percent coverage at three-sigma limits." From this statement, we could infer that Wheeler believes that industry should be satisfied with a greater than 2-percent false-signal rate. For processes that do not follow a normal distribution, this could translate into huge business costs and equipment downtime searching for special-cause conditions that are in fact common-cause. After an organization chases phantom special-cause-occurrences over some period of time, it is not surprising that they would abandon control charting all together.

Wheeler also points out how traditional control charting has been around for more than 70 years and how the limits have been thoroughly proven. I don't disagree with the three sampling standard deviation limits; however, how often is control charting really used throughout businesses? In the 1980s, there was a proliferation of control charts; however, look at us now. Has the usage of these charts continued and, if they are used, are they really applied with the intent originally envisioned by Shewhart? I suggest there is not the widespread usage of control charts that quality-tools training classes would lead students to believe. But why is there not more frequent usage of control charts in both manufacturing and transactional processes?

To address this underutilization, I am suggesting that while the four-step Shewhart methodology has its applicability, we now need to revisit how these concepts can better be used to address the needs of today's businesses, not only in manufacturing but transactional processes as well.

To assess this tool-enhancement need, consider that the output of a process (Y) is a function of its inputs (Xs) and its step sequence, which can be expressed as Y=f(X). The primary purpose of Wheeler's described Shewhart four-step sequence is to identify when a special cause condition occurs so that an appropriate action can be immediately taken. This procedure is most applicable to the tracking of key process input variables (Xs) that are required to maintain a desired output process response for situations where the process has demonstrated that it provides a satisfactory level of performance for the customer-driven output, i.e., Y.

However, from a business point of view, we need to go beyond what is currently academically provided and determine what the enterprise needs most as a whole. One business-management need is an improved performance reporting system for both manufacturing and transactional processes. This enhanced reporting system needs to structurally evaluate and report the Y output of a process for the purpose of leading to the most appropriate action or non-action.

An example application transactional process need for such a charting system is telephone hold time in a call center. For situations like this, there is a natural boundary: hold time cannot get below zero. For this type of situation, a log-normal transformation can often be used to describe adequately the distribution of call-center hold time. Is this distribution a perfect fit? No, but it makes physical sense and is often an adequate representation of what physically happens; that is, a general common-cause distribution consolidation bounded by zero with a tail of hold times that could get long.

From a business view point, what is desired at a high level is a reporting methodology that describes what the customer experiences relative to hold time, the Y output for this process. This is an important business requirement need that goes beyond the four-step Shewhart process.

I will now describe how to address this need through an enhancement to the Shewhart four-step control charting system. This statistical business performance charting (SBPC) methodology provides a high-level view of how the process is performing. With SBPC, we are not attempting to manage the process in real time. With this performance measurement system, we consider assignable-cause differences between sites, working shifts, hours of the day, and days of the week to be a source of common-cause input variability to the overall process—in other words, Deming's responsibility of management variability.

With the SBPC system, we first evaluate the process for stability. This is accomplished using an individuals chart where there is an infrequent subgrouping time interval so that input variability occurs between subgroups. For example, if we think that Monday's hold time could be larger than the other days of the week because of increased demand, we should consider selecting a weekly subgrouping frequency.

With SBPC, we are not attempting to adjust the number of operators available to respond to phone calls in real time since the company would have other systems to do that. What SBPC does is assess how well these process-management systems are addressing the overall needs of its customers and the business as a whole. The reason for doing this is to determine which of the following actions or non-actions are most appropriate, as described in Table 1.

Table 1: Statistical Business Performance Charting (SBPC) Action Options

  1. Is the process unstable or did something out of the ordinary occur, which requires action or no action?
  2. Is the process stable and meeting internal and external customer needs? If so, no action is required.
  3. Is the process stable but does not meet internal and external customer needs? If so, process improvement efforts are needed.

The four-step Shewhart model that Wheeler referenced focuses only on step number one.

In my previous article, I used randomly generated data to describe the importance of considering a data transformation when there is a physical reason for such a consideration. I thought that it would be best to use random data since we knew the answer; however, Wheeler repeatedly criticized me for selecting a too-skewed distribution and not using real data.

I will now use real data, which will lead to the same conclusion as my previous article.

Real-data Example

A process needs to periodically change from producing one product to producing another. It is important for the changeover time to be as small as possible since the production line will be idle during changeover. The example data used in this discussion is a true enterprise view of a business process. This reports the time to change from one product to another on a process line. It includes six months of data from 14 process lines that involved three different types of changeouts, all from a single factory. The factory is consistently managed to rotate through the complete product line as needed to replenish stock as it is purchased by the customers. The corporate leadership considers this process to be predictable enough, as it is run today, to manage a relatively small finished goods inventory. With this process knowledge, what is the optimal method to report the process behavior with a chart?

Figure 1 is an individuals chart of changeover time. From this control chart, which has no transformation as Wheeler suggests, nine incidents are noted that should have been investigated in real time. In addition, one would conclude that this process is not stable or is out of control. But, is it?

Figure 1: Individuals Chart of Changeover Time (Untransformed Data)

One should note that in Figure 1 the lower control limit is a negative number, which makes no physical sense since changeover time cannot be less than zero.

Wheeler makes the statement, "Whenever we have skewed data there will be a boundary value on one side that will fall inside the computed three-sigma limits. When this happens, the boundary value takes precedence over the computed limit and we end up with a one-sided chart." He also says, "The important fact about nonlinear transformations is not that they reduce the false-alarm rate, but rather that they obscure the signals of process change."

It seems to me that these statements can be contradictory. Consider that the response that we are monitoring is time, which has a zero boundary, and where a lower value is better. For many situations, our lower-control limit will be zero with Wheeler’s guidelines (e.g., Figure 1). Consider that the purpose of an improvement effort is to reduce changeover time. An improved reduction in changeover time can be difficult to detect using this “one-sided” control chart, when the lower-control limit is at the boundary condition—zero in this case.

Wheeler's article makes no mention of what the process customer requirements are or the reporting of its capability relative to specifications, which is an important aspect of lean Six Sigma programs. Let's address that point now.

The current process has engineering evaluating any change that takes longer than twelve hours. This extra step is expensive and distracts engineering from its core responsibility. The organization's current reporting system does not address how frequently this engineering intervention occurs.

In his article, Wheeler made no mention of making such a computation for either normal or non-normal distributed situations. This need occurs frequently in industry when processes are to be assessed on how they are performing relative to specification requirements.

Let's consider making this estimate from a normal probability plot of the data, as shown in Figure 2. This estimate would be similar to a practitioner manually calculating the value using a tabular z-value with a calculated sample mean and standard deviation.


Figure 2: Probability Plot of the Untransformed Data

We note from this plot how the data do not follow a straight line; hence, the normal distribution does not appear to be a good model for this data set. Because of this lack-of-fit, the percentage-of-time estimate for exceeding 12 hours is not accurate; i.e., 46% (100-54 = 46).

We need to highlight that technically we should not be making an assessment such as this because the process is not considered to be in control when plotting untransformed data. For some, if not most, processes that have a long tail, we will probably never appear to have an in-control process, no matter what improvements are made; however, does that make sense?

The output of a process is a function of its steps and input variables. Doesn’t it seem logical to expect some level of natural variability from input variables and the execution of process steps? If we agree to this assumption, shouldn’t we expect a large percentage of process output variability to have a natural state of fluctuation; i.e., be stable?

To me this statement is true for most transactional and manufacturing processes, with the exception of things like naturally auto-correlated data situations such as the stock market. However, with traditional control charting methods, it is often concluded that the process is not stable even when logic tells us that we should expect stability.

Why is there this disconnection between our belief and what traditional control charts tell us? The reason is that underlying control-chart-creation assumptions and practices are often not consistent with what occurs naturally in the real world. One of these practices is not using suitable transformations when they are needed to improve the description of process performance, for instance, when a boundary condition exists.

It is important to keep in mind that the reason for process tracking is to determine which actions or non-actions are most appropriate, as described in Table 1. Let’s now return to our real-data example analysis.

For this type of bounded situation, often a log-normal distribution will fit the data well, since changeover time cannot physically go below a lower limit of zero, such as the previously described call-center situation. With the SBPC approach, we want first to assess process stability. If a process has a current stable region, we can consider that this process is predictable. Data from the latest region of stability can be considered a random sample of the future, given that the process will continue to operate as it has in the recent past's region of stability.

When these continuous data are plotted on an appropriate probability plot coordinate system, a prediction statement can be made: What percentage of time will the changeover take longer than 12 hours? Figure 3 shows a control chart in conjunction with a probability plot. A netting-out of the process analysis results is described below the graphics: The process is predictable where about 38 percent of the time it takes longer than 12 hours.

Unlike the previous non-transformed analysis, we would now conclude that the process is stable, i.e., in control. Unlike the non-transformed analysis, this analysis considers that the skewed tails, which we expect from this process, to be the result of common-cause process variability and not a source for special cause investigation. Because of this, we conclude, for this situation, that the transformation provides an improved process discovery foundation model to build upon, when compared to a non-transformed analysis approach.

The process is predictable where about 38% percent of the time it takes longer than 12 hours.


Figure 3: Report-out.

Let’s now compare the report-outs of both the untransformed (Figure 1) and transformed data (Figure 3). Consider what actions your organization might take if presented each of these report-outs separately.

The Figure 1 report can be attempting to explain common cause events as though each occurrence has an assignable cause that needs to be addressed. Actions resulting from this line of thinking can lead to much frustration and unnecessary process-procedural tampering that result in increased process-response variability, as Deming illustrated in his funnel experiment.

When Figure 3's report-out format is used in our decision making process, we would assess the options in Table 1 to determine which action is most appropriate. With this data-transformed analysis, number three in Table 1 would be the most appropriate action, assuming that we consider the 38 percent frequency of occurrence estimate above 12 hours excessive.

Wheeler stated, "If you are interested in looking for assignable causes you need to use the process behavior chart (control chart) in real time. In a retrospective use of the chart you are unlikely to ever look for any assignable causes …"

This statement is contrary to what is taught and applied in the analyze phase of lean Six Sigma’s define-measure-analyze-improve-control (DMAIC) process improvement project execution roadmap. Within the DMAIC analyze phase, the practitioner evaluates historical data statistically for the purpose of gaining insight into what might be done to improve his process improvement project’s process.

It can often be very difficult to determine assignable causes with a real-time-search-for-signals approach that Wheeler suggests, especially when the false signal rate can be amplified in situations where an appropriate transformation was not made. Also, with this approach, we often do not have enough data to test that the hypothesis of a particular assignable cause is true; hence, we might think that we have identified an assignable cause from a signal, but this could have been a chance occurrence that did not, in fact, negatively impact our process. In addition, when we consider how organizations can have thousands of processes, the amount of resources to support this search-for-signal effort can be huge.

Deming in Out of the Crisis stated, "I should estimate that in my experience most troubles and most possibilities for improvement add up to proportions something like this: 94 percent belong to the system (responsibility of management), 6 percent [are] special."

With a search-for-signal strategy it seems as if we are trying to resolve the 6 percent of issues that Deming estimates. It would seem to me that it would be better to focus our efforts on how we can better address the 94 percent common-cause issues that Deming describes.

To address this matter from a different point of view, consider extending a classical assignable cause investigation from special cause occurrences to determining what needs to be done to improve a process’ common-cause variability response if the process does not meet the needs of the customer or the business.

I have found with the SBPC reporting approach that assignable causes that negatively impact process performance from a common-cause point of view can best be determined by collecting data over some period of time to test hypotheses that assess differences between such factors as machines, operators, day of the week, raw material lots, and so forth. When undergoing process improvement efforts, the team can use collected data within the most recent region of stability to test out compiled hypothesis statements that it thinks could affect the process output level. These analyses can provide guiding light insight to process-improvement opportunities. This is a more efficient analytical discovery approach than a search-for-signals strategy where the customer needs are not defined or addressed relative to current process performance.

For the example SBPC plot in Figure 3, the team discovered through hypotheses tests that there was a significant difference in the output as a function of the type of change, the shift that made the change, and the number of performed tests made during the change.

This type of information helps the team determine where to focus its efforts in determining what should be done differently to improve the process. Improvement to the system would be demonstrated by a statistically significant shift of the SPBC report-out to a new-improved level of stability. This system of analysis and discovery would apply for both processes that need data transformation and those that don't, that is, the SPBC system highlights and does not obscure the signals of process change.

Detection of an Enterprise Process Change

Wheeler stated, "However, in practice, it is not the false-alarm rate that we are concerned with, but rather the ability to detect signals of process changes. And that is why I used real data in my article. There we saw that a nonlinear transformation may make the histogram look more bell-shaped, but in addition to distorting the original data, it also tends to hide all of the signals contained within those data."

Let's now examine how well a shift can be detected with our real-data example, using both non-transformed and transformed data control charts. A traditional test for a process control chart is the average run length until an out-of-control indication is detected, typically for a one standard deviation shift in the mean. This does not translate well to cycle-time-based data where there are already values near the natural limit of zero. For our analysis, we will assume a 30-percent reduction in the average cycle time to be somewhat analogous to a shift of one standard deviation in the mean of a normally distributed process.

In our sample data, there were 590 changeovers in the six-month period. The 30-percent reduction in cycle time was introduced after point 300. A comparison of the transformed and untransformed process behavior charting provides a clear example of the benefits of the transformation, noting that, for the non-transformed report-out, a lower control limit of zero, per Wheeler's suggestion, was included as a lower bound reference line.

In Figure 4a, the two charts show the untransformed data analysis, the upper chart appearing to have become stable after the simulated process change. When a staging has been created (lower chart in Figure 4a), the new chart stage identifies four special cause incidents that were considered common-cause events in the transformed data set, as shown in the lower chart in Figure 4b.


 Figure 4a: Non-transformed Analysis of a Process Shift


Figure 4b: Transformed Analysis of a Process Shift


The two transformed charts in Figure 4b show a change in the process with the introduction of the special-cause indications after the simulated change was implemented.

When staging is introduced into this charting, the special-cause indications are eliminated; i.e., the process is considered stable after the change.

Using Wheeler’s guidelines, a simple reduction in process cycle time would appear to be the removal of special causes. This is fine in the short term, but as we collect more data on the new process and introduce staging into the data charting we would surely, for skewed process performance, return to a situation were special-cause signals would reoccur, as shown in the lower chart in Figure 4a. We should highlight that these special cause events appear as common-cause variability in the transformed-data-set analysis shown in the lower chart in Figure 4b.

If you follow a guideline of examining a behavioral chart that has an appropriate transformation, you would have noted a special cause occurring when the simulated process change was interjected. From this analysis, the success of the process improvement effort would have been recognized and the process behavior chart limits would be re-set to new limits that reflect the new process response level.

Since the transformed data set process is stable, we can report-out a best estimate for how the process is performing relative to the 12 hour cycle time criteria. In Figure 5, the SBPC description provides an estimate of how the process performed before and after the change, where 100-63 = 37% and 100-87 = 13%. Since the process has a recent region of stability, the data from this recent region can provide a predictive estimate of future performance unless something changes within the process and/or its inputs; i.e., our estimate is that 13 percent of the cycle times will be above 12 hours.


The process has been predictable since observation 300 where now about 13% of the time it takes longer than 12 hours for a changeover. This 13% value is a reduction from about 37% before the new process change was made.


Figure 5: SBPC Report-out Describing Impact of Process Change.


I wonder if much of the dissatisfaction and lack of use of business control charting derive from the use of non-transformed data. Immediately after the change, the process looks to be good, and the improvement effort is recognized as a success. However, in a few weeks or months after the control chart has been staged, the process will show to be out of control again because the original data is no longer the primary driver for the control limits. The organization will assume the problem has returned and possibly consider the earlier effort to now be a failure.

In addition, Wheeler’s four-step Shewhart process made no mention of how to assess how well a process is performing relative to customer needs, a very important aspect in the real business world.


The purpose of traditional individuals charting is to identify in a timely fashion when special-cause conditions occur so that corrective actions can be taken. The application of this technique is most beneficial when tracking the inputs to a process that has an output level which is capable of meeting customer needs.

False signals can occur if the process measurement by nature is not normally distributed, for example, in processes that cannot be below zero. Investigation into these false signals can be expensive and lead to much frustration when no reason is found for out-of-control conditions.

The Wheeler suggested four-step Shewhart process has its application; however, a more pressing issue for businesses is in the area of high-level predictive performance measurements. SBPC provides an enhanced control charting system that addresses these needs; e.g., an individuals chart in conjunction with an appropriate probability plot for continuous data. Appropriate data transformation considerations need to be part of the overall SBPC implementation process.

With SBPC, we are not limited to identifying out-of-control conditions but also are able to report the capability of the process in regions of stability in terms that everyone can understand. With this form of reporting, when there is a recent region of stability, we can consider data from this region to be a random sample of the future. With statistical business performing charting approach, we might be able to report that our process has been stable for the last 14 weeks with a prediction that 10 percent of our incoming calls will take longer than a goal of one minute.

I expect that the one real issue behind this entire discussion is the idea of "what is good enough?" Wheeler shared a belief that a control charting method that allows up to 2.5 percent of process measures to trigger a cause and corrective action effort that will not find a true cause in a business as "good enough." Wheeler goes so far as to relate the 95 percent confidence concept from hypothesis testing to imply that up to 5 percent false special cause detections are acceptable. Using the above-described concept of transforming process data where the transformation is appropriate for the process-data type will lead to process behavior charting that matches the sensitivity and false-cause detection that we have all learned to expect when tracking normally distributed data in a typical manufacturing environment.

Why would anyone want to have a process behavior chart that will be interpreted differently for each use in an organization? The answer should be clear: use transformations when they are appropriate and then your organization can interpret all control charts in the same manner. Why be "good enough" when you have the ability to be correct?

The "to transform or not transform" issue addressed in this paper led to SBPC reporting and its advantages over the classical control-charting approach described by Wheeler. However, the potential for SBPC predictive reporting has much larger implications than reporting a widget manufacturing process output.

Traditional organizational performance measurement reporting systems have a table of numbers, stacked bar charts, pie charts, and red-yellow-green goal-based scorecards that provide only historical data and make no predictive statements. Using this form of metric reporting to run a business is not unlike driving a car by only looking at the rear view mirror, a dangerous practice.

When predictive SBPC system reporting is used to track interconnected business process map functions, an alternative forward-looking dashboard performance reporting system becomes available. With this metric system, organizations can systematically evaluate future expected performance and make appropriate adjustments if they don't like what they see, not unlike looking out a car's windshield and turning the steering wheel or applying the brake if they don't like where they are headed.

How SBPC can be integrated within a business system that analytically/innovatively determines strategies with the alignment of improvement projects that positively impact the overall business will be described in a later article.


About The Author

Forrest Breyfogle—New Paradigms’s picture

Forrest Breyfogle—New Paradigms

CEO and president of Smarter Solutions Inc., Forrest W. Breyfogle III is the creator of the integrated enterprise excellence (IEE) management system, which takes lean Six Sigma and the balanced scorecard to the next level. A professional engineer, he’s an ASQ fellow who serves on the board of advisors for the University of Texas Center for Performing Excellence. He received the 2004 Crosby Medal for his book, Implementing Six Sigma. E-mail him at forrest@smartersolutions.com


Response to Dan, Frank, and David

Thanks for your comments. Dave, good to see we are on the same page relative to the points I that I made in my articles.

Dan and Frank, I chose to use recent customer data without making adjustments. In 20-20 hindsight, it seems like it would have been better to use a data set where no one could have thought there were any trends. This data artifact seems to have distracted some from the fundamental messages and thought process that was conveyed in these articles.

One of the purposes for this real-data analysis was to illustrate how transforming the data could detect a process shift since that was one of the concerns expressed in a previous article. The article showed how an improvement shift in the process was distinctly detected with the transformed data analysis.

Frank made a point that we should always be looking for process improvement in all our measurements. From a practical point of view at the business level, organizations do not have enough resources to achieve this seemingly worthy goal for all measurement that they have. Improvements can be hard to achieve, especially with processes that have much of their low hanging fruit already picked. In addition, this form of improve-everything thinking can lead to silo benefits that do not benefit the enterprise as a whole.

From an enterprise point of view, we need to determine where our process improvement efforts should focus so that resources spent on these efforts have the most impact to the business as a whole. To determine this, we need to compare process outputs to specification limits and/or process improvement business goals. With the untransformed Shewhart 4-step approach, voice of the customer specification limits were not included; i.e., what happen with Lean Six Sigma process capability assessments for a stable process?

Frank, I truly believe that the signals for the untransformed data are false because of the nature of the process; i.e., it has a lower boundary and a log-normal distribution fits nicely. However, whether untransformed signals are false or not in this example, we would still be looking at all the data in the transformed region of stability to determine where improvement efforts should focus, with engineering and other’s assistance. The reason for doing this is that the non-conformance rate is too high, which the Shewart 4-step process did not address. Decisions of what is wrong with individual data points can lead to tampering with the process when it is thought that there was a special cause incident but there wasn’t.

Statistical hypothesis tests of various theories in the region of stability (log-normal transformed data in this case) can help with this understanding. Knowledge gained from determining a statistical significant difference between machines, operators, process temperature, etc. can help engineers and process owners gain insight to what should be done to improve the process; i.e., just like what is suggested in the analyze phase of DMAIC project execution roadmap.

Frank, control charts are not truly a hypothesis test; hence, reference to type 1 and type 2 errors are not really applicable and these references should not be made to control-chart decisions. It is a fact that data can be bounded and be fairly well modeled as a log-normal. Hopefully you will agree to that. You will get more false signals if you do not consider an appropriate transformation when the data truly has a log-normal distribution. If you don’t agree with this statement, try some log-normal data simulations for yourself.

Dan, I will not disagree with your points about staging; however, I do not understand your point about me now sharing the location/shape factors. This was real data where a simulated improvement was made, which the transformed analysis did a good job detecting, which was a point of contention in a previous article.

Dan and Frank, most business scorecards as they are currently presented do not access the scorecard-data statistically. I believe that statistics can help much in this area. It is not obvious to me that a control charting approach which focuses on the search for out-of-control signals as an improvement strategy can be beneficial in this organizational-scorecard effort. In a planned future article I will describe how the SBPC methodology is a means to achieve these business scorecard objectives.

NOT Transforming the Data Can Be Fata to Your Analysis

I agree with Breyfogle on:
1. Always sssuming Normality and NOT fitting your data to a non-normal distribution or transforming it to a BOX-COX or Johnson transformation WHEN NECESSARY can lead you to SPC charts with false alarms that bring into question whether the process is now stable. Rusles and criteria has not been explored thoroughly in the quality community.
2. If data is moderately to extremely skewed, the Probability of No-conformance will be significantly inaccurate. Having performed over 1500 SPC charts on datasets, I've found extremely skewed data does need to be fitted by a non-normal distribution to reduce false alarms and report an accurate perfomance (1.0-PNC in %).
3. Control limits are significantly different computed by a normal distribution and by a non-normal distribution.

It makes no sense NOT to transform the data or fit it with the best non-normal distribution if the data is extremely slewed. About 75%-8-% of 1500 datasets I've analyzed, the data was moderately to extremely skewed. However about 67% of the control limits computed by fitting a normal or non-normal distribution to moderately skewed data was not significantly different. Yield results were slightly different. I've also found that most of the data was skewed because of moderate to extreme outliers. Once extreme outliers were eliminated, SPC charts indicated the process was stable.

I disagree with Wheeler and with Florac Carlton, who wrote, "Managing the Software Process", who each say all data never needs not be transformed in producing SPC charts. Statistics applied to manufacturing has improved since Shewart introduced SPC charts.

Some obervations on raw data

I fully agree with the comments from Dan. Interesting signals and information are already present in the raw data.

From the non-transformed data it is very clear that after the process change a major improvement has been realized: a simple t-test learns that after the change there is a significant drop of the process average; no doubt one has evidence to believe in being on the right track!

However, despite of the major improvement, it does not mean that the process capability can't be further improved; we are in the spirit of continuous improvment right? That's why to my opinion it would be wrong to ignore the signals after the process change because they may offer opportunities for further capability improvement. The transformed data clearly hide these potentially important signals. Why should we believe that the signals after the process change are false, are we sure they are false?? Should we not go and discuss with the process experts, engineering department and people on the floor to assess these excessive variation signals? Process control in the first place is teamwork, requiring the commitment of everyone in the philosopy of continuous improvement. Also, working with transformed data will not make this team discussion easier..

Statistically one can prove that highly skewed ditributions can yield Type I 2 - 3% false alarm rate. No doubt, a clearly proven fact that data transformation reduces this rate with a factor of ten. But, (!) on the other hand, what is the type II error by data transformation i.e. the risk for wrongly accepting a real exessive variation signal as random variation?

Some Observations on the Data

There were some basic questions that this generated.

In figure 1. The first 119 points appear stable, mean about 8-9. I would have generated an I-Chart. Then the next 50-60 points (~1/2 the first 119 points) the mean appears to have shifted up by 50% (8-9 to about 14-15). If I had set the limits based on the first points, the period between 119 and 200, would have been highlighted for concern about special cause -- why was there an upward shift. Then past 530, we see a similar "rise" beginning. Considering that if I had indeed generated an I-Chart on the first 100 points, a "no transform" chart would have worked fine in catching (highlighting) this magnitude of shift. And taking action at that point.

The second concern I had was the log probability plots. Breyfogle did not share the measured location/shape factors so that we can see how "skewed" the data were. The "d2" constant (using in generating any I-Chart) is claimed to be fairly robust to some degree of normality. In my own studies, I saw that the "d2" was fairly good up to a skew of 1-1.5. So, how "nonnormal" were the data. (remember, an ordinary exponential is only a skew = 2).

Third, considering the 50% process shift from points 119 to ~200, should this data have been included/excluded in the probability plot? And further, in the "30% reduction charts" (figure 4a) the basis for including the points in generating the control limits was not clarified or justified. Basically, the 30% improvement set was post the "spike" in the chart, and hence, the limits generated for the first points included that 'spike". Also, the upper control limits went from 20 to about 25 (Figure 4a). In other words, were these special causes or not? And should they have been included or not? I have to assume that these points were "random variation, since no mention of the 50% increase was made, but any "spike" of 50% should be a cause for concern and resolved before completing the analysis.

Sensitivity to shifts other issues with transformations

I should tell you that I work for Forrest, but that does not mean I always agree with him.

I have enjoyed reading this series of articles between Breyfogle and Wheeler. It challenges me to think about what I do in my work. I am a Statistician and have done SPC for many years. I have always considered transformations of skewed data as OK if I wanted to use an I-chart. This discussion pushed me to test my beliefs.

To do this I ran two simulations using @RISK software. The first looked at the false special cause rate when using skewed data in an I-chart. This concept is important to an SPC user, not as important to a academic. False special cause events lead to unneeded activities by the process owner and end up with tampering or the generation of false knowledge as you try to explain a random event as if it was a special cause.

In this simulation, I adjusted the scale parameter of a lognormal distribution, lognormal(3.5,0.4),that was typical of cycle time data. In this analysis I demonstrated that as the skewness increases, the false special cause events (rule 1 only) begin to increase as the scale parameter exceeds about 0.13, which is also where the data will begin to start failing a Normal goodness of fit test. Interestingly, adjusting the location parameter has no effect on the rate false special cause detections.

Documented in my blog http://www.smartersolutions.com/blog/wordpress/?p=326

The second simulation evaluated the sensitivity of the i-chart to a shift in the process mean. I chose to evaluate the average run length (ARL) for a rule 1 event after the shift. I simulated shifts equivalent to shifts from -2 to +2 standard deviations, but focused on the 1 sigma shift values since they are the common reference for control charting. I compared the non-transformed lognormal data to the transformed data and included a normal data set for a reference. I was surprised to find that my hypothesis that the transformation should reduce the sensitivity to detecting change was wrong. I found that the transformed data performed equal to or slightly better than expected for a shift in normal data. The i-chart was terrible with the original data. My ARL was near 400 for a downward shift in the mean where it should be around 150. The upward shift ARL was near 20, which is quite good, but when you consider that the ARL is around 56 for the data without a shift, I am not sure if practitioner would even notice.

Documented in my blog http://www.smartersolutions.com/blog/wordpress/?p=358

My assessments tell me that transformations are not only a good thing, but absolutely necessary if you are intending to use the I-chart to identify the difference between common and special cause events. Mr. Wheeler is right that the I-chart control limits bound about 98% of the data for skewed distributions. But that 2% false indication of a special cause will lead to such bad behavior when used in SPC, that I would rather not even chart it. Ignorance might be better than the resulting tampering of the process.

Mr. Wheeler was right in the fact that transforming without thought can lead to problems, so transform when it makes sense and lead your SPC program to perform better. In the early days, transformations were shunned because of the difficulty in doing charts by hand. We should all move ahead of that thought now that we have PCs and software to do the math. Why work in the past. Use the tools we have been provided to do our work better.


The purpose of control charts

It is good to see Breyfogle responding to Wheeler and using some real data. However he makes a common error in the way he uses this data with his statement: "The purpose of traditional individuals charting is to identify in a timely fashion when special-cause conditions occur". The real purpose of a control chart is to gain an insight into the process. "Special cause" points on a control chart draw attention to areas that should be investigated. If Breyfogle had appreciated this and taken a step back to observe, rather than getting bogged down in charting, he would have seen that his data has an obvious periodicity. His data shows one and a half cycles. The question to ask is what is causing this ? Addressing the cause will lead to fewer special cause points ... and an improved process.

Real, skewed, time based data is easy enough to generate. This simple exercise takes 5 minutes, and demonstrates how effective standard XmR charts are for the type of process that Breyfogle describes:

Breyfogle is correct in that control charts are not used extensively in business. The reason is that people don't understand them. Adding unnecessary complexity does not help.

Dr Burns

Dr. Burns' Observations

Dr. Burns, Thank you for your input. I agree completely with you that control charts should be used to help one gain insight into processes. However, we need to also structurally address at the same time the capability of the process relative to meeting process specifications (voice of the customer needs) and the needs of the business as a whole. I too observed what looked like a cyclic pattern in this real data but did not mention it in the article. Using the approach described in the article, one could statistically test various hypothesis using historical data for the purpose of gaining insight to the process’ apparent cyclic behavior, which could provide direction on could be done to improve the process. I too do not want to make the life of a control-chart user more complex; however, it is important that over simplification not lead to an inappropriate action from the interpretation of the chart. As the article illustrated, an appropriate transformation can be needed in this correct-decision-making process.


Mr Breyfogle, may I suggest that you read Dr Wheeler's wonderful little book, published in 2010, "Normality and the Process Behavior Chart."  Dr Wheeler tests 1143 different distributions and proves Dr Shewhart's assertion, that normality is not required.

- Dr Burns


The plot thickens! Both Breyfogle and Wheeler have written books and Wheeler admitted that he could almost write a book on this subject. It appears that Breyfogle has accomplished that in his most recent response. It took 13 pages on my printer, which certainly would have made a nice chapter. I would have preferred a simple point-by-point response to Wheeler's last article, but it looks like we are not going to get anything like that from either author.

I like the fact that Forrest has brought in some real data, as that seemed to be one of the biggest points of criticism he had taken as part of his last response. But again, both authors are fighting different battles. Forrest wants to fight the battle of whether or not you are making scrap and causing unnecessary firefighting, and Don Wheeler wants to strictly look at process behavior and signals vs. noise. Until this impasse is overcome, we will likely continue to see these authors doing battle over different things.

So has Forrest Breyfogle nailed it this time? My thought is that he has argued his case quite persuasively. He is right that control charting has been around for a very long time. And in that time, both the theory and application of the process of control charting have been improved beyond Shewhart's original approach. Some authors have made some pretty strong recommendations about control charting and both of these authors most certainly have done that in their books.

Let me point out that there is one thing that both Wheeler and Breyfogle seem to agree on (can you believe it?). And that is the universal use of the "individuals" control chart. Breyfogle seems to want to discard all other types of control charts and Wheeler, in his book, "Making Sense of Data" shows the X-MR chart as the "Swiss Army Knife" of control charts on page 233 of that book. While I don't necessarily agree that all other types of control charts should be eliminated, I can see from both authors how the use of the humble X-MR chart is often the way to go (or at least a way in which you are unlikely to go wrong).

That said, I will have to spend a little more time pouring over the 13 pages of fodder provided in this most recent article to be able to intelligently speak to the many points made. But I would make note of one thing. This is not religion. This is just a process. Let's not make this bigger than it really is. If I use the techniques of either author for the purposes they suggest (emphasis on this last phrase), will I go wrong? I think not.
- Mike Harkins

Thoughtful and logical

This is a thorough and well-supported refutation of Wheeler's claims.

I agree that inappropriate transformation CAN be fatal, but when the context tells you that the normal model is clearly inappropriate, why would you insist on using it anyway? For some tools, like Xbar charts, there is little harm in using the normal chart. But for individual X charts, the false alarm rate varies wildly. Apparently Wheeler is ok with false alarms being somewhere between 0 and 2.5% (even up to 9% in extreme cases), but I believe today's managers are not happy with that. They expect statistical tools to be reliable and consistent in their error rates. When the tools are unpredictable, managers will simply drop them and miss the potential benefits.

Thanks to Forrest Breyfogle for the effort to tackle this important argument.

Leptokurtophiles unite!

Breyfogle makes too many assumptions, misses stating big picture

This is NOT well thought out. Breyfogle makes a bunch of assumptions at the beginning of his article about the use of charting as well as management attitudes and Wheeler's attitude, without offering one bit of research to back up his assumptions. This should lead all readers to believe that whatever follows may be suspect. Regardless, what Breyfogle does not mention is the big-picture management of variation, which is an individual company's philosophy decision. There are two ways to manage variation at a company that its management must decide. One way is meeting *internal/customer specifications* is good enough. The other way is continuous improvement of variability reduction to achieve process consistency. Control charting using well-proven methods, per Shewhart and Deming, is the most effective way to approach service-process consistency. Breyfogle is basically suggesting you mess around with data points to *fit* his statistical paradigm, which is unwise--it's unwise to transform data, because once you start down that path you are merely game playing with numbers, which is an enormous waste of a company's time and resources. The two ways of managing variation at a company have two completely different objectives and two completely different results. The objective of Brefyogle's method is meeting internal/customer specifications is good enough. The result is you get services or products that vary as much as possible within specs, because anything within spec is considered *good enough.* The objective of Shewhart's/Deming's method is process consistency. The result is you get service- or product- processes that are as consistent as possible. In order for control chart limits to work, they must be derived from untampered data points, the voice of the process. Specification limits are the voice of the company/customer. Once you start to mix the two concepts in your head, and then in your statistical practices, you are simply trying to justify your statistical practices with faulty logic. You are simply working with control chart limits that are derived from tampered-with data points----this makes no logical sense to me. Over the years, I have met plenty of statistical shorcut artists who keep trying to introduce shortcuts to Shewhart's/Deming's methodologies. I am not suggesting Breyfogle is one of them. But I am stating the fact that making service- or product-process improvements TAKES TIME and there are no shortcuts.

Response to AGAPEGUY77

AGAPEGUY77, thanks for sharing your thoughts. Here are my comments relative to your points. Yes, I presented only a description of to tranform or not. I will present a “big-picture management of variation” in a later article. I am all for continuous improvement; however, you have to also evaluate the overall system to determine where process improvement efforts can best be focused that impact the big picture and its metrics; e.g, a theory of constraints (TOC) point of view. Will address this in a followup article. I also agree with product consistency and am a big fan of Shewhart and Deming. What I am simply suggesting is that some processes have skewed distributions by nature and that needs to be considered when making a process consistency assessment; i.e., using distributions that make physical sense and no playing games with the data. I am not suggesting short cutting the system but instead will be leading in a future article to an overall system for orchestrating activities in the business; i.e., an overall business system. A future article will also describe issues with red-yellow-green scorecards and how the described metric reporting system in this article can reduce firefighting from red-yellow-green goal setting metrics.