NOT Transforming the Data Can Be Fatal to Your Analysis

A case study, with real data, describes the need for data transformation.

Not surprisingly, there was controversy over Forrest Breyfogle's article, "Non-normal Data: To Transform or Not to Transform," written in response to Donald Wheeler’s article "Do You Have Leptokurtophobia?" Wheeler continued the debate with "Transforming the Data Can Be Fatal to Your Analysis." This article is Breyfogle’s response to Wheeler’s latest column.

--Editor

Donald Wheeler stated in his second article "Transforming the Data Can Be Fatal to Your Analysis," "out of respect for those who are interested in learning how to better analyze data, I feel the need to further explain why the transformation of data can be fatal to your analysis."

…

Want to continue?

By logging in you agree to receive communication from Quality Digest. Privacy Policy.

Create a FREE account

Forgot My Password

Comments

Thoughtful and logical

This is a thorough and well-supported refutation of Wheeler's claims.

I agree that inappropriate transformation CAN be fatal, but when the context tells you that the normal model is clearly inappropriate, why would you insist on using it anyway? For some tools, like Xbar charts, there is little harm in using the normal chart. But for individual X charts, the false alarm rate varies wildly. Apparently Wheeler is ok with false alarms being somewhere between 0 and 2.5% (even up to 9% in extreme cases), but I believe today's managers are not happy with that. They expect statistical tools to be reliable and consistent in their error rates. When the tools are unpredictable, managers will simply drop them and miss the potential benefits.

Thanks to Forrest Breyfogle for the effort to tackle this important argument.

Leptokurtophiles unite!

Breyfogle makes too many assumptions, misses stating big picture

This is NOT well thought out. Breyfogle makes a bunch of assumptions at the beginning of his article about the use of charting as well as management attitudes and Wheeler's attitude, without offering one bit of research to back up his assumptions. This should lead all readers to believe that whatever follows may be suspect. Regardless, what Breyfogle does not mention is the big-picture management of variation, which is an individual company's philosophy decision. There are two ways to manage variation at a company that its management must decide. One way is meeting *internal/customer specifications* is good enough. The other way is continuous improvement of variability reduction to achieve process consistency. Control charting using well-proven methods, per Shewhart and Deming, is the most effective way to approach service-process consistency. Breyfogle is basically suggesting you mess around with data points to *fit* his statistical paradigm, which is unwise--it's unwise to transform data, because once you start down that path you are merely game playing with numbers, which is an enormous waste of a company's time and resources. The two ways of managing variation at a company have two completely different objectives and two completely different results. The objective of Brefyogle's method is meeting internal/customer specifications is good enough. The result is you get services or products that vary as much as possible within specs, because anything within spec is considered *good enough.* The objective of Shewhart's/Deming's method is process consistency. The result is you get service- or product- processes that are as consistent as possible. In order for control chart limits to work, they must be derived from untampered data points, the voice of the process. Specification limits are the voice of the company/customer. Once you start to mix the two concepts in your head, and then in your statistical practices, you are simply trying to justify your statistical practices with faulty logic. You are simply working with control chart limits that are derived from tampered-with data points----this makes no logical sense to me. Over the years, I have met plenty of statistical shorcut artists who keep trying to introduce shortcuts to Shewhart's/Deming's methodologies. I am not suggesting Breyfogle is one of them. But I am stating the fact that making service- or product-process improvements TAKES TIME and there are no shortcuts.

Response to AGAPEGUY77

AGAPEGUY77, thanks for sharing your thoughts. Here are my comments relative to your points. Yes, I presented only a description of to tranform or not. I will present a “big-picture management of variation” in a later article. I am all for continuous improvement; however, you have to also evaluate the overall system to determine where process improvement efforts can best be focused that impact the big picture and its metrics; e.g, a theory of constraints (TOC) point of view. Will address this in a followup article. I also agree with product consistency and am a big fan of Shewhart and Deming. What I am simply suggesting is that some processes have skewed distributions by nature and that needs to be considered when making a process consistency assessment; i.e., using distributions that make physical sense and no playing games with the data. I am not suggesting short cutting the system but instead will be leading in a future article to an overall system for orchestrating activities in the business; i.e., an overall business system. A future article will also describe issues with red-yellow-green scorecards and how the described metric reporting system in this article can reduce firefighting from red-yellow-green goal setting metrics.

Religion

The plot thickens! Both Breyfogle and Wheeler have written books and Wheeler admitted that he could almost write a book on this subject. It appears that Breyfogle has accomplished that in his most recent response. It took 13 pages on my printer, which certainly would have made a nice chapter. I would have preferred a simple point-by-point response to Wheeler's last article, but it looks like we are not going to get anything like that from either author.

I like the fact that Forrest has brought in some real data, as that seemed to be one of the biggest points of criticism he had taken as part of his last response. But again, both authors are fighting different battles. Forrest wants to fight the battle of whether or not you are making scrap and causing unnecessary firefighting, and Don Wheeler wants to strictly look at process behavior and signals vs. noise. Until this impasse is overcome, we will likely continue to see these authors doing battle over different things.

So has Forrest Breyfogle nailed it this time? My thought is that he has argued his case quite persuasively. He is right that control charting has been around for a very long time. And in that time, both the theory and application of the process of control charting have been improved beyond Shewhart's original approach. Some authors have made some pretty strong recommendations about control charting and both of these authors most certainly have done that in their books.

Let me point out that there is one thing that both Wheeler and Breyfogle seem to agree on (can you believe it?). And that is the universal use of the "individuals" control chart. Breyfogle seems to want to discard all other types of control charts and Wheeler, in his book, "Making Sense of Data" shows the X-MR chart as the "Swiss Army Knife" of control charts on page 233 of that book. While I don't necessarily agree that all other types of control charts should be eliminated, I can see from both authors how the use of the humble X-MR chart is often the way to go (or at least a way in which you are unlikely to go wrong).

That said, I will have to spend a little more time pouring over the 13 pages of fodder provided in this most recent article to be able to intelligently speak to the many points made. But I would make note of one thing. This is not religion. This is just a process. Let's not make this bigger than it really is. If I use the techniques of either author for the purposes they suggest (emphasis on this last phrase), will I go wrong? I think not.
- Mike Harkins

The purpose of control charts

It is good to see Breyfogle responding to Wheeler and using some real data. However he makes a common error in the way he uses this data with his statement: "The purpose of traditional individuals charting is to identify in a timely fashion when special-cause conditions occur". The real purpose of a control chart is to gain an insight into the process. "Special cause" points on a control chart draw attention to areas that should be investigated. If Breyfogle had appreciated this and taken a step back to observe, rather than getting bogged down in charting, he would have seen that his data has an obvious periodicity. His data shows one and a half cycles. The question to ask is what is causing this ? Addressing the cause will lead to fewer special cause points ... and an improved process.

Real, skewed, time based data is easy enough to generate. This simple exercise takes 5 minutes, and demonstrates how effective standard XmR charts are for the type of process that Breyfogle describes:
http://q-skills.com/nm/nm.htm

Breyfogle is correct in that control charts are not used extensively in business. The reason is that people don't understand them. Adding unnecessary complexity does not help.

Dr Burns

Dr. Burns' Observations

Dr. Burns, Thank you for your input. I agree completely with you that control charts should be used to help one gain insight into processes. However, we need to also structurally address at the same time the capability of the process relative to meeting process specifications (voice of the customer needs) and the needs of the business as a whole. I too observed what looked like a cyclic pattern in this real data but did not mention it in the article. Using the approach described in the article, one could statistically test various hypothesis using historical data for the purpose of gaining insight to the process’ apparent cyclic behavior, which could provide direction on could be done to improve the process. I too do not want to make the life of a control-chart user more complex; however, it is important that over simplification not lead to an inappropriate action from the interpretation of the chart. As the article illustrated, an appropriate transformation can be needed in this correct-decision-making process.

Normality

Mr Breyfogle, may I suggest that you read Dr Wheeler's wonderful little book, published in 2010, "Normality and the Process Behavior Chart." Dr Wheeler tests 1143 different distributions and proves Dr Shewhart's assertion, that normality is not required.

- Dr Burns

Sensitivity to shifts other issues with transformations

I should tell you that I work for Forrest, but that does not mean I always agree with him.

I have enjoyed reading this series of articles between Breyfogle and Wheeler. It challenges me to think about what I do in my work. I am a Statistician and have done SPC for many years. I have always considered transformations of skewed data as OK if I wanted to use an I-chart. This discussion pushed me to test my beliefs.

To do this I ran two simulations using @RISK software. The first looked at the false special cause rate when using skewed data in an I-chart. This concept is important to an SPC user, not as important to a academic. False special cause events lead to unneeded activities by the process owner and end up with tampering or the generation of false knowledge as you try to explain a random event as if it was a special cause.

In this simulation, I adjusted the scale parameter of a lognormal distribution, lognormal(3.5,0.4),that was typical of cycle time data. In this analysis I demonstrated that as the skewness increases, the false special cause events (rule 1 only) begin to increase as the scale parameter exceeds about 0.13, which is also where the data will begin to start failing a Normal goodness of fit test. Interestingly, adjusting the location parameter has no effect on the rate false special cause detections.

Documented in my blog http://www.smartersolutions.com/blog/wordpress/?p=326

The second simulation evaluated the sensitivity of the i-chart to a shift in the process mean. I chose to evaluate the average run length (ARL) for a rule 1 event after the shift. I simulated shifts equivalent to shifts from -2 to +2 standard deviations, but focused on the 1 sigma shift values since they are the common reference for control charting. I compared the non-transformed lognormal data to the transformed data and included a normal data set for a reference. I was surprised to find that my hypothesis that the transformation should reduce the sensitivity to detecting change was wrong. I found that the transformed data performed equal to or slightly better than expected for a shift in normal data. The i-chart was terrible with the original data. My ARL was near 400 for a downward shift in the mean where it should be around 150. The upward shift ARL was near 20, which is quite good, but when you consider that the ARL is around 56 for the data without a shift, I am not sure if practitioner would even notice.

Documented in my blog http://www.smartersolutions.com/blog/wordpress/?p=358

My assessments tell me that transformations are not only a good thing, but absolutely necessary if you are intending to use the I-chart to identify the difference between common and special cause events. Mr. Wheeler is right that the I-chart control limits bound about 98% of the data for skewed distributions. But that 2% false indication of a special cause will lead to such bad behavior when used in SPC, that I would rather not even chart it. Ignorance might be better than the resulting tampering of the process.

Mr. Wheeler was right in the fact that transforming without thought can lead to problems, so transform when it makes sense and lead your SPC program to perform better. In the early days, transformations were shunned because of the difficulty in doing charts by hand. We should all move ahead of that thought now that we have PCs and software to do the math. Why work in the past. Use the tools we have been provided to do our work better.

Rick

Some Observations on the Data

There were some basic questions that this generated.

In figure 1. The first 119 points appear stable, mean about 8-9. I would have generated an I-Chart. Then the next 50-60 points (~1/2 the first 119 points) the mean appears to have shifted up by 50% (8-9 to about 14-15). If I had set the limits based on the first points, the period between 119 and 200, would have been highlighted for concern about special cause -- why was there an upward shift. Then past 530, we see a similar "rise" beginning. Considering that if I had indeed generated an I-Chart on the first 100 points, a "no transform" chart would have worked fine in catching (highlighting) this magnitude of shift. And taking action at that point.

The second concern I had was the log probability plots. Breyfogle did not share the measured location/shape factors so that we can see how "skewed" the data were. The "d2" constant (using in generating any I-Chart) is claimed to be fairly robust to some degree of normality. In my own studies, I saw that the "d2" was fairly good up to a skew of 1-1.5. So, how "nonnormal" were the data. (remember, an ordinary exponential is only a skew = 2).

Third, considering the 50% process shift from points 119 to ~200, should this data have been included/excluded in the probability plot? And further, in the "30% reduction charts" (figure 4a) the basis for including the points in generating the control limits was not clarified or justified. Basically, the 30% improvement set was post the "spike" in the chart, and hence, the limits generated for the first points included that 'spike". Also, the upper control limits went from 20 to about 25 (Figure 4a). In other words, were these special causes or not? And should they have been included or not? I have to assume that these points were "random variation, since no mention of the 50% increase was made, but any "spike" of 50% should be a cause for concern and resolved before completing the analysis.

Some obervations on raw data

I fully agree with the comments from Dan. Interesting signals and information are already present in the raw data.

From the non-transformed data it is very clear that after the process change a major improvement has been realized: a simple t-test learns that after the change there is a significant drop of the process average; no doubt one has evidence to believe in being on the right track!

However, despite of the major improvement, it does not mean that the process capability can't be further improved; we are in the spirit of continuous improvment right? That's why to my opinion it would be wrong to ignore the signals after the process change because they may offer opportunities for further capability improvement. The transformed data clearly hide these potentially important signals. Why should we believe that the signals after the process change are false, are we sure they are false?? Should we not go and discuss with the process experts, engineering department and people on the floor to assess these excessive variation signals? Process control in the first place is teamwork, requiring the commitment of everyone in the philosopy of continuous improvement. Also, working with transformed data will not make this team discussion easier..

Statistically one can prove that highly skewed ditributions can yield Type I 2 - 3% false alarm rate. No doubt, a clearly proven fact that data transformation reduces this rate with a factor of ten. But, (!) on the other hand, what is the type II error by data transformation i.e. the risk for wrongly accepting a real exessive variation signal as random variation?

NOT Transforming the Data Can Be Fata to Your Analysis

I agree with Breyfogle on:
1. Always sssuming Normality and NOT fitting your data to a non-normal distribution or transforming it to a BOX-COX or Johnson transformation WHEN NECESSARY can lead you to SPC charts with false alarms that bring into question whether the process is now stable. Rusles and criteria has not been explored thoroughly in the quality community.
2. If data is moderately to extremely skewed, the Probability of No-conformance will be significantly inaccurate. Having performed over 1500 SPC charts on datasets, I've found extremely skewed data does need to be fitted by a non-normal distribution to reduce false alarms and report an accurate perfomance (1.0-PNC in %).
3. Control limits are significantly different computed by a normal distribution and by a non-normal distribution.

It makes no sense NOT to transform the data or fit it with the best non-normal distribution if the data is extremely slewed. About 75%-8-% of 1500 datasets I've analyzed, the data was moderately to extremely skewed. However about 67% of the control limits computed by fitting a normal or non-normal distribution to moderately skewed data was not significantly different. Yield results were slightly different. I've also found that most of the data was skewed because of moderate to extreme outliers. Once extreme outliers were eliminated, SPC charts indicated the process was stable.

I disagree with Wheeler and with Florac Carlton, who wrote, "Managing the Software Process", who each say all data never needs not be transformed in producing SPC charts. Statistics applied to manufacturing has improved since Shewart introduced SPC charts.

Response to Dan, Frank, and David

Thanks for your comments. Dave, good to see we are on the same page relative to the points I that I made in my articles.

Dan and Frank, I chose to use recent customer data without making adjustments. In 20-20 hindsight, it seems like it would have been better to use a data set where no one could have thought there were any trends. This data artifact seems to have distracted some from the fundamental messages and thought process that was conveyed in these articles.

One of the purposes for this real-data analysis was to illustrate how transforming the data could detect a process shift since that was one of the concerns expressed in a previous article. The article showed how an improvement shift in the process was distinctly detected with the transformed data analysis.

Frank made a point that we should always be looking for process improvement in all our measurements. From a practical point of view at the business level, organizations do not have enough resources to achieve this seemingly worthy goal for all measurement that they have. Improvements can be hard to achieve, especially with processes that have much of their low hanging fruit already picked. In addition, this form of improve-everything thinking can lead to silo benefits that do not benefit the enterprise as a whole.

From an enterprise point of view, we need to determine where our process improvement efforts should focus so that resources spent on these efforts have the most impact to the business as a whole. To determine this, we need to compare process outputs to specification limits and/or process improvement business goals. With the untransformed Shewhart 4-step approach, voice of the customer specification limits were not included; i.e., what happen with Lean Six Sigma process capability assessments for a stable process?

Frank, I truly believe that the signals for the untransformed data are false because of the nature of the process; i.e., it has a lower boundary and a log-normal distribution fits nicely. However, whether untransformed signals are false or not in this example, we would still be looking at all the data in the transformed region of stability to determine where improvement efforts should focus, with engineering and other’s assistance. The reason for doing this is that the non-conformance rate is too high, which the Shewart 4-step process did not address. Decisions of what is wrong with individual data points can lead to tampering with the process when it is thought that there was a special cause incident but there wasn’t.

Statistical hypothesis tests of various theories in the region of stability (log-normal transformed data in this case) can help with this understanding. Knowledge gained from determining a statistical significant difference between machines, operators, process temperature, etc. can help engineers and process owners gain insight to what should be done to improve the process; i.e., just like what is suggested in the analyze phase of DMAIC project execution roadmap.

Frank, control charts are not truly a hypothesis test; hence, reference to type 1 and type 2 errors are not really applicable and these references should not be made to control-chart decisions. It is a fact that data can be bounded and be fairly well modeled as a log-normal. Hopefully you will agree to that. You will get more false signals if you do not consider an appropriate transformation when the data truly has a log-normal distribution. If you don’t agree with this statement, try some log-normal data simulations for yourself.

Dan, I will not disagree with your points about staging; however, I do not understand your point about me now sharing the location/shape factors. This was real data where a simulated improvement was made, which the transformed analysis did a good job detecting, which was a point of contention in a previous article.

Dan and Frank, most business scorecards as they are currently presented do not access the scorecard-data statistically. I believe that statistics can help much in this area. It is not obvious to me that a control charting approach which focuses on the search for out-of-control signals as an improvement strategy can be beneficial in this organizational-scorecard effort. In a planned future article I will describe how the SBPC methodology is a means to achieve these business scorecard objectives.