On Alleviating Tortured Data

And a proposal for a ‘new’ run chart rule

Up until a few years ago, I wasn’t a big fan of run charts. Why not just go ahead and construct a process behavior chart and move on? Well, sometimes a run chart is more appropriate for certain data structures.

For example, some data are “chunky”—see Donald Wheeler’s treatment of chunky data in this QD column from 2011. Davis Balestracci has written about run charts in recent columns, including this one from August 2014. But my aim here is to present the progression from trying to make sense of chunky data with a process behavior chart to using a run chart to track a system improvement. I will present a new run chart rule for your consideration, and your feedback regarding it would be appreciated.

Everyone in industry is interested in improving safety. The key universal metric is the total incident rate (TIR), which the government uses to track overall safety performance throughout U.S. industry. You can find these data at www.OSHA.gov.

…

Want to continue?

By logging in you agree to receive communication from Quality Digest. Privacy Policy.

Create a FREE account

Forgot My Password

Comments

Skewed data with Injuries

You can calculate all sorts of different rules based upon an alpha and k points. It might already be in a non-parametric book table (I don't have one handy but I remember doing such calculations in grad school).

For this particular example with time between injuries, IF (big IF) your data is truly poisson and injuries are "independent" then the time between injuries is by definition exponentially distributed. Nelson (can't remember if it was Lloyd or Wayne) found that the ideal "transform" of such data to get it "close" to normal for use was X^0.27. He had an article either in Technometrics or something similar back in the 1990's. I had independently found with data from my company that taking the square root of the square root (or X^0.25) worked pretty well and that's what I used.

Before I get comments from those about not needing to transform, I'm generally one of those people myself (with rare exceptions)... when the data is heavily skewed and it is skewed because it is bound. We plotted the time between injuries (transformed to X^0.25) on an IMR chart and it worked extremely well. You could easily use run rules about safety getting worse or getting better after a safety initiative was introduced. We plotted the current time since the last injury using a different symbol so that people knew it was still ongoing. If it crossed the UCL for the I chart, then we would ask ourselves did we introduce a change that truly caused that point? We found it to be an effective tool to combat the managerial issue you brought up about feeling good about no injuries when in fact the process hadn't really changed.

The Hawthorne effect is pretty easy to pick up but it is the sustained drive that makes a difference.

Our workforce didn't have too much trouble understanding that time was transformed when we explained it - they understood skewness of the data pretty easily and had enough comfort with SPC that it was easy to grasp and we showed it both in manager meetings and production meetings. So, yes, you can use medians to develop your own set of runs rules (and likely you can find what you need in a non-parametric book), but a simple transformation can work as well and a lot less work in the end.