Statistics Article

Donald J. Wheeler’s picture

By: Donald J. Wheeler

The cumulative sum (or Cusum) technique is occasionally offered as an alternative to process behavior charts, even though they have completely different objectives. Process behavior charts characterize whether a process has been operated predictably. Cusums assume that the process is already being operated predictably and look for deviations from the target value. Thus, by replacing process characterization with parameter estimation, Cusums beg the very question process behavior charts were created to address.

To illustrate the Cusum approach and compare it with an average chart, I’ll use the example from page 20 of Shewhart’s first book, Economic Control of Quality of Manufactured Product (Martino Fine books, 2015 reprint).These data consist of 204 measurements of electrical resistivity for an insulator. Shewhart organized them into 51 subgroups of size four, based upon the time order in which the measurements were obtained. Figure 1 gives the averages and ranges for these 51 subgroups.

Donald J. Wheeler’s picture

By: Donald J. Wheeler

Many people have been taught that capability indexes only apply to “normally distributed data.” This article will consider the various components of this idea to shed some light on what has, all too often, been based on superstition.

Capability indexes are statistics

Capability and performance indexes are arithmetic functions of the data. They are no different from an average or a range, just slightly more complex. The four basic indexes are the following:

The capability ratio, Cp, is an index number that compares the [space available within the specifications] with the [generic space required for any predictable process].

The performance ratio, Pp, is another index number that compares the [space available within specifications] with the [estimated space used by process in the past].

Danielle Underferth’s picture

By: Danielle Underferth

As municipalities clamor for a slice of President Biden’s $1.2 trillion infrastructure spending bill, one Johns Hopkins scientist is re-examining one of the basic elements of road-building: Determining the width of road lanes. But determining the width that provides the highest level of safety, access, and comfort for every road user—drivers, cyclists, and pedestrians—is complex, says Shima Hamidi, an assistant professor in Johns Hopkins’ Department of Environmental Health and Engineering, which is shared by the Whiting School of Engineering and the Bloomberg School of Public Health.

It’s a data problem, she says, and she wants to help cities solve it.

Hamidi is undertaking a massive collection of data on urban streets across the United States to answer one question: How low can cities go on street width to make room for bike lanes and wider sidewalks?

Tristan Mobbs’s picture

By: Tristan Mobbs

All too often the topic of fixing dirty data is neglected in the plethora of online media covering artificial intelligence (AI), data science, and analytics. This is wrong for many reasons.

To highlight just one, confidence in the quality of data is the vital foundation of all analysis. This topic remains relevant for all levels of complexity, from spreadsheets to complex machine-learning models.

So, I was delighted to review Susan Walsh’s book, Between the Spreadsheets: Classifying and Fixing Dirty Data (Facet Publishing, 2021). Here are some highlights from her book, and my own advice on who should read it.

Atul Minocha’s picture

By: Atul Minocha

Do you ever feel like you’re spending money like crazy on marketing and getting little or nothing in return? If so, you might be tempted to pull the plug on marketing altogether. That would be a big mistake.

An effective marketing strategy can mean the difference between your organization’s success and failure. To maximize your strategy, there are eight common marketing mistakes you should avoid at all costs.

#1 Focusing solely on data

Most marketers firmly believe the old saying, “What doesn’t get measured doesn’t get improved.” They track various metrics, hoping the data will show them how to improve customer engagement.

The problem is some of the most important elements of customer engagement—like emotional response—can’t be tracked easily. How do you measure whether or not you’re tugging at their heartstrings?

The real power of marketing comes from synergy of both the left brain (data) and the right brain (emotion). Focusing solely on the data will never lead to optimal results.

Donald J. Wheeler’s picture

By: Donald J. Wheeler

Most of the world’s data are obtained as byproducts of operations. These observational data track what happens over time and have a structure that requires a different approach to analysis than that used for experimental data. An understanding of this approach will reveal how Shewhart’s generic, three-sigma limits are sufficient to define economic operation for all types of observational data.

Management requires prediction, yet all data are historical. To use historical data to make predictions, we will have to use some sort of extrapolation. We might extrapolate from the product we have measured to a product not measured, or we might even extrapolate from the product measured to a product not yet made. Either way, the problem of prediction requires that we know when these extrapolations are reasonable and when they are not.

The structure of observational data

Before we talk about prediction, we need to consider the structure of observational data. For any one product characteristic we can usually list dozens, or even hundreds, of cause-and-effect relationships which affect that characteristic. Some of these causes will have larger effects than the others. So, if we had perfect knowledge, we could arrange the causes in order according to their effects to obtain a Pareto like Figure 1.

Anthony D. Burns’s picture

By: Anthony D. Burns

I’m a chemical engineer. The fundamentals of the chemical engineering profession were laid down 150 years ago by Osborne Reynolds. Although chemical engineering has seen many advances, such as digital process control and evolutionary process optimization, every engineer understands and uses Reynold’s work. Most people have heard of the Reynolds number, which plays a key role in calculating air and liquid fluid flows. There are no fads. Engineers use the fundamentals of the profession.

Fads, fads, fads

By contrast, in the past 70 years, “quality” has seen more than 20 fads. The fundamentals have been forgotten and corrupted. Quality has been lost. Quality managers engage in an endless pursuit of magic pudding that will fix all their problems.

Alarmingly, the latest “quality” fad, Agile, has nothing to do with quality. It’s a software development fad that evolved from James Martin’s rapid application development (RAD) fad of the 1980s. This in turn grew into the rapid iterative processing (RIP) fad. When it comes to quality today, anything will do, no matter how unrelated.

W. Edwards Deming’s picture

By: W. Edwards Deming

Editor’s note: The following is from a transcript of a forgotten speech given in Tokyo in 1978 by W. Edwards Deming for the Union of Japanese Scientists and Engineers (JUSE). Because the original was a poor photocopy, there are small portions of text that could not be transcribed. Transcript courtesy of Mike McLean.

The spectacular leap in quality of most Japanese manufactured products, from third-rate to top quality and dependability, with astounding economy in production, started off in 1950 with a meteoric flash, and still continues. The whole world knows about Japanese quality and the sudden surge upward that began in 1950, but few people have any idea how it happened.

It seems worthwhile to collect in one place the statistical principles of administration that made possible the revolution of quality in Japan, as even at this date, most of these principles are not generally understood or practiced in America. It is for this reason that the title speaks of new principles.

The relative importance of some of the principles explained here have of course changed over the years since 1950. Some principles stated here have emerged as corollaries of earlier principles. Other corollaries could be added, almost without end.

William A. Levinson’s picture

By: William A. Levinson

Part one of this article showed that it is possible, by means of a Visual Basic for Applications program in Microsoft Excel, to calculate the fraction of in-specification product that is rejected by a non-capable gage, as well as the fraction of nonconforming product that is accepted. This calculation requires only 1) the process performance metrics, including the parameters of the distribution of the critical to quality characteristic, which need not be normal; and 2) the gage variation as assessed by measurement systems analysis (MSA).

Part 2 of the series shows how to optimize the acceptance limits to either minimize the cost of wrong decisions, or assure the customer that it will receive no more than a specified fraction of nonconforming work.

William A. Levinson’s picture

By: William A. Levinson

IATF 16949:2016 clause requires measurement systems analysis (MSA) to quantify gage and instrument variation. The deliverables of the generally accepted procedure are the repeatability or equipment variation, and the reproducibility or appraiser variation. The Automotive Industry Action Group1 adds an analytic process with which to quantify the equipment variation (repeatability) of go/no-go gages if these come in specified dimensions, or can be adjusted to selected dimensions.

The anvils of a snap gage can, for example, be set accurately to specified dimensions with Johansson gage blocks. Pin gages (also known as plug gages), on the other hand, come in small but discrete increments. If the precision to tolerance (P/T) ratio is greater than the generally accepted target, the gage cannot distinguish reliably between good and nonconforming product near the specification limits. This means nonconforming work will reach internal or external customers, while good items will be rejected, as shown in figure 1 below.

Syndicate content