© 2022 Quality Digest. Copyright on content held by Quality Digest or by individual authors. Contact Quality Digest for reprint information.

“Quality Digest" is a trademark owned by Quality Circle Institute, Inc.

Published on *Quality Digest* (https://www.qualitydigest.com)

**Published: **05/18/2016

I’ve mentioned that design of experiments (DOE) is one of the few things worth salvaging from typical statistical training, and I thought I’d talk a bit more about DOE in the next couple of columns. The needed discipline for a good design is similar when using rapid-cycle plan-do-study-act (PDSA).

Doing a search on the current state of DOE in improvement education, I observed that curricula haven’t changed much in the last 10 years and still seem to favor factorial designs or orthogonal arrays as a panacea.

The main topics for many basic courses remain:

• Full and fractional factorial designs

• Screening designs

• Residual analysis and normal probability plots

• Hypothesis testing and analysis of variance (ANOVA)

The main topics for advanced DOE courses usually include:

• Taguchi signal-to-noise ratio

• Taguchi approach to experimental design

• Response-surface designs

• Hill climbing

• Mixture designs

No doubt these are all very interesting. But what is the 20 percent of this material that will solve 80 percent of people’s problems? Some of the topics above are very specialized, rarely used, and can only be understood when people have a practical working knowledge of the other material after actually using it.

Many trainers also fall into the trap of thinking that hypothesis testing and ANOVA should be taught as separate topics. A well-respected statistical colleague says it so well [my emphasis]: “I get [questions about degrees of freedom] all the time (ANOVA tables in particular seem to terrorize people)...* but I wish people were asking better questions about the problem they’re trying to understand/solve, the quality of the data they’re collecting/crunching, and what on Earth they’re actually going to do with the results and their conclusions.* In a well-meaning attempt not to turn away *any* statistical questions, my own painful attempts to explain degrees of freedom have only served to distract the people who are asking from what they really should be thinking about.”

A basic knowledge of full and fractional factorial designs, screening designs, and their analysis and diagnostics is a good place to start. This knowledge, though necessary and useful when one is at a low state of knowledge about one’s process, is not sufficient. It usually needs to be supplemented by some basic, extremely useful designs from response surface methodology.

There is no finer reference for a process-oriented approach to DOE than Ronald Moen’s, Thomas Nolan’s, and Lloyd Provost’s *Quality Improvement Through Planned Experimentation* (McGraw-Hill Education, 2012). Response surface methodology, however, is not covered.

In my industrial career, many of my clients found much more ultimate value in obtaining a process road map—called a contour plot—which is accomplished through a response surface methodology. In its basic form it is hardly an advanced technique, but it does go a bit beyond factorial designs. Many times a response surface methodology can even build on factorial designs in a nice, efficient sequential strategy as one evolves to a higher state of knowledge, which leads to much more effective optimization and process control.

A typical contour plot is shown in figure 1 (scenario explained shortly). It shows how the predictive model from the design analysis can be turned into a road map of the process studied. Temperature (x-axis) and an ingredient’s concentration (y-axis) were varied over the ranges on their respective axes. For any combination of those two variables, one can read the predicted value of the response being studied (the objective in this case is to minimize it).

However, this map *is never fully known* and can only be approximated. The question becomes: What is your best shot at doing this in as few experiments as possible? First, some background.

The contour plot in figure 1 maps a real production process where the desired product immediately decomposes into a pesky, tarry byproduct that is difficult and expensive to remove. The process is currently averaging approximately 15-percent tar, and each achievable percent reduction equates to $1 million (in 1970 dollars) in annual savings.

Process history has determined three variables to be crucial for process control: temperature, copper sulfate concentration, and excess nitrite. Any combination of these three variables within the ranges of temperature, 55°–65°C; copper sulfate concentration, 26–31 percent; and excess nitrite, 0-12 percent would represent a safe and economical operating condition. The current operating condition is the midpoint of these ranges.

For purposes of experimentation only, the equipment is capable of operating in the following ranges if necessary: temperature, 50°–70°C; copper sulfate concentration, 20–35 percent; and excess nitrate, 0–20 percent.

Suppose you had a budget of 25 (expensive) experiments that need to answer these questions:

• Where should the three variables be set to minimize tar production?

• What percent tar would be expected?

• What’s the best estimate of the process variation (i.e., tar ± x percent)?

This is the scenario I use to introduce my experimental design seminars. I divide the audience into groups of three to four people, and give each group a process simulator where they can enter any condition and get the resulting percent tar.

It almost never fails: I get as many different answers for optimum settings, resulting tar, and variation as there are groups in the room—and just as many strategies (and number of experiments run) for reaching their conclusions. Human variation!

I have each group present its results to me, and I act like the many mercilessly tough managers to whom I have made similar presentations.

General observations:

• Most try holding two of the variables constant while varying the third, and then try to further optimize by varying the other two around their best result.

• Each experiment seems to be run based only on the previous result.

• Some look at me smugly and run the cube of a three-variable factorial design (many times getting the worst answers in the room).

• Some run more than the allotted 25 experiments.

• Some go outside of the established variable safe ranges.

• Most find a good result, and then try a finer and finer grid to further optimize.

• There’s always one group that claims to have optimized in fewer than 10 experiments, and the group members (and everyone else) look at me like I’m nuts when I tell them they should repeat their alleged optimum, which will use up an experiment; and that repeating *any* condition uses up an experiment.

I’m accused of horrible things when the repeated condition gets a different answer (sometimes differing by as much as 11–14). I simply ask, “If you run a process at the same conditions on two different days, do you get the same results?”

What usually happens as a result:

• I’m often told the “process is out of control,” so there’s no use experimenting.

• Most estimates of process variability are naively low.

• Groups have no idea how to present results in a way that would sell them to a tough manager.

• The suggested optimal excess nitrate settings are all over the range of 0–12, even though *it is modeled to have no effect* and should be set to zero.

My simulator generates the true number from the actual process map (in figure 1) along with a random, normally distributed variation that has a standard deviation of four. (The actual process had a standard deviation of eight.) In looking at the contour plot, tar is minimized at 65°C and approximately 28.8 percent CuSO_{4}, resulting in 6–8-percent tar + ~8–10 for any production run.

In 1983, I heard the wonderfully practical C.D. Hendrix say, “People tend to invest too many experiments in the wrong place!”

As it turns out, by the end of the class, human variation is minimized when every group independently agrees on the same 15-experiment strategy (a few choose an alternative, equally effective 20-experiment strategy). When they see quite different numerical results from each individual design, they are initially leery, but then pleasantly surprised when they all get pretty close to the real answer.

Reduced human variation = higher quality and more consistent results* in only 15–20 experiments.* They are now in “the right place,” and have 5–10 more experiments to refine their optimum.

More next time.

**Links:**

[1] http://www.qualitydigest.com/inside/management-column/021616-getting-real-rapid-cycle-pdsa.html

[2] http://www.amazon.com/Quality-Improvement-Through-Planned-Experimentation/dp/0071759662/ref=sr_1_1?ie=UTF8&qid=1459452443&sr=8-1&keywords=moen+nolan+provost+design