Six Sigma and Beyond

E-mail Author

Applying Virtual DOE in the Real World

A manufacturing example illustrates the benefits of data mining and artificial neural networks.

In last month's column I presented a method for conducting virtual design of experiments (VDOE) using artificial neural networks and data mining. This month I will present a manufacturing example that illustrates how this approach can be applied in the real world.

Table 1: Raw Data Used to Train
and Validate a Neural Net Modelsixsig32

The data in Table 1 are from a solder process. Data weren't gathered for a designed experiment, but were merely collected during the operation of the process. Statisticians often refer to such data as "happenstance data." Because happenstance data aren't gathered under controlled conditions, they are subject to a wide variety of unknown influences that make their validity questionable. Therefore, happenstance data analysis should never be used to draw conclusions. However, because organizations usually invest substantial sums of money collecting these data, it would be a waste to ignore the information in the data warehouse just because it isn't "clean" or perfect.

The data mining approach I recommend is to use the data to ask questions rather than to reach firm conclusions. The questions posed by the data mining analysis can be studied further using pilot projects and statistical design of experiments (DOE) under controlled conditions. Some examples of such questions are as follows:

Qdbullet Which variable or variables are likely to be important? (These variables can help design factorial or screening experiments.)

Qdbullet Approximately what effect is caused by changing a given variable by a certain amount? (This effect can be used to determine high, intermediate and low settings for the experiment.)

Qdbullet Is there a combination of variables that causes the process to break down? (A combination of this kind can be used to help determine which experimental region to study or which type of experimental design to use.)

These questions can be partially answered by data mining. Artificial neural networks are just one of many possible data mining techniques that can be used. The results of the data mining provide further input into the DOE process.

Figure 1: Neural Net Model

sixsig1I used the data in Table 1 to train and validate a neural net. The variable PH Time is the preheat time in seconds, PH Distance is the distance from the board to the preheat element in centimeters, and Defects is the coded number of defects per unit. The model produced by the neural net is shown in Figure 1.

It is interesting to compare the neural net model with the response surface model (RSM) produced by classical DOE methods. The RSM is shown in Figure 2. Variable PH Time represents the circuit board preheat time in seconds, and variable D represents the distance from the preheat element in centimeters. In this

Figure 2: Response Surface Analysis

sixsig2RSM, the zero point for variable PH Time is 45 sec, and zero for PH Distance is 22.5 cm. The surface described by the neural net is somewhat different; however, both models direct the PH Time and PH Distance settings to similar levels and both make similar predictions for the defect rate.

Neural net software allows "What if?" analysis, as shown in Figure 3. By using What if? you can conduct virtual DOE by designing proper experiments just as you would design real-world experiments. However, rather than using the inputs to conduct experiments on the actual process, you feed the inputs into the neural net What if? analysis. The outputs from the neural net are analyzed just as you would analyze the results of real experiments. In other words, they are entered into your DOE software, and statistical tests are made on the results.

Figure 3: What If? Analysis Using the Neural Net Model


Because I created the model using happenstance data, the results must be validated. DOE methods applied to the actual process can provide the experimental validation; use the model to provide the starting point for the experiments. If the data are in fact valid, the neural network VDOE will start the experiment much closer to the optimum, thereby greatly reducing the amount of real-world experimenting required. The cost savings mean that you will be able to conduct many more experiments within your DOE budget, thus being able to make improvements more quickly.

This column is based on an excerpt from The Complete Guide to Six Sigma, Quality Publishing LLC, scheduled for publication in summer 1999. Copyright 1999 by Quality America LLC. The article first appeared at . Reprinted by permission.


About the author

Thomas Pyzdek is president and CEO of Pyzdek Consulting Inc. He has written hundreds of articles and papers on quality topics and has authored 13 books, including The Complete Guide to the CQM.

Pyzdek served on the first board of examiners for the Malcolm Baldrige National Quality Award. He is a fellow of the American Society for Quality, an ASQ-certified quality and reliability engineer, and a recipient of the ASQ Edwards Medal.

Comments about this column can be e-mailed to Pyzdek at Tom Pyzdek , or visit his Web site at .


Berry, Michael J.A. and Linoff, Gordon. Data Mining Techniques: For Marketing, Sales, and Customer Support, New York: John Wiley & Sons, 1997.

[QD Online] [Harrington] [Townsend] [Guaspari] [Crosby] [Godfrey] [6-Sigma]

Copyright 1999 QCI International. All rights reserved. Quality Digest can be reached by phone at (530) 893-4095. E-mail: Click Here