Problems With Gauge R&R Studies

How to make sense of your repeatability and reproducibility (R&R) values

Measurement error is ubiquitous. As a result, over the past 250 years, different areas of science and engineering have come up with many different ways to deal with the problem. One approach to the problem of measurement error was developed during the 1960s within General Motors. Throughout the years it was modified and revised, until in 1989, it was turned over to the Automotive Industry Action Group (AIAG). Since that time, the AIAG gauge repeatability and reproducibility (R&R) study has been promoted throughout many different industries. Unfortunately, the original procedure contained some fundamental problems that have not been corrected over the years. This column will address these historic problems and suggest solutions.

The AIAG gauge R&R study

The gauge R&R study starts out with a sound strategy for collecting data. A simple fully-crossed experiment is performed where two or more operators measure each of three to 10 parts two or three times apiece. To have an example, we will use the data shown in figure 1 where three operators measure each of five parts two times apiece.

…

Want to continue?

By logging in you agree to receive communication from Quality Digest. Privacy Policy.

Create a FREE account

Forgot My Password

Comments

Comment on Problems with Gauge R&R Studies

This sure changes how I view the many results that I have had to deal with that were less than the current AIAG Gauge R&R requirements.

However, the real question is: How do you convince customers that are ingrained in the AIAG Gauge R&R process that they need to change how they require the results to be calculated? Or is it to convince the AIAG group to change their published documents?

Kim Howarter
Revcor, Inc.
Carpentersville, IL

Kim's Question

The only alternative to ignorance is education. You might start by showing them a copy of this paper.
Hope this will help.

Good question

I'm hoping Don will weigh in. I remember taking the EMP workshop years ago, and Don told us he had presented this paper at an AIAG or ASQ working group, and they just stared blankly at him...
On the hopeful side, I ran into some British and German engineers a few years ago, and they told me that they had abandoned AIAG GR&R and were all exclusively using Wheeler's EMP.

Don't get me started!

I recommend doing some simulation work, and then you will discover a few things.

1) The use of the d2* constant was developed for variances. When you use this constant to estimate a standard deviation, you get a biased result. It will show up especially in the attempt to determine reproducibility, where it is generally quite significant. The d2 number should actually be used instead. If you are trying to estimate the variance from a range, or a group of ranges, use d2*, but if you are trying to estimate a standard deviation, use d2. This depends on your research question, and what you wish to do with the numbers generated.

2) Any attempts at determining measurement error as a proportion of observed total variation in a measurement study is bogus, and has no external validity. The problem is the sampling method and the sample size. First, I contend that you should not be taking a random sample of items, but a stratified (non-random) sample to exercise the full range of the measurement system. There is other analysis that you need to do, such as determining measurement error as a function of item size. You certainly want to look at product near your specification ranges, and how the measurement system behaves in those regions. Taking a "random sample" could be quite misleading.

The second problem is the sample size used. Just look at the confidence intervals for variances and standard deviations with a sample size of 10 (a size often used in an "R&R" study)! The sampling error is so large, that any attempt at estimation if futile.

3) And finally, as I looked at 1000's of measurement system applications in industry, in many cases, "reproducibility" should really be treated as a "fixed" factor. That is a complete different analysis, and study. In the case of a fixed factor, you are not interested in estimating the variance from operator/system to operator/system, you are interested in estimating the bias between operators/systems. Often, this is much more useful.

Michael Petrovich's comments

Michael Petrovich is right about the origin and use of d2*. The inconsistent use of d2 and d2* is part of the AIAG study. I did not recommend these formulas, but merely presented the formulas used by AIAG. In my EMP book I present a table of formulas for estimators of various quantities which lists 12 alternatives for the three basic variances involved. However, when using the adjustment terms shown for repeatability, we are estimating a variance, and so d2* is appropriate. We could also take exception with the way the AIAG study uses adjustment terms for Repeatability and yet fails to use adjustment terms for the product variation (where it inappropriately uses d2*).

On the second point, I will refrain from agreeing that the intraclass correlation is "bogus." It is the correlation between two measurements of the same item. It is theoretically sound, and it provides a way to characterize the relative utility of a measurement system for a given application. However, I did not address the issue of how the parts are selected. In my EMP book I discuss and illustrate systematic samples, grab samples, and other types of sampling with EMP Studies. The structure of the data in an R&R study will result in a decent estimate of measurement error, but it will give you a soft estimate of the product variation (9 d.f. or less). This means that there will be a lot of uncertainty in any of the ratios computed. In my book I also discuss how to solve the problem of a lack of degrees of freedom for the product variation estimate.

I am aware of the differences in fixed and random effect analysis. I deal with these differences in my EMP book where I show how to detect and estimate operator biases. However, the AIAG gauge R&R study simply assumes that the operators are different and that they are a random effect. Since I had already shown the R&R study to be inappropriate in so many other ways, I did not go on to address this issue in this paper in the interest of simplicity.

Software

May I suggest also posing this to Minitab and JMP, to allow for alternate calculations in their software?

Gauge R and R

Thanks for the excellent article. It was timely and have already forwarded it to colleagues who are starting to get the message.

GRR Software

GagePack by PQ Sysyems has already incorporated Don's EMP III methods as part of their calibration software.

Rich DeRoeck

Semiconductor Industry white papers available

The member companies of ISMI (the manufacturing subsidiary of SEMATECH, the semiconductor industry consortium) have written a white paper to aid companies in our industry to deal with auditors who insist on using AIAG guidelines, even though the MSA Manual states they are only guidelines. We have found that often both internal and external auditors insist on the use of AIAG methods, even though Don Wheeler and others have shown that there are superior measures of the goodness of a measurement system. The white paper can be downloaded free of charge via http://www.sematech.org/docubase/abstracts/35939.htm.

A corresponding white paper for use with SPC system auditors can be found at http://www.sematech.org/docubase/abstracts/35938.htm.

Performing this in Minitab

The method Dr. Wheeler is describing is the "Xbar-R Method" in Gage R&R (Crossed) in Minitab and the "honest" ratios are presented as well as the AIAG ratios. The "honest" ones are in the %Contribution column presented first, and the AIAG ratios are presented as %Study Var in the next table.

For what it's worth, I believe most practitioners with stat software are currently using the ANOVA method and not Xbar-R, but the concepts of additivity among %Contribution as compared to %Study Var hold true just the same. I believe the only attractive propoerty of %Study Var is that it is in the original units of the data and at least in that dimension has an easier interpretation, but without a firm understanding of the underlying formulas and concepts the correct interpretation of each is likely lost.

AIAG Gauge R&R Feedback

Interesting article. I am curious as to whether any of this has been communicated to the AIAG and what their response is.

Answer for George

I presented a paper with most of this information to the AIAG group at an ASQ world conference in 1992. They gave me a award and changed very little in the procedure.

Donald J. Wheeler, Ph.D.
Fellow American Statistical Association
Fellow American Society for Quality

d2*

Dr. Wheeler,

I've been reading through your paper(I am somewhat new to R&R) and I've been finding it extremely helpful in getting myself up to speed with what I need to know. For that, I am very thankful. My only issue, and I'm hoping it proves to be a misinterpretation or misunderstanding of the material on my part, is where you get the number 1.906 as the value for d2*. I'm not questioning it's validity in the formula; more so, where can I find such a table of constants?(to aid in practical application on other potential sets of data) Again, I feel as though this may be some kind of misinterpretation of what I'm reading, so any bits that can help me over this hurdle would be greatly appreciated! Thanks!

Problem with Gauge Studies

Don, I respect your work especially your SPC text, but your continued rant about the AIAG method is dishonest. It is time for it to stop.

First, if you've done your homework you understand why Standard Deviation, not variance was used at the onset of Gauge Studies. It was 1962 and the error involved with doing calculations by hand was greater than the error with using r-bar to estimate s (or s^2). So your "honest" method just squared the result. Wow, you must be a genius!

And your talking down to us about a + b does not equal c is just plain arrogance.

The problem with your rant is you have no way to deal with the adequacy of a measure where there is a specification. you think that industry just cares about control charts. You can tell you are just an academic.

Your writings are used as an excuse to avoid the hard work of getting an adequate measure by a bunch of lazy MBB's and others. Anyone who has ever used the AIAG methods and found that they can solve unsolvable problems knows your writing in nonsense.

Just my experience.

Persoanl attacks

People making such personal and unprofessional attacks on Dr Wheeler, with words such as "rant", should have the kahoonas to state their name.

Link this Article

Dr. Wheeler,

I admire the way you call out and fix flawed methodologies. Fairly recently, you wrote about, but didn't identify, a software package that had some fundamental errors. Do you know if they've been corrected?

May I link this article on LinkedIn?

Uncareful R&R use

There are further examples of uncareful R&R studies, especially in the automotive industry, where the AIAG's manual is literally read but not always understood, which is destructive testing. Though expensive they are, destructive tests are used more and more often for safety components and for components made of plastic or rubber or that are kept together by adhesives. Only the latest AIAG's manual warns not to rely on R&R studies for these measures, of course, but, once again, form takes over substance. I'm personally not in favor of R&R studies to test the reliability of a measuring system, I would rather break it down to its components, that is devices, people, methods and environment, which are much easier to audit and to repair, were it the case. But I'm not an AIAG ruler, so I've to go along. Mr. Wheeler is right when raising doubts about the validity of R&R studies, it's high time; for too long their reliability has been taken for granted.

Gauge R&R studies

Thank you for your article. I have wondered why the measurement error sources never added up to 100% for years, and I have struggled to understand why a gauge that measures at 9% is 'acceptable' while '11%' is not. For that matter, I could never figure out how a gage would be unacceptable when part variation would be 90% plus of the observed variation. Thanks for the insight.

Quality

I'm newbie here, i conducted gage type 1 study on CMM & analysis the data on minitab but results are not so good.
i have a little bit confusion about tolerance, i ran a master ring gage 25 times, having diameter 200.001mm & record the data. in minitab, enter the data in measurements column & software
ask also put tolerance value. tolerance value of what, CMM tolerance or master ring gage tolerance? CMM accurcy is +-0. 05mm & Master ring gage tolerance is 0.001mm

thanks in advance

Regards

Aslam abbas