Performing a Long-Term MSA Study

Testing through time stability

hh, measurement system analysis—the basis for all our jobs because, as Lord Kelvin said, “… When you cannot measure it, when you cannot express it in numbers, your knowledge is of a meager and unsatisfactory kind.” How interesting it is then, that we who thrive on data so frequently don't have any proof that the numbers we're using relate to the event we are measuring—hence my past few articles about the basics of measurement system analysis in “Letting You In on a Little Secret,” on how to do a potential study in “The Mystery Measurement Theatre, and on how to do a short-term study in “Performing a Short-Term MSA Study.” The only (and most important) topic remaining is how to perform a long-term study, which is the problem I left with you last month.

…

Want to continue?

By logging in you agree to receive communication from Quality Digest. Privacy Policy.

Create a FREE account

Forgot My Password

Comments

Part-to-part variation question

Very interesting article. I'm not sure that I would completely agree that part-to-part variation is of no interest to us in MSA, though...don't we desire a high value of the ratio of part-to-part variation to measurement-to-measurement variation? Aren't we testing to find out whether our instruments can detect part-to-part variation? True, in this example we are told to assume that each part produces exactly the same voltage in every reading (I'm glad you revisited that later...it's a great question to try to answer, before you throw out that gage!).

Hi Rip, and thanks for

Hi Rip, and thanks for commenting! This leads to an (I hope) interesting topic....
.
Hmm, let me rephrase this way and see if it makes sense: the component of the variation due to part-to-part variability is not needed to determine gauge capability (a.k.a. %R&R), except that I need to account for it so that I don't over-estimate the gauge variability.
.
So we don't need a "high value of the ratio of part-to-part variation to measurement-to-measurement variation" in order to determine if our gauge is capable of correctly categorizing parts as conforming or not. Let's take the extreme: the parts are identical in every possible way and we are only making one part to one spec, so we only measure these identical parts as part of our measurement process. Even though the parts are identical, there still is gauge variability, which we can still quantify using this approach - it is wholly independent of the real part-to-part variability. We just need to know if the gauge is too variable to do that conform/nonconform classification correctly. (This right here is, I believe, the source of MANY misunderstandings about gauge capability - people don't understand that gauge capability tests the capability to make the right decision (usually conformance) - it was that way from the beginning when my colleague Dr. Luftig worked on this at Ford with Dr. Deming.)
.
That said, if I measure multiple parts to multiple nominals, I want to include that entire span in my gauge study ("exercise the gauge"), otherwise I have no external validity beyond that one voltage or whatever. In this case, we only have one nominal voltage we are measuring and one part that makes it, so a random sample of the product is germane to our research question. If we add another product/nominal voltage to what we measure, we *must* re-do the MSA before qualifying it for production since the gauge might not be at all capable of measuring that. Also, be sure to note that, while we assume each part produces the same actual voltage each time, each different part produces a slightly different real voltage than the other parts, and that is the component we need to take out for our purpose of understanding the gauge variability.
.
"Aren't we testing to find out whether our instruments can detect part-to-part variation?" Actually, not really. Or rather, it is not necessary. We are testing to see if the gauge is capable of making the conform/nonconform decision based on the customer specification. Of course, if I choose as my spec something on the order of what I want to detect in part-to-part differences, then that is what I am testing, but it is not necessary for gauge capability *to the customer spec*. Consider parts that differ from each other by a nanometer, but are made to a customer spec that is +/- 1 inch - you really going to pay for a gauge to detect those nanometer differences, or just stick with a tape measure? :)
.
On the (third?) hand, if your work in the process requires you to be able to detect smaller differences than the spec to the customer, you would have two different capability numbers for the same gauge variability, right? A %R&R of 10% maybe for the customer, but a %R&R of 50% for process improvement work (dividing the same standard deviation by a smaller number). Still, one might argue that while you might need a gauge that measures more *precisely* to work on the process, it probably is not economically justifiable that it goes much beyond the equivalent of a 10% R&R of the customer spec - why are you working at that fine a detail of the spec in the first place - no one is paying you for it! (Unless, like what happened to me once, you knew the spec was going to be greatly decreased soon....)

Another question by

Another question by e-mail...name redacted until he comes along to claim it! :)
.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Hi Steve,
.
I've been following your discussions on the subject of MSA and now have a question regarding the most recent article that appeared in this month's Quality Digest - Performing a Long-Term MSA Study. My question is connected with the following paragraph extracted from your article:
.
"But two out of 200 observations is well within what we would expect with Type I error rate of 0.0027 (the rate you get a point out of the ±3σ control limits due to chance and chance alone)..."
.
Not sure if I follow your logic here and would appreciate if you could provide me with more explanation on this part of the statement, please.
.
Best regards,
.
D-------
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.
Thanks for taking the time to read and ask the question!
.
OK, let's say that we are only using the "outside the +/-3 sigma limits" rule. (In MSA I would use others, particularly the run rule, but it only makes a false signal more likely, so let's keep it simple.) The probability of an individual occurrence of that due to chance and chance alone is 0.0027. That is our Type I error rate, or the rate of a "false positive" signal.
.
What is the likelihood of getting 2 signals in 200 due to chance and chance alone at that rate, then? Using MVPstats to calculate a one-sample exact proportion test (little dots are there to line things up):
One-Sample Proportion Test . ....p = 0.0100............Po = 0.0027 ..np = 2.0...............nPo = 0.5400 ...n = 200 . ..95.00% Exact CI for P: 0.0012 to 0.0357 . Exact Test for: P = 0.0027 Exact Binomial p-value(two-tailed) = 0.205
So in around 20% of trials of 200, we would see 2 false signals due to chance and chance alone. So I would conclude that the gauge is exhibiting reasonable stability - no reason to think we are seeing anything other than random variation.
.
Now, that is if I am looking at the 200 points all at once, right? But what I am really aiming for is an ongoing test. If instead of setting this up I was using it, I am only getting one point a day, let's say. Now if I see one of those charts go out of control, I must react - I don't have the context of 200 points, all I have is a point that is out RIGHT NOW!!! So in that case, I stop, check the part, check the gauge, and then do another measurement to see if I still have a problem before going ahead and approving the gauge for use today.
.
Make sense?

Thanks

Yes! I see...I just never get this kind of a situation. I get parts that do vary, and an operating range, and then the concern is, how much of the variation comes from the parts and how much from the gage? Can we measure parts all the way across the range? Can I tell when something is near the edge of the spec or out of spec? Can I measure the results of a designed experiment, that usually tests a range outside the spec on either end, and several increments in between?
Of course, if it were, say a measurement of length, and I were to do the entire study using a standard (or to be more consistent with your example, eight copies of a standard), I would expect NO part-to-part variation, but I would still be able to characterize the measurement variation. Maybe it's just the voltage example that's throwing me...what if the voltage is varying from measurement to measurement? How could we know? In this case, it's something that produces voltage, and we are assuming that "these voltages are constant and the only variation we see in remeasuring a module is gauge variation." If I were being asked to justify scrapping this gage, what evidence do I have that the output voltage of the piece is not creating this variation?
As to the third hand, I almost always want to be able to detect smaller differences than the spec; Taguchi pretty much established that back in 1960. Most of the people I work with don't live in a "go/no-go" world anymore. The spec is the voice of the customer; for most uses I see, I need gages with the ability to measure the voice of the process.

Back to ya Rip!

Right - if you are using a gauge to measure a bunch of stuff, then you had better use the full range you expect to measure in your MSA. This is where a correlation between the magnitude and dispersion analysis I showed in the short- and long-term can protect you from making a very expensive bad decision. Additionally, I would test to make sure that the *bias* is constant with magnitude as well (a.k.a. linearity) - I have been bitten by that one before too, specifically on thermocouples.
.
We can answer all the questions in your first paragraph performing the ongoing long-term MSA I describe - all those sources are identified. Every day you re-measure those same 8 parts (kept in a vault no doubt), assess the state of the gauge, and if it is still in control and acceptable, then you can trust the measurements. So if I get a nonconforming part, I am willing to bet that it is the part and not the gauge. Note again that a gauge calibration sticker is *wholly inadequate* to answer that question!
.
I would NOT do a study on a standard, unless you manufacture standards! That would have no external validity as I talked about in the second MSA article: http://www.qualitydigest.com/inside/quality-insider-column/mystery-meas… Note again that we are not counting on the parts to be identical with each other, we are counting on each one to produce the same voltage indication each time. But, as you say, what if the voltage of each unit changes with time? "How," you ask, "could we know?" Well the one thing we do know is that our process *as-measured* can't meet the customer spec. Even with perfect zero part-to-part differences right in the middle of the spec, we cannot reliably determine of a part is conforming or not! So it is not a question of justifying scrapping a gauge - you would have no way of determining if it was in or out of specification - the real question is justifying further investigation (does the voltage vary over time and we need to redesign the module) and/or justifying a new voltmeter. Whatever it is, it is happening across the board, so we are in no position to be guaranteeing anything to our customer. This is BAD NEWS! Even scarier if we just did the MSA and have been using it this way for years...we have probably scrapped a lot of perfectly fine modules and sent on some horribly out of spec modules, and the database of our measurements would show not one part shipped out of spec, and only out of spec parts scrapped!
.
Careful what I mean here: I am not proposing going back to "in-spec good, out of spec bad" go/no go world - quite the contrary, as my earlier articles show (if you like Taguchi you might enjoy http://www.qualitydigest.com/inside/six-sigma-column/show-me-money). Rather, one purpose of MSA is to determine if we can make a good "conformance decision" for that individual part, as I have been careful to say. That is what the %R&R tells us - what proportion of the spec width is taken up by gauge variation. That "twin peaks" graph above and in the other MSA articles show you why %R&R (or P/T) needs to be relatively small. If you have a %R&R of 10%, you can divide up the spec into ten real measurement divisions (far better than just determining go/no go). 5.15 times the measurement standard deviation is the "actual" resolution of the gauge. For most purposes, a %R&R of 10% is entirely sufficient to be able to detect changes of a practical magnitude as part of your improvement efforts (and remember the power of the Central Limit Theorem gives you further power with repeated measurements - *if* the gauge is in control). If you think you need 20 or 50 or 100 divisions, you might be right, but I am going to push back and say, "OK, so what magnitude of an effect do you want to detect?" and if you answer 1/100 of the spec, I am going to ask, "OK, so who cares? Not your customer or they would not have such a proportionally wide spec." And you are going to have to have a good answer to convince me to shell out $500,000 for that new nano-gauge! :) Or another way of asking it is, "Is your customer going to pay you more to improve your conformance *to target* by 1/100 of the spec?" If not, you probably don't need to measure at that resolution anyway and should be spending your time and money on that OTHER process over there.