Statistics Article

William A. Levinson’s picture

By: William A. Levinson

The first part of this series introduced measurement systems analysis for attribute data, or attribute agreement analysis. AIAG1 provides a comprehensive overview, and Jd Marhevko2 has done an outstanding job of extending it to judgment inspections as well as go/no-go gages. Part two will cover the analytical method, which allows more detailed quantification of the gage standard deviation and also bias, if any, with the aid of parts that can be measured in terms of real numbers.

Part one laid out the procedure for data collection as well as the signal detection approach, which identifies and quantifies the zone around the specification limits where inspectors and gages will not obtain consistent results. The signal detection approach can also deliver a rough estimate of the gage’s repeatability or equipment variation. Go/no-go gages that can be purchased in specific dimensions, or set to specific dimensions (e.g., with gage blocks) do indeed have gage standard deviations even though they return pass/fail results.

James Bossert’s picture

By: James Bossert

When we talk about measurement system analysis (MSA), people tend to focus on attribute agreement analysis because it is usually quicker and easier to do than a gauge repeatability and reproducibility (gauge R&R) study. This article is a review of the fundamentals for gauge R&R to remind us why it is so critical. We will review the basic definitions, go through a process for preparing a study, and then review the output in Minitab to make sure we understand what is going on in the analysis.

Why do we do a gauge R&R study in the first place? We do it for two reasons. One is to validate that the measurement process is acceptable. The second is to feel comfortable about the data. We want to avoid the potential for embarrassment by presenting data in a meeting and having it challenged. We want to be able to show what we did to validate it and to have confidence the data are good. Doing MSA also helps us to understand what is in the collection process and helps convince the people collecting the data why it’s important to do so. If the data collector understands why the study is being done, then he will be more likely to identify problems when they occur.

William A. Levinson’s picture

By: William A. Levinson

Measurement systems analysis (MSA) for attributes, or attribute agreement analysis, is a lot like eating broccoli or Brussels sprouts. We must often do things we don't like because they are necessary or good for us. While IATF 16949:2016, Clause—“Measurement systems analysis,” does not mention attribute agreement analysis explicitly, it does say that MSA shall be performed to assess “variation present in the results of each type of inspection, measurement, and test equipment system identified in the control plan.” It does not limit this requirement to the familiar real-number measurements with which we are comfortable.

Common sense says, meanwhile, that it is beneficial to understand the capabilities and limitations of inspections for attributes. The last thing we want to hear from a customer is, for example, “Your ANSI/ASQ Z1.4 sampling plan with an acceptable quality level of 0.1 percent just shipped us a lot with 2-percent nonconforming work.” Samuel Windsor describes how an attribute gage study saved a company $400,000 a year, which is a powerful incentive to learn about this and use it where applicable.1 Jd Marhevko has done an outstanding job of extending attribute agreement analysis to judgment inspections as well as go/no-go gages.2

Ryan McKenna’s picture

By: Ryan McKenna

To date, this series focused on relatively simple data analyses, such as learning one summary statistic about our data at a time. In reality, we’re often interested in a slightly more sophisticated analysis, so we can learn multiple trends and takeaways at once and paint a richer picture of our data.

In this article, we will look at answering a collection of counting queries—which we call a workload—under differential privacy. This has been the subject of considerable research effort because it captures several interesting and important statistical tasks. By analyzing the specific workload queries carefully, we can design very effective mechanisms for this task that achieve low error.

Multiple Authors
By: David Darais, Joseph Near

In our last article, we discussed how to determine how many people drink pumpkin spice lattes in a given time period without learning their identifying information. But say, for example, you would like to know the total amount spent on pumpkin spice lattes this year, or the average price of a pumpkin spice latte since 2010. You’d like to detect these trends in data without being able to learn identifying information about specific customers to protect their privacy. To do this, you can use summation and average queries answered with differential privacy.

In this article, we will move beyond counting queries and dive into answering summation and average queries with differential privacy. Starting with the basics: In SQL, summation and average queries are specified using the SUM and AVG aggregation functions:

SELECT SUM(price) FROM PumpkinSpiceLatteSales WHERE year = 2020
SELECT AVG(price) FROM PumpkinSpiceLatteSales WHERE year > 2010

In Pandas, these queries can be expressed using the sum() and mean() functions, respectively. But how would we run these queries while also guaranteeing differential privacy?

Multiple Authors
By: David Darais, Joseph Near

How many people drink pumpkin spice lattes in October, and how would you calculate this without learning specifically who is drinking them, and who is not?

Although they seem simple or trivial, counting queries are used extremely often. Counting queries such as histograms can express many useful business metrics. How many transactions took place last week? How did this compare to the previous week? Which market has produced the most sales? In fact, one paper showed that more than half of queries written at Uber in 2016 were counting queries.

Counting queries are often the basis for more complicated analyses, too. For example, the U.S. Census releases data that are constructed essentially by issuing many counting queries over sensitive raw data collected from residents. Each of these queries belongs in the class of counting queries we will discuss below and computes the number of people living in the United States with a particular set of properties (e.g., living in a certain geographic area, having a particular income, belonging to a particular demographic).

Donald J. Wheeler’s picture

By: Donald J. Wheeler

Inspection sounds simple. Screen out the bad stuff and ship the good stuff. However, measurement error will always create problems of misclassification where good stuff is rejected, and bad stuff gets shipped. While guard-bands and tightened inspection have been offered as a way to remedy the problem of shipping bad stuff, it turns out that they are often prohibitively expensive in practice. Here we look at how tightened inspection improves the quality of the product stream and compare those improvements with the associated excess costs.

The problem of inspection

A product measurement, X, may be thought of as consisting of the product value, Y, plus some measurement error, E, so that X = Y + E. With this model, the relationship between X and Y can be shown using a bivariate normal probability model where:

Scott A. Hindle’s picture

By: Scott A. Hindle

A quick Google search returns many instances of the saying, “A man with a watch knows what time it is. A man with two watches is never sure.” The doubt implied by this saying extends to manufacturing plants: If you measure a product on two (supposedly identical) devices, and one measurement is in specification and the other out of specification, which is right?

The aforementioned doubt also extends to healthcare, where measurement data abound. As part of the management of asthma, I measure my peak expiratory flow rate (discussed below), and I now have two handheld peak flow meter devices. Are the two devices similar or dissimilar? How would I know? To see how I investigated this, and to see the outcome, read on. A postscript is included for those wanting to dig a bit deeper.


In 2015, I was diagnosed with asthma, a chronic condition where the airways in the lungs can narrow and swell, making breathing more difficult. The worst of it occurred at my in-laws, where I experienced wheezing and had difficulty breathing. The cause? The family cat!

William A. Levinson’s picture

By: William A. Levinson

Traditional statistical methods for computing the process performance index (Ppk) and control limits for process-control purposes assume that measurements are available for all items or parts. If, however, the critical-to-quality (CTQ) characteristic is something undesirable, such as a trace impurity, trace contaminant, or pollutant, the instrument or gauge may have a lower detection limit (LDL) below which it cannot measure the characteristic. When this is the case, a measurement of zero or “not detected” does not mean zero; it means that the measurement is somewhere between zero and the LDL.

If the statistical distribution is known and is unlikely to be a normal (i.e., bell curve) distribution, we can nonetheless fit the distribution’s parameters by means of maximum likelihood estimation (MLE). This is how Statgraphics handles censored data, i.e., data sets for which all the measurements are not available, and the process can even be done with Microsoft Excel’s Solver feature. Goodness-of-fit tests can be performed to test the distributional fit, whereupon we can calculate the process performance index Ppk and set up a control chart for the characteristic in question.

Adam Conner-Simons’s picture

By: Adam Conner-Simons

This story was originally published by MIT Computer Science & Artificial Intelligence Lab (CSAIL).

Scatterplots. You may not know them by name, but if you spend more than 10 minutes online, you’ll find them everywhere.

They’re popular in news articles, in the data science community, and perhaps most crucially, for internet memes about the digestive quality of pancakes.

Syndicate content