Featured Product
This Week in Quality Digest Live
Six Sigma Features
Scott A. Hindle
Part 4 of our series on SPC in the digital era
Donald J. Wheeler
What are the symptoms?
Douglas C. Fair
Part 3 of our series on SPC in a digital era
Scott A. Hindle
Part 2 of our series on SPC in a digital era
Donald J. Wheeler
Part 2: By trying to do better, we can make things worse

More Features

Six Sigma News
How to use Minitab statistical functions to improve business processes
Sept. 28–29, 2022, at the MassMutual Center in Springfield, MA
Elsmar Cove is a leading forum for quality and standards compliance
Is the future of quality management actually business management?
Too often process enhancements occur in silos where there is little positive impact on the big picture
Collect measurements, visual defect information, simple Go/No-Go situations from any online device
Good quality is adding an average of 11 percent to organizations’ revenue growth
Floor symbols and decals create a SMART floor environment, adding visual organization to any environment
A guide for practitioners and managers

More News

Eric Heckman

Six Sigma

Starting Out With Capability Analysis, Part One

Setting up your data

Published: Monday, January 7, 2013 - 11:31

It’s your first day at the Jedi Temple, working as a lightsaber manufacturer. Your first task on the job is to run a capability analysis on the length of lightsabers being produced. Your main concern is to see if the lightsabers fit within the required length specifications set forth by the Jedi Council. You aren’t quite sure where to start. Thankfully, Minitab Statistical Software is there to help you—even in a galaxy far, far away.

Capability analysis is used to assess the capability of an in-control process. A “capable” process is able to produce products or services that meet specifications.

Setting up the data for capability analysis

The first step in performing a capability analysis is to make sure you have your data set up properly. The first step is to put your measurement data in one column of a Minitab worksheet. In our case, the Jedi Council has provided us with 80 measurements taken from newly crafted lightsabers, taken during the course of 10 days. These measurements can be entered directly into the software or imported from Microsoft Excel or even an SQL database. (If you’re wondering how we can get measurements so precise, well... being able to measure using the Force helps!)

Here’s what our measurement column looks like:

Identifying subgroups for capability analysis

An important step in capability analysis is identifying your subgroups. A subgroup is simply a group of units produced under the same set of conditions. Ideally, it is a set of measurements taken close together in time, yet still independent of each other. In our case, the lightsabers were measured by a different Jedi each day for 10 days, with eight sabers being measured by each. In quality terms, we have 10 subgroups, with eight units in each subgroup.

In my experience working with tech support, many callers who are struggling with a capability analysis aren’t familiar with the concept of subgroup size, and thus gloss over it quickly. Don’t! The way you collected your data is extremely important to capability analysis.

Why data order matters in capability analysis

You should always make sure your data are entered in the order it was collected, and that each entry is assigned in the proper subgroup. Calculations essential to the analysis depend on the order of your data as well as the size of these groups. For this reason you should not sort your data.

To see why this is an issue, consider the following scenario: You have a subgroup of five measurements, one with sorted data, and one with unsorted data. If you sort the data, the analysis of the measurements will appear to display less variation than that subgroup actually had. As a result, your Cpk and other values will be artificially inflated, making your process look much more capable than it actually is. It is important to not fall into this trap.

I’ve also seen people enter in a variety of subgroup sizes and choose the one that gives them the best Cpk, or parameter of interest. Needless to say, that's not appropriate—it isn’t correct to analyze your data with subgroups of five if, in reality, the subgroup size was eight.

Creating a subgroup column

Once we are sure of the subgroup size and that our data are entered in Minitab in the correct order, we can create a subgroup column. This will tell the software that the corresponding measurement belongs to a particular subgroup. because our data are ordered, we know that the first eight observations in our worksheet are a part of subgroup one, the next eight are from subgroup two, etc.

Creating a subgroup column in Minitab is fairly simple. We can do this by going to Calc > Make Patterned Data > Simple Set of Numbers. In our case, the dialog box would get filled out like this:

When we press OK in this dialog box, Minitab will create a string of numbers from 1 to 10, which represent the days our data was collected. The way we've filled out the dialog box also tells Minitab to display each value eight times, which represents the eight measurements taken each day, forming our subgroups.

Once we have our subgroup column, our data set in the worksheet should look as follows:

Now that we have our data set up in Minitab correctly and our subgroups correctly identified, we're nearly ready to do our analysis. My next column will go over meeting certain assumptions for capability analysis results to be valid, assumptions that need to be met before we start. Even the will of the Force can only be trusted when it's backed up by sound statistical techniques. OK, so that quote didn’t quite make it to the final film, but it's true nonetheless.


About The Author

Eric Heckman’s picture

Eric Heckman

Eric Heckman is a technical support specialist for Minitab Inc.


Interesting Introduction

This is an interesting introduction to capability analysis, and I couldn't agree more that you have to get your data right.

I have a question, though, about this scenario, where "the lightsabers were measured by a different Jedi each day for 10 days, with eight sabers being measured by each." Why would you do that? What you end up with is within-subgroup variation measured across each day in (probably) an s-chart, and between-subgroup variation tracked in the x-chart. That's good, but now we don't know if the between-subgroup variation is day-to-day variation or Jedi-to-Jedi variation. This scheme would only work if you have already established excellent reproducibility, so you could dismiss Jedi-to-Jedi variation as a confounding factor.

On an unrelated note, I thought it was a little ironic that the primary ad in the middle of this article about Minitab was an ad for JMP.