Content By Fred Schenkelberg

Fred Schenkelberg’s picture

By: Fred Schenkelberg

When products were crafted one at a time, the design and manufacturing processes were often done by the same person. For example, a craftsman would design and build a chest of drawers or a carriage. Some trades would employ apprentices to learn the craft, which also included design.

Larger projects, like a railroad or a bridge, for example, might have included an architect or lead designer along with a team of engineers. The railroad engineer’s shop or the bridge site was not far away, allowing close communication between the ironsmith and design team.

With the rise of production systems came the rise of production facilities that specialized in mass production of an array of designs. Clothes, home appliances, and consumer products are examples of products that separated the designer from the day-to-day manufacturing experience. The advent of mass production gave rise to the necessity for product design teams to learn about the capabilities and limitations of a production system.

Early design for excellence (DFX) systems focused on design for production. As far as I know, the term “design for production” first appeared in a 1953 report by the British Productivity Council titled “Design for Production: Report of a Visit to the U.S.A.” 

Fred Schenkelberg’s picture

By: Fred Schenkelberg

What if all failures occurred truly randomly? Well, for one thing the math would be easier.

The exponential distribution would be the only time to failure distribution—we wouldn’t need Weibull or other complex multi-parameter models. Knowing the failure rate for an hour would be all we would need to know, over any time frame.

Sample size and test planning would be simpler. Just run the samples at hand long enough to accumulate enough hours to provide a reasonable estimate for the failure rate.

Would the design process change?

Yes, I suppose it would. The effects of early life and wear-out would not exist. Once a product is placed into service, the chance to fail the first hour would be the same as in any hour of its operation. It would fail eventually, and the chance of failing before a year would solely depend on the chance of failure per hour.

A higher failure rate would suggest it would have a lower chance of surviving very long. 

Fred Schenkelberg’s picture

By: Fred Schenkelberg

Concurrent engineering is a common approach that pairs developing the product design and its supporting manufacturing processes through the development process. There are several reasons why this is a good idea.

Design engineers may require the creation of new manufacturing processes to achieve specific material properties, component performance, or mechanical, electrical, or software tolerances. And if they fully understand the manufacturing capabilities and full range of impacts and risks to process yield, quality, and reliability performance, they can make informed decisions concerning their design requirements. Making good decisions during design creates value. You can estimate the value by the magnitude of the span of outcomes for the decision.

By the same token, manufacturing engineers learn earlier and firsthand what’s critical to the product’s performance. They develop intimate knowledge of the design and understand those critical nuances not included on drawings or specifications that affect a product’s functional and reliability performance.

Fred Schenkelberg’s picture

By: Fred Schenkelberg

The planning of environmental or reliability testing becomes a question of sample size at some point. It’s probably the most common question I hear as a reliability engineer: How many samples do we need?

Also, when evaluating supplier-run test results, we need to understand the implications of the results, again based on the number of samples in the test. If the supplier runs 22 samples without failure over a test that replicates the shipping set of stresses, then we need a way to interpret those results.

We often use success testing (no expected or actual failures during the testing) to minimize the number of samples required for a test and still show some level of confidence for a specified reliability level. The basis for success testing is the binomial distribution. The result of the applied stress results in the product either working or not. Binary results.

Recently I received a request to explain where the success-testing sample size formula comes from, or it’s derivation. First here’s the formula:

Where, C is confidence and R is the lower limit of the reliability.

Fred Schenkelberg’s picture

By: Fred Schenkelberg

Why do so many avoid confronting the reality of failure? In plant asset management, we are surrounded by people who steadfastly don’t want to know about nor talk about failures. Yet failure does happen; let’s not ignore this simple fact.

The blame game

Unlike a murder mystery, failure analysis (FA) is not a game of whodunnit. The knee-jerk response to blame someone rarely solves the problem, nor does it create reliability in the workplace.

If the routine is to blame someone when a failure is revealed, fewer people will reveal failures. If it’s clear we don’t want to talk about failures in a civilized manner, well, we’ll just not talk about failures.

Of course, failures will still occur. In the blame-centric organization, the majority of people who have the ability to understand and solve problems simply turn and avoid “seeing” failures. When friends and colleagues are vilified in their attempt to “solve problems,” it becomes clear that this is not a safe environment in which to point out failures.

Fred Schenkelberg’s picture

By: Fred Schenkelberg

Just, please, plot the data if you have gathered some time-to-failure data, or you have the breakdown dates for a piece of equipment. Any data really. It could be your review of your car maintenance records and notes and dates of repairs. You may have some data from field returns. You have a group of numbers and you need to make some sense of it. Just, please, plot the data.

Take the average

Finding the average seems like a great first step. Let's summarize the data in some fashion. Let's say I have the number of hours each fan motor ran before failure. I can tally up the hours, TT, and divide by the number of failures, r. This is the mean time to failure.

Or, if the data was on my car and I have the days between failures, I can also tally up the time, TT, and divide by the number of repairs, r. Same formula and we call the result the mean time between failure (MTBF).

I have a number; say it's 34,860 hours MTBF. What does that mean (no pun intended) other than on average my car operated for 34,000 hours between failures. Sometimes more, sometimes less.

Fred Schenkelberg’s picture

By: Fred Schenkelberg

What happens when a product lasts too long? How long is good enough? Every product is different, and our ability to define what’s “long enough” is fraught with uncertainty. If it wears out prematurely, your customers will go elsewhere. If it lasts too long, they won’t need to come back.

In “The One Hoss Shay,” a poem by Oliver Wendall Holmes, a deacon is confounded by the various parts of his carriage that fail, and he decides to do something about it:
“But the Deacon swore (as Deacons do,
With an ‘I dew vum,’ or an ‘I tell yeou,’)
He would build one shay to beat the taown
’n’ the keounty ’n’ all the kentry raoun’;
It should be so built that it couldn’ break daown:
‘Fer,’ said the Deacon, ‘t’s mighty plain
Thut the weakes’ place mus’ stan’ the strain;
’n’ the way t’ fix it, uz I maintain,
Is only jest
T’ make that place uz strong uz the rest.’”

Translating from the antiquated English, the deacon basically wanted to craft a carriage using the best materials and techniques. He built a very sound carriage where every part was just as strong as all the other parts. It’s a fine craft that works well, remains as new as the day it was built, and outlives the builder.

Fred Schenkelberg’s picture

By: Fred Schenkelberg

Control charts provide an ongoing statistical test to determine if a recent reading or set of readings represents convincing evidence that a process has changed from an established stable average. The test also checks sample-to-sample variation to determine if the variation is within the established stable range. A stable process is predictable, and a control chart provides the evidence that a process is stable—or not.

Some control charts use a sample of items for each measurement. The sample average values tend to be normally distributed, allowing straightforward construction and interpretation of the control charts. The center line of a chart is the process average. The control limits are generally set at plus-or-minus three standard deviations from the mean.

When selecting a subgroup size, the intent is to collect samples that are as homogeneous as possible. The intent is to detect shifts between samples because the variation within a sample is relatively small. One method is to select samples from the process as close together as possible. Another option is to randomly select from across a period of time or batch. The latter option may have larger within-sample variation, and may not be as sensitive to shifts in the mean as the first option.