The maintenance problem
Too many times, in lean manufacturing and other lean environments, 10- to 40-year-old equipment is re-deployed, moved and organized into lean cells without adequate concern or attention to maintenance reliability. In a lean cell, unscheduled equipment downtime usually costs 10 to 20 times what the same equipment downtime costs in old traditional batch processing or functional departments. For example, before “lean” we quoted CNC machine tool downtime at $250–$750 per hour for a single 3- to 5-axis CNC machine or robot. Today, automakers with well-configured lean manufacturing plants quote machine tool or robot downtime costs at $2,500 to $5,000 per hour. That is, until a painting robot misses doing its 7th or 8th car. Then the factory is backed up and downtime cost jumps to $3,350 per minute ($201,000 per hour).
As a maintenance engineer for John Deere Co. in the 1970s, I was highly motivated by downtime figures of $250–$750 per hour. By avoiding 4-6 hours of downtime, I had saved the company my month’s salary. I was motivated to find ways to avoid, reduce or eliminate downtime, wherever I could. How much more motivating is lean maintenance reliability today?
Discoveering a solution
The answer to increasing reliability and uptime of computers, telecom equipment, machine tools, automation controls, hydraulic systems, electronics, etc., used in lean manufacturing and other lean environments can be derived from Six Sigma’s Y = f(x) and DMAIC. That is if you don’t go down the wrong and apparent path, as explained below. Back in the ‘70s we didn’t have Six Sigma, so we started our analysis by gathering “cause,” “effect” and “result” information on each maintenance downtime situation. For example:
Cause: Bad CAU2 circuit board
Effect: X–Y axis cutting egg shapes rather than circles
Result: Scrap parts and downtime
Log books were placed at each machine with this format, and the maintenance situation was detailed by the electrician or mechanic as soon as the machine was repaired and the cause was known and corrected.
Soon our analysis database looked something like the following table:
| Cause | Effect | Result |
|---|---|---|
CAU2 board | Egg-shaped cuts | Scrap, downtime |
Bad memory board | Part ID growing | Rework, downtime |
Axis drive board | Axis oscillation | Scrap, downtime |
Spindle CMD board | RPM swings | Rework, downtime |
Servo valve | Y run to limit | Downtime |
Bad solenoid | No coolant | Downtime |
Hydraulic pump | No chuck gripping | Scrap, downtime |
Hydraulic 3W valve | Turret unclamping | Broken tool holder |
SCR failed | Z-axis runaway | Downtime |
CMD board | No X movement | Downtime |
FE-2A board | Only rapid travel | Broken tool, scrap, downtime |
Z-PQM drive | Axis not stopping | Scrap, downtime |
Bad limit switch | X-axis crash | Rework, downtime |
Bad encoder | Positioning errors | Scrap, rework, downtime |
Loose FB connector | Y-axis runaway | Rework, downtime |
Cap. on Y FB board | No Z-axis movement | Scrap, downtime |
As this table of malfunctions and failures is examined, there is little commonality in cause but great commonality in result. Even the effect is often similar from dissimilar causes.
Today’s Six Sigma improvement methods would express these malfunctions and failures in terms of Y = f(x), where Y is the malfunction, error or defect and Y happens as a function of “x,” or f(x). The questions to be asked are:
Is Y the effect and (x) the cause?
Is Y the result and (x) is the effect?
Is Y the result and (x) is the cause?
It seemed important to us to focus on the result (Y) and the cause (x) to try and reduce Y = downtime, scrap and rework. More recent years of experience also show that eliminating or reducing Y also results in increased precision, repeatability and yield for semiconductor and nanotechnology fabrication and other process industries.
However, there doesn’t seem to be much commonality showing up in the “cause,” or (x) factors, as is expected by Six Sigma methodology. This would normally suggest the need for a more elaborate, expensive and time-consuming predictive maintenance program. With enough tracking, we should be able to calculate the mean time between failures, predict when these devices and components are about to fail and replace them before failure happens.
This is the apparent path we were about to go down, when a July 23rd downtime situation, caused by a failed axis drive board, shocked me into a huge paradigm shift. It completely changed my focus and my career from that point forward. We wrote the following into the log book:
Cause: Bad axis drive board
Effect: X-axis oscillation
Results: Scrap and downtime
A simple observation was made: “It’s no wonder the board failed, it’s too hot in that cabinet!” In other words, there was a “cause of the cause?” It was instantly clear that heat stress was causing much of the higher downtime we experienced every summer with this vintage of CNC lathe. We should have been identifying the stress for each downtime situation in our log books like this:
Heat Caused: Bad axis drive board
Effect: X-axis oscillation
Results: Scrap and downtime
What other stresses cause electronic, hydraulic and automation equipment downtime? In this instance, the (x) factor was heat. Y = scrap and downtime, were happening as a f(x) function of heat. What are the other basic stresses that cause these seemingly random malfunctions, failures, and downtime? That same day we brainstormed and came up with other basic stresses:
Stresses or (x) factors:
Heat
Vibration
Dirt build-up
Oxidation
Corrosion
Power surges, lightning storm transients, etc.
Hydraulic contamination
Our first efforts to eliminate heat by installing a cabinet air conditioner proved so effective that we pulled away from predictive maintenance and focused on stress elimination to prolong, rather than predict, MTBF. Eliminating stress, or hardening equipment against stress, resulted in such an increase in MTBF that there was little sense in predicting failure when we were finding ways to prevent it. We prolonged reliability and increased machine uptime and utilization.
Today our maintenance history table looks like this:
Stress | Caused | Effect | Result |
|---|---|---|---|
Heat | CAU2 board | Egg-shaped cuts | Scrap, downtime |
Bad memory board | Part ID growing | Rework, downtime | |
Axis drive board | Axis oscillation | Scrap, downtime | |
Spindle CMD board | RPM swings | Rework, downtime | |
Contamination | Servo valve | Y run to limit | Downtime |
Bad solenoid | No coolant | Downtime | |
Hydraulic pump | No chuck gripping | Scrap, downtime | |
Hydraulic 3W valve | Turret unclamping | Broken tool holder | |
Surges | SCR failed | Z-axis runaway | Downtime |
CMD board | No X movement | Downtime | |
FE-2A board | Only rapid travel | Broken tool, scrap, downtime | |
Z-PQM drive | Axis not stopping | Scrap, downtime | |
Vibration | Bad limit switch | X-axis crash | Rework, downtime |
Bad encoder | Positioning errors | Scrap, rework, downtime | |
Loose FB connector | Y-axis runaway | Rework, downtime | |
Cap. on Y FB board | No Z-axis movement | Scrap, downtime |
In Six Sigma terms, we had identified seven (x) factors. Of course not all seven (x) factors are present and active on any given computer, machine or piece of equipment. But in the 25 years since this discovery, 21 of which have been spent consulting on maintenance and reliability nationwide, I’ve not been able to add to that list of basic (x) stresses. Sometimes there are other key issues we find, such as poor design, operator abuse or inadequate component ratings, but even these can frequently be endured and downtime (Y) avoided by eliminating the related (x) stress listed above.
So, then, what are the most cost-effective ways to eliminate these stresses? How do we protect or harden equipment against the unavoidable presence of these stresses? In today’s corporate environment, possibly the most effective way to answer and act upon these questions is to use Six Sigma and its DMAIC model of:
Define the problem
Measure the problem
Analyze how the problem can be eliminated
Implement the solution
Control the solution, to ensure it continues and is improved if practical (kaizen)
It’s been said that 80 percent of a secret’s value is simply knowing the secret exists. Now this “maintenance reliability secret” has been revealed. You can discover the rest on your own to eliminate all seven stresses. We have defined the seven (x) factor stresses that cause 70 to 92 percent of all your expensive unscheduled equipment downtime.
After the “secret” was discovered at John Deere’s Dubuque Works, it took the next two years to analyze and implement solutions to cut unscheduled maintenance downtime by 50 to 60 percent.
The next company I helped through the process took a little more than a year to effect 70 to 80 percent reduction of unscheduled downtime.
Over the past several years, consulting with manufacturers, telecom companies, hospitals, insurance data processing centers, high security jails and prisons, offshore drilling rigs, petroleum processing plants and semiconductor fabrication, we have been able to reduce unscheduled downtime by 70 to 92 percent and implement the improvements within 30 to 60 days. With Six Sigma DMAIC and the right technical expertise, you can do the same. Search online “Lean Maintenance via Six Sigma’s DMAIC” and other related articles by this author.
Sign In to get started!