Understanding the physics of failure contributes to our success. It's all about moving from reaction to prediction. Much like in sports science, top athletic coaches monitor every step their athletes take in order to predict how the body will behave under stressful conditions. Damage to the body sustained while engaged in extreme physical exertion was considered as a random occurrence and analysis was made using statistical methods with little knowledge surrounding what impacted upon the statistics. This was the same way variable speed drives were previously monitored. Today, predicting failure and reacting before it can have a diverse effect on productivity has become the number one priority.
Reliability engineers often use the familiar bathtub curve (Image 2) when describing how failure rates behave as a function of time. In this curve the failure rate decreases when the product is new, the failure rate is constant or flat during product’s useful life and increases when the product is getting closer to the end of its design life.
Bathtub curve describes how failure rates behave as a function of time.
The bathtub curve is useful when planning spare parts logistics and making warranty provisions etc. Unfortunately this statistical bathtub model gives you very little information about a new product. The general statistics are not very useful from an individual user’s point of view because their applications and conditions vary a lot. The bathtub curve doesn’t give answers about what to expect when conditions change and what needs to be done if you want to get better reliability.
A statistical approach is widely used by the insurance industry and economists. Luckily physicians don’t use this method when you ask their advice on how to live a healthy life. They understand that behind the statistics are individuals who have their own stresses and genome that influence what to expect. If the prediction doesn’t match your expectations physicians try to influence it according to the underlying variables.
There are only two reasons why products fail: overstress and wear-out. If the stress exceeds its strength the product will fail and the result is an overstress failure. Wear-out failure happens when the cumulative stress exceeds the product’s durability.
There are two kinds of products: nominal and defective. A nominal product doesn’t fail at nominal stress and it will last the useful life it was designed for when exposed to the specified stresses. A defective product fails when nominal stress is applied to the defect site.
It must be noted that if a product is defective it may not fail. It fails once stress is applied to it. In reality there are many more defective products than failures because in most applications the stresses applied are less than they are designed for so the defects remain unnoticed.
Physics of Failure reliability engineers go beyond the statistics and try to understand the strength of the product and application stresses that influence its reliability. Stresses that influence reliability are temperature, temperature change, humidity, humidity change, voltage, corrosion, vibration, mechanical shock and radiation.
Every time a product is exposed to certain stresses, the stress causes some irreversible damage and the product wears out.
Wear-out can be tested in laboratory in Accelerated Life Time (ALT) tests. During these tests products are exposed to different stress levels that are much higher than real use conditions. This causes them to age at a faster rate. Product life time stress can be created in months instead of years. Based on these tests various wear-out models can be created. These models can be then used to predict the expected life at various use conditions. The most widely used model is the Arrhenius model that predicts wear-out speed as a function of temperature.
ABB uses systematically accelerated life tests to demonstrate that their new variable speed drives will fulfil the reliability requirements. In a typical setup the drives are exposed to temperatures almost twice the specification limits. In this way 10 years of real application stresses can be simulated in four to six months.
Product specifications usually specify the maximum and minimum stresses during transportation, storage and use. Traditionally products are designed and tested so that they can handle these conditions.
Unfortunately real life does not necessarily follow the documented conditions and usually there is also variation in the product strength. That’s why it is not enough to know the compliance to specification. It is equally important to understand what the product’s weakest links are and the level of margin and variation in the weakest link.
In Highly Accelerated Life (HALT) tests the stresses are increased until the product fails. The stress levels are recorded and the failure is analyzed carefully. Typical stress temperatures are -60˚C and upper test temperatures are beyond 150˚C. At the higher temperature plastics are already starting to melt.
The statistical bathtub curve shows a high failure rate when the product is new. This is typical even if the product is 100 percent tested at the factory before shipment. How can we explain this?
The PoF approach may give some hints: Two reasons - overstress and wear-out - and two kinds of products - nominal and defective. By analyzing the stresses during the first months or years after shipment it can be easily seen that the stresses that the product experiences before its normal use may exceed the stresses during factory testing. During transportation and storage the product may be exposed to vibration, mechanical shock, humidity and corrosive gases. During commissioning abnormal mechanical stress might be applied on connectors or printed circuit boards, and wrong connection settings can cause excess voltages or temperature changes. Some of these stresses may exceed the product’s nominal strength, and they certainly cause the defects to result in failure.
Once the extra stresses of early life are over and the defective products have been repaired, we enter the phase where failures look random even though in reality they are also caused by overstress like electrical overvoltage, loss of cooling etc. or there was a latent defect that became visible over time.
When the product is getting old failures get more common. In applications where the stresses are high products will experience shorter life than applications with more moderate stress. In mining, for example, huge shovels are used to fill enormous trucks in just two scoops. As the shovels are under immense strain in extreme conditions, the lifetime of the equipment may be reduced.
It is important to understand that during the whole product life cycle the stress has an impact on the expected reliability as shown in Image 3. All products, nominal and defective will follow the same laws of physics and the same reliability models can be used to predict their future performance.
Impact of stress on reliability.