Why do variable speed drives fail and how do we test them?

Freezing cold, intense heat, vibration, corrosive chemicals… drives can end up working in some really tough environments. It’s impossible to know what sort of dangers drives will face after they leave the factory.

However, what a manufacturer can do is design them for durability and then test them to ensure they will perform according to specification over their whole lifetime. At ABB we believe that reliability adds value to our customers so we go to extra ordinary lengths to ensure that every drive lives up to high standards of reliability. In this article, Kari Tikkanen, ABB Reliability Engineering Manager, describes how ABB Drives carries out reliability testing.

Share this page

The physics behind product failure

Before looking at how drives are tested, we need to think about the factors that could make them fail. As we saw in a previous article, the physics of failure (PoF) approach divides products into two types and provides two reasons why products fail.

The two reasons why products fail are overstress and wear-out, and these are related to the product’s strength and durability, respectively. An overstress failure will occur if a product is subjected to stress that exceeds its strength. Wear-out is a longer term failure process: each time a product is exposed to stress it suffers some damage, and the cumulative effect builds up and eventually causes failure when it exceeds the product’s durability.

Image 1. Things fail when stress exceeds strength. Defective products have less than nominal stress and fail already at nominal stress.

Image-1-Kari-Tikkanen-ABB-Reliability-Engineering-Manager-390x195

Kari Tikkanen, ABB Reliability Engineering Manager, describes how ABB Drives carries out reliability testing.

Under the PoF approach, products are considered as either nominal or defective. Nominal products will withstand nominal stress, so they will last their entire design lifetime provided that they are not exposed to stresses exceeding the specified levels. Defective products, by contrast, will fail when nominal stress is applied to the defect site. This might lead you to expect that defective products will not last very long after they are taken into use, but many do in fact continue to function. This is because stress levels in most applications are below the nominal design limits, so defects don’t turn into failures and the products operate without any problems.

Stresses that can affect the reliability of drives are mechanical, thermal, electrical, radiation, and chemical. Image 2 shows some of the ways in which these stresses can lead to overstress and wear-out failures.

Image 2. Typical failure mechanisms for electronic equipment.

Deciding what to test and how to test it

Testing programs for ABB drives are devised according to the physics of failure approach. Testing methods are selected according to the type of product sample, and the failure mechanism that is being investigated (see Image 3). Testing is carried out at all stages: in R&D the aim is to verify that the design and components selection meet both the specifications and customer expectations; in production the purpose is to verify the quality of the product and ensure it continues to perform as designed; and when the product eventually fails, failure analysis can be carried out to identify what went wrong or if the failure was caused by natural wear-out.

Image 3. Testing methods are selected according to the type of sample and failure mechanism.

Type testing carried out as part of R&D focuses on nominal samples so relatively small sample sizes can be used. Because the samples in the test are nominal products, they should all fail the same way. An important step at the end of the testing process is to analyze the failure to ensure it is indeed a nominal product and the failure was not caused by a defective component or production error. If this should be the case it indicates that most probably there is a severe quality problem in production when a defect is found in so small sample.

In the case of testing for defective products, the first decision to be made concerns the sampling rate. In most cases only a small percentage of the products is defective, so it is necessary to test a large number to verify their true proportion. According to statistical theory, if 99% of products are good and the required confidence level is 99%, for example, then it will be necessary to test 459 units without a single failure to confirm that the proportion of defective products really is less than 1%.

Testing methods

Highly Accelerated Life Testing (HALT)
The purpose of HALT testing is to probe the product’s weakest links and determine how much overstress it can withstand, i.e. to verify the overstress margins. HALT tests often focus on temperature and vibration, both separately and in combination, with typical test temperatures of -55°C up to 150°C and vibration levels up to 50 G. Other stresses commonly used in HALT testing are voltage, current, mechanical shock, over-torqueing of terminals, moisture, etc. Image 4 shows an example stress profile for HALT testing with temperature and vibration.

Image 4. Typical stress profile in Highly Accelerated Life Testing.

HALT testing is most commonly performed on components and sub-assemblies. When the product fails the root causes are analyzed to determine whether a similar failure could occur in a real life situation. In many cases it turns out that the same type of failure could occur in a real application if certain abnormal conditions arose – such as an accident during transportation or malfunctioning of a cooling system. If necessary the design is improved and HALT testing is repeated to verify that the improvements have had the desired effect.

Reliability Demonstration Testing (RDT)
The aim of RDT testing is to confirm that the product’s expected life meets or exceeds the target, and it is generally carried out as part of R&D.

Factors that must be known or determined in order to design ALT/RDT tests are the product’s expected reliability at end of life, the mission profile during its life, required confidence level, and the allowed stress levels identified in HALT tests.

The tests expose the product to stress it will experience over its entire lifetime. Most common stresses for drives are temperature and temperature cycling. To reduce the time needed for testing, stress levels that exceed specified operating conditions are used.

Depending on the confidence level required and the reliability levels expected, 7-20 samples are typically needed for RDT testing. The product can be launched on the market following successful RDT testing, but testing nevertheless continues after the launch in the form of ALT testing. This follow-on ALT testing will confirm the validity of the model and may provide opportunities for total life cycle cost reduction if the tested life is too long.

At ABB complete drives can be RDT tested in a special Reliability Container where they are exposed to drastic stresses. Various reliability models are used to determine the testing time that corresponds to the stresses that the drive will see during its whole life. Typically 10 years life can be tested in a few months.

Accelerated Life Testing (ALT)
ALT testing is done to determine the product’s expected lifetime. It is very similar to RDT except that various stress levels are used and the tests are continued until failure.

The difference between the RDT test and ALT test is that after RDT we don’t know what the actual expected life is because the units are not supposed to fail. We just know that it survives x years or longer. ALT tests will give us estimate of the real life. RDT tests also assumes certain failure models and material constants. ALT testing provides us with the model and material constants.

Depending on the expected confidence and reliability levels, ALT testing typically requires 7-60 samples. Large sample sizes and a significant test time are often needed, especially if the activation energy or other coefficients used by the model are not known.

Highly Accelerated Stress Screening (HASS)
HASS screening is performed as an integral part of the production process. The idea is to expose products to increased stress levels in order to cause defective units to fail during screening rather than in the infant period of the life cycle. Image 5 is a product ‘bathtub curve’, which shows how failure rates vary over the lifetime. Machines are like humans: the higher the stress the higher the infant mortality, higher sickness rate and shorter life.

Image 5. HASS screening shifts the bathtub curve in order to induce infant mortality failure among defective units and therefore reduce failure rates in actual use.

HASS involves a trade-off: while the aim is to reduce failures in actual use, the screening process itself shortens the product’s lifetime slightly because the application of stress contributes to wear-out. Therefore it is important to carefully consider the balance between benefits and drawbacks when planning HASS. Is it worth using up two months of a ten year lifetime, for example, if this will eliminate x% of the product’s infant mortality failures? HASS is naturally most beneficial for products with high infant mortality failure rates. Downsides of HASS are that the screening costs are high and production throughput times are increased.

As a practical example of how HASS is used in production processes at ABB Drives, test cabinets are used to screen main circuit boards and gate driver control boards. The boards are connected to a power supply which is cycled during screening. They are tested several days at temperatures exceeding maximum operational temperatures. This stress equals to the stress in a few weeks of normal operation.

On Going Reliability Testing (ORT)
The purpose of ORT is to ensure that no changes have occurred in components or production processes that will have a systematic impact on reliability. The methodology is similar to RDT/ALT testing, but the sample units are randomly selected from actual production. In the case of ABB’s drives, testing can be performed on drive modules, IGBT drive packages, PCB boards and even complete drives. Depending on the risk level that has been set, samples are taken from production each week or month.

If a failure occurs during testing, a thorough root cause analysis is carried out to check whether the failure was a random event or due to a change in the product or one of the components.

Testing drives together with motors

When OEMs are sourcing a new drive they want to know how it will perform with their motors. A facility at ABB Drives – the Drives Customer Laboratory – enables drives to be tested with the customer’s own motors or any ABB motors. The tests are set up so that the loading conditions simulate the actual application, and the purpose is to find the optimal drive system for the application. This testing is therefore concerned with performance of the motor/drive system rather than reliability.

The test equipment measures drive/motor dynamic performance, load capability and efficiency. This data is used to optimize the drive/motor combination, which can help to reduce costs, space requirements, and energy consumption. It can also avoid over-dimensioning, which often happens when drives are selected without access to accurate performance data.

The product stresses can be also accurately measured during the tests. This information can be used to predict the reliability of the product in this application and the total life cycle costs can be optimized by selecting the right sizing of the product and designing an optimal service program.

Failure analysis

When a failure occurs, the next step is naturally to ask what went wrong – and whether something could be done to avoid similar failures in future. The Drives Failure Analysis Laboratory undertakes root cause analysis using equipment and techniques such as 3D X-rays, acoustic scanners, SAM microscopes, and component cross sectioning. The aim is to determine whether the product was nominal or defective and what caused the stress that led to the failure. Modern failure analysis is an interesting topic that will be examined in detail in a future article.