The following method is used for calculation of MTBF (Mean Time Between Failure) based on field failures for ACS 400. Only complete inverter module returns are considered in the calculation. The same method can be used for essentially all products where field failures can be measure by quantity and in time.
1. Basic Formulas.
The most common and simplest way of looking at reliability is to assume that an exponential
distribution applies. The reliability (R) or proportion of drives that remains functional after a specific time (t) can then be expressed as:
R =e
This formula can be rewritten as:
-t MTBF MTBF =-1 ln R
MTBF can now be calculated when reliability (R) is known. Reliability in this case is just a number that expresses how many units that are functional after a period of time. For example, if we are looking at 1000 drives for 1 year and find that 50 of the drives failed during the year, R = (1000 – 50)/1000 = 0.95. MTBF can be calculated to be 19.5 years using the formula above. The reason why the unit of
measure becomes years in this case is that we looked at the drives for 1 year. If the period of time had been ½ year, MTBF would be 19.5 ½ years = 9.75 years.
Note that the crude method of just dividing the number of drives looked at (1000) with the number of failures (50), yields a similar result (20 years). The difference is in that using the exponential
distribution takes into account the fact that the number of drives left in operation declines during the year.
2. Specific method used by New Berlin
When MTBF is to be calculated, the period of time during which failures are counted must be
determined. One major factor in selecting a time period is that failure rate has seasonal variations due to ambient temperature variations and variations in loading. For New Berlin this effectively means that the time during which failures are counted must be a multiple of 1 year. A 1 year period of time was chosen because if time is longer, the waiting period before an MTBF value can be calculated becomes too long.
Another consideration is what batch of drives to use for each MTBF calculation. In New Berlin we choose the drives produced in each month since the amount of drives is sufficiently large (approx. 3000) to produce a reasonable amount of failures during a year. The amount of failures affects the error in the MTBF calculation (see below) and it should therefore be chosen accordingly.
The third decision to make is when to start the year during which failures are counted. A dwell time after the production month is obviously necessary because it takes time for the drives to get into operation. In addition, the initial start up of drives and the initial period in operation is likely to have a higher failure rate (infant mortality) and an MTBF value should not include these extraordinary events. :///.doc
Confidential for Internal Use Only Print Date: 03/28/13 4:08 AM
In New Berlin we chose a dwell time of 9 months. This decision was based on a study of how drive failures occurred in relation to the production date. The study was performed for ACS 400 drives produced in each month from July, 2000 through June, 2001 (see attached spreadsheet
failtime400.xls). The age of drives at failure were recorded and grouped into 1 month slots. The chart in the spreadsheet, which shows the density function for drive failures, indicates that some sort of stability is reached after 9 months.
As an example of timing, let’s consider drives produced in January, 2000. Failure recording would start on 9/1/00 and continue through 8/31/01. Obviously the waiting time to be able to calculate the MTBF for a particular month is 1 year and 9 month.
There are some additional corrections that should be considered in the calculation of MTBF. First is the fact that drives fail before we enter the 1 year measuring period. These failures are in no way involved in the MTBF calculation but they reduce the quantity of drives participating in the 1 year run. Therefore a correction needs to be made to the initial quantity of drives. From spreadsheet failtime.xls it can be calculated that the quantity of drives failing in the 0 – 9 month time frame is approx. 0.85 times the drives that fail during the 1 year run. Therefore, the initial quantity is assumed to be Qp – 0.85 * Qf , where Qp is the quantity of drives produced and Qf is the quantity of drives failed in the 1 year run. With this in mind the final formula for MTBF becomes:
MTBF =-1 ⎛Q p -1. 85⋅Q f ⎫⎪ln Q -0. 85⋅Q ⎪f ⎭⎝p
The 9 month dwell time and the 12 month failure recording time means that currently the last month for which MTBF can be calculated is July, 2001. A way to get preliminary values for a few more months is to use a shorter failure recording period and estimate the correction needed to get to the full 12 month period. Spreadsheet failtime.xls gives an indication about suitable correction factors (last column). The accuracy of MTBF values obtained in this way is of course less than when a full 12 month period is used.
The calculated MTBF values for ACS 400 in the US are attached as an Excel spreadsheet
(MTBF400.xls).
3. Errors in MTBF
Although the MTBF is calculated on the entire population of drives from a production month and all failures are recorded during the 12 month recording period, there is a statistical error associated with the MTBF calculation. The error is a result of the fact that failures occur at random and the recording period is limited in time. In this case where the failures are counted during a predetermined period, the data is considered time-censored (type 1) and the lower and upper 90 % confidence limits becomes:
:///.doc
Confidential for Internal Use Only MTBF l =MTBF ⋅χ20. 05, 2⋅Q 2⋅Q f f +2 Print Date: 03/28/13 4:08 AM
MTBF u =MTBF ⋅χ20. 95, 2⋅Q f 2⋅Q f
Where Qf is the number of drives failed and MTBF is the value calculated per section 2 above. The chi-squared function is readily available in Excel as CHIINV(probability, deg. of freedom)
The error is very helpful in determining if fluctuations seen in MTBF are just noise or a real change in the quality of the drives. With the chosen 90 % confidence a good fit trend line should fall within the confidence limits 90 % of the time. If it doesn’t there is a high probability that there are real problems. An example of that is our own MTBF trend (attached below). For the spring of 2001 there are many data points that fall to the side of the trend line. We don’t know for sure what happened back then but it was the time when manufacture of the main boards was shifted from Flextronics in Sweden to Flextronics in Finland.
:///.doc
Confidential for Internal Use Only Print Date: 03/28/13 4:08 AM
The following method is used for calculation of MTBF (Mean Time Between Failure) based on field failures for ACS 400. Only complete inverter module returns are considered in the calculation. The same method can be used for essentially all products where field failures can be measure by quantity and in time.
1. Basic Formulas.
The most common and simplest way of looking at reliability is to assume that an exponential
distribution applies. The reliability (R) or proportion of drives that remains functional after a specific time (t) can then be expressed as:
R =e
This formula can be rewritten as:
-t MTBF MTBF =-1 ln R
MTBF can now be calculated when reliability (R) is known. Reliability in this case is just a number that expresses how many units that are functional after a period of time. For example, if we are looking at 1000 drives for 1 year and find that 50 of the drives failed during the year, R = (1000 – 50)/1000 = 0.95. MTBF can be calculated to be 19.5 years using the formula above. The reason why the unit of
measure becomes years in this case is that we looked at the drives for 1 year. If the period of time had been ½ year, MTBF would be 19.5 ½ years = 9.75 years.
Note that the crude method of just dividing the number of drives looked at (1000) with the number of failures (50), yields a similar result (20 years). The difference is in that using the exponential
distribution takes into account the fact that the number of drives left in operation declines during the year.
2. Specific method used by New Berlin
When MTBF is to be calculated, the period of time during which failures are counted must be
determined. One major factor in selecting a time period is that failure rate has seasonal variations due to ambient temperature variations and variations in loading. For New Berlin this effectively means that the time during which failures are counted must be a multiple of 1 year. A 1 year period of time was chosen because if time is longer, the waiting period before an MTBF value can be calculated becomes too long.
Another consideration is what batch of drives to use for each MTBF calculation. In New Berlin we choose the drives produced in each month since the amount of drives is sufficiently large (approx. 3000) to produce a reasonable amount of failures during a year. The amount of failures affects the error in the MTBF calculation (see below) and it should therefore be chosen accordingly.
The third decision to make is when to start the year during which failures are counted. A dwell time after the production month is obviously necessary because it takes time for the drives to get into operation. In addition, the initial start up of drives and the initial period in operation is likely to have a higher failure rate (infant mortality) and an MTBF value should not include these extraordinary events. :///.doc
Confidential for Internal Use Only Print Date: 03/28/13 4:08 AM
In New Berlin we chose a dwell time of 9 months. This decision was based on a study of how drive failures occurred in relation to the production date. The study was performed for ACS 400 drives produced in each month from July, 2000 through June, 2001 (see attached spreadsheet
failtime400.xls). The age of drives at failure were recorded and grouped into 1 month slots. The chart in the spreadsheet, which shows the density function for drive failures, indicates that some sort of stability is reached after 9 months.
As an example of timing, let’s consider drives produced in January, 2000. Failure recording would start on 9/1/00 and continue through 8/31/01. Obviously the waiting time to be able to calculate the MTBF for a particular month is 1 year and 9 month.
There are some additional corrections that should be considered in the calculation of MTBF. First is the fact that drives fail before we enter the 1 year measuring period. These failures are in no way involved in the MTBF calculation but they reduce the quantity of drives participating in the 1 year run. Therefore a correction needs to be made to the initial quantity of drives. From spreadsheet failtime.xls it can be calculated that the quantity of drives failing in the 0 – 9 month time frame is approx. 0.85 times the drives that fail during the 1 year run. Therefore, the initial quantity is assumed to be Qp – 0.85 * Qf , where Qp is the quantity of drives produced and Qf is the quantity of drives failed in the 1 year run. With this in mind the final formula for MTBF becomes:
MTBF =-1 ⎛Q p -1. 85⋅Q f ⎫⎪ln Q -0. 85⋅Q ⎪f ⎭⎝p
The 9 month dwell time and the 12 month failure recording time means that currently the last month for which MTBF can be calculated is July, 2001. A way to get preliminary values for a few more months is to use a shorter failure recording period and estimate the correction needed to get to the full 12 month period. Spreadsheet failtime.xls gives an indication about suitable correction factors (last column). The accuracy of MTBF values obtained in this way is of course less than when a full 12 month period is used.
The calculated MTBF values for ACS 400 in the US are attached as an Excel spreadsheet
(MTBF400.xls).
3. Errors in MTBF
Although the MTBF is calculated on the entire population of drives from a production month and all failures are recorded during the 12 month recording period, there is a statistical error associated with the MTBF calculation. The error is a result of the fact that failures occur at random and the recording period is limited in time. In this case where the failures are counted during a predetermined period, the data is considered time-censored (type 1) and the lower and upper 90 % confidence limits becomes:
:///.doc
Confidential for Internal Use Only MTBF l =MTBF ⋅χ20. 05, 2⋅Q 2⋅Q f f +2 Print Date: 03/28/13 4:08 AM
MTBF u =MTBF ⋅χ20. 95, 2⋅Q f 2⋅Q f
Where Qf is the number of drives failed and MTBF is the value calculated per section 2 above. The chi-squared function is readily available in Excel as CHIINV(probability, deg. of freedom)
The error is very helpful in determining if fluctuations seen in MTBF are just noise or a real change in the quality of the drives. With the chosen 90 % confidence a good fit trend line should fall within the confidence limits 90 % of the time. If it doesn’t there is a high probability that there are real problems. An example of that is our own MTBF trend (attached below). For the spring of 2001 there are many data points that fall to the side of the trend line. We don’t know for sure what happened back then but it was the time when manufacture of the main boards was shifted from Flextronics in Sweden to Flextronics in Finland.
:///.doc
Confidential for Internal Use Only Print Date: 03/28/13 4:08 AM