Page 268 - An Introduction to Microelectromechanical Systems Engineering
P. 268
Quality Control, Reliability, and Failure Analysis 247
the random failure events can have different probability distribution laws (e.g.,
exponential, normal, lognormal, Weibull, gamma, and Rayleigh [29, 31]). The
operating time is a duration for which the product performs its required function.
For a nonrepairable product, the mean operating time is also referred to as the mean
time to failure (MTTF). For a product that can be completely repaired, the mean
time of operation becomes the mean time between failures (MTBF). As most elec-
tronic and micromachined components are often difficult to repair after failure, we
will limit the discussion of lifetime to MTTF.
Knowledge of the probability distribution function, f(t), is necessary to compute
the probability of a unit failing as well as the failure rate and MTTF [28, 29]. The
probability of a failure at time t, defined as F(t), is the area under the distribution
function, mathematically given by the integral over a time period t. It is mostly a
mathematical concept that is not widely used in specifying product reliability.
Instead, failure rate and MTTF are the two key and practical specifications in the
assessment and prediction of reliability. The failure rate, also known as hazard rate,
Z(t), is a measure of the instantaneous speed of failure, effectively the number of
failures over a given period of time. Consequently, it has units of failures per unit
−9
time, most commonly one failure in one billion hours (10 /hr) also known as fail-
ure in time (FIT). Mathematically, it can be shown that Z(t)= f(t)/[1−F(t)]. Experi-
mentally, the failure rate is calculated as the ratio of the observed number of failures
occurring in a time interval to the number of functional devices at the beginning of
this time interval, normalized to the length of the time interval [28]. The larger the
number of devices and the longer the observation time are, the higher the statistical
confidence becomes in the measured failure rate. This confidence is mathematically
reflected by multiplying the measured failure rate by the statistical chi squared
2
(χ ) parameter [31]. When the observation time is impractically long to achieve
reasonable confidence, temperature-based accelerated life testing (described later)
becomes an invaluable tool to extrapolate values for the failure rate and MTTF.
The experimentally observed failure rate of many high-technology products,
including electronic, fiber-optical, and micromachined components, exhibits a
characteristic time-dependent behavior that is best described by the “bath tub”
curve (see Figure 8.17). This curve shows an early stage in the life of the product
with a rapidly decreasing failure rate resulting from better screening, improving reli-
ability, and lower infant mortality. A second stage characterized by a rather con-
stant failure rate defines the mean useful life of the component in the field. A rising
failure rate brought by an increase in wear signals the onset of the last stage and the
end of the useful life.
Reliability scientists model the bath-tub curve as a superposition of three
different probability distribution functions, one for each stage in the curve. The
Weibull distribution function best models the early stage, whereas the lognormal dis-
tribution is used to model the third stage. The exponential distribution is best to
describe the middle span because it models a constant failure rate that we denote as
λ. The overall failure rate curve is the sum of all three contributions (see Figure 8.17).
The middle span is one that attracts most attention, as it describes the reliability of
the product during its most useful life. A key characteristic of the exponential
distribution function is its time-independent failure rate, which allows for varying
the combination of the number of devices under test and the hours of testing. For