Reliability Engineering

Reliability Engineering

( Assignment I – Construction Methods and Management )





February, 2013
Reliability Engineering

Reliability Engineering

What is Reliability?
The reliability of an item/ system is the probability that the item/system performs a
specified function under specified operational and environmental conditions at and
throughout a specified time.

Quantitatively, reliability is the probability of success; usually expressed as mean time

between failures. (MTBF)

Reliability is a collection of planned activities (established through formal and informal

management systems) that are effectively working together to prevent loss of system
function. This definition is a managed approach to maintain the reliability of system
functions. Both definitions refer to the system and maintaining the functionality of the

Reliability engineering has not developed as a unified discipline, but has grown out of the
integration of a number of activities which were previously the province of the engineer.

Since no human activity can enjoy zero risk, and no equipment a zero rate of failure, there
has grown a safety technology for optimizing risk. This attempts to balance the risk against
the benefits of the activities and the costs of further risk reduction.

Similarly, reliability engineering, beginning in the design phase, seeks to select the design
compromise which balances the cost of failure reduction against the value of enhancement.

Why is Reliability Important?

There are a number of reasons why reliability is an important product attribute, including:

 Reputation. A company's reputation is very closely related to the reliability of its

products. The more reliable a product is, the more likely the company is to have a
favorable reputation.
 Customer Satisfaction. While a reliable product may not dramatically affect customer
satisfaction in a positive manner, an unreliable product will negatively affect
customer satisfaction severely. Thus high reliability is a mandatory requirement for
customer satisfaction.
 Warranty Costs. If a product fails to perform its function within the warranty period,
the replacement and repair costs will negatively affect profits, as well as gain

Reliability Engineering

unwanted negative attention. Introducing reliability analysis is an important step in
taking corrective action, ultimately leading to a product that is more reliable.
 Repeat Business. A concentrated effort towards improved reliability shows existing
customers that a manufacturer is serious about its product, and committed to
customer satisfaction. This type of attitude has a positive impact on future business.
 Cost Analysis. Manufacturers may take reliability data and combine it with other cost
information to illustrate the cost-effectiveness of their products. This life cycle cost
analysis can prove that although the initial cost of a product might be higher, the
overall lifetime cost is lower than that of a competitor's because their product
requires fewer repairs or less maintenance.
 Customer Requirements. Many customers in today's market demand that their
suppliers have an effective reliability program. These customers have learned the
benefits of reliability analysis from experience.
 Competitive Advantage. Many companies will publish their predicted reliability
numbers to help gain an advantage over their competitors who either do not publish
their numbers or have lower numbers.

What is the difference between Quality and Reliability?

Even though a product has a reliable design, when the product is manufactured and used in
the field, its reliability may be unsatisfactory. The reason for this low reliability may be that
the product was poorly manufactured. So, even though the product has a reliable design, it
is effectively unreliable when fielded, which is actually the result of a substandard
manufacturing process. As an example, cold solder joints could pass initial testing at the
manufacturer, but fail in the field as the result of thermal cycling or vibration. This type of
failure did not occur because of an improper design, but rather it is the result of an inferior
manufacturing process. So while this product may have a reliable design, its quality is
unacceptable because of the manufacturing process.

Just like a chain is only as strong as its weakest link, a highly reliable product is only as good
as the inherent reliability of the product and the quality of the manufacturing process.

Importance of Failure Data

Throughout the history of engineering, reliability improvement (also called reliability
growth) arising as a natural consequence of the analysis of failure has long been a central
feature of development. This ‘test and correct’ principle had been practiced long before the
development of formal procedures for data collection and analysis because failure is usually
self-evident and thus leads inevitably to design modifications.

The design of safety-related systems (for example, railway signaling) has evolved partly in
response to the emergence of new technologies but largely as a result of lessons learnt from
failures. The application of technology to hazardous areas requires the formal application of

Reliability Engineering

this feedback principle in order to maximize the rate of reliability improvement.
Nevertheless, all engineered products will exhibit some degree of reliability growth, as
mentioned above, even without formal improvement programs.

Nineteenth and early twentieth century designs were less severely constrained by the cost
and schedule pressure of today. Thus, in many cases, high levels of reliability were achieved
as a result of over-design. The need for quantified reliability-assessment techniques during
design and development was not therefore identified. Therefore failure rates of engineered
components were not required, as they are now, for use in prediction techniques and
consequently there was little incentive for the formal collection of failure data.

Another factor is that, until well into this century, component parts were individually
fabricated in a ‘craft’ environment. Mass production and the attendant need for
components standardization did not apply and the concept of valid repeatable component
failure rate could not exist. The reliability of each product was, therefore, highly dependent
on the craftsman/ manufacturer and less determined by the ‘combination’ of part

Nevertheless, mass production of standard mechanical parts has been the case since early in
this century. Under these circumstances defective items can be identified readily, by means
of inspection and test, during the manufacturing process, and it is possible to control
reliability by quality-control procedures.

The advent of electronic age, accelerated by the second world war, led to the need for more
complex mass-produced component parts with a higher degree of variability in the
parameters and dimensions involved. The experience of poor field reliability of military
equipment throughout the 1940s and 1950s focused attention on the need for more formal
methods of reliability engineering. This gave rise to the collection of failure information
from both the field and from the interpretation of test data. Failure rate data banks were
created in the mid-1960s as a result of work at organizations such as UKAEA (UK Atomic
Energy Authority) and RRE (Royal Radar Establishment, UK) and RADC (Rome Air
Development Corporation US).

The manipulation of the data was manual and involved the calculation of rates from the
incident data, inventories of component types and the records of elapsed hours. This
activity was stimulated by the appearance of reliability prediction modeling techniques
which require component failure rates as inputs to the prediction equations.

The availability and low cost of desktop personal computing (PC) facilities, together with
versatile and powerful software packages, has permitted the listing and manipulation of
incident data for an order less expenditure of working hours. Fast automatic sorting of the
data encourages the analysis of failures into failure modes. This is no small factor in
contributing to more effective reliability assessment, since generic failure rates permit only

Reliability Engineering

parts count reliability predictions. In order to address specific system failures it is necessary
to input component failure modes into the fault tree or failure mode analysis.

With the rapid growth of built-in test diagnostic features in equipment a future trend may
be the emergence of some limited automated fault reporting.

Some definitons:
Maintainability: The ability of an item, under stated conditions of use, to be retained in, or
restored to, a state in which it can perform its required function(s), when maintenance is
performed under stated conditions and using prescribed procedures and resources. It is
expressed as Mean Time To Repair (MTTR)>

Availability: Is the probability that a system is available for use at a given time- a function of
reliability and maintainability. It is operating time divided by load time, which is the
available time per day minus the planned downtime.

Failure: The termination of the ability of an item to perform its required function.

Inherent Availability

Inherent availability considers only corrective maintenance in an ideal support environment

(with neither administrative nor logistic delays).

When equipment is in a failed state, it is no longer available for work, and its reliability
decreases. As the length of time in a failed state (downtime) increases, the maintainability
of the equipment decreases.

What is Reliability Engineering?

Reliability engineering consists of the systematic application of time-honored engineering
principles and techniques throughout a product lifecycle and is thus an essential component
of a good Product Lifecycle Management (PLM) program. The goal of reliability engineering
is to evaluate the inherent reliability of a product or process and pinpoint potential areas for
reliability improvement. Realistically, all failures cannot be eliminated from a design, so
another goal of reliability engineering is to identify the most likely failures and then identify
appropriate actions to mitigate the effects of those failures.
The reliability evaluation of a product or process can include a number of different reliability
analyses. Depending on the phase of the product lifecycle, certain types of analysis are
appropriate. As the reliability analysis is being performed, it is possible to anticipate the

Reliability Engineering

reliability effects of design changes and corrections. The different reliability analyses are all
related, and examine the reliability of the product or system from different perspectives, in
order to determine possible problems and assist in analyzing corrections and improvements.
Reliability engineering can be done by a variety of engineers, including reliability engineers,
quality engineers, test engineers, systems engineers or design engineers. In highly evolved
teams, all key engineers are aware of their responsibilities in regards to reliability and work
together to help improve the product.
The reliability engineering activity should be an ongoing process starting at the conceptual
phase of a product design and continuing throughout all phases of a product lifecycle. The
goal always needs to be to identify potential reliability problems as early as possible in the
product lifecycle. While it may never be too late to improve the reliability of a product,
changes to a design are orders of magnitude less expensive in the early part of a design
phase rather than once the product is manufactured and in service.

Important aspects of Reliability Engineering

1. Current knowledge of predictive, analytical, and compliance technologies, and the
ability to apply these techniques to add value to the firm.
2. The adaptation and application of concepts such as TPM and RCM.
3. The development and implementation of a proactive M&R plan(s) to eliminate
maintenance requirements, minimize the use and costs of reactive maintenance,
maximizing the benefits of PM and PdM, and achieve increasing levels of integrated
asset management.
4. The ability to lead or technically support multidisciplinary teams.
5. During design, advise other engineers on reliability (prediction) for their systems and
tactics to improve reliability such as redundancy, parts derating, Failure mode and
effects analysis, etc
6. During design, participates in trade-off studies among performance, cost, and
reliability. Reliability estimates are a key input to Life Cycle Costing (LCC).
7. During development, continues to update reliability predictions and prepares
reliability test plans.
8. During pre-production, verifies reliability of subsystems and entire system through
various types of testing.

What’s the role of the reliability engineer?

The primary role of the reliability engineer is to identify and manage asset reliability risks
that could adversely affect plant or business operations. This broad primary role can be
divided into three smaller, more manageable roles: loss elimination, risk management and
life cycle asset management (LCAM).

Reliability Engineering

Loss Elimination
One of the fundamental roles of the reliability engineer is to track the production losses and
abnormally high maintenance cost assets, then find ways to reduce those losses or high
costs. These losses are prioritized to focus efforts on the largest/most critical opportunities.
The reliability engineer (in full partnership with the operations team) develops a plan to
eliminate or reduce the losses through root cause analysis, obtains approval of the plan and
facilitates the implementation.

Risk Management
Another role of the reliability engineer is to manage risk to the achievement of an
organization’s strategic objectives in the areas of environmental health and safety, asset
capability, quality and production. Some tools used by a reliability engineer to identify and
reduce risk include:

 PHA – Preliminary hazards analysis

 FMEA – Failure modes and effects analysis
 CA – Criticality analysis
 SFMEA – Simplified failure modes and effects analysis
 MI – Maintainability information
 FTA – Fault tree analysis
 ETA – Event tree analysis

Life Cycle Asset Management

Studies show that as much as 95 percent of the total cost of ownership (TCO) or life cycle
cost (LCC) of an asset is determined before it is put into use. This reveals the need for the
reliability engineer to be involved in the design and installation stages of projects for new
assets and modification of existing assets.

Maintenance Prevention
The goal of maintenance prevention (MP) is to reduce maintenance costs and deterioration
losses in new equipment by considering past maintenance data and the latest technology
when designing for higher reliability, maintainability, operability, flexibility, safety, and other

Objectives of Maintenance Prevention

1. Reduce the time taken from the design to stable operation
2. Accomplish the transition efficiently with minimum labor and a balanced workload.
3. Ensure that equipment is designed to be highly reliable, maintainable, economical,
operable, and safe.

Reliability Engineering

4. Minimize future maintenance costs and deterioration losses of new equipment.
5. MP design process improves equipment reliability by investigation weakness in
existing equipment and feeding the information back to the designers.

Ideally, MP-designed equipment should not break down or produce out-of-spec

products. It should be easy and safe to maintain.

MP design activities are subject to the following constraints:

- Technology (production and equipment technology)

- Quantitative and qualitative equipment capacity
- Basic equipment specifications
- Capital budget
- Operating costs (operator labor, raw materials yields, maintenance costs, energy
costs, etc)

At each stage of MP design possible problems with respect to the following issues need
to be examined.

- Quality
- Productivity
- Operability
- Energy-saving
- Cost
- Maintainability
- Safety and environment

Standard checklist must be used at each stage.

MP should be performed for capital projects, redesign or modification of current assets,

equipment installation and commissioning, and replacement parts and components

Reliability Engineering

Nikolaidis, Efstratios et al (2005): Engineering Design Reliability Handbook. CRC Press

Ebeling, Charles E. (2004): An introduction to reliability and maintainability engineering.

Lewis, Elmer Eugene (1987): Introduction to reliability engineering

Smith, David J. (2001): Reliability, Maintainability and Risk

