09 RTS Redundancy

Download as pdf or txt
Download as pdf or txt
You are on page 1of 36

Reliability of Technical Systems

Main Topics
1. Short Introduction, Reliability Parameters: Failure Rate, Failure
Probability, etc.
2. Some Important Reliability Distributions
3. Component Reliability
4. Introduction, Key Terms, Framing the Problem
5. System Reliability I: Reliability Block Diagram, Structure Analysis (Fault
Trees), State Model.
6. System Reliability II: State Analysis (Markovian chains)
7. System Reliability III: Dependent Failure Analysis
8. Data Collection, Bayes Theorem, Static and Dynamic Redundancy
9. Advanced Methods for Systems Modeling and Simulation (Petri Nets,
network theory, object-oriented modeling)
10. Software Reliability, Fault Tolerance
11. Human Reliability Analysis
12. Case study: Building a Reliable System

HS 10 / ETH Zürich Reliability of technical Systems 2


Data Collection

• Specific Data
Available data for a specific unit same as the unit being subject of
analysis; its validity hence is provided.
This kind of data is ideal for a reliability analyis. Nevertheless,
often there is a lack of it in practice.
• Generic Data
Such data often are given in publications for „similar“ units; the validity of
this data is not given per se.
Application to other units is questionable. However, convenient
increase of the data basis
• „expert judgement“
subjective judgement of an expert regarding the unit behavior.
Rather inappropriate for a reliability analysis, but often the only
available data source.
HS 10 / ETH Zürich Reliability of technical Systems 3
Data Collection
Assumptions
Characterizing a unit / component
• Ensure statistical similarity between database and analysis
- Construction
- Conditions, i.e. process parameters (pressure, temperature), Medium,
Environment u.a.
- Operational conditions, e.g. active versus stand-by
• Definition of a failure
• Definition of an observation period
• Definition of the boundary elements.
Boundaries

Motor Pump Pipelines

Connections:
flange, weld, etc.
HS 10 / ETH Zürich Reliability of technical Systems 4
Data Collection
Plant specific data sources
Current basic documents are business documents (BU), i.e. damage
reports, repair orders, etc.

• Loss of species, causes, impacts are rarely held

• BU are usually not designed for reliability data function,


must represent at least 90% of all failures (events).

HS 10 / ETH Zürich Reliability of technical Systems 5


Bayes Theorem
Conditional Probability

• It is important to compute the probability of an event A given that


another event B has occurred, which is called conditional probability of
A given B

Where P(A|B) gives the probability of the event A not on the entire possible
sample space Ω, but on the sample space relative to the occurrence of B

• Event A is said to be statistically independent from event B if


P(A|B)=P(A)

• Statistical independence should not be confused with mutual exclusivity


(XA XB =0), which represents a logical dependence: knowing that A has
occurred, guarantees that B cannot occur

HS 10 / ETH Zürich Reliability of technical Systems 6


Bayes Theorem
Conditional Probability : Exercise Example

There are two streams flowing past an industrial plant. The dissolved
oxygen, DO, level in the water downstream is an indication of the
degree of pollution caused by the waste dumped from the plant. Let A
denote the event that stream a is polluted, and B denote the event that
stream b is polluted. From measurement taken on the DO level of each
stream over the last year, it was determined that in a given day

P(A) = 2/5 and P(B) = ¾


and the probability that at least one stream will be polluted in any given
day is P(A U B) = 4/5

Q1: Determine the probability that stream a is also polluted given that
stream b is polluted.
Q2: Determine the probability that stream b is also polluted given that
stream a is polluted.

HS 10 / ETH Zürich Reliability of technical Systems 7


Bayes Theorem
Conditional Probability : Exercise - Solution

We have

The probability that both streams are polluted

= (2/5)+(3/4)-(4/5)=7/20

For Q1:

P(A|B) =

For Q2:

P(B|A) =

HS 10 / ETH Zürich Reliability of technical Systems 8


Bayes Theorem
Theorem of Total Probability

• Consider a partition of the sample space Ω into n mutually exclusive


and exhaustive events Ej , j = 1,2,…..,n.

• Given any event A in Ω, its probability can be computed in terms of the


partitioning events Ej (j = 1,2,…..,n), and conditional probabilities of A on
these events :

HS 10 / ETH Zürich Reliability of technical Systems 9


Bayes Theorem
Bayes Theorem

• What is the probability that event Ej has occurred if there is the evidence
that event A has occurred ?

• Equation above updates the prior probability value P(Ej) of event Ej to


the posterior probability value P(Ej |A) where P(A) can be computed by
applying the theorem of total probability.

HS 10 / ETH Zürich Reliability of technical Systems 10


Bayes Theorem
Bayes Theorem : Exercise Example

Same components are purchased from 3 suppliers (S1, S2, S3) in


quantities of 1000, 600, 400 pieces, respectively. The probabilities for one
component to be defective are 0.006 for S1, 0.02 for S2, and 0.03 for S3.
All the components are stored in a common container disregarding their
source.

Q1. What is the probability that one component randomly selected from the
stock is defective ?

Q2. Let one component as selected in previous question be defective.


What is the probability that it is from S1 ?

HS 10 / ETH Zürich Reliability of technical Systems 11


Bayes Theorem
Bayes Theorem : Exercise Solution

Q1: Pr(the selected component is defective) =

Q2: Pr (component from S1 | component is defective)

Using Bayes Theorem equation

Pr(component from S1 | component is defective) =


=

HS 10 / ETH Zürich Reliability of technical Systems 12


Redundancy
Redundancy

Existence of more than one means for performing a required function in


item.

• For hardware, distinction is made between active (hot, parallel), warm


(lightly loaded), and standby (cold) redundancy.

• Redundancy does not necessarily imply a duplication of hardware, it can


be implemented, for example, by coding or by software.

•To avoid common mode failures, redundant elements should be realized


independently from each other.

HS 10 / ETH Zürich Reliability of technical Systems 13


Redundancy

HS 10 / ETH Zürich Reliability of technical Systems 14


Redundancy
The properties of redundancy characterize various issues of redundancy
rather than distinguishing different types of redundancy:

• The extension by extra components in the structure and functions model.

• The extension by extra functions in the structure and functions model.


These extra functions can be different from the already existing ones
(additional functions) or satisfy the same specification by a different
implementation (diversity).

• The additional information to be stored, transferred and processed.

• The additional time requirements.

Redundancy is either used from the beginning of the system operation


(active / hot) or activated on fault occurrence (standby / cold) or used in a
combination thereof (lightly loaded).
HS 10 / ETH Zürich Reliability of technical Systems 15
Redundancy

HS 10 / ETH Zürich Reliability of technical Systems 16


Redundancy

HS 10 / ETH Zürich Reliability of technical Systems 17


Redundancy

Supposing a system consists of components which will not fail with a


probability of 99% (p=0.99) and which are connected in series. Then the
probability that the entire system will not fail changes with the number of
components as follows:

10 components lead to a survival probability of 90.40%


20 components lead to a survival probability of 81.71 %
30 components lead to a survival probability of 73.86 %
40 components lead to a survival probability of 66.76 %
50 components lead to a survival probability of 60.35 %
100 components lead to a survival probability of 36.40 %

HS 10 / ETH Zürich Reliability of technical Systems 18


Static Redundancy: n-out-of-m system
• The system is faultless, if at least n out of m existing components are
faultless.

• If n=1, complete redundancy occurs (in parallel), and if n=m, the m


components are, in effect, in series.

• The reliability may be obtained from the binomial probability distribution.

• If R is reliability of each independent trial, then the probability of n or more


successes among the m components may be represented as:

HS 10 / ETH Zürich Reliability of technical Systems 19


Static Redundancy: n-out-of-m system
Parallel System: 1-out-of-m system

Km

E …
Ki A
Reliability of the parallel system :

K1

Compare with reliability of the series system (non redundancy):

HS 10 / ETH Zürich Reliability of technical Systems 20


Static Redundancy: n-out-of-m system

HS 10 / ETH Zürich Reliability of technical Systems 21


Case Study
Learning from Deficits: Gulf Oil Spill and Breach of Basic Principles

March – April 2010


• Oil rig in preparation to move to another job
• Temporarily plug and cap the well with cement
• Rise in pressure from the well that suggested the cement was not holding
• First test showed large abnormality, second test was misread and declared as safe
April 20
• Jump in pressure from oil and gas rising in the well
• Methane expanded on the rig without given warnings
• All applications in operation, including those dangerous to ignite the methane
• Explosion on rig, chaotic conditions to evacuate the rig, weak clear directives
• Closing of blowout preventer failed
• Consequences
• 11 victims, 17 injured
3 3
• ~780 x10 m oil spilled in ocean (2 Super-tankers)

InTech, August 10

HS 10 / ETH Zürich Reliability of technical Systems 22


Learning from Deficits: Gulf Oil Spill and Breach of Basic Principles

A dead battery in the BOP’s


„brain“ which gives pressure
readings and controls other
functions in the giant stack of A leak in the hydraulic
valves. system that sends emergency
power to rams, valves that are
supposed to close off the
space around the pipe.
The shear ram, the BOP’s
valve of last resort, wasn’t
strong enough to cut through
joints in the pipe. Those joints
account for about 10 percent of
the pipe’s length.
Several “unexpected”
modifications to the BOP,
including test ram in place of a
The cement seal around the real one. Schematics didn’t
casing pipes in the well failed match the actual device.
pressure tests before the
explosion - gas may have been
building up in the well.

Washington Post, May 13

HS 10 / ETH Zürich Reliability of technical Systems 23


References:

• Zio, Enrico. (2007) An Introduction To the Basics of Reliability and Risk Analysis. World
scientific Publishing Co.
• Birolini, Alessandro. (2007) Reliability Engineering: Theory and Practice (5th edition).
Springer-Verlag: Berlin

HS 10 / ETH Zürich Reliability of technical Systems 24


HS 10 / ETH Zürich Reliability of technical Systems 25
Malfunctions of an unit (failure modes)

Functions Types of failure


Closing Fails open
Only partly closed
Opening Fails closed
Only partly opened
Remain closed Opens completely
Partly opens
Remain opened Closes completely
Partly closes

HS 10 / ETH Zürich Reliability of technical Systems 26


Techniques to speed up the experiment

 Sequential test

 Accelerated test

 Extrapolation

HS 10 / ETH Zürich Reliability of technical Systems 27


Sequential Test (I)

If the actual failure rate λ does not exceed the limit λ1 (λ < λ1) with high probability
w1, the components are to be accepted.
On the other hand, if λ does not under-run the limit λ2 (λ > λ2) with high probability
w2, the components are to be rejected.
Given: λ1, sample size n, acceptance threshold k, time of experiment t.
Let X be the number of failures (Poisson distributed) within the time interval [0, t].

HS 10 / ETH Zürich Reliability of technical Systems 28


Sequential Test (II)

For given probabilities α and β is true:

For λ= λ1 For λ= λ2
Probability of acceptance 1- α β
Probability of rejection α 1- β

β
ln
i≤ 1 − α + λ2 − λ1 ⋅ n ⋅ t
Accept decision if: λ λ
ln 2 ln 2
λ1 λ1

1− β
ln
i≥ α + λ2 − λ1 ⋅ n ⋅ t
Reject decision if: λ λ
ln 2 ln 2
λ1 λ1

HS 10 / ETH Zürich Reliability of technical Systems 29


Sequential Test (III): Illustration

number of failures

reject decision

accept decision
cumulative testing time

The minimum testing time (nt)min and the minimum number of failure imin amount to:

where α and β are probabilities

HS 10 / ETH Zürich Reliability of technical Systems 30


Sequential Test (IV): Algorithm

failure rate → reject failure rate → reject


is high is high

test until failure rate test until failure rate


first → is neither → → is neither → ...
next
failure high nor low failure high nor low

failure rate → accept failure rate → accept


is low is low

HS 10 / ETH Zürich Reliability of technical Systems 31


Extrapolation (I)

Failure prediction by extrapolation:


Shorter testing time
 Test is non-destructive

Conditions:
Just drift failures
 Failure criteria are known

HS 10 / ETH Zürich Reliability of technical Systems 32


Extrapolation (II): Illustration

parameter y

failure criteria

t
texp tpred

t exp testing time


t pred predicted life time (forecast through extrapolation)

HS 10 / ETH Zürich Reliability of technical Systems 33


Accelerated Test (I)

HS 10 / ETH Zürich Reliability of technical Systems 34


Accelerated Test (II): Algorithm

E 1 1
L1 ( − )
Acceleration factor F1, 2 = =e K T1 T2

L2
Measure mean life time L2 of component by T2 > T1, where T1 is a low temperature.

If the quotient E/K is unknown, repeat the test by T3 > T1 and find it from

E 1 1
L2 ( − )
=e K T2 T3
.
L3
E 1 1
( − )
Then calculate L=
1 L2 ⋅ e K T1 T2
.

HS 10 / ETH Zürich Reliability of technical Systems 35


Accelerated Test (III): Illustration

HS 10 / ETH Zürich Reliability of technical Systems 36

You might also like