Fmea Overview

Download as pdf or txt
Download as pdf or txt
You are on page 1of 26

SOFTWARE FAILURE

MODES EFFECTS ANALYSIS


OVERVIEW

 Copyright, Ann Marie Neufelder, SoftRel, LLC, 2010


 [email protected]
 www.softrel.com

 This presentation may not be copied in part or whole without written


permission from Ann Marie Neufelder, SoftRel, LLC.
This presentation may not be copied in part or in whole
without written permission from Ann Marie Neufelder
Softrel, LLC Software Failure Modes Effects Analysis

Table of Contents
2

 Definitions 3
 The cost benefit of doing a SFMEA 6
 SoftRel, LLC SFMEA capabilities 13
 Technical aspects of the SFMEA 17
 References 26

Copyright SoftRel, LLC 2010 This material may not be reprinted in part or in whole without written permission from Ann Marie Neufelder.
Softrel, LLC Software Failure Modes Effects Analysis

Software Failure Modes Effects Analyses Defined


3

 Analysis is adapted from Mil-STD 1629A, 1984 and Mil-


HDBK-338B, 1988
 Can be applied to firmware or high level software
 Software development and testing often focuses on the success
scenarios while SFMEA focuses on what can go wrong
 More effective than traditional design and code reviews
because
 Reviews often focus on style instead of failure modes
 Reviews often identify issues but not the system wide effects of the
issues
 Reviews are often not targeted to high risk areas

Copyright SoftRel, LLC 2010 This material may not be reprinted in part or in whole without written permission from Ann Marie Neufelder.
Softrel, LLC Software Failure Modes Effects Analysis

Software failure modes….


4

 Software failure modes are generally either


 Data related
 Event related

 Many of these are repeatable


 Many of these cannot be corrected once the failure
event is encountered
 So hardware redundancy is often not a corrective action
 Failure modes that might be corrected or avoided with
hardware redundancy are indicated with an “&” in class

Copyright SoftRel, LLC 2010 This material may not be reprinted in part or in whole without written permission from Ann Marie Neufelder.
Softrel, LLC Software Failure Modes Effects Analysis
Software FMEAs can be conducted from 6
5
different viewpoints
FMEA Product Level Identifies failures related to. Life cycle
viewpoint Viewpoint timing
Functional Requirements Timing, sequence, Faulty data, erroneous SRS completion
error messages for a component
Interface Interface Timing, sequence, Faulty data, erroneous Interface Design
between 2 error messages between 2 components Spec completion
components
Detailed At class or All of the above plus memory Detailed design or
module level management, algorithms, I/O, DB issues code is complete.

Production Process related Problems with many defects and/or Any time
failures during ability to meet a schedule, execution and
development tools

Maintenance Changes to the Problems when software is modified , During maintenance


software installed, updated

Usage User friendliness Software/documentation is too difficult As early as possible


and consistency, or inconsistent to be used properly as these issues will
Copyright SoftRel, LLC 2010 This material may not be reprinted in part or in whole without written permissioninfluence design
from Ann Marie Neufelder.
documentation
Softrel, LLC Software Failure Modes Effects Analysis

The cost of doing a SFMEA


6

 What are the technical benefits?


 Who will do the SFMEA?
 How much time will it take?
 Are the benefits worth the cost?
 Common SFMEA mistakes that can cost money

Copyright SoftRel, LLC 2010 This material may not be reprinted in part or in whole without written permission from Ann Marie Neufelder.
Softrel, LLC Software Failure Modes Effects Analysis
When properly implemented at the right point in the
lifecycle Software FMEAs can…
7

 Make requirements, design and code reviews more effective


 Identify single point failures due to software
 Identify defects that cannot be addressed by redundancy or
other hardware controls
 Identify abnormal behavior that might be missing from the
requirements or design specifications
 Identify unwritten assumptions
 Identify features that need fault handling design
 Address one failure mode could mean eliminating several
failures

Copyright SoftRel, LLC 2010 This material may not be reprinted in part or in whole without written permission from Ann Marie Neufelder.
Softrel, LLC Software Failure Modes Effects Analysis

What personnel is required for a SFMEA?


8

Personnel Strengths
Facilitator Understands the SFMEA process
Software Responsible for the software project
management
Software Key engineers with subject matter expertise for the
engineers product being analyzed. Depends on viewpoint:
•Functional SFMEA- someone who is familiar with the SRS
is required.
•Interface SFMEA -the person(s) who designed the
interfaces.
•Detailed SFMEA -the person responsible for design and
coding.
Domain These are people who are knowledgeable of how the
experts system will be used and what kinds of events are most
critical to an end user or system
Copyright SoftRel, LLC 2010 This material may not be reprinted in part or in whole without written permission from Ann Marie Neufelder.
Softrel, LLC Software Failure Modes Effects Analysis
What is the typical effort required for each part of the
SFMEA?
9
Task Functional, interface or detailed SFMEA Personnel involved
with this task
Planning Can usually be done in a half day All
Collect actual Usually 1 day Facilitator
software failure
data to identify
likely failure modes
Construct left side Depends on viewpoint Facilitator does initial
of SFMEA table •Functional - 30-60 mins for each SRS work. Software
statement engineers review for
•Interface - 30-90 mins for each interface completeness.
variable
•Detailed - 30-90 mins for each module
Effects on system, Can take up to 15 minutes per failure All – Facilitator keeps
likelihood, severity mode discussion moving
Mitigate risks/make Entirely dependent on the corrective Software management
corrective action action
Copyright SoftRel, LLC 2010 This material may not be reprinted in part or in whole without written permission from Ann Marie Neufelder.
Softrel, LLC Software Failure Modes Effects Analysis
These are some of the benefits that my customers have
experienced from the SFMEA analysis
10

The SFMEA is particularly cost effective at finding a small number


of defects that have catastrophic consequences and/or will result
in many failures by many end users
 Project X – Safety/monetarily critical equipment - A small number of very
serious defects were uncovered that would have been difficult if not
impossible to find in testing. The cost of these defects being discovered
even once in the field would have been several million. The cost of the
analysis was 28K.
 Project Y – A web based system allowed non-paying customers to
sometimes (under certain conditions) retrieve a product without paying first.
The testing had been directed to the positive case (paying customers get
their product) and not the negative case. That’s because the SRS never
stated what the system should “Not” do. This defect would have resulted in
significant loss of revenue if deployed.
Copyright SoftRel, LLC 2010 This material may not be reprinted in part or in whole without written permission from Ann Marie Neufelder.
Softrel, LLC Software Failure Modes Effects Analysis
Common SFMEA mistakes that cost money and reduce
benefit
11
 Starting at the wrong place
 Usually you do not start the analysis at individual lines of code
 Doing the analysis too late in the life cycle
 Assuming that certain failure modes won’t happen before
analyzing them
 Neglecting to tailor the list of failure modes to your
application type
 Neglecting to filter/rank the code by risk and impact
 Assuming that hardware redundancy will prevent all software
failure modes
 Neglecting to decide on the best viewpoint before doing the
analysis
Copyright SoftRel, LLC 2010 This material may not be reprinted in part or in whole without written permission from Ann Marie Neufelder.
Softrel, LLC Software Failure Modes Effects Analysis

What is the typical effort required for the entire


12
SFMEA?
 A typical project has the below SFMEA expenditures
 Things that make SFMEA analysis go faster and better
 More detailed product documentation such as SRS, IDS, design docs, etc
 Software engineers who are willing to think about how the software can
fail instead of trying to prove that it can’t
Personnel Strengths
Facilitator 150-200 hours
Software 20-30 hours not including time required to correct issues
management
Software 36-60 hours not including time required to correct issues
engineers
Domain 20-40 hours
experts
Copyright SoftRel, LLC 2010 This material may not be reprinted in part or in whole without written permission from Ann Marie Neufelder.
Softrel, LLC Software Failure Modes Effects Analysis

About Ann Marie Neufelder, SoftRel, LLC


13

 Has been in software engineering since 1983


 Authored the NASA webinar on software FMEA
 Has been doing SFMEA for 25+ years
 Has completed software/firmware FMEAs in these industries
and applications
 Commercial and defense vehicles
 Drilling equipment
 Electronic warfare
 Ground based satellite systems
 Lighting systems
 Commercial appliances/electronics
 Space systems

Copyright SoftRel, LLC 2010 This material may not be reprinted in part or in whole without written permission from Ann Marie Neufelder.
Softrel, LLC Software Failure Modes Effects Analysis

What you will get from the 1 day SFMEA class


14

 Hands on step by step process for doing the SFMEA


within schedule and cost constraints
 Templates to facilitate
 Completion of each step of the SFMEA process
 Brainstorming process (the most difficult step)

 300 failure mode/root cause pairs to pick from

 Examples of completed SFMEAs from real world


 Seeing a real example in various stages of construction is
the most valuable step towards constructing a useful
software FMEA
 Examples of how NOT to do a SFMEA
Copyright SoftRel, LLC 2010 This material may not be reprinted in part or in whole without written permission from Ann Marie Neufelder.
Softrel, LLC Software Failure Modes Effects Analysis

Software FMEA services provided by Ann Marie Neufelder


15

 The hardest part of the SFMEA is getting it started


 The second hardest part is knowing how to keep it under
budget
 Ann Marie Neufelder can help with that
 Facilitating an effective SFMEA to ensure minimum $ spent

 Performing a RCA to identify most likely failure modes/root


causes
 Laying out the left side of the SFMEA

 Working with the software engineers and domain experts to


complete right side of SFMEA
 Keeping the analysis on schedule by leading the discussions
down the most productive path
Copyright SoftRel, LLC 2010 This material may not be reprinted in part or in whole without written permission from Ann Marie Neufelder.
Softrel, LLC Software Failure Modes Effects Analysis

What the other consultants don’t have


16

 Ann Marie Neufelder has identified more than 300 failure


mode/root cause pairs
 She has applied SFMEAs on real world software for all 6
viewpoints
 Since she has 25+ years of software engineering experience
she knows how to integrate this analysis on real world versus
academic software projects
 Since she has 25+ years of experience, she knows the common
mistakes that will kill the effectiveness of a SFMEA

Copyright SoftRel, LLC 2010 This material may not be reprinted in part or in whole without written permission from Ann Marie Neufelder.
Softrel, LLC Software Failure Modes Effects Analysis

Technical aspects of the SFMEA


17

 What does a SFMEA look like?


 What are the steps?
 What are some of the failure modes and root
causes?

Copyright SoftRel, LLC 2010 This material may not be reprinted in part or in whole without written permission from Ann Marie Neufelder.
Softrel, LLC Software Failure Modes Effects Analysis

What does a SFMEA look like?


18

 Similar to table used for hardware FMEA


 Software engineers have the most trouble getting the left side
of the SMFEA started

Severity
subsystem

system
Function

Description

Effect on

Effect on

Detection

Likelihood

action
Compensating
Failure mode

Root cause

monitors

Corrective

Provisions
RPN
Left side is completed
Right side is completed
first by reviewing the
next by brainstorming
product and failure
subject matter expertise
modes/root causes

Copyright SoftRel, LLC 2010 This material may not be reprinted in part or in whole without written permission from Ann Marie Neufelder.
Softrel, LLC Software Failure Modes Effects Analysis

The process for doing a Software Failure Modes Effects Analyses


19
Analyze
Plan resources For each Brainstorm For each
Applicable
for software FMEA Failure failure
Product or
FMEA Viewpoint Modes mode
Process

Assess
Likelihood
Identify
Identify
Equivalent
Root cause
Failure
Modes
Identify failure Identify
effects Identify corrective
compensating
actions
provisions
Identify severity Mitigate
Identify detection
monitors
Identify Failure Consequences
Copyright SoftRel, LLC 2010 This material may not be reprinted in part or in whole without written permission from Ann Marie Neufelder.
Softrel, LLC Software Failure Modes Effects Analysis

The class covers many root cause/failure mode pairs


20 *Applicable to most if not all application types

Number of associated root causes


Failure mode Description Functional Interface Detailed
*Functionality Software does not do behave as stated 6 3
in the requirement
*Timing Events happen too late or too early 2 4
*Sequence Events happen in the wrong order 5 1 5
*Faulty Data Data is corrupt, invalid, incomplete or 5 11 11
incorrect
Faulty Error •Wrong message, wrong response when 5 9 11
Handling an error is detected
*Erroneous or •Software fails to detect an error when
missing error it should
messages •Software detects a error when there is
*False alarms none
Web based Failure modes specific to HTML, ASP, 24
Copyright SoftRel, LLC.Net, etc.
2010 This material may not be reprinted in part or in whole without written permission from Ann Marie Neufelder.
Softrel, LLC Software Failure Modes Effects Analysis

The class covers many root cause/failure mode pairs


21 *Applicable to most if not all application types

Number of associated root causes


Failure mode Description Functional Interface Detailed
Database related Storing, retrieving data from a 29
database file
Network Stale data, no communications 6
communications
Faulty or Incomplete or incorrect I/O 15 6
incompatible I/O
Faulty logic and Incomplete or overlapping logic 23
ranges
*Incorrect Formula implemented 8
algorithms incorrectly for some or all inputs
*Memory Out of memory errors 7
management

Copyright SoftRel, LLC 2010 This material may not be reprinted in part or in whole without written permission from Ann Marie Neufelder.
Softrel, LLC Software Failure Modes Effects Analysis

The class covers many root cause/failure mode pairs


22

Number of associated root causes


Failure mode Description Production Maintenance Usage
Execution Poorly executed project 36
Tools Inadequate tools/training/people 15
Schedule Inadequate scheduling 23
Faulty C/A Change to a correction causes a new See detailed
defect viewpoint
Unsupportable Software can’t be easily maintained 10
Unserviceable Software can’t be easily serviced 8
after install
Installation SW doesn’t install/update 23
Human Human error, misuse or abuse 12
Security Security violations, overly secure 9
User instructions Inadequate or conflicting instructions 13
Copyright SoftRel, LLC 2010 This material may not be reprinted in part or in whole without written permission from Ann Marie Neufelder.
for operating the software
Softrel, LLC Software Failure Modes Effects Analysis

General Steps for laying out each SFMEA viewpoint


23
1. Create one worksheet for each unit that applies for this viewpoint
 CSCI (functional), module (detailed) or interface pair (interface)
2. Review the product documentation or code associated with the
first step
 SRS (functional), code (detailed), IDS (interface)
3. Create one row for each requirement or data element
4. Review all failure modes related to that view
5. List all of the above failure modes and root causes that are
applicable for each row
6. Each row can/will have more than 1 failure mode and/or root
cause
7. Once the 4 columns on the left hand side of the table are
complete, proceed to the columns on the right side
Copyright SoftRel, LLC 2010 This material may not be reprinted in part or in whole without written permission from Ann Marie Neufelder.
Softrel, LLC Software Failure Modes Effects Analysis

Example template for the detailed SFMEA


24

Function Description Failure mode Root causes


Module Name of Faulty data List all root causes that apply
name variable, type, to this data element
size, min, max
and default
value
Module List each Faulty algorithm List all root causes that apply
name algorithm to this algorithm
Module Required logic Faulty Logic List all root causes that apply
name to this logic
Module Required Faulty ranges List all root causes that apply
name ranges to the ranges defined by this
logic

Copyright SoftRel, LLC 2010 This material may not be reprinted in part or in whole without written permission from Ann Marie Neufelder.
Softrel, LLC Software Failure Modes Effects Analysis
Example: Some Root Causes of Faulty Range Data Failure
Mode for Detailed FMEA viewpoint
25
1. Module does not work for upper bounds on input variables
2. Module does not work for lower bounds on input variables
3. Module does not work for intersections of input ranges
4. Module defines a > b when there should be a >= b
5. Module defines a < b when there should be a <= b
6. Module defines a >= b when there should be a > b
7. Module defines a <= b when there should be a < b
8. Overflow ignored
9. Improper comparison of variables with 2 different formats
10. Equality Comparison between floating point value and zero

Copyright SoftRel, LLC 2010 This material may not be reprinted in part or in whole without written permission from Ann Marie Neufelder.
Softrel, LLC Software Failure Modes Effects Analysis

References
26
 [1] “SAE ARP 5580 Recommended Failure Modes and Effects
Analysis (FMEA) Practices for Non-Automobile Applications”,
July, 2001, Society of Automotive Engineers.
 [2] “Software Systems Testing and Quality Assurance”, Boris
Beizer, 1984, Van Nostrand Reinhold, New York, NY.
 [3] “A Taxonomy of E-commerce Risk and Failures”, Giridharan
Vilangadu Vijayaraghaven, A Thesis Submitted to the
Department of Computer Science at Florida Institute of
Technology, Melbourne, Florida, May 2003.

Copyright SoftRel, LLC 2010 This material may not be reprinted in part or in whole without written permission from Ann Marie Neufelder.

You might also like