Ebook LSS Green Belt PDF - Mai 2018 PDF

Cer1fied
Lean
Six Sigma
Green Belt
eBook
LEAN SIX SIGMA BELT SERIES
Fourth Edi+on -‐ Minitab

Legal No+ce
INDIVIDUAL COPY

This Book is an Open Source Six Sigma™ copyrighted
publica1on and is for individual use only. This publica1on
may not be republished, electronically or physically
reproduced, distributed, changed, posted to a website an
intranet or a file sharing system or otherwise distributed in
any form or manner without advanced wriLen permission
from Open Source Six Sigma, LLC. Minitab is a Registered
Trademark of Minitab Inc.

FBI An1 Piracy Warning: The unauthorized reproduc1on or
distribu1on of this copyrighted work is illegal. Criminal
copyright infringement, including infringement without
monetary gain, is inves1gated by the FBI and is punishable by
up to 5 years in federal prison and a fine of $250,000.

For reprint permission, to request addi1onal copies or to
request customized versions of this publica1on contact Open
Source Six Sigma, LLC.

Open Source Six Sigma, LLC
6200 East Thomas Road Suite 203
ScoLsdale, Arizona, United States of America 85251
Email: [email protected]
Website: www.OpenSourceSixSigma.com
Table of Contents
Page
Define Phase
Understanding Six Sigma…………………………..………………………………..….…….… 1
Six Sigma Fundamentals……………..…..………………………..………………..……..…. 22
Selecting Projects………………………….……………………………………..……..……… 42
Elements of Waste……………………..…………………………...……………………………64
Wrap Up and Action Items……………...………………………………………………….……77
Measure Phase
Welcome to Measure……………………………………………………………….……..….....83
Process Discovery………………………..………………………………………………………86
Six Sigma Statistics…………………..………………….………………………………….….135
Measurement System Analysis…………….………………………………………………....169
Process Capability ………………………...…………………………………………..……….204
Wrap Up and Action Items …………………………………………………………………….225
Analyze Phase
Welcome to Analyze……………………………………………………………………….…..231
“X” Sifting…………………………………………………..……………………….……….….234
Inferential Statistics………………………………….………………………..………….…….261
Introduction to Hypothesis Testing……………………………..……….…………………….276
Hypothesis Testing Normal Data Part 1……………………….……………..………………291
Hypothesis Testing Normal Data Part 2 ………………….…………………………….……334
Hypothesis Testing Non-Normal Data Part 1………………………………………….……360
Hypothesis Testing Non-Normal Data Part 2……………………………………………….387
Wrap Up and Action Items ………………………………………..…………………....……..406
Improve Phase
Welcome to Improve……………………………….………………….…………………...…..412
Process Modeling Regression…………………………………………………..…………….415
Advanced Process Modeling………………….……………………………………………….434
Designing Experiments…………………………….…………………………..………………461
Wrap Up and Action Items………………………………………………………..……………476
Control Phase
Welcome to Control………………………………..……………………………………………482
Lean Controls……………………………………………………………………………………485
Defect Controls…………………………………………………………………….……………500
Statistical Process Control……………………….…………………………………………….512
Six Sigma Control Plans………………………...………………………………..……………552
Wrap Up and Action Items……………………...…………………………..……………….…572
Glossary
LSS Green Belt eBook v12 MT © Open Source Six Sigma, LLC
1
Lean Six Sigma

Green Belt Training
Define Phase
Understanding Six Sigma
Welcome to the Lean Six Sigma Green Belt Training Course.
This course has been designed to build your knowledge and capability to improve the
performance of processes and subsequently the performance of the business of which you are a
part. The focus of the course is process centric. Your role in process performance improvement
is to be through the use of the methodologies of Six Sigma, Lean and Process Management.
By taking this course you will have a well rounded and firm grasp of many of the tools of these
methodologies. We firmly believe this is one of the most effective classes you will ever take and it
is our commitment to assure that this is the case.
We begin in the Define Phase with “Understanding Six Sigma”.
2

Overview
The fundamentals of
this phase are Understanding Six Sigma
Definitions, History,
Strategy, Problem Definitions
Solving and Roles and
Responsibilities. History
We will examine the

Strategy
meaning of each of
these and show you
Problem Solving
how to apply them.
Roles & Responsibilities
Six Sigma Fundamentals
Selecting Projects
Elements of Waste
Wrap Up & Action Items
What is Six Sigma…as a Symbol?
σ, sigma, is a letter of the Greek alphabet.

Variation is our enemy. Our
customers, both internal and
external, have expectations
–  Mathematicians use this symbol to signify Standard Deviation, an relative to the deliverables from
important measure of variation. our processes. Variation from
–  Variation designates the distribution or spread about the average of those expectations are likely
any process. dissatisfiers to them. Much of
this course is devoted to
identifying, analyzing and
Narrow Variation Wide Variation eliminating variation. So let’s
begin to understand it.
The Blue Line designates

narrow variation while the
Orange Line designated wide
variation.
The variation in a process refers to how tightly the various outcomes are clustered
Obviously the less variation
around the average. No process will produce the EXACT same output each time.
within a process the more
predictable the process is,
assuming the Mean is not moving all over the place. If you took the height of everyone in the class
would you expect a large variation or narrow variation?
What if you had a few professional basketball players in the room, would that widen or narrow the
variation?
3
What is Six Sigma…as a Value?

Sigma is a measure of
deviation. The
mathematical calculation
for the Standard Deviation
of a population is as
shown.
§  Sigma can be used
interchangeably with the
statistical term Standard By definition, the Standard Deviation is the distance
Deviation.
between the mean and the point of inflection on the
§  Standard Deviation is
the average distance of normal curve.
data points away from the
Mean in a distribution. Point of Inflection
When measuring the

sigma value of a process
we want to obtain the
distance from the Mean to
the closest specification
limit in order to determine
how many Standard
Deviations we are from
the mean….our Sigma
Level!
The Mean being our optimal or desired level of performance.
What is Six Sigma…as a Measure?
The probability of creating a defect can be estimated and translated into a

“Sigma” level.
-6 -5 -4 -3 -2 -1 +1 +2 +3 +4 +5 +6
The higher the sigma level the better the performance. Six Sigma refers to a process having six
Standard Deviations between the average of the process center and the closest specification limit or
service level.
This pictorial depicts the percentage of data which falls between Standard Deviations within a Normal
Distribution. Those data points at the outer edge of the bell curve represent the greatest variation in
our process. They are the ones causing customer dissatisfaction and we want to eliminate them.
4
Measure
Sigma Level is:

–  A statistic used to describe the performance of a process
relative to the specification limits
–  The number of Standard Deviations from the Mean to the
closest specification limit of the process
USL
6 Sigma
5 Sigma
4 Sigma
3 Sigma
2 Sigma
1 Sigma
The likelihood of a defect decreases as the number of Standard Deviations

that can be fit between the Mean and the nearest spec limit increases.
Each gray dot represents one Standard Deviation. As you can see the Normal Distribution is
tight.
Said differently, if all the outputs of our process fall within six Standard Deviations from the Mean
we will have satisfied our customers nearly all the time. In fact, out of one million customer
experiences only 3.4 will have experienced a defect.
What is Six Sigma…as a Metric?

Each of these metrics serves a different purpose and may be used at different levels in the
organization to express the performance of a process in meeting the organization’s (or
customer’s) requirements. We will discuss each in detail as we go through the course.
§  Defects 20
§  Defects per unit (DPU) 18
§  Parts per million (PPM) 16
§  Defects per million opportunities (DPMO) 14

§  Rolled Throughput yield (RTY) 12
§  First Time Yield (FTY) 10
§  Sigma(s) 8
0 20 40 60 80 100
Above are some key metrics used in Six Sigma. We will discuss each in detail as we go through
the course.
5
What is Six Sigma…as a Benchmark?
This data represents the sigma level of companies. As you can see less than 10% of companies
are at a 6 sigma level!
Yield PPMO COPQ Sigma
99.9997% 3.4 <10% 6 World Class Benchmarks
99.976% 233 10-15% 5 10% GAP
99.4% 6,210 15-20% 4 Industry Average
93% 66,807 20-30% 3 10% GAP
65% 308,537 30-40% 2 Non Competitive
50% 500,000 >40% 1
Source: Journal for Quality and Participation, Strategy and Planning Analysis
What does 20 - 40% of Sales represent to your Organization?
What is Six Sigma…as a Method?
The Six Sigma Methodology is made up of five stages: Define, Measure, Analyze, Improve and
Control.
Each has highly defined steps to assure a level of discipline in seeking a solution to any variation or
defect present in a process.
DMAIC provides the method for applying the Six Sigma

philosophy in order to improve processes.
!  Define - the business opportunity
!  Measure - the process current state
!  Analyze - determine Root Cause or Y= f (x)
!  Improve - eliminate waste and variation
!  Control - sustain the results
6
What is Six Sigma…as a Tool?
Six Sigma contains a broad set of tools interwoven in a

business problem-solving methodology. Six Sigma tools
are used to scope and choose projects, design new
products and processes, improve current processes,
decrease downtime and improve customer response time.
Six Sigma has not created new tools, it has simply

organized a variety of existing tools to create flow.
Customer Value
Management Product Process Process System Functional
Responsiveness,
Cost, Quality, = EBIT, (Enabler) , Design , Yield , Speed , Uptime , Support
Delivery
Six Sigma has not created new tools. It is the use and flow of the tools that is important. How they
are applied makes all the difference.
Six Sigma is also a business strategy that provides new knowledge and capability to employees so
they can better organize the process activity of the business, solve business problems and make
better decisions. Using Six Sigma is now a common way to solve business problems and remove
waste resulting in significant profitability improvements. In addition to improving profitability,
customer and employee satisfaction are also improved.
Six Sigma is a process measurement and management system that enables employees and
companies to take a process oriented view of the entire business. Using the various concepts
embedded in Six Sigma, key processes are identified, the outputs of these processes are
prioritized, the Capability is determined, improvements are made, if necessary, and a management
structure is put in place to assure the ongoing success of the business.
People interested in truly learning Six Sigma should be mentored and supported by seasoned Belts
who truly understand how Six Sigma works.
7
What is Six Sigma…as a Goal?

To give you a better example the concept of the sigma level can be related to hanging fruit. The higher
the fruit the more challenging it is to obtain. And, the more sophisticated the tools necessary to obtain
them.
Sweet Fruit
Design for Six Sigma
5+ Sigma
Bulk of Fruit
Process
3 - 5 Sigma Characterization
and Optimization
Low Hanging Fruit

3 Sigma Basic Tools of
Problem Solving
1 - 2 Sigma Ground Fruit

Simplify and
Standardize
What is Six Sigma…as a Philosophy?
General Electric: First, what it is not. It is not a secret society, a slogan or a cliché. Six Sigma is
a highly disciplined process that helps us focus on developing and delivering near-perfect
products and services. The central idea behind Six Sigma is that if you can measure how many
"defects" you have in a process you can systematically figure out how to eliminate them and get as
close to "zero defects" as possible. Six Sigma has changed the DNA of GE — it is now the way we
work — in everything we do and in every product we design.
Honeywell: Six Sigma refers to our overall strategy to improve growth and productivity as well as
a measurement of quality. As a strategy, Six Sigma is a way for us to achieve performance
breakthroughs. It applies to every function in our company, not just those on the factory floor. That
means Marketing, Finance, Product Development, Business Services, Engineering and all the other
functions in our businesses are included.
Lockheed Martin: We have just begun to scratch the surface with the cost-saving initiative called
Six Sigma and already we have generated $64 million in savings with just the first 40 projects. Six
Sigma uses data gathering and statistical analysis to pinpoint sources of error in the organization or
products and determines precise ways to reduce the error.
8
History of Six Sigma
Simplistically, Six
Sigma was a •  1984 Bob Galvin of Motorola edicted the first objectives of Six Sigma
program that was –  10x levels of improvement in service and quality by 1989
generated around –  100x improvement by 1991
targeting a process –  Six Sigma capability by 1992
Mean (average) six –  Bill Smith, an engineer from Motorola, is the person credited as the father
Standard Deviations of Six Sigma
away from the
•  1984 Texas Instruments and ABB Work closely with Motorola to
closest specification
further develop Six Sigma
limit.
•  1994 Application experts leave Motorola
By using the process
•  1995 AlliedSignal begins Six Sigma initiative as directed by Larry
Standard Deviation
Bossidy
to determine the
location of the Mean –  Captured the interest of Wall Street
the results could be •  1995 General Electric, led by Jack Welch, began the most widespread
predicted at 3.4 undertaking of Six Sigma even attempted
defects per million by •  1997 To present Six Sigma spans industries worldwide
the use of statistics.
There is an allowance for the process Mean to shift 1.5 Standard Deviations. This number is another
academic and esoteric controversial issue not worth debating. We will get into a discussion of this
number later in the course.
The Phase Approach of Six Sigma

Six Sigma created a realistic and quantifiable goal in terms of its target of 3.4 defects per million
operations. It was also accompanied by a methodology to attain that goal. That methodology
was a problem solving strategy made up of four steps: measure, analyze, improve and control.
When GE launched Six Sigma they improved the methodology to include the Define Phase.
Define Measure Analyze Improve Control
GENERAL ELECTRIC MOTOROLA
Today the Define Phase is an important aspect to the methodology. Motorola was a mature culture
from a process perspective and did not necessarily have a need for the Define Phase.
Most organizations today DEFINITELY need it to properly approach improvement projects.
As you will learn, properly defining a problem or an opportunity is key to putting you on the right
track to solve it or take advantage of it.
9
DMAIC Phases Roadmap

Champion/
Process
Owner
Identify Problem Area
Determine Appropriate Project Focus

Define
Estimate COPQ
Charter Project
Measure
Assess Stability, Capability and Measurement Systems
Identify and Prioritize All X’s

Analyze
Prove/Disprove Impact X’s Have on Problem

Improve
Identify, Prioritize, Select Solutions Control or Eliminate X’s Causing Problems
Implement Solutions to Control or Eliminate X’s Causing Problems
Implement Control Plan to Ensure Problem Does Not Return

Control
Verify Financial Impact
This roadmap provides an overview of the DMAIC approach.
Define Phase Deployment
Here is a more granular Business Case

Selected
look at the Define
Phase. Notify Belts and Stakeholders
This is what you will

Create High-Level Process Map
later learn to be a Level
2 Process Map. Determine Appropriate Project Focus
(Pareto, Project Desirability)
Define & Charter Project

(Problem Statement, Objective, Primary Metric, Secondary Metric)
N Estimate COPQ
Approved
Project Recommend Project Focus
Focus
Y
Create Team
Charter Team
Ready for Measure
10
Define Phase Deliverables
Listed here are the type of Define Phase deliverables that will be reviewed by this course.
By the end of this course you should understand what would be necessary to provide these
deliverables in a presentation.
§  Charter Benefits Analysis

§  Team Members (Team Meeting Attendance)
§  Process Map – high level
§  Primary Metric
§  Secondary Metric(s)
§  Lean Opportunities
§  Stakeholder Analysis
§  Project Plan
§  Issues and Barriers
Six Sigma Strategy

Six Sigma places the emphasis on the Process
–  Using a structured, data driven approach centered on the customer Six Sigma can resolve
business problems where they are rooted, for example:
§  Month end reports
§  Capital expenditure approval
§  New hire recruiting
Six Sigma is a Breakthrough Strategy
–  Widened the scope of the definition of quality

§  includes the value and the utility of
the product/service to both the
company and the customer.
Success of Six Sigma depends on the extent of

transformation achieved in each of these levels.
Six Sigma as a breakthrough strategy to process improvement. Many people mistakenly assume
that Six Sigma only works in manufacturing type operations. That is categorically untrue. It
applies to all aspects of either a product or service based business.
Wherever there are processes Six Sigma can improve their performance.
11

Conventional Strategy
Conventional definitions of quality focused on conformance to standards.
Requirement Requirement
or or
LSL
Target USL
Bad Good Bad
Conventional strategy was to create a product or service that met certain specifications.
§  Assumed if products and services were of good quality then their
performance standards were correct.
§  Rework was required to ensure final quality.
§  Efforts were overlooked and unquantified (time, money, equipment
usage, etc).
The conventional strategy was to create a product or service that met certain specifications. It was
assumed if products and services were of good quality their performance standards were correct
irrespective of how they were met.
Using this strategy often required rework to ensure final quality or the rejection and trashing of some
products and the efforts to accomplish this “inspect in quality” were largely overlooked and un-
quantified.
You will see more about this issues when we investigate the Hidden Factory.
Problem Solving Strategy
The Problem Solving Methodology focuses on:

•  Understanding the relationship between independent variables and the
dependent variable.
•  Identifying the vital few independent variables that effect the
dependent variable.
•  Optimizing the independent variables so as to control our dependent
variable(s).
•  Monitoring the optimized independent variable(s).
There are many examples to describe dependent and independent

relationships.
•  We describe this concept in terms of the equation:
•  Often referred to as a transfer function
Y=f (Xi)
This simply states that Y is a function of the
X’s. In other words Y is dictated by the X’s.
12
Problem Solving Strategy (cont.)
Y = f(x) is a key concept you must fully understand and remember. It is a fundamental principle to
the Six Sigma methodology. In its simplest form it is called “cause and effect”. In its more robust
mathematical form it is called “Y is equal to a function of X”. In the mathematical sense it is data
driven and precise as you would expect in a Six Sigma approach. Six Sigma will always refer to an
output or the result as a Y and will always refer to an input that is associated with or creates the
output as an X.
Another way of saying this is the output is dependent on the inputs that create it through the
blending that occurs from the activities in the process. Since the output is dependent on the inputs
we cannot directly control it, we can only monitor it.
Example
Y = f (Xi)
Which process variables (causes) have critical impact on the
output (effect)?
Crusher Yield
= f ( Feed, Speed,Material
Type , Wear , Lubricant )
Tool
Time to Close
= f (Balance
Trial
,Accounts,Accounts,Memos,Mistakes,X )
Correct Sub Credit Entry
n
Applied
If we are so good at the X’s why are we

constantly testing and inspecting the Y?
Y = f(x) is a transfer function tool to determine what input variables (X’s) affect the output
responses (Y’s). The observed output is a function of the inputs. The difficulty lies in determining
which X’s are critical to describe the behavior of the Y’s.
The X’s determine how the Y performs.
In the Measure Phase we will introduce a tool to manage the long list of input variable and their
relationship to the output responses. It is the X-Y Matrix or Input-Output Matrix.
13
Y = f(X) Exercise
Exercise:
Consider establishing a Y = f(x) equation for a

simple everyday activity such as producing a
cup of espresso. In this case our output or Y is
espresso.
Espresso =f ( X1 , X , X , X , X )
2 3 4 n
Notes
14
Six Sigma Strategy
We use a variety of Six Sigma

tools to help separate the vital (X1)
few variables effecting our Y from (X10) (X4)
the trivial many .
Some processes contain many, (X7) (X8)
many variables. However, our Y is
not effected equally by all of them.
(X3)
By focusing on the vital few we (X5)
instantly gain leverage.
(X9)
Archimedes said: Give me a lever big enough and a
fulcrum on which to place it and I shall move the world.
(X6)
(X2)
Archimedes not
shown actual size!
As you go through the application of DMAIC you will have a goal to find the Root Causes to the
problem you are solving. Remember a vital component of problem solving is cause and effect
thinking or Y = f(X). To aid you in doing so you should create a visual model of this goal as a funnel
- a funnel that takes in a large number of the “trivial many contributors” and narrows them to the
“vital few contributors” by the time they leave the bottom.
At the top of the funnel you are faced with all possible causes - the “vital few” mixed in with the
“trivial many.” When you work an improvement effort or project you must start with this type of
thinking. You will use various tools and techniques to brainstorm possible causes of performance
problems and operational issues based on data from the process. In summary, you will be applying
an appropriate set of “analytical methods” and the “Y is a function of X” thinking to transform data
into the useful knowledge needed to find the solution to the problem.
It is a mathematical fact 80 percent of a problem is related to six or less causes; the X’s. In most
cases it is between one and three. The goal is to find the one to three Critical X’s from the many
potential causes when we start an improvement project. In a nutshell this is how the Six Sigma
methodology works.
15
Breakthrough Strategy
Bad 6-Sigma
Breakthrough UCL
Old Standard
Performance
LCL
UCL
New Standard
LCL
Good
Time Juran s Quality Handbook by Joseph Juran
By utilizing the DMAIC problem solving methodology to identify and optimize the vital few variables we
will realize sustainable breakthrough performance as opposed to incremental improvements or, even
worse, temporary and non-sustainable improvement..
The image above shows how after applying the Six Sigma tools variation stays within the specification
limits.
VOC, VOB, VOE
The
foundation of
Six Sigma
VOC is Customer Driven
requires
VOB is Profit Driven

Focus on the
voices of the
Customer, the
Business and
the Employee
which VOE is Process Driven
provides:
§  Awareness of the needs that are critical to the quality (CTQ) of our products and
services
§  Identification of the gaps between “what is” and “what should be”
§  Identification of the process defects that contribute to the “gap”
§  Knowledge of which processes are “most broken”
§  Enlightenment as to the unacceptable Costs of Poor Quality (COPQ)
Six Sigma puts a strong emphasis on the customer because they are the ones assessing our
performance and they respond by either continuing to purchase our products and services or….by NOT!
So, while the customer is the primary concern we must keep in mind the Voice of the Business – how do
we meet the business’s needs so we stay in business? And we must keep in mind the Voice of the
Employee - how do we meet employees’ needs such that they remain employed by our firm and remain
inspired and productive?
16
Six Sigma Roles and Responsibilities

There are many roles and responsibilities for successful implementation of Six Sigma.
MBB
§  Executive Leadership
§  Champion/Process Owner
Black Belts §  Master Black Belt
§  Black Belt
§  Green Belt
Green Belts §  Yellow Belt
Yellow Belts
Just like a winning sports team various people who have specific positions or roles have defined
responsibilities. Six Sigma is similar - each person is trained to be able to understand and perform the
responsibilities of their role. The end result is a knowledgeable and well coordinated winning business
team.
The division of training and skill will be delivered across the organization in such a way as to provide a
specialist: it is based on an assistant structure much as you would find in the medical field between a
Doctor, 1st year Intern, Nurse, etc. The following pages discuss these roles in more detail.
In addition to the roles described herein all other employees are expected to have essential Six Sigma
skills for process improvement and to provide assistance and support for the goals of Six Sigma and
the company.
Six Sigma has been designed to provide a structure with various skill levels and knowledge for all
members of the organization. Each group has well defined roles and responsibilities and
communication links. When all participants are actively applying Six Sigma principles the company
operates and performs at a higher level. This leads to increased profitability and greater employee and
customer satisfaction.
Executive Leadership
Not all Six Sigma deployments are driven from the top by executive leadership. The data is clear,
however, that those deployments that are driven by executive management are much more successful
than those that are not.
§  Makes decision to implement the Six Sigma initiative and develop accountability
method
§  Sets meaningful goals and objectives for the corporation
§  Sets performance expectations for the corporation
§  Ensures continuous improvement in the process
§  Eliminates barriers
The executive leadership owns the vision for the business, they provide sponsorship and set
expectations for the results from Six Sigma. They enable the organization to apply Six Sigma and then
monitor the progress against expectations.
17
Champion/Process Owner
Champions identify and select the most meaningful projects to work on, they provide guidance to
the Six Sigma Belt and open the doors for the belts to apply the process improvement technologies.
§  Own project selection, execution control, implementation and realization of

gains
§  Own Project selection
§  Obtain needed project resources and eliminates roadblocks
§  Participate in all project reviews
§  Ask good questions…
§  One to three hours per week commitment
Champions are responsible for functional business activities and to provide business deliverables to
either internal or external customers. They are in a position to be able to recognize problem areas of
the business, define improvement projects, assign projects to appropriate individuals, review projects
and support their completion. They are also responsible for a business roadmap and employee
training plan to achieve the goals and objectives of Six Sigma within their area of accountability.
Master Black Belt
MBB should be well versed with all aspects of Six Sigma, from technical applications to Project
Management. MBBs need to have the ability to influence change and motivate others.
§  Provide advice and counsel to Executive Staff

§  Provide training and support
- In class training
MBB - On site mentoring
§  Develop sustainability for the business
§  Facilitate cultural change
A Master Black Belt is a technical expert, a “go to” person for the Six Sigma methodology. Master
Black Belts mentor Black Belts and Green Belts through their projects and support Champions. In
addition to applying Six Sigma, Master Black Belts are capable of teaching others in the practices
and tools.
Being a Master Black Belt is a full time position.
18
Black Belt
Black Belts are application experts and work projects within the business. They should be well
versed with The Six Sigma Technologies and have the ability to drive results.
§  Project team leader

§  Facilitates DMAIC teams in applying Six Sigma
methods to solve problems
Black Belts §  Works cross-functionally
§  Contributes to the accomplishment of organizational
goals
§  Provides technical support to improvement efforts
A Black Belt is a project team leader, working full time to solve problems under the direction of a
Champion, and with technical support from the Master Black Belt. Black Belts work on projects
that are relatively complex and require significant focus to resolve. Most Black Belts conduct an
average of 4 to 6 projects a year -- projects that usually have a high financial return for the
company.
Green Belt
Green Belts are practitioners of Six Sigma Methodology and typically work within their
functional areas or support larger Black Belt Projects.
•  Well versed in the definition & measurement of critical processes

- Creating Process Control Systems
§  Typically works project in existing functional area
Green Belts §  Involved in identifying improvement opportunities

§  Involved in continuous improvement efforts
- Applying basic tools and PDCA
§  Team members on DMAIC teams
- Supporting projects with process knowledge & data
collection
Green Belts are capable of solving problems within their local span of control. Green Belts remain in
their current positions, but apply the concepts and principles of Six Sigma to their job environment.
Green Belts usually address less complex problems than Black Belts and perform at least two projects
per year. They may also be a part of a Black Belt’s team, helping to complete the Black Belt project.
19
Yellow Belt
§  Provide support to Black Belts and Green Belts as

needed
Yellow Belts §  May be team members on DMAIC teams
- Supporting projects with process
knowledge and data collection
Yellow Belts participate in process management activities. They fully understand the principles of Six
Sigma and are capable of characterizing processes, solving problems associated with their work
responsibilities and implementing and maintaining the gains from improvements. They apply Six
Sigma concepts to their work assignments. They may also participate on Green and Black Belt
projects.
The Life of a Six Sigma Belt
Training as a Six Sigma Belt can be one of the most rewarding undertakings of your career and
one of the most difficult.
You can expect to experience:
§  Hard work (becoming a Six Sigma Belt is not

easy)
§  Long hours of training
§  Be a change agent for your organization
§  Work effectively as a team leader
§  Prepare and present reports on progress
§  Receive mentoring from your Master Black Belt
§  Perform mentoring for your team members
§  ACHIEVE RESULTS!
You’re going places!
20
Black & Green Belt Certification
To achieve certification, Belts typically must:
§  Complete all course work:

- Be familiar with tools and their application
- Practice using tools in theoretical situations
- Discuss how tools will apply to actual projects
§  Demonstrate application of learning to training project:
- Use the tools to effect a financially measurable
and significant business impact through their
projects
- Show ability to use tools beyond the training We’ll be
environment
watching!
§  Must complete two projects within one year from beginning of training
§  Achieve results and make a difference
§  Submit a final report which documents tool understanding and

application as well as process changes and financial impact for each
project
Organizational Behaviors
All players in the Six Sigma process must be willing to step up and act according to the Six Sigma
set of behaviors.
§  Leadership by example: “walk the talk”
§  Encourage and reward individual initiative
§  Align incentive systems to support desired behaviors
§  Eliminate functional barriers
§  Embrace “systems” thinking
§  Balance standardization with flexibility
Six Sigma is a system of improvement. It develops people skills and capability for the participants. It
consists of proven set of analytical tools, project-management techniques, reporting methods and
management methods combined to form a powerful problem-solving and business-improvement
methodology. It solves problems, resulting in increased revenue and profit, and business growth.
The strategy of Six Sigma is a data-driven, structured approach to managing processes, quantifying
problems, and removing waste by reducing variation and eliminating defects.
The tactics of Six Sigma are the use of process exploration and analysis tools to solve the equation
of Y = f(X) and to translate this into a controllable practical solution.
As a performance goal a Six Sigma process produces less than 3.4 defects per million
opportunities. As a business goal Six Sigma can achieve 40% or more improvement in the
profitability of a company. It is a philosophy that every process can be improved, at breakthrough
levels.
21
At this point you should be able to:
§  Describe the objectives of Six Sigma
§  Describe the relationship between variation and sigma
§  Recognize some Six Sigma concepts
§  Recognize the Six Sigma implementation model
§  Describe the general roles and responsibilities in Six Sigma
You have now completed Define Phase – Understanding Six Sigma.
Notes
22
Lean Six Sigma

Green Belt Training
Define Phase
Now we will continue in the Define Phase with the “Six Sigma Fundamentals”.
The output of the Define Phase is a well developed and articulated project. It has been correctly
stated that 50% of the success of a project is dependent on how well the effort has been defined.
There’s that Y = f(X) thinking again.
23

Overview
The fundamentals of
this phase are Process U n d e r s ta n d in g S ix S ig m a
Maps, Voice of the
Customer, Cost of Poor
Quality and Process S ix S ig m a Fu n d a m e n ta ls
Metrics.
PPro
roce
cessss M
Maa ppss
We will examine the
meaning of each of VV ooice
ice oof f th
thee CCuussto
tomm eerr
these and show you
how to apply them. CCoosst t oof f PPoooor
r QQ uuaa lity
lity
PPro
roce
cessss M
Meetr
trics
ics
S e le ctin g P r o je cts
Ele m e n ts o f W a s te
W ra p U p & A ctio n Ite m s
What is a Process?
Why have a process focus?

–  So we can understand how and why work gets done
–  To characterize customer & supplier relationships
–  To manage for maximum customer satisfaction while utilizing
minimum resources
–  To see the process from start to finish as it is currently being
performed
–  Defects: Blame the process, not the people
proc•ess (pros′es) n. – A repetitive and systematic

series of steps or activities where inputs are
modified to achieve a value-added output
What is a Process? Many people do or conduct a process everyday but do they really think of it as a
process? Our definition of a process is a repetitive and systematic series of steps or activities where
inputs are modified to achieve a value-added output.
Usually a successful process needs to be well defined and developed.
24
Examples of Processes
We go through processes everyday. Below are some examples of processes. Can you
think of other processes within your daily environment?
§  Injection molding §  Recruiting staff
§  Decanting solutions §  Processing invoices
§  Filling vial/bottles §  Conducting research
§  Crushing ore §  Opening accounts
§  Refining oil §  Reconciling accounts
§  Turning screws §  Filling out a timesheet
§  Building custom homes §  Distributing mail
§  Paving roads §  Backing up files
§  Changing a tire §  Issuing purchase orders
Process Maps
Process Mapping, also called The purpose of a Process Map is to:
flowcharting, is a technique to –  Identify the complexity of the process
visualize the tasks, activities and –  Communicate the focus of problem solving
steps necessary to produce a product
or a service. The preferred method for Process Maps are living documents and must be changed as the
process is changed:
describing a process is to identify it
–  They represent what is currently happening not what you think is
with a generic name, show the happening
workflow with a Process Map and –  They should be created by the people who are closest to the process
describe its purpose with an
operational description.
Process Map
Remember a process is a blending of
inputs to produce some desired
output. The intent of each task, activity
and step is to add value, as perceived
t
ec
Start Step A Step B Step C Step D Finish

sp
In
by the customer, to the product or

service we are producing. You cannot
discover if this is the case
until you have adequately mapped the process.
There are many reasons for creating a Process Map:

- It helps all process members understand their part in the process and how their process fits into the
bigger picture.
- It describes how activities are performed and how the work effort flows, it is a visual way of standing
above the process and watching how work is done. In fact, Process Maps can be easily uploaded into
model and simulation software allowing you to simulate the process and visually see how it works.
- It can be used as an aid in training new people.
- It will show you where you can take measurements that will help you to run the process better.
- It will help you understand where problems occur and what some of the causes may be.
- It leverages other analytical tools by providing a source of data and inputs into these tools.
- It identifies many important characteristics you will need as you strive to make improvements.
The individual processes are linked together to see the total effort and flow for meeting business and
customer needs. In order to improve or to correctly manage a process, you must be able to describe it
in a way that can be easily understood. Process Mapping is the most important and powerful tool you
will use to improve the effectiveness and efficiency of a process.
25
Process Map Symbols
Standard symbols for Process Mapping:

(available in Microsoft Office™, Visio™, iGrafx™ , SigmaFlow™ and other products)
A RECTANGLE indicates an A PARALLELAGRAM shows

activity. Statements within that there are data
the rectangle should begin
with a verb
A DIAMOND signifies a decision An ELLIPSE shows the start

point. Only two paths emerge from and end of the process
a decision point: No and Yes
An ARROW shows the A CIRCLE WITH A LETTER OR

connection and direction
1 NUMBER INSIDE symbolizes
the continuation of a
of flow
flowchart to another page
There may be several interpretations of some of the Process Mapping symbols; however, just
about everyone uses these primary symbols to document processes. As you become more
practiced you will find additional symbols useful, i.e. reports, data storage etc. For now we will
start with just these symbols.
High Level Process Map
At a minimum a
high level Process One of the deliverables from the Define Phase is a high level
Map must include; Process Map which at a minimum must include:
start and stop –  Start and stop points
points, all process
–  All process steps
steps, all decision
points and –  All decision points
directional flow. –  Directional flow
–  Value categories as defined here:
Also be sure to •  Value Added:
include Value –  Physically transforms the thing going through the process
Categories such as –  Must be done right the first time
Value Added –  Meaningful from the customer s perspective (is the customer willing to
(Customer Focus) pay for it?)
and Value Enabling •  Value Enabling:
(External –  Satisfies requirements of non-paying external stakeholders
Stakeholder focus). (government regulations)
•  Non-Value Added:
–  Everything else
26
Process Map Example
START B Z
REVIEW CASE LOGOFF PHONE, CHECK

Process Map
for a Call
LOGON TO PC &
TOOL HISTORY & MAIL,E-MAIL,VOICE MAIL
APPLICATIONS
TAKE NOTES
E
SCHEDULED
N
C SCHEDULED
PHONE TIME?
Y
A Center
PHONE TIME? Z TRANSFER Y
TRANSFER
APPROPRIATE?
CALL
D N
Y
A EXAMINE NEXT NOTE
N OR RESEARCH ITEM
LOGON
TO PHONE IMMEDIATE PROVIDE
Y RESPONSE Y RESPONSE
ACCESS CASE TOOL F
D PHONE
TIME AVAILABLE? PHONE&
N WALK-IN NOTE
CALL or DATA ENDS ENTER APPROPRIATE
WALK-IN? N SSAN (#,9s,0s)
Z CALL PUT ON HOLD,
REFER TO IF EMP DATA NOT
PHONE DATA REFERENCES POPULATED, ENTER
CAPTURE BEGINS
CREATE A CASE
Y INCL CASE TYPE
ANSWER? OLD N
DETERMINE WHO DATE/TIME, &
CASE
IS INQUIRING N NEEDED BY
Y
QUERY INTERNAL UPDATE ENTRIES
ACCESS CASE TOOL HRSC SME(S) INCL OPEN DATE/TIME AUTO Y
ROUTE
ROUTE
DETERMINE NATURE N
OF CALL & CONFIRM Y
ANSWER?
UNDERSTANDING
CASE Y CLOSE CASE
N CLOSED W/ E
DATE/TIME
CASE TOOL N OFF HOLD AND ADD TO N
RECORD? C ARRANGE CALL RESEARCH
BACK PHONE DATA LIST GO TO E
TAKE ACTION
Y ENDS F or E NEXT
or
DEPENDING ON
DO RESEARCH F
B CASE
Cross Functional Process Map

When multiple departments or functional groups are involved in a complex process it is often useful
to use cross functional Process Maps.
–  Draw in either vertical or horizontal Swim Lanes and label the functional
groups and draw the Process Map
These are best

used in Sending Wire Transfers
transactional
Department
Attach ACH ACH – Automated

Request
processes or Start transfer
form to Clearing House.
Invoice
where the
process involves Fill out ACH Receive
Vendor
Produce an
Invoice
No enrollment payment End
several form
departments.
Accounting
Match against Maintain database

Financial
Vendor bank batch

The lines drawn Yes Input info into to balance ACH
info in web interface and daily cash transfers
FRS? batch
horizontally
across the map Accepts transactions,
Bank
transfer money and

represent provide batch total
different
Accounting
Review and
departments in
General
21.0
Process 3.0 Bank
transfer in Journey Entry
the company FRS
Reconciliation
and are usually

referred to as Swim Lanes. By mapping in this manner one can see how the various departments
are interdependent in this process.
27
Process Map Exercise
Exercise objective: Using your favorite Process

Mapping tool create a Process Map of your project
or functional area.
1.  Create a high level Process Map, use enough

detail to make it useful.
•  It is helpful to use rectangular post-its for process
steps and square ones turned to a diamond for
decision points.
2.  Color code the value added (green) and non-value
added (red) steps.
3.  Be prepared to discuss this with your mentor.
Notes
28
Do you know your Customer?
Knowing your customer is more than just a handshake. It is

necessary to clearly understand their needs. In Six Sigma we
call this understanding the CTQ’s or Critical to Customer
Characteristics.
Voice Of the Customer Critical to Customer

Characteristics
An important element of Six Sigma is understanding your customer. This is called VOC or Voice of
the Customer. By doing this allows you to find all of the necessary information that is relevant
between your product/process and customer, better known as CTQ’s (Critical to Quality). The CTQ’s
are the customer requirements for satisfaction with your product or service.
Voice of the Customer

Do you feel confident
you know what your Voice of the Customer or VOC seems obvious; after all, we
all know what the customer wants. Or do we??
customer wants?
There are four steps The customer’s perspective has to be foremost in the mind of the Six
Sigma Belt throughout the project cycle.
that can help you in
understanding your 1.  Features
customer. These •  Does the process provide what the customers expect and need?
steps focus on the •  How do you know?
customer’s 2. Integrity
perspective of •  Is the relationship with the customer centered on trust?
features, your •  How do you know?
company’s integrity, 3. Delivery
delivery mechanisms •  Does the process meet the customer s time frame?
and perceived value •  How do you know?
versus cost.
4. Expense
•  Does the customer perceive value for cost?
•  How do you know?
29
What is a Customer?
Every process has a

Different types of customers dictate how we interact with them in
deliverable. The
the process. In order to identify customer and supplier
person or entity who
requirements we must first define who the customers are:
receives this
deliverable is a
customer. External
–  Direct: those who receive the output of your services, they
There are two types of
generally are the source of your revenue
customers; External
and Internal. People –  Indirect: those who do not receive or pay for the output of your
generally forget about services but have a vested interest in what you do (government
the Internal customer agencies)
and they are just as
important as the Internal
customers who are
- those within your organization
buying your product.
who receive the output of your
work
Value Chain
The relationship from one process to the next in an organization creates a

Value Chain of suppliers and receivers of process outputs.
Each process has a contribution and accountability to the next to satisfy the
external customer.
External customers needs and requirements are best met when all process
owners work cooperatively in the Value Chain.
Careful –
each move
has many
impacts!
The disconnect from Design and Production in some organizations is a good example. If
Production is not fed the proper information from Design how can Production properly build a
product?
Every activity (process) must be linked to move from raw materials to a finished product on a store
shelf.
30
What is a CTQ?
Critical to Quality (CTQ’s) are measures we use to capture VOC

properly. (also referred to in some literature as CTC’s – Critical to Example: Making an
Customer) Online Purchase
CTQ’s can be vague and difficult to define. Reliability – Correct
–  The customer may identify a requirement that is difficult to measure amount of money is
directly so it will be necessary to break down what is meant by the taken from account
customer into identifiable and measurable terms
Responsiveness –
Product: Service:
•  Performance •  Competence
How long to you wait
•  Features •  Reliability for product after the
•  Conformance •  Accuracy Merchant receives
•  Timeliness •  Timeliness their money
•  Reliability •  Responsiveness
•  Serviceability •  Access Security – is your
•  Durability •  Courtesy sensitive banking
•  Aesthetics •  Communication information stored in
•  Reputation •  Credibility secure place
•  Completeness •  Security
•  Understanding
Developing CTQ’s
The steps in developing Identify Customers
CTQ’s are identifying
the customer, capturing
Step 1 •  Listing
•  Segmentation
the Voice of the •  Prioritization
Customer and finally
validating the CTQ’s.
Capture VOC
Step 2 •  Review existing performance
•  Determine gaps in what you need to know
•  Select tools that provide data on gaps
•  Collect data on the gaps
Validate CTQ’s
Step 3 •  Translate VOC to CTQ’s
•  Prioritize the CTQ’s
•  Set Specified Requirements
•  Confirm CTQ’s with customer
31
Cost of Poor Quality (COPQ)
Another important tool from •  COPQ stands for Cost of Poor Quality
this phase is COPQ, Cost of
Poor Quality. COPQ •  As a Six Sigma Belt one of your tasks will be to estimate COPQ for
represents the financial your process
opportunity of your team’s
improvement efforts. Those •  Through your process exploration and project definition work you will
opportunities are tied to develop a refined estimate of the COPQ in your project
either hard or soft savings.
•  This project COPQ represents the financial opportunity of your
COPQ, is a symptom team’s improvement effort (VOB)
measured in loss of profit
(financial quantification) that •  Calculating COPQ is iterative and will change as you learn more
results from errors (defects) about the process
and other inefficiencies in our
No, not that kind
processes. This is what we
of cop queue!
are seeking to eliminate!
You will use the concept of COPQ to quantify the benefits of an improvement effort and also to
determine where you might want to investigate improvement opportunities.
The Essence of COPQ

There are four
•  The concepts of traditional Quality Cost are the foundation for elements that make up
COPQ. COPQ; External Costs,
–  External, Internal, Prevention, Appraisal Internal Costs,
Prevention Costs and
Appraisal Costs.
•  A significant portion of COPQ from any defect comes from effects
Internal Costs are
that are difficult to quantify and must be estimated.
opportunities of error
found in a process that
•  COPQ helps us understand the financial impact of problems created is within your
by defects. organization. Whereas
External Costs are
•  COPQ is a symptom, not a defect costs associated to the
finish product
–  Projects fix defects with the intent of improving symptoms.
associated with the
internal and external
customer.
Prevention Costs are typically cost associated to product quality; this is viewed as an investment
companies make to ensure product quality. The final element is Appraisal costs; these are tied to
product inspection and auditing.
This idea was of COPQ was defined by Joseph Juran and is a great point of reference to gain a further
understanding.
Over time with Six Sigma, COPQ has migrated towards the reduction of waste. Waste is a better term
because it includes poor quality and all other costs that are not integral to the product or service your
company provides. Waste does not add value in the eyes of customers, employees or investors.
32
COPQ - Categories
Internal COPQ Prevention

•  Quality Control Department •  Error Proofing Devices
•  Inspection •  Supplier Certification
•  Quarantined Inventory •  Design for Six Sigma
•  Etc… •  Etc…
External COPQ Detection

•  Warranty •  Supplier Audits
•  Customer Complaint Related •  Sorting Incoming Parts
Travel •  Repaired Material
•  Customer Charge Back Costs •  Etc…
•  Etc…
COPQ - Iceberg
Inspection
Generally speaking Warranty Recode
Rework
COPQ can be
Rejects
classified as tangible Visible Costs
(easy to see) and (Hard Costs)
intangible (hard to
see). Visually you can
think of COPQ as an Engineering change orders Lost sales
iceberg. Most of the
iceberg is below the Time value of money (less obvious) Late delivery
Expediting costs
water where you
cannot see it. More set-ups
Excess inventory
Similarly the tangible Working Capital allocations

Long cycle times
quality costs are costs
the organization is Excessive Material
Orders/Planning
rather conscious of, Hidden Costs Lost Customer Loyalty
may be measuring (Soft Costs)
already or could
easily be measured. The COPQ metric is reported as a percent of sales revenue. For example tangible
costs like inspection, rework, warranty, etc. can cost an organization in the range of 4 percent to 10
percent of every sales dollar it receives. If a company makes a billion dollars in revenue this means
there are tangible wastes between 40 and 100 million dollars.
Even worse are the intangible Costs of Poor Quality. These are typically 20 to 35% of sales. If you
average the intangible and tangible costs together it is not uncommon for a company to be spending
25% of their revenue on COPQ or waste.
33
COPQ and Lean
Waste does not add, subtract or otherwise modify the throughput

in a way that is perceived by the customer to add value.
•  In some cases, waste may be

necessary but should be Lean Enterprise
recognized and explored: Seven Elements of Waste *
–  Inspection, Correction, Waiting u  Correction
in suspense u  Processing
–  Decision diamonds, by u  Conveyance
definition, are non-value added u  Motion
•  Often waste can provide u  Waiting
opportunities for additional u  Overproduction
defects to occur. u  Inventory
•  We will discuss Lean in more

detail later in the course.
Implementing Lean fundamentals can also help identify areas of COPQ. Lean will be discussed later.
COPQ and Lean
While Hard Savings are always more desirable

because they are easier to quantify it is also
necessary to think about Soft Savings.
COPQ – Hard Savings COPQ – Soft Savings
•  Labor Savings •  Gaining Lost Sales

•  Cycle Time Improvements •  Missed Opportunities
•  Scrap Reductions •  Customer Loyalty
•  Hidden Factory Costs •  Strategic Savings
•  Inventory Carrying Cost •  Preventing Regulatory Fines
Here are examples are COPQ’s Hard and Soft Savings.
34
COPQ Exercise
Exercise objective: Identify current COPQ

opportunities in your direct area.
1.  Brainstorm a list of COPQ opportunities.
2.  Categorize the top 3 sources of COPQ for

the four classifications:
•  Internal
•  External
•  Prevention
•  Detection
Notes
35
The Basic Six Sigma Metrics
In any process improvement endeavor the ultimate

objective is to make the process:
•  Better: DPU, DPMO, RTY (there are others, but they derive from
these basic three)
•  Faster: Cycle Time
•  Cheaper: COPQ
If you make the process better by eliminating defects you will make it faster.
If you choose to make the process faster you will have to eliminate defects to
be as fast as you can be.
If you make the process better or faster you will necessarily make it cheaper.
The metrics for all Six Sigma projects fall into one of
these three categories.
Previously we have been discussing process management and the concepts behind a process
perspective. Now we begin to discuss process improvement and the metrics used.
Some of these metrics are:

DPU: defects per unit produced.
DPMO: defects per million opportunities, assuming there is more than one
opportunity to fail in a given unit of output.
RTY: rolled throughput yield, the probability that any unit will go through a process
defect-free.
Cycle Time Defined
Cycle time
includes any wait Think of Cycle Time in terms of your product or
or queue time for transaction in the eyes of the customer of the process:
either people or
products. –  It is the time required for the product or transaction to go through the
entire process from beginning to end
–  It is not simply the touch time of the value-added portion of the process
What is the cycle time of the process you mapped?

Is there any variation in the cycle time?
Why?
36
Defects Per Unit (DPU)
DPU or Defects per Unit

Six Sigma methods quantify individual defects and not just defectives ~
quantifies individual defects
–  Defects account for all errors on a unit
on a unit and not just
•  A unit may have multiple defects
defective units. A returned
•  An incorrect invoice may have the wrong amount due and the wrong due
unit or transaction can be date
defective and have more –  Defectives simply classifies the unit bad
than one defect. •  Does not matter how many defects there are
•  The invoice is wrong, causes are unknown
Defect: A physical count of
–  A unit:
all errors on a unit,
•  Is the measure of volume of output from your area.
regardless of the disposition
•  Is observable and countable. It has a discrete start and stop point.
of the unit.
•  It is an individual measurement and not an average of measurements.
EXAMPLES: An error in a
Online transaction has Two Defects One Defective
(typed wrong card number,
internet failed). In this case
one online transaction had 2
defects (DPU=2).
A Mobile Computer that has 1 broken video screen, 2 broken keyboard keys and 1 dead battery,
has a total of 4 defects. (DPU=4)
Is a process that produces 1 DPU better or worse than a process that generates 4 DPU? If you
assume equal weight on the defects, obviously a process that generates 1 DPU is better; however,
cost and severity should be considered. However the only way you can model or predict a process
is to count all the defects.
First Time Yield
Traditional metrics
FTY is the traditional quality metric for yield
when chosen
poorly can lead the –  Unfortunately it does not account for any necessary rework
team in a direction
Total Units Passed
not consistent with FTY =
the focus of the Total Units Tested
business. A metric
we must be Units in = 100 Units in = 100 Units in = 100 Units Tested = 100
Units Out = 100 Units Out = 100 Units Out = 100 Units Passed = 100
concerned about is
Process A (Grips) Process B (Shafts) Process C (Club Heads) Final Product (Set of Irons)
FTY - First Time
Yield. It is very
possible to have
100% FTY and
spend tremendous
amounts in excess
repairs and Defects Repaired Defects Repaired Defects Repaired
30 20 FTY = 100 %
rework. 40
*None of the data used herein is associated with the products shown herein. Pictures are no more than illustration to make a point to teach the concept.
37
Rolled Throughput Yield
RTY is a more appropriate metric for problem solving

–  It accounts for losses due to rework steps
RTY = X1 * X2 * X3
Units in = 100 Units in = 100 Units in = 100
Units W/O Rework = 60 Units W/O Rework = 70 Units W/O Rework = 80 Units Passed = 34
RTY = 0.6 RTY = 0.7 RTY = 0.8 Units Tested = 100
Process A (Grips) Process B (Shafts) Process C (Club Heads) Final Product (Set of Irons)
Defects Repaired Defects Repaired Defects Repaired

40 30 20 RTY = 33.6 %
Instead of relying on FTY, First Time Yield, a more efficient metric to use is RTY-Rolled Throughput
Yield. RTY has a direct correlation (relationship) to Cost of Poor Quality.
In the few organizations where data is readily available the RTY can be calculated using actual defect
data. The data provided by this calculation would be a binomial distribution since the lowest yield
possible would be zero.
As depicted here RTY is the multiplied yield of each subsequent operation throughout a process (X1 *
X2 * X3…)
RTY Estimate
Sadly, in most companies there is •  In many organizations the long term data required to
not enough data to calculate RTY calculate RTY is not available. We can however estimate
in the long term. Installing data RTY using a known DPU as long as certain conditions
collection practices required to are met.
provide such data would not be •  The Poisson distribution generally holds true for the
cost effective. In those instances random distribution of defects in a unit of product and is
it is necessary to utilize a the basis for the estimation.
prediction of RTY in the form of e- –  The best estimate of the proportion of units containing
dpu (e to the negative dpu). no defects, or RTY, is:
When using the e-dpu equation to RTY = e-dpu

calculate the probability of a
product or service moving through The mathematical constant e is the base of the natural logarithm.
e ≈ 2.71828 18284 59045 23536 02874 7135
the entire process without
a defect, there are several things that must be held for consideration. While this would seem to be a
constraint, it is appropriate to note that if a process has in excess of 10% defects there is little need to
concern yourself with the RTY.
In such extreme cases it would be much more prudent to correct the problem at hand before worrying
about how to calculate yield.
38
Deriving RTY from DPU
The Binomial distribution is the true model for defect data but the Poisson is the
convenient model for defect data. The Poisson does a good job of predicting
when the defect rates are low.
120%
Poisson VS Binomial (r=0,n=1) Probability Yield Yield % Over
of a defect (Binomial) (Poisson) Estimated
0.0 100% 100% 0%
100%
Yield (Binomial) 0.1 90% 90% 0%
Yield (Poisson) 0.2 80% 82% 2%
Yield (RTY)
80%
0.3 70% 74% 4%
60% 0.4 60% 67% 7%
0.5 50% 61% 11%
40% 0.6 40% 55% 15%
0.7 30% 50% 20%
20% 0.8 20% 45% 25%
0.9 10% 41% 31%
0% 1.0 0% 37% 37%
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
Probability of a defect
Binomial Poisson
n = number of units
r = number of predicted defects
p = probability of a defect occurrence
q =1-p
For low defect rates (p < 0.1) the Poisson approximates the Binomial fairly well.
Our goal is to predict yield. For process improvement the “yield” of interest is the ability of a process
to produce zero defects (r = 0). Question: What happens to the Poisson equation when r = 0?
Deriving RTY from DPU - Modeling
Given a Unit
Basic Question: What is the likelihood of
probability that Opportunity producing a unit with zero defects?
any opportunity is
a defect = # •  For the unit shown the following data
was gathered:
defects / (# units
–  60 defects observed
x # opps per unit): –  60 units processed
RTY for DPU = 1
0.368
•  What is the DPU? 0.364
To what value is
0.36
Yield
the P(0)
0.356
converging? 0.352
•  What is probability that any given 0.348
Note: Ultimately opportunity will be a defect? 10 100 1000 10000 100000 1000000
Chances Per Unit
this means you
need the ability to •  What is the probability that any given Opportunities P(defect) P(no defect) RTY (Prob defect free unit)
track all the opportunity will NOT be a defect is: 10 0.1 0.9 0.34867844
100 0.01 0.99 0.366032341
individual defects 1000
10000
0.001
0.0001
0.999
0.9999
0.367695425
0.367861046
which occur per •  The probability that all 10 opportunities 100000 0.00001 0.99999 0.367877602
1000000 0.000001 0.999999 0.367879257
unit via your data on single unit will be defect-free is:
If we extend the concept to an infinite number
collection system. of opportunities, all at a DPU of 1.0, we will
approach the value of 0.368.
Probability an opportunity is a defect = 0.1

Probability an opportunity is not a defect = 1 - 0.1 = 0.9
Probability all 10 opportunities are defect-free = 0.910 = 0.34867844
39
RTY Prediction — Poisson Model
•  Use the binomial to estimate the probability of a discrete event

(good/bad) when sampling from a relatively large population,
n > 16, & p < 0.1.
•  When r = 0 we compute the probability of finding zero defects per
When r = 1
unit (called rolled throughput yield ).
this
•  The table to the right shows the proportion of product which will
equation have (dpu) r e – dpu
simplifies Y=
to:
–  0 defects (r = 0) r r! p[r]
–  1 defect (r = 1) When DPU = 1
(dpu)*e-
dpu –  2 defects (r = 2), etc… 0 0.3679
•  When on average we have a process with 1 defect per unit then 1 0.3679
we say there is a 36.79% chance of finding a unit with zero 2 0.1839
defects. There is only a 1.53% chance of finding a unit with 4
defects. 3 0.0613
•  When r = 1 this equation simplifies to: (dpu)*edpu 4 0.0153
•  To predict the % of units with zero defect (i.e., RTY): 5 0.0031
–  count the number of defects found 6 0.0005
–  count the number of units produced 7 0.0001
–  compute the dpu and enter it in the dpu equation
8 0.0000
The point of this slide is to demonstrate the mathematical model used to predict the probability of an
outcome of interest. It has little practical purpose other than to acquaint the Six Sigma Belt with the
math behind the tool they are learning and let them understand there is a logical basis for the equation.
Six Sigma Metrics – Calculating DPU

The DPU for a given operation can be calculated by dividing the number of
defects found in the operation by the number of units entering the operational
step.
100 parts built
2 defects identified and corrected
dpu = 0.02
So RTY for this step would be e-.02 (.980199) or 98.02%.
RTY1=0.98 RTY2=0.98 RTY3=0.98 RTY4=0.98 RTY5=0.98 RTYTOT=0.904

dpu = .02 dpu = .02 dpu = .02 dpu = .02 dpu = .02 dpuTOT = .1
If the process had only 5 process steps with the same yield the process.
RTY would be: 0.98 * 0.98 * 0.98 * 0.98 * 0.98 = 0.903921 or 90.39%. Since our metric of
primary concern is the COPQ of this process we can say less than 9% of the time we will
be spending dollars in excess of the pre-determined standard or value added amount to
which this process is entitled.
Note: RTY’s must be multiplied across a process,

DPU’s are added across a process.
When the number of steps in a process continually increase we then continue to multiply the yield from
each step to find the overall process yield. For the sake of simplicity let’s say we are calculating the
RTY for a process with 8 steps. Each step in our process has a yield of .98. Again, there will be a
direct correlation between the RTY and the dollars spent to correct errors in our process.
40
Focusing our Effort – FTY vs. RTY
Assume we are creating two products in our

organization that use similar processes.
Product A
FTY = 80%
Product B
FTY = 80%
How do you know what to work on?
If we chose only to examine the FTY in our decision making process it would be difficult to determine
the process and product on which our resources should be focused.
As you have seen there are many factors behind the final number for FTY. That is where we need to
look for process improvements.
Focusing our Effort – FTY vs. RTY
Let’s look at the DPU of each product assuming equal

opportunities and margin…
Answer Slide
questions.
Product A
Now we have a better Product B
idea of:
dpu 100 / 100 = 1 dpu dpu 200 / 100 = 2 dpu
“What does a defect
Now, can you tell which to work on?
cost?”
the product with the highest DPU? …think again!
“What product should
get the focus?” How much more time and/or raw material are required?
How much extra floor space do we need?
How much extra staff or hours are required to perform the rework?
How many extra shipments are we paying for from our suppliers?
How much testing have we built in to capture our defects?
41
§  Describe what is meant by “Process Focus”
§  Generate a Process Map
§  Describe the importance of VOC, VOB and VOE, and CTQ’s
§  Explain COPQ
§  Describe the Basic Six Sigma metrics
§  Explain the difference between FTY and RTY
§  Explain how to calculate “Defects per Unit” (DPU)
You have now completed Define Phase – Six Sigma Fundamentals.
Notes
42
Lean Six Sigma

Green Belt Training
Define Phase
Selecting Projects
Now we will continue in the Define Phase with the “Selecting Projects”.
43
Selecting Projects
Overview
The fundamentals of this phase U n d e r s ta n d in g S ix S ig m a
are Selecting Projects, Refining
and Defining and Financial
Evaluation. S ix S ig m a Fu n d a m e n ta ls
The output of the Define Phase

S e le ctin g P r o je cts
is a well developed and
articulated project. It has been
correctly stated that 50% of the Selecting
Selecting Projects
Projects
success of a project is
dependent on how well the Refining
Refining & Defining
& Defining
effort has been defined.
Fina
Financial Evaluation
ncial Evaluation
Approaches to Project Selection

Here are three approaches
There are three basic approaches to Project Selection…
for identifying projects. Do
you know what the best
approach is?
The most popular process

Blatantly Obvious Brainstorming Approach for generating and selecting
Identifies projects based on
projects is by holding
Things that clearly occur on a
repetitive basis and present individual’s experience and tribal “brainstorming” sessions. In
problems in delivering our knowledge of areas that may be
service(s) or product(s). creating problems in delivering our brainstorming sessions a
service(s) / product(s) and hopefully group of people get together,
tie to bottom-line business impact.
sometimes after polling
process owners for what
Structured Approach “blatantly obvious” problems
Identifies projects based on organizational data providing a direct plan to
effect core business metrics that have bottom-line impact. are occurring, and as a team
try to identify and refine a list
All three ways work…the Structured Approach is the most desirable. of problems that MAY be
causing issues in the
organization. Furthermore in an organization that does not have an intelligent problem-solving
methodology in-place, such as Six Sigma, Lean or even TQM, what follows the project selection
process brainstorm is ANOTHER brainstorming session focused on coming up with ideas on how to
SOLVE these problems.
Although brainstorming itself can be very structured it falls far short of being a systematic means of
identifying projects that will reduce Cost of Poor Quality throughout the organization. Why…for several
reasons. One, it does not ensure we are dealing with the most important high-impact problems but
rather what happens to be the recent fire fight initiatives. Two, usually brainstorming does not utilize a
data based approach, it relies on tribal knowledge, experience and what people THINK is happening.
As we know what people THINK is happening and what is ACTUALLY happening can be two very
different things.
44
Selecting Projects
Project Selection – Core Components
Business Case – The Business Case is a high level articulation of

the area of concern. This case answers two primary questions;
one, what is the business motivation for considering the project
and two, what is our general area of focus for the improvement
effort?
Project Charter – The Project Charter is a more detailed version of

the Business Case. This document further focuses the
improvement effort. It can be characterized by two primary
sections, one, basic project information and, two, simple project
performance metrics.
Benefits Analysis – The Benefits Analysis is a comprehensive

financial evaluation of the project. This analysis is concerned with
the detail of the benefits in regard to cost & revenue impact that
we are expecting to realize as a result of the project.
With every project there must be a minimum of 3 deliverables:

Business Case
Project Charter
Benefits Analysis
Project Selection - Governance
R e s p o n s ib le Fre q u e n cy
P a rty R e s o u r ce s o f U p d a te
B u s in e s s C hampion Business Unit

N/A
Ca se (Process O wner) Members
C hampion (Process
P ro je ct
Six Sig ma Belt O wner) & O ng oing
C h a rte r
Master Black Belt
Benefits C apture C hampion (Process

B e n e fits O ng oing /
Ma nag er or O wner) &
A n a ly s is D,M,A ,I,C
Unit Fina ncial Rep Six Sig ma Belt
45
Selecting Projects
A Structured Approach – A Starting Point
These are some The Starting Point is defined by the Champion or Process Owner with the
examples of Business Case as the output.
Business Metrics or –  These are some examples of business metrics or Key Performance Indicators
Key Performance commonly referred to as KPI’s.
Indicators. –  The tree diagram is used to facilitate the process of breaking down the metric of
interest.
What metric should
you focus on…it !  EBIT
depends? What is Level 2
the project focus? !  Cycle time
What are your !  Defects Level 2
organizations
strategic goals? !  Cost Level 1
Level 2
Are Cost of Sales !  Revenue
preventing growth?
Are customer !  Complaints Level 2
complaints !  Compliance
resulting in lost
earnings? Are !  Safety
excess cycle times
and yield issues eroding market share? Is the fastest growing division of the business the
refurbishing department?
It depends because the motivation for organizations vary so much and all projects should be directly
aligned with the organizations objectives. Answer the question: What metrics are my department not
meeting? What is causing us pain?
A Structured Approach - Snapshot
Once a metric point

has been determined The KPI’s need to broken down into actionable levels.
another important
question needs to be
Business Measures
asked - What is my Actionable Level
metric a function of? Key Performance Indicators (KPIs)
In other words what
are all of the things
that affect this metric?
We utilize the Tree

Level 2 Level 3 Activities Processes
Diagram to facilitate
Level 1
the process of
breaking down the
metric of interest.
When creating the tree diagram you will eventually run into activities which are made up of
processes. This is where projects will be focused because this is where defects, errors and
waste occur.
46
Selecting Projects
Business Case Components – Level 1
Primary Business Measure or Key Performance Indicator (KPI)

Level 1
–  Focus on one primary business measure or KPI.
–  Primary business measure should bear a direct line of sight with the
organization’s strategic objectives.
–  As the Champion narrows in on the greatest opportunity for improvement this

provides a clear focus for how the success will be measured.
Be sure to start with higher level metrics, whether they are measured at the Corporate Level,
Division Level or Department Level, projects should track to the Metrics of interest within a given
area. Primary Business Measures or Key Performance Indicators (KPI’s) serve as indicators of the
success of a critical objective.
Business Case Components – Business Measures
Post business measures (product/service) of the

primary business measure are lower level metrics
and must focus on the end product to avoid internal
optimization at expense of total optimization.
Business Business
Processes Activities
Measure Measure
Primary Business
Measure
Business Business
Measure Measure
Post business measures, be they a product or a service, are lower level metrics and must focus
on the end product.
47
Selecting Projects
Business Case Components – Processes
Business Business
Measure Measure
Primary Business
Measure
Business Business
Measure Measure
Y = f (x1, x2, x3…xn )

First Call Resolution = f (Calls, Operators, Resolutions…xn )
Black Box Testing = f (Specifications, Simulation, Engineering…xn)
Business measures are a function of processes. These processes are usually created or enforced
by direct supervision of functional managers. Processes are usually made up of a series of activities
or automated steps.
Business Case Components - Activities
Business Business
Measure Measure
Primary Business
Measure
Business Business
Measure Measure
Y = f (x1, x2, x3…xn )

Resolutions = f (New Customers, Existing Customers, Defective Products…xn )
Simulation = f (Design, Data, modeling…xn )
The Activities represent the final stage of the matrix where multiple steps result in the delivery of
some output for the customer. These deliverables are set by the business and customer and
are captured within the Voice of the Customer, Voice of the Business or Voice of the Employee.
These activities are the X’s that determine the performance of the Y which is where the actual
breakthrough projects should be focused.
48
Selecting Projects
What is a Business Case?
The Business Case

is created to ensure The Business Case communicates the need for the
the strategic need project in terms of meeting business objectives.
for your project. It
is the first step in
project description The components are:
development. –  Output unit (product/service) for external customer
–  Primary business measure of output unit for project
–  Baseline performance of primary business measure
–  Gap in baseline performance of primary business measure from
business objective
Let’s get down

to business!
Business Case Example
During FY 2005, the First Time Call Resolution

Efficiency for New Customer Hardware Setup Here is an example of an
was 89% . Business Case. This defines
the problem and provides
evidence of the problem.
This represents a gap of 8% from the industry
standard of 97% that amounts to a potential of
$2,000,000US of annualized cost impact.
As you review this statement remember the following format of what needs to be in a Business
Case: WHAT is wrong, WHERE and WHEN is it occurring, what is the BASELINE magnitude at
which it is occurring and what is it COSTING me?
You must take caution to avoid under-writing a Business Case. Your natural tendency is to write too
simplistically because you are already familiar with the problem. You must remember if you are to
enlist support and resources to solve your problem others will have to understand the context and
the significance in order to support you.
The Business Case cannot include any speculation about the cause of the problem or what actions
will be taken to solve the problem. It is important you do not attempt to solve the problem or bias the
solution at this stage. The data and the Six Sigma methodology will find the true causes and
solutions to the problem.
The next step is getting project approval.
49
Selecting Projects
The Business Case Template
Fill in the Blanks for Your Project:
During ___________________________________ , the ____________________ for

(Period of time for baseline performance) (Primary business measure)
________________________ was _________________ .

(A key business process) (Baseline performance)
This gap of ____________________________

(Business objective target vs. baseline)
from ___________________ represents ____________________ of cost impact.

(Business objective) (Cost impact of gap)
You need to make sure your own Business Case captures the units of pain, the business measures, the
performance and the gaps. If this template does not seem to be clicking use your own or just free form
your Business Case ensuring it is well articulated and quantified.
Business Case Exercise
Exercise objective: To understand how to create a strong

Business Case.
1.  Complete the Business Case template below to the best of

your ability.
During ________________________ , the ____________________ for

(Period of time for baseline performance) (Primary business measure)
_______________________ was ___________________ .

(A key business process) (Baseline performance)
This gap of __________________________

(Business objective target vs. baseline)
from __________________ represents ____________ of cost impact.

(Business objective) (Cost impact of gap)
Using the Excel file ‘Define Templates.xls’, Business Case, perform this exercise.
50
Selecting Projects
What is a Project Charter?

The Charter expands on the Business Case, it clarifies the projects focus and measures of
project performance and is completed by the Six Sigma Belt.
Components:
• The Problem
• Project Scope
• Project Metrics
• Prima ry & Secondary
• G ra phical Displa y of Project Metrics
• Prima ry & Secondary
• Sta nda rd project informa tion
• Project, Belt & Process O wner
na mes
• Sta rt da te & desired End da te
• Division or Business Unit
• Supporting Ma ster Bla ck Belt
(Mentor)
• Team Members
The Project Charter is an important document – it is the initial communication of the project. The first
phases of the Six Sigma methodology are Define and Measure. These are known as
“Characterization” phases that focus primarily on understanding and measuring the problem at hand.
Therefore some of the information in the Project Charter, such as primary and secondary metrics, can
change several times. By the time the Measure Phase is wrapping up the Project Charter should be
in its final form meaning defects and the metrics for measuring them are clear and agreed upon.
As you can see some of the information in the Project Charter is self explanatory, especially the first
section. We are going to focus on establishing the Problem Statement and determining Objective
Statement, scope and the primary and secondary metrics.
Project Charter - Definitions
•  Problem Statement - Articulates the pain of the defect or error in the

process.
•  Objective Statement – States how much of an improvement is desired

from the project.
•  Scope – Articulates the boundaries of the project.
•  Primary Metric – The actual measure of the defect or error in the process.
•  Secondary Metric(s) – Measures of potential consequences (+ / -) as a

result of changes in the process.
•  Charts – Graphical displays of the Primary and Secondary Metrics over a

period of time.
51
Selecting Projects
Project Charter - Problem Statement
Migrate the Business Case into a Problem Statement…
First the Business

Case will serve as the
Problem Statement, as
the Belt learns more
about the process and
the defects that are
occurring.
Project Charter – Objective & Scope
Consider the following for

constructing your Objective &
Scope:
What represents a significant

improvement?
§  X amount of an increase in
yield
§  X amount of defect reduction
§  Use Framing Tools to establish
the initial scope
A project’s main objective is to solve

a problem! The area highlighted is
for articulating how much of a
reduction or improvement will yield a
significant impact to the process
and business.
This is the starting point creating

your project’s Objective Statement.
52
Selecting Projects
Pareto Analysis
Assisting you in determining what inputs are having the greatest impact on your process is the
Pareto Analysis approach.
Pareto Analysis ~
•  A bar graph used to arrange information in such a manner that priorities for
process improvement can be established.
•  The 80-20 theory was first developed in 1906 by Italian economist, Vilfredo
Pareto, who observed an unequal distribution of wealth and power in a
relatively small proportion of the total population. Joseph M. Juran is credited
with adapting Pareto's economic observations to business applications.
The 80:20 Rule Examples
•  20% of the time expended produced 80% of the results
•  80% of your phone calls go to 20% of the names on your list
•  20% of the streets handle 80% of the traffic
•  80% of the meals in a restaurant come from 20% of the menu
•  20% of the paper has 80% of the news
•  80% of the news is in the first 20% of the article

Here are some
•  20% of the people cause 80% of the problems examples of the 80:20
Rule. Can you think of
•  20% of the features of an application are used 80% of the time any other examples?
53
Selecting Projects
Pareto Chart - Tool

Multi level Pareto Charts are used in a drill down fashion to get to Root Cause of the tallest bar.
Level 1
Level 2
Level 3
The Pareto Charts are often referred to as levels. For instance the first graph is called the first level,
the next the second level and so on. Start high and drill down. Let’s look at how we interpret this and
what it means.
Let’s look at the following example.
By drilling down from the first level Level 2

we see that Department J makes
up approximately 60% of the scrap
and part Z101 makes up 80% of
Dept. J’s scrap.
See how we are creating focus and

establishing a line of sight?
You many be eager to jump into

trying to fix the problem once you
have identified it, BE CAREFUL.
Level 3
This is what causes rework and
defects in the first place.
Follow the methodology, be patient

and you will eventually be led to a
solution.
54
Selecting Projects
Pareto Chart - Example
§  Open MINITABTM and select •  Use the “Call Center.mtw”

Pareto Analysis as shown above worksheet to create a Pareto Chart
What would you do with this Pareto?
When your
Pareto shows up
like this your focus
is on the 80-20
which is across
the “incorrectly
routed and
dropped calls”
totaling to about
80%.
55
Selecting Projects
Pareto Chart – Example (cont.)

Let’s look at the problem a little differently…
- Using a higher level scope for the first Pareto may help in providing focus.
- Create another Pareto as shown below.
This gives a deeper picture of which product category contributes the highest defect count.
Now we have something to work with. Notice the 80% area…. draw a line from the 80% mark
across to the cumulative percent line (Red Line) in the graph as shown here. Which cards create
the highest Defect Rates?
Now you are beginning to see what needs work to improve the performance of your project.
56
Selecting Projects
Pareto Chart – Example (cont.)
Now that we have an area of focus drill down one more level.
–  This chart will only use the classifications within the first bar on the
previous chart.
–  Create another Pareto which will drill down to the categories within
the Card Type from the previous Pareto.
Remember to keep focused on finding the biggest bang for the buck.
Now what? We have got ourselves another Pareto …
Essentially this tells us there is clear direction to the major defects within the Platinum Business
Accounts.
57
Selecting Projects
Project Charter – Primary Metric
Moving on to the next Establishing the Primary Metric:

- Quantified measure of the defect - Serves as the indicator of project success
element of the Project - Links to the KPI or Primary Business measure - Only one Primary Metric per project
Charter…, Using the
Excel file ‘Define
The Primary
Templates.xls’,
Metric is a very
Project Charter, important
perform the following measure in the
exercise: Six Sigma project;
this metric is a
Since we will be quantified
measure of the
narrowing in on the
defect or primary
defect through the issue of the
Measure Phase it is project.
common for the
Primary Metric to
change several times
while we struggle to
understand what is
happening in our
process of interest.
The Primary Metric also serves as the gauge for when we can claim victory with the project.
Project Charter – Secondary Metrics

Consider a project
focused on improving Establishing Secondary Metric(s):
duration of call times - Measures positive & negative consequences as a result of changes in the process
- Can have multiple Secondary Metrics
(cycle time) in a call
center. If we realize a Secondary Metrics
reduction in call time are put in place to
measure potential
you would want to changes that may
know if anything else occur as a result of
was effected. making changes to
our Primary Metric.
Think about it…did They will measure
overtime increase / ancillary changes in
reduce, did labor the process, both
increase / reduce, what positive and
negative.
happened to customer
satisfaction ratings?
These are all things
that should be
measured in order to
accurately capture the
true effect of the
improvement.
58
Selecting Projects
Project Charter – Metric Charts

The Project
Charter template Generating Charts:
–  Displays Primary and Secondary Metrics over time
includes the –  Should be updated regularly throughout the life of the project
graphing –  One for Primary Metric and one for each of the Secondary Metrics Primary and
–  Typically utilize Time Series Plots Secondary Metrics
capabilities shown
here. It is should be
continually
acceptable to not measured and
use this template frequently updated
but in any case during the projects
ensure you are lifecycle.
regularly Use them as your
measuring the gauge of Project
critical metrics. Success and
Status. This is
where your
Project’s progress
will be apparent.
Project Charter Exercise
Using the Excel

file ‘Define Exercise objective: To begin planning the Project Charter
Templates.xls’, deliverable.
Project Charter, 1.  Complete the Project Charter template to the best of your
perform this ability.
exercise. 2.  Be prepared to explain all aspects of this charter to your
mentor.
Project Charter Template.xls
59
Selecting Projects
What is the Financial Evaluation?
The Financial Evaluation establishes the value of the project.
The components are:

–  Impact OK, let’s add it up!
•  Sustainable
•  One-off
–  Allocations
•  Cost Codes / Accounting System
–  Forecast
•  Cash flow
•  Realization schedule
Typically a financial representative is responsible for evaluating the financial

impact of the project. The Belt works in coordination to facilitate the proper
information.
Standard financial principles should be followed at the beginning and end of the project to provide a
true measure of the improvement’s effect on the organization.
A financial representative of the firm should establish guidelines on how savings will be calculated
throughout the Lean Six Sigma deployment.
Benefits Capture - Calculation “Template”
Whatever
your There are two
organization’s I
M
types of Impact:
Sustainable Impact One-Off Impact
protocol may P
A
C
One Off &
be these T
Sustainable
aspects
should be Cost Codes
accounted for C
allocate the impact
O
within any S
T Reduced Increased
Costs
Implemen-
Capital
to the appropriate
improvement C
O
Costs Revenue tation
area in the
D
project. E
S Books
Forecasts allow for

F
O
Realization Schedule proper
(Cash Flow)
R
E
C
management of
A
S
T
projects and
By Period
(i.e. Q1,Q2,Q3,Q4) resources
60
Selecting Projects
Benefits Capture - Basic Guidelines
•  Benefits should be calculated on the baseline of key

business process performance that relate to a business
measure or KPI(s).
•  The Project Measure (Primary Metric) has to have a direct

link between the process and its KPIs.
•  Goals have to be defined realistically to avoid under or over

setting.
•  Benefits should be annualized.
•  Benefits should be measured in accordance with Generally

Accepted Accounting Principles (GAAP).
When calculating project benefits you should follow these steps.
Benefits Capture - Categorization

Here is an example of how to categorize your project’s impact.
A
•  Projects directly impacting the Income Statement or Cash Flow
Statement.
B
•  Projects impact the Balance Sheet (working capital).
C•  Projects avoid expense or investment due to known or expected

events in the future (cost avoidance).
D•  Projects are risk management, insurance, Safety, Health,

Environment and Community related projects which prevent or
reduce severity of unpredictable events.
You do not want to take this one home!
61
Selecting Projects
Benefits Calculation Involvement & Responsibility
Project Selection D-M-A-I-C Implementation 6 Month Audit
Financial Financial Financial Financial

Representative Representative Representative Representative
Champion Black Belt Champion Process Owner

& &
Process Owner Process Owner
It is highly recommended that you follow the involvement governance shown here.
Benefits Capture - Summary
•  Performance tracking for Six Sigma Projects should use the

same discipline that would be used for tracking any other
high-profile projects.
•  The A-B-C-D categories can be used to illustrate the impact of

your project or a portfolio of projects.
•  Establish the Governance Grid for Responsibility &

Involvement.
This is the
one we want!
Just some recommendations to consider when running your projects or program.
62
Selecting Projects
Benefits Calculation Template
The Benefits
Calculation Template
facilitates and aligns
with the aspects
discussed for Project
Accounting.
The Excel file ‘Define

Templates.xls’,
BENEFITS
CALCULATION
TEMPLATE.
63
Selecting Projects
§  Understand the various approaches to project selection
§  Articulate the benefits of a “Structured Approach”
§  Refine and Define the business problem into a Project

Charter to display critical aspects of an improvement
project
§  Make initial financial impact estimate
You have now completed Define Phase – Selecting Projects.
Notes
64
Lean Six Sigma

Green Belt Training
Define Phase
Elements of Waste
Now we will continue in the Define Phase with “Elements of Waste”.
65
Elements of Waste
Overview
The fundamentals of
this phase are the 7 U n d e r s ta n d in g S ix S ig m a
components of waste
and 5S.
S ix S ig m a Fu n d a m e n ta ls
We will examine the
meaning of each of
these and show you
how to apply them. S e le ctin g P r o je cts
77 CCoom
m ppoonneennts
ts oof f W
W aa sste
te
55 SS
Definition of Lean
Lean Enterprise is based on the premise that anywhere

work is being done, waste is being generated.
The Lean Enterprise seeks to organize its processes to

the optimum level, through the continual focus on the
identification and elimination of waste.
-- Barbara Wheat
Lean Six Sigma Master Black Belt
66
Elements of Waste
Lean – History
1885 1913 1955 - 1990 1993 -

Craft Production Mass Production Toyota Production Lean Enterprise
- Machine then harden - Part inter-changeability System - "Lean" applied to all
- Fit on assembly - Moving production line - Worker as problem functions in enterprise
- Customization - Production engineering solver value stream
- Highly skilled workforce - "Workers don't like to - Worker as process - Optimization of value
- Low production rates think" owner enabled by: delivered to all
- High Cost - Unskilled labor -- Training stakeholders and
- High production rates -- Upstream quality enterprises in value chain
- Low cost -- Minimal inventory - Low cost
- Persistent quality -- Just-in-time - Improving productivity
problems - Eliminate waste - High quality product
- Inflexible models - Responsive to change - Greater value for
- Low cost stakeholders
- Improving productivity
- High quality product
Lean Manufacturing has been going on for a very long time, however the phrase is credited to
James Womac in 1990. The small list of accomplishments noted above are primarily focused on
higher volume manufacturing.
Lean Six Sigma

The essence of Lean is to Lean Six Sigma combines the strengths of each system:
concentrate effort on removing
waste while improving process •  Lean •  Six Sigma
–  Guiding principles based –  Focus on voice of the
flow to achieve speed and agility operating system customer
at lower cost. The focus of Lean –  Relentless elimination of all –  Data and fact based decision
is to increase the percentage of waste making
value-added work performed by –  Creation of process flow and –  Variation reduction to near
demand pull perfection levels
a company. Lean recognizes
–  Resource optimization –  Analytical and statistical rigor
most businesses spend a
–  Simple and visual
relatively small portion of their
energies on the true delivery of Strength: Efficiency Strength: Effectiveness
value to a customer. While all
companies are busy it is
estimated for some companies An Extremely Powerful Combination!
that as little as 10% of their
time is spent on value-added work, meaning as much as 90% of time is allocated to non value-added
activities, or waste.
Forms of waste include: Wasted capital (inventory), wasted material (scrap), wasted time (cycle time),
wasted human effort (inefficiency, rework) and wasted energy (energy inefficiency). Lean is a
prescriptive methodology for relatively fast improvements across a variety of processes, from
administrative to manufacturing applications. Lean enables your company to identify waste where it
exists. It also provides the tools to make improvements on the spot.
67
Elements of Waste
Lean Six Sigma (cont.)

Lean focuses on what it calls the Value Stream, the sequence of activities and work required to
produce a product or to provide a service. It is similar to a Linear Process Flow Map but it
contains its own unique symbols and data. The Lean method is based on understanding how the
Value Stream is organized, how work is performed, which work is value added versus non-value
added and what happens to products and services and information as they flow through the Value
Stream. Lean identifies and eliminates the barriers to efficient flow through simple, effective tools.
Lean removes many forms of waste so Six Sigma can focus on eliminating variability. Variation
leads to defects, which is a major source of waste. Six Sigma is a method to make processes
more capable through the reduction of variation. Thus the symbiotic relationship between the two
methodologies.
Project Requirements for Lean
•  Perhaps one of the most criminal employee performance issues in

today’s organizations is generated not by a desire to cheat one’s
employer but rather by a lack of regard to waste.
•  In every work environment there are multiple opportunities for
reducing the non-value added activities that have (over time)
become an ingrained part of the standard operating procedure.
•  These non-value added activities have become so ingrained in our
process they are no longer recognized for what they are, WASTE.
•  waste (v.) Anything other than the minimum amount of time,
material, people, space, energy, etc. needed to add value to the
product or service you are providing.
•  The Japanese word for waste is muda.
Get that stuff

outta here!
Employees at some level have been de-sensitized to waste: “That is what we have always
done.”
Lean brings these opportunities for savings back into focus with specific approaches to finding
and eliminating waste.
68
Elements of Waste
Seven Components of Waste
Muda is classified into seven components:

–  Overproduction
–  Correction (defects)
–  Inventory
–  Motion
–  Overprocessing
–  Conveyance
–  Waiting
Sometimes additional forms of muda are added:

–  Under use of talent
–  Lack of safety
Being Lean means eliminating waste.
Overproduction
Overproduction is producing more than the next step needs

or more than the customer buys.
–  It may be the worst form of waste because it contributes to all the
others.
Examples are:
! Preparing extra reports
! Reports not acted upon or even read
! Multiple copies in data storage
! Over-ordering materials
! Duplication of effort/reports
Waste of Overproduction relates to the

excessive accumulation of work-in-process
(WIP) or finished goods inventory.
Producing more parts than necessary to satisfy the customer’s quantity demand thus leading to
idle capital invested in inventory.
Producing parts at a rate faster than required such that a work-in-process queue is created –
again, idle capital.
69
Elements of Waste
Correction
Correction of defects is as obvious as it sounds.
Examples are:
! Incorrect data entry
! Paying the wrong vendor
! Misspelled words in
communications
! Making bad product
! Materials or labor
discarded during
production
Eliminate erors!!
Waste of Correction includes the waste of handling
and fixing mistakes. This is common in both
manufacturing and transactional settings.
Correcting or repairing a defect in materials or parts adds unnecessary costs because of additional
equipment and labor expenses. An example is the labor cost of scheduling employees to work
overtime to rework defects.
Inventory
Inventory is the liability of materials that are bought, invested

in and not immediately sold or used.
Examples are:
! Transactions not
processed
! Bigger in box than out

box
! Over-ordering materials
consumed in-house
! Over-ordering raw
materials – just in case
Waste of Inventory is identical to overproduction

except it refers to the waste of acquiring raw material
before the exact moment it is needed.
Inventory is a drain on an organization’s overhead. The greater the inventory, the higher the
overhead costs become. If quality issues arise and inventory is not minimized, defective material is
hidden in finished goods.
To remain flexible to customer requirements and to control product variation we must minimize
inventory. Excess inventory masks unacceptable change-over times, excessive downtime, operator
inefficiency and a lack of organizational sense of urgency to produce product.
70
Elements of Waste
Motion
Motion is the unnecessary movement of people and

equipment.
–  This includes looking for things like documents or parts as well as
movement that is straining.
Examples are:
! Extra steps
! Extra data entry
! Having to look for

something
Waste of Motion examines how people move to

ensure that value is added.
Any movement of people or machinery that does not contribute added value to the product; i.e.
programming delay times and excessive walking distance between operations.
Overprocessing
Overprocessing is tasks, activities and materials that do not

add value.
–  Can be caused by poor product or tool design as well as from not
understanding what the customer wants.
Examples are:
! Sign-offs
! Reports containing more

information than the
customer wants or needs
! Communications, reports,
emails, contracts, etc.
containing more than the
necessary points (briefer is
better)
Waste of Overprocessing relates to over-
processing anything that may not be adding ! Voice mails that are too
value in the eyes of the customer. long
Processing work that has no connection to advancing the line or improving the quality of the product.
Examples include typing memos that could be had written or painting components or fixtures internal
to the equipment.
71
Elements of Waste
Conveyance
Conveyance is the unnecessary movement of material and

goods.
–  Steps in a process should be located close to each other so movement is
minimized.
Examples are:
! Extra steps in the process
! Distance traveled
! Moving paper from place

to place
Waste of Conveyance is the movement of material.
Conveyance is incidental, required action that does not directly contribute value to the product.
Perhaps it must be moved however, the time and expense incurred does not produce product or
service characteristics that customers see.
It is vital to avoid conveyance unless it is supplying items when and where they are needed (i.e.
just-in-time delivery).
Waiting
Waiting is nonproductive time due to lack of material, people

or equipment.
–  Can be due to slow or broken machines, material not arriving on
time, etc.
Examples are:
! Processing once each

month instead of as the
work comes in
! Showing up on time for a

meeting that starts late
! Delayed work due to lack

of communication from
another internal group
Waste of Waiting is the cost of an idle resource.
Idle time between operations or events; i.e. an employee waiting for machine cycle to finish or a
machine waiting for the operator to load new parts.
72
Elements of Waste
Waste Identification Exercise
Exercise objective: To identify waste that occurs in

your processes.
Write an example of each type of Muda below:
–  Overproduction ___________________
–  Correction ___________________
–  Inventory ___________________
–  Motion ___________________
–  Overprocessing ___________________
–  Conveyance ___________________
–  Waiting ___________________
Notes
73
Elements of Waste
5S – The Basics
5S is a process designed to organize the workplace, keep it

neat and clean, maintain standardized conditions and instill the
discipline required to enable each person to achieve and
maintain a world class work environment.
•  Seiri - Put things in order

•  Seiton - Proper Arrangement
•  Seiso – Clean
•  Seiketsu – Purity
•  Shitsuke - Commitment
The term “5S” derives from the Japanese words for five practices leading to a clean and
manageable work area. The five “S” are:
‘Seiri' means to separate needed tools, parts and instructions from unneeded materials and to
remove the latter.
'Seiton' means to neatly arrange and identify parts and tools for ease of use.
'Seiso' means to conduct a cleanup campaign.
'Seiketsu' means to conduct seiri, seiton and seiso at frequent, indeed daily, intervals to maintain a
workplace in perfect condition.
'Shitsuke' means to form the habit of always following the first four S’s.
Simply put, 5S means the workplace is clean, there is a place for everything and everything is in its
place. The 5S will create a work place that is suitable for and will stimulate high quality and high
productivity work. Additionally it will make the workplace more comfortable and a place of which you
can be proud.
Developed in Japan, this method assume no effective and quality job can be done without clean and
safe environment and without behavioral rules.
The 5S approach allows you to set up a well adapted and functional work environment, ruled by
simple yet effective rules. 5S deployment is done in a logical and progressive way. The first three S’s
are workplace actions, while the last two are sustaining and progress actions.
It is recommended to start implementing 5S in a well chosen pilot workspace or pilot process and
spread to the others step by step.
74
Elements of Waste
English Translation
There have been many attempts to force 5 English “S” words to maintain the original intent of 5S
from Japanese. Listed below are typical English words used to translate:
1. Sort (Seiri)
2. Straighten or Systematically Arrange (Seiton)
3. Shine or Spic and Span (Seiso)
4. Standardize (Seiketsu)
5. Sustain or Self-Discipline (Shitsuke)
Place things in such

a way they can be
easily reached
whenever they are
needed.
Straighten
Shine
Sort
5S
Visual sweep of areas,
Identify necessary items and remove eliminate dirt, dust and
unnecessary ones, use time management. scrap. Make workplace
shine.
Self-Discipline
Standardize
Make 5S strong in
Work to standards,
habit. Make problems
maintain standards, wear
appear and solve them.
safety equipment.
Regardless of which “S” words you use the intent is clear: Organize the workplace, keep it neat
and clean, maintain standardized conditions and instill the discipline required to enable each
individual to achieve and maintain a world class work environment.
75
Elements of Waste
5S Exercise
Exercise objective: To identify elements of 5S in your

workplace.
Write an example for each of the 5S’s below:
•  Sort ____________________
•  Straighten ____________________
•  Shine ____________________
•  Standardize ____________________
•  Self-Discipline ____________________
Notes
76
Elements of Waste
§  Describe 5S
§  Identify and describe the 7 Elements of Waste
§  Provide examples of how Lean Principles can affect your area
You have now completed Define Phase – Elements of Waste.
Notes
77
Lean Six Sigma

Green Belt Training
Define Phase
Wrap Up and Action Items
Now we will conclude the Define Phase with “Wrap Up and Action Items”.
78
Define Phase Overview—The Goal
The goal of the Define Phase is to:
•  Identify a process to improve and develop a specific Lean Six

Sigma project.
–  Lean Six Sigma Belts define critical processes, sub-processes and
identify the decision points in those processes.
•  Define is the contract phase of the project. We are determining

exactly what we intend to work on and estimating the impact to
the business.
•  At the completion of the Define Phase you should have a

description of the process defect that is creating waste for the
business.
Define Action Items
At this point you should all understand what is necessary to

complete these action items associated with Define.
–  Charter Benefits Analysis

–  Team Members
–  Process Map – high level
–  Primary Metric
–  Secondary Metric(s)
–  Lean Opportunities
–  Stakeholder Analysis
Deliver
–  Project Plan the
–  Issues and Barriers Goods!
79
Six Sigma Behaviors
•  Being tenacious, courageous
•  Being rigorous, disciplined
•  Making data-based decisions
•  Embracing change & continuous learning Walk

the
•  Sharing best practices Walk!
Each player in the Six Sigma process must be

A ROLE MODEL
for the Six Sigma culture.
Define Phase — The Roadblocks
Look for the potential roadblocks and plan to address

them before they become problems:
–  No historical data exists to support the project.
–  Team members do not have the time to collect data.
–  Data presented is the best guess by functional managers.
–  Data is communicated from poor systems.
–  The project is scoped too broadly.
–  The team creates the ideal Process Map rather than the
as is Process Map.
Clear the road

– I m comin
through!
80
DMAIC Roadmap
Process Owner
Champion/

Define
Estimate COPQ
Establish Team
Measure

Analyze
Prove/Disprove Impact X’s Have On Problem

Improve

Control
Define Phase Deployment

The importance of the Define
Phase is to begin to understand Business Case
the problem and formulate it into a Selected
project. Notice that if the

Notify Belts and Stakeholders
Recommended Project Focus is
approved the next step would be
Create High-Level Process Map
team selection.
(Pareto, Project Desirability)
Define & Charter Project

(Problem Statement, Objective, Primary Metric, Secondary Metric)
N Estimate COPQ
Approved
Project Recommend Project Focus
Focus
Y
Create Team
Charter Team
Ready for Measure
81
Action Items Support List
Define Questions
Step One: Project Selection, Project Definition And Stakeholder Identification
Project Charter
• What is the Problem Statement? Objective?
• Is the Business Case developed?
• What is the primary metric?
• What are the secondary metrics?
• Why did you choose these?
• What are the benefits?
• Have the benefits been quantified? It not, when will this be done?
Date:____________________________
• Who is the customer (internal/external)?
• Has the COPQ been identified?
• Has the controller’s office been involved in these calculations?
• Who are the members on your team?
• Does anyone require additional training to be fully effective on the team?
Voice of the Customer (VOC) and SIPOC defined
• Voice of the Customer identified?
• Key issues with stakeholders identified?
• VOC requirements identified?
• Business Case data gathered, verified and displayed?
Step Two: Process Exploration
Processes Defined and High Level Process Map
• Are the critical processes defined and decision points identified?
• Are all the key attributes of the process defined?
• Do you have a high level process map?
• Who was involved in its development?
General Questions
• Are there any issues/barriers that prevent you from completing this phase?
• Do you have adequate resources to complete the project?
• Have you completed your initial Define report out presentation?
These are some additional questions to ensure all the deliverables are achieved.
82
At this point you should:
§  Have a clear understanding of the specific action items
§  Have started to develop a project plan to complete the action items
§  Have identified ways to deal with potential roadblocks
§  Be ready to apply the Six Sigma method within your business
You have now completed Define Phase.
Notes
83
Lean Six Sigma

Green Belt Training
Measure Phase
Welcome to Measure
Now that we have completed Define we are going to jump into the Measure Phase.
Here you enter the world of measurement where you can discover the ultimate source of
problem-solving power: data. Process improvement is all about narrowing down to the vital few
factors that influence the behavior of a system or a process. The only way to do this is to
measure and observe your process characteristics and your critical-to-quality characteristics.
Measurement is generally the most difficult and time-consuming phase in the DMAIC
methodology. But if you do it well, and right the first time, you will save your self a lot of trouble
later and maximize your chance of improvement.
Welcome to the Measure Phase - will give you a brief look at the topics we are going to cover.
84
Welcome to Measure
Overview
These are the modules

we will cover in the Welc
Welcome to Meas
ome to Measure
ure
Measure Phase.
PProc
roces
esss Dis
Disccovery
overy
SS ix S
ix S ig
igma S
ma S tatis
tatistic
ticss
Meas
Measurement S
urement S ys
ys tem Analys
tem Analys is
is
PProc
roces
esss C
C apability
apability
Wrap Up & Ac

tion Items
DMAIC Roadmap
Process Owner
Champion/

Define
Estimate COPQ
Establish Team
Measure

Analyze

Improve

Control
Here is the overview of the DMAIC process. Within measure we are going to start getting into
details about process performance, measurement systems and variable prioritization.
85
Welcome to Measure
Measure Phase Deployment
Detailed Problem Statement Determined
Detailed Process Mapping
Identify All Process X’s Causing Problems (Fishbone, Process Map)
Select the Vital Few X’s Causing Problems (X-Y Matrix, FMEA)
Assess Measurement System
Y
Repeatable &
Reproducible?
N
Implement Changes to Make System Acceptable
Assess Stability (Statistical Control)
Assess Capability (Problem with Centering/Spread)
Estimate Process Sigma Level
Review Progress with Champion
Ready for Analyze
This provides a process look at putting “Measure” to work. By the time we complete this phase you
will have a thorough understanding of the various Measure Phase concepts.
86
Lean Six Sigma

Green Belt Training
Measure Phase
Process Discovery
Now we will continue in the Measure Phase with “Process Discovery”.
87
Process Discovery
Overview
Welcome to Measure
Process Discovery
Cause and Effect Diagrams
Detailed Process Mapping
FMEA
Six Sigma Statistics
Measurement System Analysis
Process Capability
The purpose of this module is highlighted above. We will review tools to help facilitate Process
Discovery.
This will be a lengthy step as it requires a full characterization of your selected process.
There are four key deliverables from the Measure Phase:

(1) A robust description of the process and its workflow
(2) A quantitative assessment of how well the process is actually working
(3) An assessment of any measurement systems used to gather data for making decisions or to
describe the performance of the process
(4) A “short” list of the potential causes of our problem, these are the X’s that are most likely
related to the problem.
On the next lesson page we will help you develop a visual and mental model that will give you
leverage in finding the causes to any problem.
88
Process Discovery
Overview of Brainstorming Techniques
We utilize Brainstorming techniques to populate a Cause and Effect

Diagram seeking ALL possible causes for our issue of concern.
Cause and Effect Diagram

People Machine Method
The
The Y
Y
The or
Problem
The X
X’ss Problem
Condition
(Causes)
l
Material Measurement Environment Categories
You will need to use brainstorming techniques to identify all possible problems and their causes.
Brainstorming techniques work because the knowledge and ideas of two or more persons is always
greater than that of any one person.
Brainstorming will generate a large number of ideas or possibilities in a relatively short time.
Brainstorming tools are meant for teams, but can be used at the individual level also. Brainstorming
will be a primary input for other improvement and analytical tools you will use.
You will learn two excellent brainstorming techniques, Cause and Effect Diagrams and affinity
diagrams. Cause and Effect Diagrams are also called Fishbone Diagrams because of their
appearance and sometimes called Ishikawa diagrams after their inventor.
In a brainstorming session ideas are expressed by those in the session and written down without
debate or challenge. The general steps of a brainstorming sessions are:
1.  Agree on the category or condition to be considered.

2.  Encourage each team member to contribute.
3.  Discourage debates or criticism, the intent is to generate ideas and
not to qualify them, that will come later.
4.  Contribute in rotation (take turns), or free flow, ensure every member
has an equal opportunity.
5.  Listen to and respect the ideas of others.
6.  Record all ideas generated about the subject.
7.  Continue until no more ideas are offered.
8.  Edit the list for clarity and duplicates.
89
Process Discovery
CauseCause
and Effect
and Effect
Diagram
Diagram
People
People Machine
Machine Method
Method
The
TheYY
Theor
Problem
The X
X’ss Problem
Problem
Condition
(Causes)
l
Material
Material Measurement
Measurement Environment
Environment
Categories
Categories
Products Categories for the legs of the Transactional

–  Measurement diagram can use templates –  People
–  People for products or transactional –  Policy
–  Method symptoms. Or you can select –  Procedure
–  Materials the categories by process –  Place
–  Equipment step or what you deem –  Measurement
–  Environment appropriate for the situation. –  Environment
A Cause and Effect Diagram is a composition of lines and words representing a meaningful
relationship between an effect, or condition, and its causes. To focus the effort and facilitate thought,
the legs of the diagram are given categorical headings. Two common templates for the headings
are for product related and transactional related efforts. Transactional is meant for processes where
there is no traditional or physical product; rather it is more like an administrative process.
Transactional processes are characterized as processes dealing with forms, ideas, people,
decisions and services. You would most likely use the product template for determining the cause of
burnt pizza and use the transactional template if you were trying to reduce order defects from the
order taking process. A third approach is to identify all categories as you best perceive them.
When performing a Cause and Effect Diagram keep drilling down, always asking why, until you find
the Root Causes of the problem. Start with one category and stay with it until you have exhausted
all possible inputs then move to the next category. The next step is to rank each potential cause by
its likelihood of being the Root Cause. Rank it by the most likely as a 1, second most likely as a 2
and so on. This make take some time, you may even have to create sub-sections like 2a, 2b, 2c,
etc. Then come back to reorder the sub-section in to the larger ranking. This is your first attempt at
really finding the Y = f(X); remember the funnel? The top X’s have the potential to be the Critical X’s,
those X’s which exert the most influence on the output Y.
Finally you will need to determine if each cause is a control or a Noise Factor. This as you know is
a requirement for the characterization of the process. Next we will explain the meaning and
methods of using some of the common categories.
There may be several interpretations of some of the process mapping symbols; however, just about
everyone uses these primary symbols to document processes. As you become more practiced you
will find additional symbols useful, i.e. reports, data storage etc. For now we will start with just these
symbols.
90
Process Discovery
The Measurement category groups Root Causes related to the measurement and
measuring of a process activity or output:
Examples of questions to ask:
•  Is there a metric issue? Measurement
•  Is there a valid measurement
system? Is the data good
enough?
•  Is data readily available?
Y
The People category groups Root Causes related to people, staffing and
Organizational structure:
Examples of questions to ask: People
• Are people trained, do they
have the right skills?
• Is there person to person
Y
variation?
• Are people over-worked, under-worked?
The Method category groups Root Causes related to how the work is done, the
way the process is actually conducted:
Examples of questions to ask: Method

• How is this performed?
• Are procedures correct?
• What might be unusual? Y
The Materials category groups Root Causes related to parts, supplies, forms or
information needed to execute a process:

•  Are bills of material current? Y
•  Are parts or supplies obsolete?
•  Are there defects in the materials?
Materials
91
Process Discovery
The Equipment category groups Root Causes related to tools used in the
process:
•  Have machines been serviced recently,
what is the uptime? Y
•  Have tools been properly maintained?
•  Is there variation?
Equipment
The Environment (a.k.a. Mother Nature) category groups Root Causes related to
our work environment, market conditions and regulatory issues.
•  Is the workplace safe and comfortable?
•  Are outside regulations impacting the Y
business?
•  Does the company culture aid the
process?
Environment
Classifying the X’s
The Cause & Effect Diagram is a tool to generate opinions about

possible causes for defects.
For each of the X’s identified in the diagram classify them as follows:
–  Controllable: C (Knowledge)
–  Procedural: P (People, Systems)
–  Noise: N (External or Uncontrollable)
Think of procedural as a subset of controllable. Unfortunately many

procedures within a company are not well controlled and can cause
the defect level to increase. The classification methodology is used
to separate the X’s so they can be used in the X-Y Matrix and the
FMEA taught later in this module.
WHICH X’s CAUSE DEFECTS?
The Cause and Effect Diagram is an organized way to approach brainstorming. This approach allows
us to further organize ourselves by classifying the X’s into Controllable, Procedural or Noise types.
92
Process Discovery
Chemical Purity Example
Measurement Manpower Materials
Incoming QC (P) Training on method (P) Raw Materials (C)
Measurement Insufficient staff (C)

Method (P) Skill Level (P) Multiple Vendors (C)
Measurement
Capability (C) Adherence to procedure (P) Specifications (C)
Work order variability (N)
Chemical
Startup inspection (P) Room Humidity (N) Column Capability (C) Purity
Handling (P) RM Supply in Market (N) Nozzle type (C)
Purification Method (P) Shipping Methods (C) Temp controller (C)
Data collection/feedback (P)
Methods Mother Nature Equipment
This example of the Cause and Effect Diagram is of chemical purity. Notice how the input variables for
each branch are classified as Controllable, Procedural and Noise.
Cause and Effect Diagram - MINITAB™
The Fishbone Diagram shown here for surface flaws was generated in MINITAB™. We will now
review the various steps for creating a Cause and Effect Diagram using the MINITAB™
statistical software package.
93
Process Discovery
Cause and Effect Diagram - MINITAB™
Open the MINITAB™ Project Measure Data Sets.mpj and select the worksheet
Surfaceflaws.mtw .
Open the MINITAB™ worksheet “Surfaceflaws.mtw”.
Take a few moments to study the worksheet. Notice the first 6 columns are the classic bones for a
Fishbone. Each subsequent column is labeled for one of the X’s listed in one of the first six columns
and are the secondary bones.
After you have entered the Labels click on the first field under the “Causes” column to bring up the
list of branches on the left hand side. Next double-click the first branch name on the left hand side to
move “C1 Man” underneath “Causes”.
94
Process Discovery
Cause and Effect Diagram - MINITAB™ (cont.)
To continue identifying
the secondary
branches select the
button, “Sub…” to the
right of the “Label”
column.
Click on the third field

under “Causes” to
bring up the list of
branches on the left
hand side.
Next double-click the

seventh branch name
on the left hand side to
move “C7 Training”
underneath “Causes”
then select “OK” and
repeat for each
remaining sub branch.
In order to adjust the Cause and Effect Diagram so the main causes titles are not rolled grab the line
with your mouse and slide the entire bone.
95
Process Discovery
Cause & Effect Diagram Exercise
Exercise objective: Create a Fishbone Diagram.
1.  Retrieve the high level Process Map for your project
and use it to complete a Fishbone, if possible include
your project team.
Don t let the big

one get away!
96
Process Discovery
Overview of Process Mapping
In order to correctly manage a process you must be able to

describe it in an easily understood manner.
–  The preferred method for describing a process is to
identify it with a generic name, show the workflow with
a Process Map and describe its purpose with an
operational description.
–  The first activity of the Measure Phase is to adequately
describe the process under investigation.
t
ec
Start Step A Step B Step C Step D Finish
sp
In
Process Mapping, also called flowcharting, is a technique to visualize the tasks, activities and steps
necessary to produce a product or a service. The preferred method for describing a process is to
identify it with a generic name, show the workflow with a Process Map and describe its purpose with
an operational description.
Remember a process is a blending of inputs to produce some desired output. The intent of each task,
activity and step is to add value, as perceived by the customer, to the product or service we are
producing. You cannot discover if this is the case until you have adequately mapped the process.
There are many reasons for creating a Process Map:

- It helps all process members understand their part in the process and how their process fits into the
bigger picture.
- It describes how activities are performed and how the work effort flows, it is a visual way of standing
above the process and watching how work is done. In fact, Process Maps can be easily uploaded
into model and simulation software where computers allow you to simulate the process and visually
see how it works.
- It can be used as an aid in training new people.
- It will show you where you can take measurements that will help you to run the process better.
- It will help you understand where problems occur and what some of the causes may be.
- It leverages other analytical tools by providing a source of data and inputs into these tools.
- It identifies and leads you to many important characteristics you will need as you strive to make
improvements.
Individual maps developed by Process Members form the basis of Process Management. The
individual processes are linked together to see the total effort and flow for meeting business and
customer needs.
In order to improve or to correctly manage a process, you must be able to describe it in a way that
can be easily understood. That is why the first activity of the Measure Phase is to adequately
describe the process under investigation. Process Mapping is the most important and powerful tool
you will use to improve the effectiveness and efficiency of a process.
97
Process Discovery
Information from Process Mapping

These are more reasons
why Process Mapping is Mapping processes identifies many important
the most important and characteristics and develops information for other
powerful tool you will analytical tools:
need to solve a problem. 1.  Process inputs (X’s)
It has been said Six 2.  Supplier requirements
Sigma is the most 3.  Process outputs (Y’s)
efficient problem solving 4.  Actual customer needs
methodology available. 5.  All value-added and non-value added process tasks and steps
This is because work 6.  Data collection points
done with one tool sets •  Cycle times
up another tool, very little •  Defects
information and work is •  Inventory levels
wasted. Later you will •  Cost of poor quality, etc.
learn to how to further 7.  Decision points
use the information and 8.  Problems that have immediate fixes
knowledge you gather
9.  Process control needs
from Process Mapping.
Process Mapping
There are usually three views of a process: There are usually three views
of a process: The first view is
“what you think the process
is” in terms of its size, how
1 2 3 work flows and how well the
What you THINK it is.. What it ACTUALLY is.. What it SHOULD be.. process works. In virtually all
cases the extent and difficulty
of performing the process is
understated.
It is not until someone

Process Maps the process
that the full extent and
difficulty is known, and it
virtually is always larger than
what we thought, is more
difficult and it cost more to
operate than we realize. It is here that we discover the hidden operations also. This is the second
view: “what the process actually is”.
Then there is the third view: “what it should be”. This is the result of process improvement activities.
It is precisely what you will be doing to the key process you have selected during the weeks between
classes. As a result of your project you will either have created the “what it should be” or will be well
on your way to getting there. In order to find the “what it should be” process, you have to learn
process mapping and literally “walk” the process via a team method to document how it works. This
is a much easier task then you might suspect, as you will learn over the next several lessons.
We will start by reviewing the standard Process Mapping symbols.
98
Process Discovery
Standard Process Mapping Symbols
Standard symbols for Process Mapping:

(available in Microsoft Office™, Visio™, iGrafx™ , SigmaFlow™ and other products)
A RECTANGLE indicates an A PARALLELAGRAM shows

activity. Statements within that there are data
the rectangle should begin
with a verb
A DIAMOND signifies a decision An ELLIPSE shows the start

point. Only two paths emerge from and end of the process
a decision point: No and Yes
An ARROW shows the A CIRCLE WITH A LETTER OR

connection and direction
1 NUMBER INSIDE symbolizes
the continuation of a
of flow
flowchart to another page
There may be several interpretations of some of the Process Mapping symbols; however, just
about everyone uses these primary symbols to document processes. As you become more
practiced you will find additional symbols useful, i.e. reports, data storage etc. For now we will
start with just these symbols.
99
Process Discovery
Process Mapping Levels
Level 1 – The Macro Process Map, sometimes called a Management level

or viewpoint.
Customer Calls for Take Make Cook Pizza Box Deliver Customer
Hungry Order Order Pizza Pizza Correct Pizza Pizza Eats
Level 2 – The Process Map, sometimes called the Worker level or

viewpoint. This example is from the perspective of the pizza chef.
Pizza
Dough
No
Take Order Add Place in Observe Check Yes Remove
from Cashier Ingredients Oven Frequently if Done from Oven 1
Start New
Pizza
Scrap
No
Tape
Pizza Place in Put on
1 Correct Box
Order on Delivery Rack
Yes Box
Level 3 – The Micro Process Map, sometimes called the Improvement

level or viewpoint. Similar to a Level 2 it will show more steps and tasks
and will present various performance data; yields, cycle time, value and
non-value added time, defects, etc.
Before Process Mapping starts you have to learn about the different level of detail on a Process
Map and the different types of Process Maps. Fortunately these have been well categorized and
are easy to understand.
There are three different levels of Process Maps. You will need to use all three levels and you
most likely will use them in order from the macro map to the micro map. The macro map contains
the least level of detail with increasing detail as you get to the micro map. You should think of and
use the level of Process Maps in a way similar to the way you would use road maps. For example,
if you want to find a country you look at the world map. If you want to find a city in that country you
look at the country map. If you want to find a street address in the city you use a city map. This is
the general rule or approach for using Process Maps.
The Macro Process Map, what is called the Level 1 Map, shows the big picture. You will use this to
orient yourself to the way a product or service is created. It will also help you to better see which
major step of the process is most likely related to the problem you have and it will put the various
processes you are associated with in the context of the larger whole. A Level 1 PFM, sometimes
called the “management” level, is a high-level process map having the following characteristics:
§  Combines related activities into one major processing step

§  Illustrates where/how the process fits into the big picture
§  Has minimal detail
§  Illustrates only major process steps
§  Can be completed with an understanding of general process steps and the
purpose/objective of the process
100
Process Discovery
Process Mapping Levels (cont.)
The next level is generically called the Process Map. You will refer to it as a Level 2 Map and it
identifies the major process steps from the workers point of view. In the pizza example above these
are the steps the pizza chef takes to make, cook and box the pizza for delivery. It gives you a good
idea of what is going on in this process but could can you fully understand why the process
performs the way it does in terms of efficiency and effectiveness, could you improve the process
with the level of knowledge from this map?
Probably not. You are going to need a Level 3 Map called the Micro Process Map. It is also known
as the improvement view of a process. There is however a lot of value in the Level 2 Map because
it is helping you to “see” and understand how work gets done, who does it, etc. It is a necessary
stepping stone to arriving at improved performance.
Next we will introduce the four different types of Process Maps. You will want to use different types
of Process Maps, to better help see, understand and communicate the way processes behave.
Types of Process Maps
There are four types of Process Maps that you will use. They are the Linear Flow Map, the
deployment or Swim Lane Flow Map, the S-I-P-0-C Map (pronounced sigh-pock) and the Value
Stream Map.
The Linear Flow Process Map

Calls
Customer Take Make Cook Pizza Box Deliver Customer
for
Hungry Order Pizza Pizza Correct Pizza Pizza Eats
Order
As the name states this diagram shows the process steps in a sequential flow, generally ordered
from an upper left corner of the map towards the right side.
The Deployment-Flow or Swim Lane Process Map

Customer
Customer Calls for Customer

Hungry Order Eats
Cashier
Take
Order
Pizza Box
Cook
Make Cook
Pizza Pizza Correct Pizza
Deliverer
Deliver
Pizza
The value of the Swim Lane Map is that is shows you who or which department is responsible for
the steps in a process. A timeline can be added to show how long it takes each group to perform
their work. Also each time work moves across a Swim Lane there is a Supplier – Customer
interaction. This is usually where bottlenecks and queues form.
While they all show how work gets done, they emphasize different aspects of process flow and
provide you with alternative ways to understand the behavior of the process so you can do
something about it. The Linear Flow Map is the most traditional and is usually where most start the
mapping effort.
The Swim Lane Map adds another dimension of knowledge to the picture of the process: Now you
can see which department area or person is responsible. You can use the various types of maps in
the form of any of the three levels of a Process Map.
101
Process Discovery
Process Maps – Examples for Different Processes
Linear Process Map for Door Manufacturing

Begin Prep doors Inspect Pre-cleaning A
Return
for
rework
Mark for door

Install into Inspect
A work jig
Light sanding
finish
handle B
drilling
Rework
De-burr and Apply part Move to

B Drill holes
smooth hole number finishing
C
Scratch Final Apply stain

C Inspect Inspect End
repair cleaning and dry
Scrap
Swim Lane Process Map for Capital Equip

Prepare
Business
Define paperwork Review &

Receive &
Unit
(CAAR & approve

Needs CAAR
use
installation
request)
Review &
Configure
I.T.
approve
& install
standard
Supplier Procurement Top Mgt/ Finance
Review &
Issue
approve
payment
CAAR
Corporate
Review &
approve
CAAR
Acquire
equipment
Supplier Supplier
Ships Paid
21 days 6 days 15 days 5 days 17 days 7 days 71 days 50 days

The SIPOC diagram is SIPOC diagram for customer-order process:
especially useful after
you have been able to Suppliers Inputs Process Outputs Customers Requirements
construct either a Level 1 ATT Phones Pizza type See Below Price Cook Complete call < 3 min
Office Depot Accounting Order to Cook < 1 minute
or Level 2 Map because
Size Order confirmation
TI Calculators Quantity Bake order Complete bake order
it facilitates your NEC Cash Register Extra Toppings

Special orders
Data on cycle time
Order rate data
Correct bake order
Correct address
gathering of other Drink types & quantities Order transaction

Delivery info
Correct Price
Other products
pertinent data that is Phone number
affecting the process in a Address
Name
systematic way. It will Time, day and date

Volume
help you to better see
and understand all of the
influences affecting the
behavior and Customer Order:
performance of the Level 1 process flow diagram
process.
Call for an Answer Write Confirm Sets Address & Order to
You may also add a Order Phone Order Order Price Phone Cook
requirements section
to both the supplier side and the customer side to capture the expectations for the inputs and the
outputs of the process. Doing a SIPOC is a great building block to creating the Level 3 Micro
Process Map. The two really compliment each other and give you the power to make improvements
to the process.
102
Process Discovery

The Value Stream The Value Stream Map
Map is a specialized
Process Steps
map that helps you Log Route Disposition Cut Check Mail Delivery
to understand -Computer
-1 Person
-Department
Assignments
-Guidelines
-1 Person
-Computer
-Printer I
-Envelops
-Postage
Size of work queue or I I I I
numerous inventory
-1 Person -1 Person -1 Person
performance metrics 4,300 C/T = 15 sec

Uptime = 0.90
7,000 C/T = 75 sec
Uptime = 0.95
1,700 C/T = 255 sec
Uptime = 0.95
2,450 C/T = 15 sec
Uptime = 0.85
1,840 C/T = 100 sec
Uptime = 0.90
Process Step
associated primarily Time Parameters
Hours = 8
Breaks = 0.5
Hours = 8
Breaks = 0.5
Hours = 8
Breaks = 0.5
Hours = 8
Breaks = 0.5
Hours = 8
Breaks = 0.5
Hours Hours Hours Hours Hours
with the speed of the Available =6.75
Sec.
Available =7.13
Sec.
Available =7.13
Sec.
Available =6.38
Sec.
Available =6.75
Sec.
process but has Step Processing Time
Avail. = 24,300 Avail. = 25,650 Avail. = 25,650 Avail. = 22,950 Avail. = 24,300
many other Days of Work in 15 sec 75 sec 255 sec 15 sec 100 sec
queue 2.65 days 20.47 days 16.9 days 1.60 days 7.57 days
important data.
Process Performance
While this Process Metrics IPY = 0.92 IPY = .94 IPY = .59 IPY = .96 IPY = .96
Map level is at the Defects = 0.08
RTY = .92
Defects = .06
RTY = .86
Defects = .41
RTY = .51
Defects = .04
RTY = .49
Defects = .04
RTY = .47
macro level, the Rework = 4.0%
Material Yield = .96
Rework = 0.0
Rework = 10%
Rework = 0.0
Rework = 0.0
Value Stream Map Aggregate Performance

Scrap = 0.0% Scrap = 0.0% Scrap = 0.0% Scrap = 0.0% Scrap = 4.0%
Metrics
provides you a lot of Cum Material Yield = .96 X .94 X .69 X .96 X .96 = .57 RTY = .92 X .94 X .59 X .96 X .96 = .47
detailed performance
data for the major The Value Stream Map is a very powerful technique to understand the
steps of the process. velocity of process transactions, queue levels and value added ratios in
It is great for finding both manufacturing and non-manufacturing processes.
bottlenecks in the
process.
Process Mapping Exercise – Going to Work
The purpose of this exercise is to develop a Level 1 Macro, Linear

Process Flow Map and then convert this map to a Swim Lane Map.
Read the following background for the exercise: You have been concerned
about your ability to arrive at work on time and also the amount of time it takes
from when your alarm goes off until you arrive at work. To help you better
understand both the variation in arrival times and the total time, you decide to
create a Level 1 Macro Process Map. For purposes of this exercise, the start is
when your alarm goes off the first time and the end is when you arrive at your
work station.
Task 1 – Think about the various tasks and activities you routinely do from the
defined start to the end point of this exercise.
Task 2 – Using a pencil and paper create a Linear Process Map at the macro
level but with enough detail so you can see all the major steps of your process.
Task 3 – From the Linear Process Map, create a Swim Lane Map. For the
lanes you may use the different phases of your process, such as the wake up
phase, getting prepared, driving, etc.
103
Process Discovery
A Process Map of Process Mapping

Process Mapping
follows a general order,
but sometimes you may Create the Level 2 Create a Level 3
Select the process
find it necessary, even PFM PFM
advisable to deviate
somewhat. However,
you will find this a good Determine
approach to map Perform SIPOC
Add Performance
data
path to follow as it has the process
proven itself to
generate significant
results. Complete Level 1
PFM worksheet
Identify all X’s and Identify VA/NVA
Y’s steps
On the lessons ahead

we will always show
you where you are at in Create Level 1 PFM
Identify customer
requirements
this sequence of tasks
for Process Mapping.
Before we begin our
Process Mapping we Define the scope for Identify supplier
the Level 2 PFM requirements
will first start you off
with how to determine
the approach to mapping the process. Basically there are two approaches: the individual and the
team approach.
Process Mapping Approach

If you decide to do
the individual Using the Individual Approach
Select the
approach, here are a process 1.  Start with the Level 1 Macro Process Map.
few key factors: You 2.  Meet with process owner(s) / manager(s). Create a
must pretend that Level 1 Map and obtain approval to interview
Determine
you are the product approach to process members.
or service flowing map the 3.  Starting with the beginning of the process, pretend
process
through the process you are the product or service flowing through the
process, interview to gather information.
and you are trying to Complete 4.  As the interviews progress, assemble the data into a
“experience” all of Level 1 PFM
Level 2 PFM.
worksheet
the tasks that 5.  Verify the accuracy of the Level 2 PFM with the
happen through the people who provided input.
various steps. Create 6.  Update the Level 2 PFM as needed.
Level 1 PFM
You must start by
talking to the Using the Team Approach
Define the
manager of the area scope for 1.  Follow the Team Approach to Process Mapping
and/or the process the Level 2
PFM
owner. This is where
you will develop the
Level 1 Macro Process Map. While you are talking to him, you will need to receive permission to talk
to various members of the process in order to get the detailed information you need.
104
Process Discovery
Process Mapping Approach
Process Mapping
works best with a Select the Using the Team Approach
team approach. The process 1.  Start with the Level 1 Macro Process Map.
logistics of 2.  Meet with process owner(s) / manager(s). Create a
performing the Level 1 Map and obtain approval to call a process
Determine
mapping meeting with process members (See team
mapping are approach to
map the workshop instructions for details on running the
somewhat different process meeting).
but it overall it takes 3.  Bring key members of the process into the process
less time, the quality Complete flow workshop. If the process is large in scope hold
Level 1 PFM individual workshops for each subsection of the
of the output is worksheet
total process. Start with the beginning steps.
higher and you will Organize meeting to use the post-it note approach
have more “buy-in” to gather individual tasks and activities, based on
into the results. Input Create the macro map, that comprise the process.
Level 1 PFM
should come from 4.  Immediately assemble the information provided into
people familiar with a Process Map.
all stages of process. Define the 5.  Verify the PFM by discussing it with process owners
scope for and by observing the actual process from beginning
the Level 2
PFM to end.
Where appropriate the team should include line individuals, supervisors, design engineers, process
engineers, process technicians, maintenance, etc. The team process mapping workshop is where it
all comes together.
Select the The Team Process Mapping Workshop

process
1.  Add to and agree on Macro Process Map.
2.  Using 8.5 X 11 paper for each macro process step
Determine
tape the process to the wall in a linear style.
approach to 3.  Process Members then list all known process tasks
map the they do on a post-it note, one process task per note.
process
• Include the actual time spent to perform each
activity, do not include any wait time or queue
Complete time.
Level 1 PFM
worksheet • List any known performance data that describe
the quality of the task.
4.  Place the post-it notes on the wall under the
appropriate macro step in the order of the work flow.
Create
Level 1 PFM 5.  Review process with the group, add additional
information and close meeting.
6.  Immediately consolidate information into a Level 2
Define the Process Map.
scope for 7.  You still have to verify the map by walking the
the Level 2 process.
PFM
In summary, after adding to and agreeing to the Macro Process Map the team process mapping
approach is performed using multiple post-it notes where each person writes one task per note and,
when finished, place them onto a wall which contains a large scale Macro Process Map.
This is a very fast way to get a lot of information including how long it takes to do a particular task.
Using the Value Stream Analysis techniques which you will study later, you will use this data to
improve the process. We will now discuss the development of the various levels of Process Mapping.
105
Process Discovery
Steps in Generating a Level 1 PFM

You may recall the preferred
method for describing a Creating a Level 1 PFM
Select the
process is to identify it with a process 1. Identify a generic name for the process:
generic name, describe its For instance: Customer Order Process
purpose with an operational Determine 2. Identify the beginning and ending steps of the process:
description and show the approach to
map the
Beginning - customer calls in. Ending – baked pizza given to
operations
workflow with a process process
3. Describe the primary purpose and objective of the process
map. When developing a (operational definition):
Complete The purpose of the process is to obtain telephone orders for
Macro Process Map, always Level 1 PFM
pizzas, sell additional products if possible, let the customer
worksheet
add one process step in front know the price and approximate delivery time, provide an
of and behind the area you accurate cook order, log the time and immediately give it to the
pizza cooker.
believe contains your Create 4. Mentally walk through the major steps of the process and
Level 1 PFM
problem as a minimum. To write them down:
Receive the order via phone call from the customer, calculate
aid you in your start we have the price, create a build order and provide the order to
provided you with a checklist Define the
scope for
operations
or worksheet. You may the Level 2
PFM
5. Use standard flowcharting symbols to order and to illustrate
the flow of the major process steps.
acquire this data from your
own knowledge and/or with
the interviews you do with the managers / process owners. Once you have this data, and you
should do this before drawing maps, you will be well positioned to communicate with others and you
will be much more confident as you proceed.
A Macro Process Map can be useful when reporting project status to management. A macro-map
can show the scope of the project so management can adjust their expectations accordingly.
Remember, only major process steps are included. For example, a step listed as “Plating” in a
manufacturing Macro Process Map might actually consists of many steps: pre-clean, anodic
cleaning, cathodic activation, pre-plate, electro-deposition, reverse-plate, rinse and spin-dry, etc.
The plating step in the macro-map will then be detailed in the Level 2 Process Map.
Exercise – Generate a Level 1 PFM
The purpose of this exercise is to develop a Level 1 Linear

Process Flow Map for the key process you have selected as your
Select the
process project.
Read the following background for the exercise: You will use
Determine your selected key process for this exercise (if more than one
approach to person in the class is part of the same process you may do it as a
map the
process small group). You may not have all the pertinent detail to correctly
put together the Process Map, that is ok, do the best you can.
Complete This will give you a starting template when you go back to do your
Level 1 PFM project. In this exercise you may use the Level 1 PFM worksheet
worksheet
on the next page as an example.
Task 1 – Identify a generic name for the process.

Create
Level 1 PFM Task 2 - Identify the beginning and ending steps of the process.
Task 3 - Describe the primary purpose and objective of the
process (operational definition).
Define the Task 4 - Mentally walk through the major steps of the process
scope for and write them down.
the Level 2
PFM Task 5 - Use standard flowcharting symbols to order and to
illustrate the flow of the major process steps.
106
Process Discovery
Exercise – Generate a Level 1 PFM (cont.)
If necessary, you may

look at the example for 1. Identify a generic name for the process:
the Pizza order entry
process.
2. Identify the beginning and ending steps of the process:
3. Describe the primary purpose and objective of the process

(operational definition):
4. Mentally walk through the major steps of the process and write
them down:
5. Use standard flowcharting symbols to order and to illustrate the

flow of the major process steps on a separate sheet of paper.
Exercise – Generate a Level 1 PFM Solution
1.  Identify a generic name for the process:

(I.E. customer order process).
2.  Identify the beginning and ending steps of the process:

(beginning - customer calls in, ending – pizza order given to the chef).
3.  Describe the primary purpose and objective of the process (operational
definition): (The purpose of the process is to obtain telephone orders for
pizzas, sell additional products if possible, let the customer know the
price and approximate delivery time, provide an accurate cook order, log
the time and immediately give it to the pizza cooker).
4.  Mentally walk through the major steps of the process and write them
down:
(Receive the order via phone call from the customer, calculate the price,
create a build order and provide the order to the pizza cooker).
5.  Use standard flowcharting symbols to order and to illustrate the flow of
the major process steps on a separate sheet of paper.
107
Process Discovery
Defining the Scope of Level 2 PFM

With a completed Level 1
PFM you can now “see” Customer Order Process
where you have to go to get Select the Customer
Customer Calls
Callsfor
for Take
Take Make Cook Box Deliver Customer
Order Order
process Hungry
Hungry Order Order Pizza Pizza Pizza Pizza Eats
more detailed information.
You will have the basis for a
Level 2 Process Map. Determine Pizza
Dough
approach to
The improvements are in map the No
process Take Order Add Place in Observe Check Yes Remove
the details. If the efficiency from Cashier Ingredients Oven Frequently if Done from Oven 1
or effectiveness of the
Start New
process could be Complete Pizza
Level 1 PFM
significantly improved by a worksheet Scrap
No
broad summary analysis Tape
1
the improvement would be Correct
Yes
Box
Order on
Box
Delivery Rack
done already. If you map Create

the process at an Level 1 PFM Rules for determining the Level 2 Process Map scope:
actionable level you can •  From your Macro Process Map select the area that represents your
identify the source of problem.
inefficiencies and defects. Define the •  Map this area at a Level 2.
scope for
But you need to be careful the Level 2 •  Start and end at natural starting and stopping points for a process; in
PFM other words you have the complete associated process.
about mapping too little an
area and missing your
problem cause, or mapping
to large an area in detail
thereby wasting your
valuable time.
Create the
The rules for determining the Level 2 PFM Pizza
Dough
scope of the Level 2 Process
No
Map: Yes
Take Order Add Place in Observe Check Remove
a) Look at your Macro Process Perform
SIPOC
Map to select the area that
represents your problem. Start New
Pizza
b) Map this area at a Level 2. Identify all X’s
c) Start and end at natural and Y’s Scrap
No
starting and stopping points for
Tape
a process, in other words you 1 Pizza
Correct
Place in
Box
Order on
Put on
Delivery Rack
Identify Yes Box
have the complete associated customer
requirements
process.
When you perform the process Identify

mapping workshop or do the supplier
requirements
individual interviews, you will
determine how the various
tasks and activities form a complete step. Do not worry about precisely defining the steps, it is not
an exact science, common sense will prevail. If you have done a process mapping workshop, which
you will remember we highly recommended, you will actually have a lot of the data for the Level 3
Micro Process Map. You will now perform a SIPOC and, with the other data you already have, it will
position you for about 70 percent to 80 percent of the details you will need for the Level 3 Process
Map.
108
Process Discovery
Building a SIPOC
SIPOC diagram for customer-order process:

Create the
Level 2 PFM
Suppliers Inputs Process Outputs Customers Requirements
ATT Phones Pizza type See Below Price Cook Complete call < 3 min
Office Depot Size Order confirmation Accounting Order to Cook < 1 minute
TI Calculators Quantity Bake order Complete bake order
NEC Cash Register Extra Toppings Data on cycle time Correct bake order
Perform Special orders Order rate data Correct address
SIPOC Drink types & quantities Order transaction Correct Price
Other products Delivery info
Phone number
Address
Name
Identify all X’s Time, day and date
and Y’s Volume
Identify
customer
requirements
Customer Order:
Level 1 process flow diagram
Call for Answer Write Confirm Sets Address Order to
Identify an Order Phone Order Order Price & Phone Cook
supplier
requirements
The tool name prompts the team to consider the suppliers (the 'S' in SIPOC) of your process, the
inputs (the 'I') to the process, the process (the 'P') your team is improving, the outputs (the 'O') of
the process and the customers (the 'C') that receive the process outputs.
Requirements of the customers can be appended to the end of the SIPOC for further detail and
requirements are easily added for the suppliers as well.
The SIPOC tool is particularly useful in identifying:
Who supplies inputs to the process?
What are all of the inputs to the process we are aware of? (Later in the DMAIC
methodology you will use other tools which will find still more inputs, remember Y = f(X) and
if we are going to improve Y, we are going to have to find all the X’s.
What specifications are placed on the inputs?
What are all of the outputs of the process?
Who are the true customers of the process?
What are the requirements of the customers?
You can actually begin with the Level 1 PFM that has 4 to 8 high-level steps, but a Level 2 PFM is
even of more value. Creating a SIPOC with a process mapping team, again the recommended
method is a wall exercise similar to your other process mapping workshop. Create an area that will
allow the team to place post-it note additions to the 8.5 X 11 sheets with the letters S, I, P, O and C
on them with a copy of the Process Map below the sheet with the letter P on it.
Hold a process flow workshop with key members. (Note: If the process is large in scope, hold an
individual workshop for each subsection of the total process, starting with the beginning steps).
The preferred order of the steps is as follows:
1. Identify the outputs of this overall process.
2. Identify the customers who will receive the outputs of the process.
3. Identify customers’ preliminary requirements
4. Identify the inputs required for the process.
5. Identify suppliers of the required inputs that are necessary for the process to function.
6. Identify the preliminary requirements of the inputs for the process to function properly.
109
Process Discovery
Identifying Customer Requirements
You are now ready to

identify the customer
Create the
requirements for the Level 2 PFM
outputs you have defined. PROCESS OUTPUT
Process Name
Operational
Definition
IDENTIFICATION AND ANALYSIS
Customer requirements, 1
Output Data
3 4 5 6 7
Requirements Data
8 9 10
Measurement Data
11 12
Value Data
13
General Data/Information
called VOC, determine

Customer (Name) Metric Measurement VA
System (How is it Frequency of or
Process Output - Name (Y) Internal External Metric LSL Target USL Measured) Measurement Performance Level Data NVA Comments
Perform SIPOC
what are and are not
acceptable for each of the
outputs. You may find some
of the outputs do not have Identify all X’s
and Y’s
requirements or
specifications. For a well
managed process this is Identify
not acceptable. If this is the customer
requirements
case you must ask/
negotiate with the customer
as to what is acceptable. Identify
supplier
requirements
There is a technique for
determining the validity of
customer and supplier requirements. It is called “RUMBA” standing for: Reasonable,
Understandable, Measurable, Believable and Achievable. If a requirement cannot meet all of these
characteristics then it is not a valid requirement, hence the word negotiation. We have included the
process for validating customer requirements at the end of this lesson.
The Excel spreadsheet is somewhat self explanatory. You will use a similar form for identifying the
supplier requirements. Start by writing in the process name followed by the process operational
definition. The operational definition is a short paragraph which states why the process exists, what it
does and what its value proposition is. Always take sufficient time to write this such that anyone who
reads it will be able to understand the process. Then list each of the outputs, the Y’s, and write in the
customer’s name who receives this output, categorized as an internal or external customer.
Next are the requirements data. To specify and measure something, it must have a unit of measure;
called a metric. As an example, the metric for the speed of your car is miles per hour, for your weight
it is pounds, for time it is hours or minutes and so on. You may know what the LSL and USL are but
you may not have a target value. A target is the value the customer prefers all the output to be
centered at; essentially, the average of the distribution. Sometimes it is stated as “1 hour +/- 5
minutes”. One hour is the target, the LSL is 55 minutes and the USL is 65 minutes. A target may not
be specified by the customer; if not, put in what the average would be. You will want to minimize the
variation from this value.
You will learn more about measurement but for now you must know that if something is required you
must have a way to measure it as specified in column 9. Column 10 is how often the measurement
is made and column 11 is the current value for the measurement data. Column 12 is for identifying if
this is a value or non-value added activity; more on that later. And finally column 13 is for any
comments you want to make about the output.
You will come back to this form and rank the significance of the outputs in terms of importance to
identify the CTQ’s.
110
Process Discovery
Identifying Supplier Requirements
The supplier input or

process input identification
Create the
and analysis form is nearly Level 2 PFM
identical to the output form
Process Name
Operational
PROCESS INPUT Definition
IDENTIFICATION AND ANALYSIS
just covered. Now you are 1 2
Input Data
3 4 5 6 7
Requirements Data
8 9 10
Measurement Data
11
Value Data
12
the customer, you will

Supplier (Name) Metric Measurement NV
Controlled (C) System (How is it Frequency of Performance or
Perform SIPOC Process Input- Name (X) Noise (N) Internal External Metric LSL Target USL Measured) Measurement Level Data NVA Comments
specify what is required of

your suppliers for your
process to work correctly;
Identify all X’s
remember RUMBA – the and Y’s
same rules apply.
You will notice a new Identify

customer
parameter introduced in requirements
column 2. It asks if the input
is a controlled input or an Identify
uncontrolled input (noise). supplier
requirements
The next topic will discuss
the meaning of these terms.
Later you will come back to this form and rank the importance of the inputs to the success of your
process and eventually you will have found the Critical X’s.
Controllable versus Noise Inputs
For any process or process

step input there are two Make Pizza Process Procedural
primary types of inputs: Inputs
Controllable - we can exert
influence over them Oven Temp
Uncontrollable - they behave Bake Time
Controllable Key Process
as they want to within some Ingredients
Recipe Inputs Process Outputs
reasonable boundaries. Correct Ingredients
Procedural - A standardized Properly Cooked
Room Temp
set of activities leading to Moisture Content
Hot Pizza >140 deg
Noise Inputs
readiness of a step. Ingredient Variation
Compliance to GAAP
(Generally Accepted Every input can be either:
Controllable (C) - Inputs can be adjusted or controlled while the process is running (e.g., speed,
Accounting Principals). feed rate, temperature and pressure)
Procedural (P) - Inputs that are affected through a standardized set of activities established to
However even with the inputs create a process step completion (e.g., material queues, rigging setup, fixed data-entry forms)
Noise (N) - Things we do not think we can control, we are unaware of or see, too expensive or too
we define as controllable, we difficult to control (e.g., ambient temperature, humidity, individual)
never exert complete control.
We can control an input within the limits of its natural variation but it will vary on its own based on
its distributional shape - as you have previously learned. You choose to control certain inputs
because you either know or believe they have an effect on the outcome of the process. It is
inexpensive to do, so controlling it “makes us feel better” or there once was a problem and the
solution (right or wrong) was to exert control over some input.
111
Process Discovery
Controllable versus Noise Inputs (cont.)

You choose to not control some inputs because you think you cannot control them, you either know
or believe they do not have much affect on the output, you think it is not cost justified or you just do
not know these inputs even exist. Yes, that is right, you do not know they are having an affect on the
output. For example, what effect does ambient noise or temperature have on your ability to be
attentive or productive, etc.?
It is important to distinguish which category an input falls into. You know through Y = f(X) that if it is a
Critical X, by definition, you must control it. Also if you believe an input is or needs to be controlled
then you have automatically implied there are requirements placed on it and it must be measured.
You must always think and ask whether an input is or should be controlled or if it is uncontrolled.
Exercise – Supplier Requirements
The purpose of this exercise is to identify the requirements for the

Create the
suppliers to the key process you have selected as your project.
Level 2 PFM
Read the following background for the exercise: You will use
your selected key process for this exercise (if more than one
Perform person in the class is part of the same process you may do it as a
SIPOC
small group). You may not have all the pertinent detail to correctly
identify all supplier requirements, that is ok, do the best you can.
This will give you a starting template when you go back to do your
Identify all X’s project work. Use the process input identification and analysis
and Y’s
form for this exercise.
Identify
Task 1 – Identify a generic name for the process.
customer Task 2 - Write an operational description for the process.
requirements
Task 3 - Complete the remainder of the form except the Value –
Non value added column.
Identify Task 4 - Report to the class when called upon.
supplier
requirements
112
Process Discovery
The Level 3 Process Flow Diagram
Pizza
Dough
No
Take Order Add Place in Observe Check Yes Remove
Start New
Pizza
Scrap
No
Tape
1 Correct Box
Order on Delivery Rack
Yes Box
Process Name Step Name/Number Process Name Step Name/Number

PROCESS STEP PROCESS STEP
OUTPUT IDENTIFICATION AND ANALYSIS INPUT IDENTIFICATION AND ANALYSIS
1 3 4 5 6 7 8 9 10 11 12 13 1 2 3 4 5 6 7 8 9 10 11 12 13
Output Data Requirements Data Measurement Data Value Data General Data/Information Input Data Requirements Data Measurement Data Value Data General Data/Information
Customer (Name) Metric Measurement VA Supplier (Name) Metric Measurement VA
System (How is it Frequency of or Controlled (C) System (How is it Frequency of Performance or
Process Output - Name (Y) Internal External Metric LSL Target USL Measured) Measurement Performance Level Data NVA Comments Process Input- Name (X) Noise (N) Internal External Metric LSL Target USL Measured) Measurement Level Data NVA Comments
You have a decision at this point to continue with a complete characterization of the process you have
documented at a Level 2 in order to fully build the process management system or to narrow the effort
by focusing on those steps that are contributing to the problem you want solved.
In reality, usually just a few of the process steps are the Root Cause areas for any given higher level
process output problem. If your desire is the latter there are some other Measure Phase actions and
tools you will use to narrow the number of potential X’s and subsequently the number of process steps.
To narrow the scope so it is relevant to your problem consider the following: Remember using the pizza
restaurant as our example for selecting a key process? They were having a problem with overall
delivery time and burnt pizzas. Which steps in this process would contribute to burnt pizzas and how
might a pizza which was burnt so badly it had to be scrapped and restarted effect delivery time? It would
most likely be the steps between “place in oven” to “remove from oven” but it might also include “add
ingredients” because certain ingredients may burn more quickly than others. This is how, based on the
Problem Statement you have made, you would narrow the scope for doing a Level 3 PFM.
For your project the priority will be to do your best to find the problematic steps associated with your
Problem Statement. We will teach you some new tools in a later lesson to aid you in doing this. You may
have to characterize a number of steps until you get more experience at narrowing the steps that cause
problems; this is to be expected. If you have the time you should characterize the whole process.
Each step you select as the process causal steps must be fully characterized just as you have
previously done for the whole process. In essence you will do a “mini SIPOC” on each step of the
process as defined in the Level 2 Process Map. This can be done using a Level 3 Micro Process Map
and placing all the information on it or it can be consolidated onto an Excel spreadsheet format or a
combination of both. If all the data and information is put onto an actual Process Map expect the map to
be rather large physically. Depending on the scope of the process, some people dedicate a wall space
for doing this; say a 12 to 14 foot long wall. An effective approach for this is to use a roll of industrial
113
Process Discovery
The Level 3 Process Flow Diagram (Cont.)

grade brown package wrapping paper, which is generally 4 feet wide. Just roll out the length you want,
cut it, place this on the wall then build your Level 3 Process Map by taping and writing various elements
onto the paper. The value of this approach is you can take it off the wall, roll it up, take it with you then
put it back on any wall; great for team efforts.
A Level 3 Process Map contains all of the process details needed to meet your objective: all of the
flows, set points, standard operating procedures (SOPs), inputs and outputs; their specifications and if
they are classified as being controllable or non-controllable (noise). The Level 3 PFM usually contains
estimates of defects per unit (DPU), yield and rolled throughput yield (RTY) and value/non-value add. If
processing cycle times and inventory levels (materials or work queues) are important, value stream
parameters are also included.
This can be a lot of detail to manage and appropriate tracking sheets are required. We have supplied
these sheets in a paper and Excel spreadsheet format for your use. The good news is the approach
and forms for the steps are essentially the same as the format for identifying supplier and customer
requirements at the process level. A spreadsheet is very convenient tool and the output from the
spreadsheet can be fed directly into a C&E matrix and an FMEA (to be described later), also built using
spreadsheets.
You will find the work you have done up to this point in terms of a Level 1 and 2 Process Maps and the
SIPOC will be of use, both from knowledge of the process and actual data.
An important reminder of a previous lesson: You will recall when you were taught about project
definition where it was stated you should only try to solve the performance of only one process output,
at any one time. Because of the amount of detail you can get into for just one Y, trying to optimize more
than one Y at a time can become overwhelming. The good news is you will have laid all the ground
work to focus on a second and a third Y for a process by just focusing on one Y in your initial project.
Process Inputs (X’s) and Outputs (Y’s)
You are now down at the PROCESS STEP
Process Name Step Name/Number
step level of the process, OUTPUT IDENTIFICATION AND ANALYSIS

1 3 4 5 6 7 8 9 10 11 12 13
this is what we call the Create a Level

Output Data
Customer (Name)
Requirements Data
Metric Measurement
System (How is it
Measurement Data
Frequency of
Value Data
VA
or
improvement view of a 3 PFM Process Output - Name (Y) Internal External Metric LSL Target USL Measured) Measurement Performance Level Data NVA Comments
process. Now you do

exactly the same thing
as you did for the overall Add
Performance
process, you list all of data
the input and output
information for steps of
the process you have Identify VA/
NVA steps
selected for analysis and PROCESS STEP
Process Name Step Name/Number
INPUT IDENTIFICATION AND ANALYSIS

characterization to solve 1 2
Input Data
3 4 5 6 7
Requirements Data
8 9 10
Measurement Data
11 12
Value Data
13
your problem. To help

Supplier (Name) Metric Measurement VA
Controlled (C) System (How is it Frequency of Performance or
Process Input- Name (X) Noise (N) Internal External Metric LSL Target USL Measured) Measurement Level Data NVA Comments
you comprehend what

we are trying to
accomplish we have
provided you with
visualization for the
inputs and outputs of the
Pizza restaurant.
114
Process Discovery
Process Inputs (X’s) and Outputs (Y's) (cont.)
Any process, even a pizza

restaurant process, can be C /N Requirements or Specs. Inputs (Xs) Process
characterized. This
visualization shows many N/C 7”, 12”, 16” Size of Pizza
Y’s
N/C 12 meats, 2 veggies, 3 cheese Toppings
of the inputs and outputs N N/A Name
N Within 10 miles Address Order
and their requirements. By N Within area code Phone
Take Order • All fields
complete
using the process and the N 11 AM to 1 AM
N 5 X 52
Time
Day
process step input and N MM/DD,YY Date
output sheets you get a

very detail picture about C All fields complete Order
• Size
C Per Spec Sheets Ingredients Raw
how your process works. S.O.P Per Rev 7.0 Recipe
Make Pizza
Pizza
• Weight
• Ingredients
C As per recipe chart 3-1 in Oz. Amounts
Now you have enough data correct
to start making informed

C All fields complete Order
decisions about the C Ingredients per order Raw Pizza
Cook Pizza • >140F
process performance. The C 350F +/- 5F
C 10 Min
Oven Temp
Time Cooked • Ingredients
Pizza
next lesson pages will N 60 per hour max Volume correct
• No burns
describe how you
determine if a process task, activity or step is a value added step or not.
Identifying Waste
When we produce Writes
A Add to Rewrite
scratch NV
time on Order order
products or services, we pad
A No
engage process-based Greetings
Request Writes on
NV Asks
No
Call for an Answer and Confir
activities to transform Order phone order from
mention
customer
specials
scratch
pad
for
more? m order
physical materials, ideas Yes
and information into 1

No
2
something valued by Asks cook Inform Order Gets Thanks

Anothe
No
2 Calculate customer customer
customers. Some price for time
estimate
of price/
time
still
OK?
Yes
address &
phone # & hangs
up
r call
waiting
3
activities in the process Writes N

VA Yes
generate true value, Create a
time on
scratch
pad New
Yes
1
others do not. The Level 3 PFM Each process activity can be tested for
order?
expenditure of resources, its value-add contribution. No
order from N
Completes VA
capital and other Add Ask the following two questions to 3 from note
pad
energies that do not Performance identify non-value added activity:
!  Is the form, fit or function of the work A
generate value is data
item changed as a result of this Give order to
OK Verify
with
NV
Cook
activity? notes
considered waste. Value !  Is the customer willing to pay for this Not
OK
generation is any activity Identify VA/ activity?
A
NV
Rewrite
that changes the form, fit NVA steps Order
or function of what we
are working on in a way the customer is willing to pay for. The goal of testing for VA vs. NVA is to
remove unnecessary activity (waste) from a process.
Hint: If an action starts with the two letters “re” there is a good chance it is a form of waste; i.e.
rework, replace, review, etc.
Some non-value activities cannot be removed; i.e., data collection is required to understand and plan
production activity levels, data must be collected to comply with governmental regulations, etc. (even
though the data have no effect on the actual product or service)
On the process flow diagram we place a red X through the steps or we write NVA or VA by each step.
115
Process Discovery
Definition of X-Y Diagram
The X-Y Diagram is a great tool to

help us focus, again it is based on •  The X-Y Matrix is:
team experience and “Tribal” –  A tool used to identify/collate potential X’s and assess their
knowledge. At this point in the relative impact on multiple Y’s (include all Y’s that are
customer focused)
project that is great although it
–  Based on the team’s collective opinions
should be recognized this is NOT
–  Created for every project
hard data. As you progress
through the methodology do not –  Never completed
be surprised if you find out through –  Updated whenever a parameter is changed
data analysis what the team
thought might be critical turns out •  To summarize, the X-Y Matrix is a team-based prioritization tool
to be insignificant. for the potential X’s.
The great thing about the X-Y •  WARNING! This is not real data, this is organized
Diagram is that it is sort of an brainstorming!! At the conclusion of the project you may realize
unbiased way to approach the things you thought were critical are in fact not as important
as was believed.
definition around the process and
WILL give you focus.
The Vital Few
A Belt does not just discover which X’s are important in a

process (the vital few).
–  The team considers all possible X’s that can contribute or
cause the problem observed.
–  The team uses 3 primary sources of X identification:
•  Process Mapping
•  Fishbone Analysis
•  Basic Data Analysis – Graphical and Statistical
–  A List of X’s is established and compiled.
–  The team then prioritizes which X’s it will explore first then
eliminates the obvious low impact X’s from further
consideration.
The X-Y Matrix is this Prioritization Tool!
This is an important tool for the many reasons we have already stated. Use it to your benefit,
leverage the team and this will help you progress you through the methodology to accomplish your
ultimate project goal.
116
Process Discovery
The “XY Diagram”
This is the X-Y Diagram. You should have a copy of this template. If possible open it and get
familiar with it as we progress through this section.
Using the Classified X’s
•  Breakthrough requires dealing primarily with controllable X’s

impacting the Y .
•  Use the controllable X’s from the Fishbone analysis to include in the
X-Y Matrix.
•  The goal is to isolate the vital few X’s from the trivial many X’s.
•  Procedures and Noise X’s will be used in the FMEA at the end of
this module. However:
–  All procedures must be in total compliance.
•  This may require some type of effectiveness measure.
•  This could reduce or eliminate some of the defects currently seen in
the process (allowing focus on controllable X’s).
–  Noise type inputs increase risk of defects under current
technology of operation and therefore:
•  Increase RPN on the FMEA document from an input.
•  Help identify areas needing investment for a justified ROI.
117
Process Discovery
X-Y Diagram: Steps
List X’s from Fishbone Diagram in horizontal rows:
Use your Fishbone Diagram as the source and type in the Inputs in this section. Use common sense,
some of the info from the Fishbone may not justify going into the X-Y inputs.
Enter your primary

metric and any other List Y’s in columns (including Primary and Secondary metrics).
secondary metrics
Weight the Y’s on a scale of 1-10 (10 - highest and 1- lowest).
across into this area.
Weight these output
variables (Y’s) on a
scale of 1-10 you
may find some have
the same weight
which is just fine. If,
at this time,
additional metrics
come to the surface,
which is totally
common, you may
realize you need to
add secondary
metrics to your
project or even
refine your primary
metric.
118
Process Discovery
X-Y Diagram: Steps (cont.)
For each X listed

along the left, For each X listed rank its effect on each metric based on a scale of 1, 3 or 9.
rank its effect on 9 = Strong (highest)
each 3 = Moderate (marginal)
corresponding 1 = Weak (none)
metric based on
a scale of 0, 1, 3
or 9. You can
use any scale
you choose
however we
recommend this
on. If you use a
scale of 1 to 10
this can cause
uncertainty
within the
team…is it a 6 or
a 7, what is the
difference, etc.?
The template we have provided automatically calculates and sorts the ranking shown here.
Ranking multiplies the rank of each X by the Weight of each Metric. The
product of that is added together to become the Ranking .
119
Process Discovery
Example
Shown here is a basic example of a completed X-Y Diagram. You can click “Demo” on your
template to view this anytime.
Click the Demo button to see an example.
Example
This is the Click the Summary Worksheet
summary
worksheet. If YX Diagram Summary
you click on Process: laminating
Date: 5/2/2006
the
Output Variables Input Variables
“Summary” Description Weight Description Ranking Rank %
broken 10 temperature 162 14.90%
tab you will unbonded area 9 human handling 159 14.63%
smears 8 material properties 130 11.96%
see this thickness 7 washer 126 11.59%
output. Take foreign material 6

0
pressure
robot handling
120
120
11.04%
11.04%
some time to 0
0
time
clean room practices
102
90
9.38%
8.28%
review the 0
0
clean room cleanliness
-
78 7.18%
0.00%
Input Matrix Results
worksheet. 100.00%
90.00%
80.00%
Output (Y's)
70.00%
60.00%
50.00%
40.00%
30.00%
20.00%
10.00%
0.00%
time
temperature
material properties
clean room cleanliness

pressure
Input Summary
Input (X's)
120
Process Discovery
Fishbone Diagram Exercise
Exercise objective: Create an X-Y Matrix using

the information from the Fishbone analysis.
1.  Using the Fishbone Diagram created earlier create

an X-Y Matrix.
2.  Present results to your mentor.
Definition of FMEA
Failure Modes Effect Analysis or FMEA [usually pronounced as F-M-E-A (individual letters)] is a
structured approach to: read bullets. FMEA at this point is developed with tribal knowledge with a
cross-functional team. Later using process data the FMEA can be updated and better estimates of
detection and occurrence can be obtained. The FMEA is not a tool to eliminate X’s but rather control
the X’s. It is only a tool to identify potential X’s and prioritize the order in which the X’s should be
evaluated.
Failure Modes Effect Analysis (FMEA) is a structured approach to:

•  Predict failures and prevent their occurrence in manufacturing and
other functional areas that generate defects.
•  Identify the ways in which a process can fail to meet critical
customer requirements (Y).
•  Estimate the Severity, Occurrence and Detection (SOD) of defects
•  Evaluate the current Control Plan for preventing these failures from
occurring and escaping to the customer.
•  Prioritize the actions that should be taken to improve and control
the process using a Risk Priority Number (RPN).
Give me an F , give me an M ……
121
Process Discovery
History of FMEA
History of FMEA:
•  First used in the 1960’s in the Aerospace industry during the
Apollo missions
•  In 1974 the Navy developed MIL-STD-1629 regarding the use
of FMEA
•  In the late 1970’s automotive applications driven by liability
costs began to incorporate FMEA into the management of their
processes
•  Automotive Industry Action Group (AIAG) now maintains the
FMEA standard for both Design and Process FMEA’s
The “edge of your seat” info on the history of the FMEA! You will all be sharing this with
everyone tonight at the dinner table!
Types of FMEA’s
There are many different types of FMEA’s. The basic premise is the same.
•  System FMEA: Performed on a product or service product at the early concept/

design level when various modules all tie together. All the module level FMEA s
tie together to form a system. As you go lower into a system more failure modes
are considered.
–  Example: Electrical system of a car, consists of the following modules:
battery, wiring harness, lighting control module and alternator/regulator.
–  System FMEA focuses on potential failure modes associated with the
modules of a system caused by design
•  Design DFMEA: Performed early in the design phase to analyze product fail
modes before they are released to production. The purpose is to analyze how
fail modes affect the system and minimize them. The severity rating of a fail
mode MUST be carried into the Process PFMEA.
•  Process PFMEA: Performed in the early quality planning phase of

manufacturing to analyze fail modes in manufacturing and transactional
processes that may escape to the customer. The failure modes and the potential
sources of defects are rated and corrective action taken based on a Pareto
analysis ranking.
•  Equipment FMEA: used to analyze failure modes in the equipment used in a

process to detect or make the part.
–  Example: Test Equipment fail modes to detect open and short circuits.
122
Process Discovery
Purpose of FMEA
FMEA’s:
•  Improve the quality, reliability and safety of products.
•  Increase customer satisfaction.
•  Reduce product development time and cost.
•  Document and track actions taken to reduce risk and

improve the process.
•  Focus on continuous problem prevention not problem

solving.
Who Creates FMEA’s and When?
FMEA’s are a team

tool like most in this Who When
phase of the
methodology. They •  The focused team working •  Process FMEA’s should be started:
are applicable is on a breakthrough project. •  At the conceptual design phase.
most every project, •  Process FMEA’s should be updated:
•  ANYONE who had or has a
manufacturing or role in defining, executing, •  When an existing design or process
service based. is being changed.
or changing the process.
•  When carry-over designs or
For all intents and •  This includes: processes will be used in new
purposes they will be applications and environments.
used in conjunction •  Associates
•  When a problem solving study is
with your problem •  Technical Experts completed and needs to be
solving project to •  Supervisors documented.
characterize and •  System FMEA’s should be created after
measure process •  Managers system functions are defined but before
variables. In some •  Etc. specific hardware is selected.
cases the FMEA will •  Design FMEA’s should be created when
manifest itself as a new systems, products and processes are
management tool being designed.
when the project
concludes and in
some cases it will not
be appropriate to be
used in that nature.
123
Process Discovery
Why Create an FMEA?
FMEA’s help you manage

As a means to manage… RISK by classifying your
process inputs and
RISK!!!
monitoring their effects.
This is extremely important
during the course of your
project work.
We want to avoid causing failures in the Process as well as the

Primary & Secondary Metrics .
The FMEA…
This is an FMEA. We have provided a template for you to use.
# Process Potential Potential S C Potential O Current D R Recommend Responsible Taken S O D R

Function Failure Failure E l Causes of C Process E P Actions Person & Actions E C E P
(Step) Modes Effects V a Failure C Controls T N Target Date V C T N
(process (Y's) s (X's)
defects) s
1
2
3
4
5
6
7
8
9
124
Process Discovery
FMEA Components…#
The first column highlighted here is the “Process Step Number”.

(Step) Modes Effects V a Failure C Control T N Target Date V C T N
(process (Y's) s (X's) s
defects) s
The first column is the Process Step Number.

1
2
3
4
5
Etc.
FMEA Components…Process Step
The second column is the Name of the Process Step. The FMEA should sequentially follow the
steps documented in your Process Map.
§  Phone
§  Dial Number
§  Listen for Ring
§  Say Hello
§  Introduce Yourself
§  Etc.

defects) s
Enter the Name of the Process Function here. The FMEA should
sequentially follow the steps documented in your Process Map.
Phone
Dial Number
Listen for Ring
Say Hello
Introduce Yourself
Etc.
125
Process Discovery
FMEA Components…Potential Failure Modes

The third column to the mode in which the process could potentially fail. These are the defects
caused by a C, P or N factor that could occur in the Process.

Function Failure Failure E l Causes C Process E P Actions Person & Actions E C E P
(Step) Modes Effects V a of Failure C Controls T N Target Date V C T N
defects) s
Potential Failure Modes refers to the mode in which the process

could potentially fail. These are the defects caused by a C,P or N
factor that could occur in the Process.
This information is obtained from Historical Defect Data.
FYI…A failure mode is a fancy name for a defect.
At a crossroads?
FMEA Components…Potential Failure Effects

The fourth column highlighted here is simply the effect of realizing the potential failure mode on the
overall process and is focused on the output of each step.
This information is usually obtained from your Process Map.

Function Failure Failure E l Causes of C Process E P Actions Person & Action E C E P
(Step) Modes Effects V a Failure C Controls T N Target Date s V C T N
defects) s
Potential Failure Effects is simply the effect of

realizing the potential failure mode on the overall
process. It focuses on the outputs of each step.
This information can be obtained in the Process Map.
126
Process Discovery
FMEA Components…Severity (SEV)

defects) s
This ranking should be developed based on the team’s knowledge of

the process in conjunction with the predetermined scale.
The measure of Severity is a financial measure of the impact to the
business of realizing a failure in the output.
The fifth column highlighted here is the ranking that is developed based on the team’s knowledge of the
process in conjunction with the predetermined scale. Severity is a financial measure of the impact to
the business of a failure in the output.
Ranking Severity
The Automotive Industry Action Group, a consortium of the “Big Three”: Ford, GM and Chrysler
developed this criteria. If you do not like it develop one that fits your organization; just make sure
it is standardized so everyone uses the same scale.
Effect Criteria: Severity of Effect Defined Ranking

Hazardous: May endanger the operator. Failure mode affects safe vehicle operation and/or 10
Without involves non-compliance with government regulation. Failure will occur WITHOUT
Warning warning.
Hazardous: May endanger the operator. Failure mode affects safe vehicle operation and/or 9
With Warning involves non-compliance with government regulation. Failure will occur WITH
warning.
Very High Major disruption to the production line. 100% of the product may have to be scrapped. 8
Vehicle/item inoperable, loss of primary function. Customers will be very dissatisfied.
High Minor disruption to the production line. The product may have to be sorted and a portion 7
(less than 100%) scrapped. Vehicle operable, but at a reduced level of
performance. Customers will be dissatisfied.
Moderate Minor disruption to the production line. A portion (less than 100%) may have to be 6
scrapped (no sorting). Vehicle/item operable, but some comfort/convenience
item(s) inoperable. Customers will experience discomfort.
Low Minor disruption to the production line. 100% of product may have to be re-worked. 5
Vehicle/item operable, but some comfort/convenience item(s) operable at a
reduced level of performance. Customers will experience some dissatisfaction.
Very Low Minor disruption to the production line. The product may have to be sorted and a 4
portion (less than 100%) re-worked. Fit/finish/squeak/rattle item does not
conform. Most customers will notice the defect.
Minor Minor disruption to the production line. A portion (less than 100%) of the product may 3
have to be re-worked online but out-of-station. Fit/finish/squeak/rattle item
does not conform. Average customers will notice the defect.
Very Minor Minor disruption to the production line. A portion (less than 100%) of the product may 2
have to be re-worked online but in-station. Fit/finish/squeak/rattle item does
not conform. Discriminating customers will notice the defect.
None No effect. 1
* Potential Failure Mode and Effects Analysis (FMEA), Reference Manual, 2002. Pgs 29-45. Chrysler Corporation, Ford Motor Company, General Motors Corporation.
127
Process Discovery
Applying Severity Ratings to Your Process
The actual definitions of

•  The guidelines presented on the previous slide were developed for
the severity are not so
the auto industry.
important as the fact
that the team remains •  This was included only as a guideline.... actual results may vary for
consistent in its use of your project.
the definitions. Next we •  Your severity may be linked to impact on the business or impact on
show a sample of the next customer, etc.
transactional severities.
You will need to define your own criteria…
and be consistent throughout your FMEA
Let’s brainstorm how we might define the following SEVERITY

levels in our own projects:
1, 5, 10
Sample Transactional Severities
Effect Criteria: Impact of Effect Defined Ranking
Critical Business May endanger company’s ability to do business. Failure mode affects process
10
Unit-wide operation and / or involves noncompliance with government regulation.
Critical Loss - May endanger relationship with customer. Failure mode affects product delivered
Customer and/or customer relationship due to process failure and/or noncompliance with 9
Specific government regulation.
Major disruption to process/production down situation. Results in near 100%

High 7
rework or an inability to process. Customer very dissatisfied.
Moderate disruption to process. Results in some rework or an inability to process.

Moderate Process is operable, but some work arounds are required. Customers experience 5
dissatisfaction.
Minor disruption to process. Process can be completed with workarounds or
Low rework at the back end. Results in reduced level of performance. Defect is 3
noticed and commented upon by customers.
Minor disruption to process. Process can be completed with workarounds or
Minor rework at the back end. Results in reduced level of performance. Defect noticed 2
internally but not externally.
None No effect. 1
Shown here is an example for severity guidelines developed for a financial services company.
128
Process Discovery
FMEA Components…Classification “Class”
# Process Potential Potential S C Potential O Current D R Recommen Responsible Taken S O D R

Function Failure Failure E l Causes of C Process E P d Actions Person & Actions E C E P
defects) s
Class should categorize each step as a…

!  Controllable (C)
!  Procedural (P)
!  Noise (N)
This information can be obtained in the Process Map.
Recall the classifications of Procedural, Controllable and Noise developed when constructing your
Process Map and Fishbone Diagram? Use those classifications from the Fishbone in the “Class”
column, highlighted here, in the FMEA.
Potential Causes of Failure (X’s)

Function Failure Failure E l Causes C Process E P Actions Person & Action E C E P
(Step) Modes Effects V a of Failure C Controls T N Target Date s V C T N
defects) s
Potential Causes of the Failure refers to how the failure could occur.
This information should be obtained from the Fishbone Diagram.
The column “Potential Causes of the Failure”, highlighted here, refers to how the failure could
occur. This should also be obtained from the Fishbone Diagram.
129
Process Discovery
FMEA Components…Occurrence “OCC”
The column “Occurrence” highlighted here, refers to how frequently the specified failure is
projected to occur. This information should be obtained from Capability Studies or Historical Defect
Data in conjunction with the predetermined scale.

defects) s
Occurrence refers to how frequently the specified failure is projected

to occur.
This information should be obtained from Capability Studies or
Historical Defect Data - in conjunction with the predetermined scale.
Ranking Occurrence
developed these Occurrence rankings.
Probability of Failure Possible Failure Rates Cpk Ranking
Very High: Failure is almost ≥ 1 in 2 < 0.33 10

inevitable.
1 in 3 ³ 0.33 9
High: Generally associated with
1 in 8 ³ 0.51 8
processes similar to previous
processes that have often failed.
1 in 20 ³ 0.67 7
Moderate: Generally associated 1 in 80 ³ 0.83 6

with processes similar to previous
processes that have experienced 1 in 400 ³ 1.00 5
occasional failures but not in major
proportions. 1 in 2,000 ³ 1.17 4
Low: Isolated failures associated

1 in 15,000 ³ 1.33 3
with similar processes.
Very Low: Only isolated failures
associated with almost identical 1 in 150,000 ³ 1.5 2
processes.
Remote: Failure is unlikely. No
failures ever associated with almost ≤ 1 in 1,500,000 ³ 1.67 1
identical processes.
Potential Failure Mode and Effects Analysis (FMEA), Reference Manual, 2002. Pg. 35.. Chrysler Corporation, Ford Motor Company, General Motors Corporation.
130
Process Discovery
FMEA Components…Current Process Controls

defects) s
Current Process Controls refers to the three types of controls that are
in place to prevent a failure in with the X’s. The 3 types of controls are:
•  SPC - (Statistical Process Control)
•  Poke-Yoke – (Mistake Proofing)
•  Detection after Failure – (Inspection)
Ask yourself How do we control this defect?
The column “Current Process Controls” highlighted here refers to the three types of controls that are
in place to prevent a failures.
FMEA Components…Detection (DET)

defects) s
Detection is an assessment of the probability that the proposed type

of control will detect a subsequent Failure Mode.
This information should be obtained from your Measurement System

Analysis Studies and the Process Map. A rating should be assign in
conjunction with the predetermined scale.
The “Detection” highlighted here is an assessment of the probability that the proposed type of
control will detect a subsequent failure mode.
131
Process Discovery
Ranking Detection
Criteria: The likelihood that the existence of a defect will

Detection be detected by the test content before the product Ranking
advances to the next or subsequent process
Almost Impossible Test content must detect < 80% of failures 10
Very Remote Test content must detect 80% of failures 9
Remote Test content must detect 82.5% of failures 8
Very Low Test content must detect 85% of failures 7
Low Test content must detect 87.5% of failures 6
Moderate Test content must detect 90% of failures 5
Moderately High Test content must detect 92.5% of failures 4
High Test content must detect 95% of failures 3
Very High Test content must detect 97.5% of failures 2
Almost Certain Test content must detect 99.5% of failures 1
Potential Failure Mode and Effects Analysis (FMEA), AIAG Reference Manual, 2002 Pg. 35. Chrysler Corporation, Ford Motor Company, General
Motors Corporation.
developed these Detection criteria.
Risk Priority Number “RPN”
The “The Risk

Priority Number” # Process Potential Potential S C Potential O Current D R Recommend Responsible Taken S O D R
highlighted here is Function
(Step)
Failure
Modes
Failure
Effects
E
V
l Causes
a of Failure
C
C
Process
Controls
E
T
P
N
Actions Person &
Target Date
Actions E C E P
V C T N
a value that will be (process
defects)
(Y's) s
s
(X's)
used to rank order

the concerns from
the process.
The Risk Priority Number is a value that will be used to rank order
We provided you the concerns from the process.
with a template
which will
automatically The RPN is the product of Severity, Occurrence and Detect ability as
calculate this for represented here…
you based on your
inputs for Severity,
RPN = (SEV)*(OCC)*(DET)
Occurrence and
Detection.
132
Process Discovery
FEMA Components…Actions

defects) s
Recommended Actions refers to the activity for the prevention of a

defect.
Responsible Person & Date refers to the name of the group or

person responsible for completing the activity and when they will
complete it.
Taken Action refers to the action and effective date after it has been
completed.
The columns highlighted here are a type of post FMEA. Remember to update the FMEA throughout
your project, this is what we call a “Living Document” as it changes throughout your project.
FMEA Components…Adjust RPN

defects) s
Once the Recommended Actions, Responsible Person & Date,

Taken Action have been completed the Severity, Occurrence and
Detection should be adjusted. This will result in a new RPN rating.
The columns highlighted here are the adjusted levels based on the actions you have taken within the
process.
133
Process Discovery
FMEA Exercise
Exercise objective: Assemble your team in order

to create a FMEA using the information
generated from the Process Map, Fishbone
Diagram and X-Y Matrix.
1.  Be prepared to present results to your mentor.
OK Team, let’s get

that FMEA!
134
Process Discovery
§  Create a high-level Process Map
§  Create a Fishbone Diagram
§  Create an X-Y Diagram
§  Create an FMEA
§  Describe the purpose of each tool and when it should be used
You have now completed Measure Phase – Process Discovery.
Notes
135
Lean Six Sigma

Green Belt Training
Measure Phase
Now we will continue in the Measure Phase with “Six Sigma Statistics”.
136
Overview
In this module you will learn how your
processes speak to you in the form of Welcome to Measure
data. If you are to understand the
Process Discovery
behaviors of your processes you must
learn to communicate with the process Six Sigma Statistics
in the language of data.
Basic Statistics
The field of statistics provides tools Descriptive Statistics
and techniques to act on data, to turn
Normal Distribution
data into information and knowledge
which you will then use to make Assessing Normality
decisions and to manage your Special Cause / Common Cause
processes.
Graphing Techniques
The statistical tools and methods you
will need to understand and optimize
your processes are not difficult. Use of Process Capability
Excel spreadsheets or specific
statistical analytical software Wrap Up & Action Items
has made this a relatively easy task.
In this module you will learn basic, yet powerful, analytical approaches and tools to increase your
ability to solve problems and manage process behavior.
Purpose of Basic Statistics
The purpose of Basic Statistics is to:

•  Provide a numerical summary of the data being analyzed.
–  Data (n)
•  Factual information organized for analysis
•  Numerical or other information represented in a form suitable for processing by
computer
•  Values from scientific experiments
•  Provide the basis for making inferences about the future.

•  Provide the foundation for assessing process capability.
•  Provide a common language to be used throughout an
organization to describe processes.
Relax….it won t
be that bad!
Statistics is the basic language of Six Sigma. A solid understanding of Basic Statistics is the
foundation upon which many of the subsequent tools will be based.
Having an understanding of Basic Statistics can be quite valuable. Statistics however, like anything,
can be taken to the extreme.
137
Purpose of Basic Statistics (Cont.)
But it is not the need or the intent of this course to do that, nor is it the intent of Six Sigma. It can be
stated Six Sigma does not make people into statisticians rather it makes people into excellent
problem solvers by using appropriate statistical techniques.
Data is like crude oil that comes out of the ground. Crude oil is not of much good use. However if
the crude oil is refined many useful products occur; such as medicines, fuel, food products,
lubricants, etc. In a similar sense statistics can refine data into usable “products” to aid in decision
making, to be able to see and understand what is happening, etc.
Statistics is broadly used by just about everyone today. Sometimes we just do not realize it. Things
as simple as using graphs to better understand something is a form of statistics, as are the many
opinion and political polls used today. With easy to use software tools to reduce the difficulty and
time to do statistical analyses, knowledge of statistics is becoming a common capability amongst
people.
An understanding of Basic Statistics is also one of the differentiating features of Six Sigma and it
would not be possible without the use of computers and programs like MINITAB™. It has been
observed the laptop is one of the primary reasons Six Sigma has become both popular and
effective.
Statistical Notation – Cheat Sheet

Use this as a cheat sheet however do not bother memorizing all of this. Actually most of the notation
in Greek is for population data.
Summation An individual value, an observation
The Standard Deviation of sample data A particular (1st) individual value
The Standard Deviation of population data For each, all, individual values
The variance of sample data The Mean, average of sample data

The variance of population data
The grand Mean, grand average
The range of data
The Mean of population data
The average range of data
Multi-purpose notation, i.e. # of subgroups, # A proportion of sample data

of classes
A proportion of population data
The absolute value of some term
Sample size
Greater than, less than
Greater than or equal to, less than or equal
Population size
to
138
Parameters versus Statistics
Population: All the items that have the property of interest under study.
Frame: An identifiable subset of the population.
Sample: A significantly smaller subset of the population used to make an inference.
Population
Frame
Sample
Sample
Sample
Population Parameters: Sample Statistics:

–  Arithmetic descriptions of a population –  Arithmetic descriptions of a
–  µ, σ , P, σ2, N sample
–  X-bar , s, p, s2, n
The purpose of sampling is:

To get a “sufficiently accurate” inference for considerably less time, money and other resources.
Also to provide a basis for statistical inference; if sampling is done well, and sufficiently, then the
inference is “what we see in the sample is representative of the population”
A population parameter is a numerical value that summarizes the data for an entire population
while a sample has a corresponding numerical value called a statistic.
The population is a collection of all the individual data of interest. It must be defined carefully
such as all the trades completed in 2001. If for some reason there are unique subsets of trades it
may be appropriate to define those as a unique population such as; “all sub custodial market
trades completed in 2001” or “emerging market trades”.
Sampling frames are complete lists and should be identical to a population with every element
listed only once. It sounds very similar to population and it is. The difference is how it is used. A
sampling frame, such as the list of registered voters, could be used to represent the population of
adult general public. Maybe there are reasons why this would not be a good sampling frame.
Perhaps a sampling frame of licensed drivers would be a better frame to represent the general
public.
The sampling frame is the source for a sample to be drawn.
It is important to recognize the difference between a sample and a population because we

typically are dealing with a sample of the what the potential population could be in order to make
an inference. The formulas for describing samples and populations are slightly different. In most
cases we will be dealing with the formulas for samples.
139
Types of Data
Attribute Data (Qualitative)

–  Is always binary, there are only two possible values (0, 1)
•  Yes, No
•  Go, No Go
•  Pass/Fail
Variable Data (Quantitative)
–  Discrete (Count) Data
•  Can be categorized in a classification and is based on counts.
–  Number of defects
–  Number of defective units
–  Number of customer returns
–  Continuous Data
•  Can be measured on a continuum, it has decimal subdivisions that are
meaningful
–  Time, Pressure, Conveyor Speed, Material feed rate
–  Money
–  Pressure
–  Conveyor Speed
–  Material feed rate
The nature of data of data is important to understand. Based on the type of data you will have the
option to utilize different analyses.
Data, or numbers, are usually abundant and available to virtually everyone in the organization.
Using data to measure, analyze, improve and control processes forms the foundation of the Six
Sigma methodology. Data turned into information, then transformed into knowledge, lowers the risks
of improper decision making. Your goal is to make more decisions based on data versus the typical
practices of “I think”, “I feel” and “In my opinion”.
One of your first steps in refining data into information is to recognize what the type of data is you
are using. There are two primary types of data, they are attribute and variable data.
Attribute Data is also called qualitative data. Attribute Data is the lowest level of data. It is purely
binary in nature. Good or bad, yes or no type data. No analysis can be performed on Attribute Data.
Attribute Data must be converted to a form of variable data called discrete data in order to be
counted or be useful.
Discrete Data is information that can be categorized into a classification. Discrete Data is based on
counts. It is typically things counted in whole numbers. Discrete Data is data that can't be broken
down into a smaller unit to provide additional meaning. Only a finite number of values is possible
and the values cannot be subdivided meaningfully. For example, there is no such thing as a half of
defect or a half of a system lockup.
Continuous Data is information that can be measured on a continuum or scale. Continuous Data,
also called quantitative data can have almost any numeric value and can be meaningfully
subdivided into finer and finer increments, depending upon the precision of the measurement
system. Decimal sub-divisions are meaningful with Continuous Data. As opposed to Attribute Data
like good or bad, off or on, etc., Continuous Data can be recorded at many different points (length,
size, width, time, temperature, cost, etc.). For example 2.543 inches is a meaningful number,
whereas 2.543 defects does not make sense.
Later in the course we will study many different statistical tests but it is first important to understand
what kind of data you have.
140
Discrete Variables
Discrete Variable Possible Values for the Variable
The number of defective needles in boxes of 100 0,1,2, …, 100

diabetic syringes
The number of individuals in groups of 30 with a 0,1,2, …, 30

Type A personality
The number of surveys returned out of 300 0,1,2, … 300

mailed in a customer satisfaction study
The number of employees in 100 having finished 0,1,2, … 100

high school or obtained a GED
The number of times you need to flip a coin 1,2,3, …

before a head appears for the first time
(note, there is no upper limit because you might
need to flip forever before the first head appears)
Shown here are additional Discrete Variables. Can you think of others within your business?
Continuous Variables
Continuous Variable Possible Values for the Variable
The length of prison time served for individuals All the real numbers between a and b, where a is
convicted of first degree murder the smallest amount of time served and b is the
largest.
The household income for households with All the real numbers between a and $30,000,
incomes less than or equal to $30,000 where a is the smallest household income in the
population
The blood glucose reading for those people All real numbers between 200 and b, where b is
having glucose readings equal to or greater than the largest glucose reading in all such people
200
Shown here are additional Continuous Variables. Can you think of others within your business?
141
Definitions of Scaled Data
Understanding the nature of data and how to represent it

can affect the types of statistical tests possible.
•  Nominal Scale – data consists of names, labels or categories. Cannot

be arranged in an ordering scheme. No arithmetic operations are
performed for nominal data.
•  Ordinal Scale – data is arranged in some order but differences between

data values either cannot be determined or are meaningless.
•  Interval Scale – data can be arranged in some order and for which
differences in data values are meaningful. The data can be arranged in
an ordering scheme and differences can be interpreted.
•  Ratio Scale – data that can be ranked and for which all arithmetic
operations including division can be performed. (division by zero is of
course excluded) Ratio level data has an absolute zero and a value of
zero indicates a complete absence of the characteristic of interest.
Shown here are the four types of scales. It is important to understand these scales as they will
dictate the type of statistical analysis that can be performed on your data.
Nominal Scale
Listed are
some Qualitative Variable Possible nominal level data values for
examples of the variable
Nominal Data.
The only
Blood Types A, B, AB, O
analysis is
whether they
are different or
not. State of Residence Alabama, …, Wyoming
Country of Birth United States, China, other
Time to weigh in!
142
Ordinal Scale
These are examples of Ordinal Data.
Qualitative Variable Possible Ordinal level data

values
Automobile Sizes Subcompact, compact,

intermediate, full size, luxury
Product rating Poor, good, excellent
Baseball team classification Class A, Class AA, Class AAA,

Major League
Interval Scale
Interval Variable Possible Scores
IQ scores of students in Black 100…

Belt Training (the difference between scores
is measurable and has
meaning but a difference of 20
points between 100 and 120
does not indicate that one
student is 1.2 times more
intelligent)
These are examples of Interval Data.
143
Ratio Scale
Shown here is an example of Ratio Data.
Ratio Variable Possible Scores
Grams of fat consumed per adult in the 0…

United States (If person A consumes 25 grams of fat and
person B consumes 50 grams, we can say
that person B consumes twice as much fat
as person A. If a person C consumes zero
grams of fat per day, we can say there is a
complete absence of fat consumed on that
day. Note that a ratio is interpretable and
an absolute zero exists.)
Converting Attribute Data to Continuous Data
Continuous Data provides us more opportunity for statistical analyses. Attribute Data can often
be converted to Continuous by converting it to a rate.
Continuous Data is always more desirable.
In many cases Attribute Data can be converted to Continuous

Data.
Which is more useful?

–  15 scratches or total scratch length of 9.25
–  22 foreign materials or 2.5 fm/square inch
–  200 defects or 25 defects/hour
Is this data continuous?
144
Descriptive Statistics
We will review the

Descriptive Statistics shown Measures of Location (central tendency)
here that are the most –  Mean
commonly used. –  Median
–  Mode
1) For each of the measures
of location, how alike or
different are they? Measures of Variation (dispersion)
–  Range
2) For each measure of
–  Interquartile Range
variation, how alike or
different are they? –  Standard deviation
–  Variance
3) What do these similarities
or differences tell us?
Open the MINITAB™ Project Measure Data Sets.mpj and select We are going to use
the worksheet basicstatistics.mtw the MINITAB™
worksheet shown here
to create graphs and
statistics. Open the
worksheet
“basicstatistics.mtw”.
145
Measures of Location
Mean are the most common measure of location. A “Mean” implies you are talking about the
population or inferring something about the population. Conversely, average, implies something
about sample data.
Mean is:
•  Commonly referred to as the average.
•  The arithmetic balance point of a distribution of data.
Stat>Basic Statistics>Display Descriptive Statistics…>Graphs…
>Histogram of data, with normal curve
Sample Population
Descriptive Statistics: Data
Variable N N* Mean SE Mean StDev Minimum Q1

Median Q3
Data 200 0 4.9999 0.000712 0.0101 4.9700 4.9900
5.0000 5.0100
Variable Maximum
Data 5.0200
Although the symbol is different there is no mathematical difference between the Mean of a sample
and Mean of a population.
The physical center of a data set is the Median and unaffected by large data values. This is why
people use Median when discussing average salary for an American worker, people like Bill Gates
and Warren Buffet skew the average number.
Median is:
•  The mid-point, or 50th percentile, of a distribution of data.
•  Arrange the data from low to high or high to low.
–  It is the single middle value in the ordered list if there is an odd
number of observations
–  It is the average of the two middle values in the ordered list if there
are an even number of observations
Variable N N* Mean SE Mean StDev Minimum Q1 Median Q3

Data 200 0 4.9999 0.000712 0.0101 4.9700 4.9900 5.0000 5.0100
Variable Maximum
Data 5.0200
146
Measures of Location (cont.)
Trimmed Mean is a:
Compromise between the Mean and Median.
•  The Trimmed Mean is calculated by eliminating a specified percentage of
the smallest and largest observations from the data set and then
calculating the average of the remaining observations
•  Useful for data with potential extreme values.
Stat>Basic Statistics>Display Descriptive Statistics…>Statistics…> Trimmed Mean
Variable N N* Mean SE Mean TrMean StDev Minimum Q1 Median

Data 200 0 4.9999 0.000712 4.9999 0.0101 4.9700 4.9900 5.0000
Variable Q3 Maximum
Data 5.0100 5.0200
The trimmed Mean (highlighted above) is less susceptible to the effects of extreme scores.
Mode is:
The most frequently occurring value in a distribution of data.
Mode = 5
It is possible to have multiple Modes. When this happens it is called Bi-Modal Distributions. Here
we only have one; Mode = 5.
147
Measures of Variation (cont.)
Range is the:
Difference between the largest observation and the smallest observation
in the data set.
•  A small range would indicate a small amount of variability and a large range
a large amount of variability.

Data 200 0 4.9999 0.000712 0.0101 4.9700 4.9900 5.0000 5.0100
Variable Maximum
Data 5.0200
Interquartile Range is the:

Difference between the 75th percentile and the 25th percentile.
Use Range or Interquartile Range when the data distribution is Skewed.
A range is typically used for small data sets which is completely efficient in estimating variation for
a sample of 2. As your data increases the Standard Deviation is a more appropriate measure of
variation.
Variance is the:
Average squared deviation of each individual data point from the Mean.
Sample Population
The Variance is the square of the Standard Deviation. It is common in statistical tests where it is
necessary to add up sources of variation to estimate the total.
Standard Deviations cannot be added, variances can.
148
Measures of Variation (cont.)
Standard Deviation is:

•  Equivalent of the average deviation of values from the Mean for a
distribution of data.
•  A unit of measure for distances from the Mean.
•  Use when data are symmetrical.
Sample Population

Data 200 0 4.9999 0.000712 0.0101 4.9700 4.9900 5.0000 5.0100
Variable Maximum
Data 5.0200
Cannot calculate population Standard Deviation because this is sample data.
The Standard Deviation for a sample and population can be equated with short and long-term
variation. Usually a sample is taken over a short period of time making it free from the types of
variation that can accumulate over time so be aware. We will explore this further at a later point in
the methodology.
Normal Distribution
The Normal Distribution is the most recognized distribution in

statistics.
What are the characteristics of a Normal Distribution?

–  Only random error is present
–  Process free of assignable cause
–  Process free of drifts and shifts
So what is present when the data is Non-normal?
We can begin to discuss the Normal Curve and its properties once we understand the basic
concepts of central tendency and dispersion.
As we begin to assess our distributions know that sometimes it is actually more difficult to determine
what is effecting a process if it is Normally Distributed. When we have a Non-normal Distribution
there are usually special or more obvious causes of variation that can be readily apparent upon
process investigation.
149
The Normal Curve
The Normal Distribution is

The Normal Curve is a smooth, symmetrical, bell-shaped
the most commonly used
and abused distribution in
curve generated by the density function.
statistics and serves as the
foundation of many
statistical tools which will be
taught later in the
methodology.
It is the most useful continuous probability model as

many naturally occurring measurements such as
heights, weights, etc. are approximately Normally
Distributed.
Normal Distribution
The shape of the Each combination of Mean and Standard Deviation generates a
Normal unique Normal curve:
Distribution is a
function of two
parameters, (the
Mean and the
Standard
Deviation).
We will convert the

Normal “Standard” Normal Distribution:
Distribution to the
–  Has a µ = 0, and σ = 1
standard Normal in
order to compare –  Data from any Normal Distribution can be made to fit the
various Normal standard Normal by converting raw scores to standard scores.
Distributions and
to estimate tail
–  Z-scores measure how many Standard Deviations from the
area proportions. mean a particular data-value lies.
By normalizing the Normal Distribution this converts the raw scores into standard Z-scores with a
Mean of 0 and Standard Deviation of 1, this practice allows us to use the Z-table.
150
Normal Distribution (cont.)
The area under the curve between any two points represents the
proportion of the distribution between those points.
The area between the

Mean and any other
point depends upon the
Standard Deviation.
µ x
Convert any raw score to a Z-score using the formula:
Refer to a set of Standard Normal Tables to find the proportion

between µ and x.
The area under the curve between any two points represents a proportion of the distribution. The
concept of determining the proportion between 2 points under the standard Normal curve is a critical
component to estimating Process Capability and will be covered in detail in that module.
Empirical Rule
The Empirical
rule allows us to
predict, or more
appropriately,
make an
estimate of how
our process is
performing. You
will gain a great
deal of
understanding
within the
Process
Capability
module. Notice
the difference
between +/- 1
SD and +/- 6 SD.
151
The Empirical Rule (cont.)
No matter what the shape of your distribution as you travel 3 Standard

Deviations from the Mean the probability of occurrence beyond that point
begins to converge to a very low number.
Why Assess Normality?
There is no
good and bad. It While many processes in nature behave according to the Normal
is not always Distribution many processes in business, particularly in the areas of
better to have service and transactions, do not.
“Normal” data,
look at it in There are many types of distributions:
respect to the
intent of your
project. Again,
there is much
informational
content in non-
Normal There are many statistical tools that assume Normal Distribution
Distributions, for properties in their calculations.
this reason it is
useful to know So understanding just how Normal the data are will impact how we
how Normal our look at the data.
data are.
Go back to your project, what do you want to do with your distribution, Normal or Non-normal.
Many distributions simply by nature can NOT be Normal. Assume that your dealing with a time
metric, how do you get negative time, without having a flux capacitor as in the movie “Back to the
Future.” If your metric is by nature bound to some setting.
152
Tools for Assessing Normality
The Anderson
Darling test yields a The shape of any Normal curve can be calculated based on the
statistical Normal Probability density function.
assessment (called
a goodness-of-fit Tests for Normality basically compare the shape of the calculated
test) of Normality curve to the actual distribution of your data points.
and the MINITAB™
version of the
For the purposes of this training we will focus on two ways in
Normal probability
MINITAB™ to assess Normality:
test produces a
–  The Anderson-Darling test
graph to visual
demonstrate just –  Normal probability test
how good that fit is.
Watch that curve!
Goodness-of-Fit
The Anderson-Darling test uses an empirical density function.
100
Departure of the Expected for Normal Distribution
Actual Data
actual data from 20%
the expected C
80
u
Normal Distribution. m
u
l
The Anderson- a 60
t
i
Darling Goodness- v
e
of-Fit test assesses P
e 40
r
the magnitude of c
e
these departures n
t
20
using an Observed 20%
minus Expected
0
formula. 3.0 3.5 4.0 4.5 5.0 5.5
Raw Data Scale
Anderson-Darling test assesses how closely actual frequency at a given value corresponds to the
theoretical frequency for a Normal Distribution with the same Mean and Standard Deviation.
153
The Normal Probability Plot
From the MINITABTM .mpj file look up the

worksheet “Descriptive Statistics.MTW” and
use column C5 titled “Anderson Darling” to
P-value 0.921
The Anderson-Darling test is a good
perform the Normality test as shown: litmus test for normality: if the P-
Stat>Basic Statistics>Normality Test. Choose value is more than .05 your data are
Normal enough for most purposes.
“Anderson Darling” and click “OK”.
The graph shows the probability density of your data plotted against the expected density of a
Normal curve. Notice the y-axis (probability) does not increase linearly as it is logarithmic based.
When the data fits a Normal Distribution the points (closed red circles) will be on or very close to the
Gaussian model (the blue line) in this analysis. A “P-value” of 0.921 (which is > 0.05) tells us the
distribution follows that of a “Normal Distribution” for the 500 points plotted in this example. There
are a few values on the higher side that tend to deviate away from the model. This means there are
a few Outliers on the higher side. However there are not enough to disrupt the “Normal Distribution”
pattern as we have a large set of 500 data points.
The Anderson-Darling test also appears in this

output. Again, if the P-value is greater than .05
assume the data are Normal.
P-value = 0.921
The reasoning
behind the decision
to assume
Normality based on
the P-value will be
covered in the
Analyze Phase.
For now just accept
this as a general
guideline.
154
Anderson-Darling Caveat
Use the Anderson Darling column to generate these graphs.

Summary for Anderson Darling
Probability Plot of Anderson Darling
A nderson-Darling N ormality Test
Normal
A -S quared 0.18
99.9 P -V alue 0.921
Mean 50.03
M ean 50.031
StDev 4.951
99 S tDev 4.951
N 500 V ariance 24.511
AD 0.177 S kew ness -0.061788
95 P-Value 0.921 Kurtosis -0.180064
90 N 500
80 M inimum 35.727
70 1st Q uartile 46.800
Percent
60 M edian 50.006
50 3rd Q uartile 53.218
40 36 40 44 48 52 56 60
M aximum 62.823
30
95% C onfidence Interv al for M ean
20
49.596 50.466
10 95% C onfidence Interv al for M edian
5 49.663 50.500
95% C onfidence Interv al for S tDev
1 9 5 % C onfidence Inter vals
4.662 5.278
Mean
0.1
35 40 45 50 55 60 65 Median
Anderson Darling 49.50 49.75 50.00 50.25 50.50
In this case both the Histogram and the Normality Plot look very normal . However
because the sample size is so large the Anderson-Darling test is very sensitive and any
slight deviation from Normal will cause the P-value to be very low. Again, the topic of
sensitivity will be covered in greater detail in the Analyze Phase.
For now, just assume that if N > 100 and the data
look Normal, then they probably are.
If the Data Are Not Normal, Do Not Panic!
Once again, Non-

normal Data is NOT a •  Normal Data are not common in the transactional world.
bad thing depending
on the type of •  There are lots of meaningful statistical tools you can use to
analyze your data (more on that later).
process / metrics you
are working with.
•  It just means you may have to think about your data in a
Sometimes it can even slightly different way.
be exciting to have
Non-normal Data
because in some ways
it represents
opportunities for
improvements.
Don t touch that button!
155

Normality Exercise
Exercise objective: To demonstrate how to test

for Normality.
1.  Generate Normal Probability Plots and the

graphical summary using the Descriptive
Statistics.MTW file.
2.  Use only the columns Dist A and Dist D.
3.  Answer the following quiz questions based on

your analysis of this data set.
Answers:
1) Is Distribution A Normal? Answer > No
2) Is Distribution B Normal? Answer > No
Isolating Special Causes from Common Causes
Do not get too worried

about killing all Special Cause: Variation caused by known factors resulting in a
variation, get the non-random distribution of output. Also referred to as Assignable
biggest bang for your Cause .
buck and start making
improvements by Common Cause: Variation caused by unknown factors resulting in a
following the steady but random distribution of output around the average of the
methodology. Many
data. It is the variation left over after Special Cause variation has
companies today can
been removed and typically (not always) follows a Normal
realize BIG gains and
reductions in variation
Distribution.
by simply measuring,
describing the If we know the basic structure of the data should follow a Normal
performance and then Distribution but plots from our data shows otherwise; we know the
making common sense data contain Special Causes.
adjustments within the
process…recall the
“ground fruit”?
Special Causes = Opportunity
Think about your data
in terms of what it should look like, then compare it to what it does look like. See some deviation,
maybe some Special Causes at work?
156
Introduction to Graphing
Passive data
collection means do
The purpose of Graphing is to:
not mess with the •  Identify potential relationships between variables.
process! We are •  Identify risk in meeting the critical needs of the Customer,
gathering data and
Business and People.
looking for patterns
in a graphical tool. If •  Provide insight into the nature of the X’s that may or may not
the data is control Y.
questionable, so is •  Show the results of passive data collection.
the graph we create
from it. For now
utilize the data In this section we will cover…
available, we will
learn a tool called 1.  Box Plots
Measurement 2.  Scatter Plots
System Analysis later
3.  Dot Plots
in this phase.
4.  Time Series Plots
5.  Histograms
Data Sources
Data
demographics Data sources are suggested by many of the tools that have been
will come out of covered so far:
the basic
–  Process Map
Measure Phase
tools such as –  X-Y Matrix
Process Maps, –  FMEA
X-Y Diagrams, –  Fishbone Diagrams
FMEAs and
Fishbones. Put
your focus on
Examples are:
the top X’s from
X-Y Diagram to 1. Time 3. Operator
Shift Training
focus your
Day of the week Experience
activities. Skill
Week of the month
Season of the year Adherence to procedures
2. Location/position 4. Any other sources?

Facility
Region
Office
157
Graphical Concepts
The characteristics of a graph are

critical to the graphing process. The characteristics of a good graph include:
The validity of data allows us to •  Variety of data
understand the extent of error in
•  Selection of
the data. The selection of
variables impacts how we can –  Variables
control a specific output of a –  Graph
process. The type of graph will –  Range
depend on the data
demographics while the range
Information to interpret relationships
will be related to the needs of the
customer. The visual analysis of
the graph will qualify further Explore quantitative relationships
investigation of the quantitative
relationship between the
variables.
The Histogram
A Histogram is a basic
A Histogram displays data that have been summarized into intervals. It
graphing tool that displays can be used to assess the symmetry or Skewness of the data.
the relative frequency or the
number of times a
measured items falls within
a certain cell size. The
values for the
measurements are shown
on the horizontal axis (in
cells) and the frequency of
each size is shown on the
vertical axis as a bar graph.
The graph illustrates the
distribution of the data by
showing which values occur
most and least frequently. To construct a Histogram the horizontal axis is divided into equal
A Histogram illustrates the intervals with a vertical bar drawn at each interval to represent its
shape, centering and frequency (the number of values that fall within the interval).
spread of the data you
have. It is very easy to
construct and an easy to use tool that you will find useful in many situations. This graph represents
the data for the 20 days of arrival times at work from the previous lesson page.
In many situations the data will form specific shaped distributions. One very common distribution
you will encounter is called the Normal Distribution, also called the bell shaped curve for its
appearance. You will learn more about distributions and what they mean throughout this course.
158
Histogram Cont’d.
Choose the
Minitab Screen Command Menu Steps ~
worksheet titled
“Graphing
Data.MTW” from
project file. Now
perform the
Histogram based on
the following steps:
Graph>Histogram>Si
mple as shown on
the three
screenshots here.
The next step to
plotting the four
Histograms on the
same slide (as 4 in 1)
is shown next.
Once you select the data columns H1_20, H2_20, H3_20, H4_20 and click “Select” you would see
those variables displayed on the inside insert window of this screen as “H1_20-H4_20”. Now prior
to clicking “Ok” you can select the option of “Multiple Graphs” and choose the option “In separate
panels of the same graph”. Click “Ok” on this window and then the final window to arrive at the 4-
in-1 Histogram as shown on the next slide.
159
Histogram Caveat
As you can see in

the MINITAB™ file All the Histograms below were generated using random samples of
the columns used the data from the worksheet Graphing Data.mtw .
to generate the
Histograms here
only have 20 data
points. It is easy to
generate your own
samples to create
Histogram simply
by using the
MINITAB™ menu
path:
“Calc>Random
Data>Sample from
columns…”
Be careful not to determine Normality simply from a Histogram plot

since if the sample size is low the data may not look very Normal.
Variation on a Histogram
The Histogram shown here looks to be very Normal.
Using the worksheet Graphing Data.mtw create a simple

Histogram for the data column called granular.
160
Dot Plot
Using the worksheet “Graphing
Data.mtw”, create a Dot Plot.
The Dot Plot can be a useful alternative to the
Histogram for the granular Histogram especially if you want to see
individual values or you want to brush the data.
distribution obscures the
granularity whereas the Dot Plot
reveals it. Also Dot Plots allow
the user to brush data points.
The Histogram does not.
Points could have Special

Causes associated with them.
These occurrences should also

be identified in the Logbook in
order to assess the potential for
a special cause related to them.
You should look for potential
Special Cause situations by
examining the Dot Plot for both high frequencies and location.
If in fact there are Special Causes (Uncontrollable Noise or Procedural non-compliance) they should
be addressed separately then excluded from this analysis. Take a few minutes and create other Dot
Plots using the columns in this data set.
Box Plot
A Box Plot (sometimes called a
Box Plots summarize data about the shape, dispersion and center of the data
Whisker Plot) is made up of a box
and also help spot Outliers.
representing the central mass of the
variation and thin lines, called Box Plots require one of the variables, X or Y, be categorical or Discrete and
whiskers, extending out on either the other be Continuous.
side representing the thinning tails of A minimum of 10 observations should be included in generating the Box Plot.
the distribution. Box Plots summarize
information about the shape, Maximum Value
dispersion and center of your data.

Because of their concise nature it
75th Percentile
easy to compare multiple Middle
distributions side by side. 50% of 50th Percentile (Median)
Data
Mean
25th Percentile
These may be “before” and “after”
views of a process or a variable. Or
they may be several alternative ways
of conducting an operation. min(1.5 x Interquartile Range
or minimum value)
Essentially when you want to quickly Outliers
find out if two or more distributions
are different (or the same) you
create a Box Plot. They can also
help you spot Outliers quickly which
show up as asterisks on the chart.
161
Box Plot Anatomy

A Box Plot is based on
Outlier
quartiles and represents a *
distribution as shown on the Upper Limit: Q3+1.5(Q3-Q1)
left of the graphic. The lines
extending from the box are Upper Whisker
called whiskers. The whiskers
extend outward to indicate the Q3: 75th Percentile
lowest and highest values in Median
Box
the data set (excluding Q2: Median 50th Percentile
outliers). The lower whisker Q1: 25th Percentile
represents the first 25% of the
data in the Histogram (the light
Lower Whisker
grey area). The second and
third quartiles form the box
Lower Limit: Q1-1.5(Q3-Q1)
that represents 50% of the
data and finally the whisker on
the right represents the fourth quartile. The line drawn through the box represents the Median of
the data. Extreme values, or Outliers, are represented by asterisks. A value is considered an
Outlier if it is outside of the box (greater than Q3 or less than Q1) by more than 1.5 times (Q3-Q1).
You can use the Box Plot to assess the symmetry of the data: If the data are fairly symmetric, the
Median line will be roughly in the middle of the box and the whiskers will be similar in length. If the
data are skewed the Median may not fall in the middle of the box and one whisker will likely be
noticeably longer than the other.
Box Plot Examples
The first Box Plot

shows the
differences in What can you tell
glucose level for about the data
nine different
people.
expressed in
these Box Plots?
The second Box
Plot shows the
effects of
cholesterol
medication over
time for a group
of patients.
Eat this –
then
check the
Box Plot!
162
Box Plot Examples
Using the MINITAB™ worksheet “Graphing Data.mtw”.
The data shows the

setup cycle time to
complete “Lockout –
Tagout” for three people
in the department.
Looking only at the Box

Plots it appears that
Brian should be the
benchmark for the
department since he has
the lowest Median setup
cycle time with the
smallest variation. On
the other hand Shree’s
data has 3 Outlier points
that are well beyond
what would be expected
for the rest of the data
and his variation is
larger.
Be cautious drawing conclusions solely from a Box Plot. Shree may be the expert who is brought in
for special setups because no one else can complete the job.
163
Individual Value Plot Enhancement

Open the
MINITAB™
Project “Measure
Data Sets.mpj”
and select the
worksheet
“Graphing
Data.mtw”.
The individual
value plot shows
the individual data
points represented
in the Box Plot.
There are many
options available
within MINITAB™,
take a few
minutes to explore
the options within
the dialog box
found by following the menu path “Graph> Individual Value Plot> Multiple Y’s, Simple…”.
Attribute Y Box Plot
Using the Box Plot with an Attribute Y (pass/fail) and a Continuous X

MINITAB™
Graph> Box Plot…One Y, With Groups…Scale…Transpose value and category scales
worksheet
“Graphing
Data.mtw”.
To create this
Box Plot follow
the MINITAB™
menu path
“Graph> Box
Plot…One Y,
With Groups…
Scale…
Transpose value
and category
scales”.
If the output is pass/fail it must be plotted on the y axis. Use the data shown to create the transposed
Box Plot. The reason we do this is for consistency and accuracy.
164
Attribute Y Box Plot
The dialog box

shown here can be
found by selecting
the “Scale” button
in the “One Y, With
Groups “ dialog
box.
The output Y is
Pass/Fail, the Box
Plot shows the
spread of hydrogen
content that created
the results.
Individual Value Plot
Using the MINITAB™ worksheet “Graphing Data.mtw”, follow the MINITAB™ menu path
“Stat>ANOVA> One-Way (Unstacked )>Graphs…Individual value plot, Boxplots of data”, make both
graphs using the columns indicated and tile them.
The Individual Value Plot when used with a Categorical X or Y enhances

the information provided in the Box Plot:
–  Recall the inherent problem with the Box Plot when a bimodal
distribution exists (Box Plot looks perfectly symmetrical)
–  The Individual Value Plot will highlight the problem
Stat>ANOVA> One-Way (Unstacked )>Graphs…Individual value plot, Box Plots of data
165
Individual Value Plot
On the Individual
Plot data points, Minitab Screen Command Menu Steps ~
click your mouse
once and it selects
all data points as
shown on this
screenshot above.
Then click on
“Editor” under your
main menu and
choose “Edit
Individual
Symbols…” to
arrive at the window
“Edit Individual
Symbols.” At this
window select
“Identical Points”
then move to the
next slide.
Jitter Example
By using the
Jitter function Once your graph is created click once on any of the data points (that action should
select all the data points).
we will spread
Then go to MINITAB™ menu path: Editor> Edit Individual Symbols>Identical
the data apart
Points>Jitter…
making it
Increase the Jitter in the x-direction to .075, click OK, then click anywhere on the
easier to see
graph except on the data points to see the results of the change.
how many
data points
there are. Individual Value Plot of Weibull, Normal, Bi Modal
30
This gives us
relevance so 25
we do not
have points 20
plotted on top
Data
15
of each other.
10
Weibull Normal Bi Modal
166
Time Series Plot
Using the
MINITAB™ Time Series Plots allow you to examine data over time.
worksheet Depending on the shape and frequency of patterns in the plot several
“Graphing X’s can be found as critical…… or eliminated.
Graph> Time Series Plot> Simple...
Data.mtw”.
A Time Series is
created by
following the
MINITAB™ menu
path “Graph> Time
Series Plot>
Simple...”
Time Series Plots

are very useful in
most projects.
Every project
should provide time
series data to look
for frequency,
magnitude and
patterns. What X
would cause these
issues?
Time Series Example
Looking at the Time

Series Plot the The Time Series Plot here shows the response to be very dynamic.
response appears to
be very dynamic.
The benefit of this

approach to charting
is you can see every
data point as it is
gathered over time.
Some interesting
occurrences can be
revealed.
What other characteristic is present?
167

Time Series Example
Using the MINITAB™

Let’s look at some other Time Series Plots.
worksheet “Graphing
Data.mtw”. What is happening within each plot?
What is different between the two plots?
Now let’s lay two Time
Series on top of each Graph> Time Series Plot> Multiple...(use variables Time 2 and Time 3)
other. This can be
done by following the
MINITAB™ menu
path “Graph> Time
Series Plot>
Multiple...” (use
variables Time 2 and
Time 3).
What is happening
within each plot?
What is the difference
between the two
plots? Time 3 appears
to have wave pattern.
Curve Fitting Time Series

Using the MINITAB™ worksheet “Graphing Data.mtw”. MINITAB™ allows you to add a smoothed
line to your time series based on a smoothing technique called Lowess.
MINITAB™ allows you to add a smoothed line to your time series

based on a smoothing technique called Lowess.
Lowess means Locally Weighted Scatterplot Smoother.
Graph> Time Series Plot> Simple…(select variable Time 3)…Data View…Smoother…Lowess
168
§  Explain the various statistics used to express location and spread
of data
§  Describe characteristics of a Normal Distribution
§  Explain Special Cause variation
§  Use data to generate various graphs and make interpretations

based on their output
You have now completed Measure Phase – Six Sigma Statistics.
Notes
169
Lean Six Sigma

Green Belt Training
Measure Phase
Now we will continue in the Measure Phase with “Measurements System Analysis”.
170
Overview
Measurement System
Analysis is one of those Welcome to Measure
non-negotiable items!
MSA is applicable in Process Discovery
98% of projects and it
alone can have a Six Sigma Statistics
massive effect on the
success of your project Measurement System Analysis
and improvements
within the company.
Basics of MSA
In other words, LEARN
IT & DO IT. It is very Variables MSA
important.
Attribute MSA
Process Capability
Introduction to MSA
We have learned the heart and soul of Six Sigma is data.

–  How do you know the data you are using is accurate and precise?
–  How do know if a measurement is a repeatable and reproducible?
How good are these?

or
MSA
In order to improve your processes it is necessary to collect data on the "critical to" characteristics.
When there is variation in this data it can either be attributed to the characteristic that is being
measured and to the way measurements are being taken; which is known as measurement error.
When there is a large measurement error it affects the data and may lead to inaccurate decision-
making.
Measurement error is defined as the effect of all sources of measurement variability that cause an
observed value (measured value) to deviate from the true value.
171
Introduction to MSA (Cont.)

The measurement system is the complete process used to obtain measurements, such as the
procedures, gages and personnel employed to obtain measurements. Each component of this
system represents a potential source of error. It is important to identify the amount of error and, if
necessary, the sources of error. This can only be done by evaluating the measurement system with
statistical tools.
There are several types of measurement error which affect the location and the spread of the
distribution. Accuracy, linearity and stability affect location (the average). Measurement accuracy
describes the difference between the observed average and the true average based on a master
reference value for the measurements. A linearity problem describes a change in accuracy through
the expected operating range of the measuring instrument. A stability problem suggests that there is
a lack of consistency in the measurement over time. Precision is the variability in the measured
value and is quantified like all variation by using the standard deviation of the distribution of
measurements. For estimating accuracy and precision, multiple measurements of one single
characteristic must be taken.
The primary contributors to measurement system error are repeatability and reproducibility.
Repeatability is the variation in measurements obtained by one individual measuring the same
characteristic on the same item with the same measuring instrument. Reproducibility refers to the
variation in the average of measurements of an identical characteristic taken by different individuals
using the same instrument.
Given that Reproducibility and Repeatability are important types of error they are the object of a
specific study called a Gage Repeatability & Reproducibility study (Gage R&R). This study can be
performed on either attribute-based or variable-based measurement systems. It enables an
evaluation of the consistency in measurements among individuals after having at least two
individuals measure several parts at random on a few trials. If there are inconsistencies, then the
measurement system must be improved.

is the entire system NOT just MSA is a mathematical procedure to quantify variation
calibration or how good the introduced to a process or product by the act of measuring.
measurement instrument is.
We must evaluate the entire Reference
Item to be
environment and Measurement Measurement
Measured
System Analysis gives us a way Operator Measurement Equipment
to evaluate the measurement Process
environment mathematically.
Procedure
All these sources of variation Environment
combine to yield a
measurement that is different The item to be measured can be a physical part, document or a scenario for customer service.
than the true value. Operator can refer to a person or can be different instruments measuring the same products.
Reference is a standard that is used to calibrate the equipment.
Procedure is the method used to perform the test.
It is also referred to as “Gage Equipment is the device used to measure the product.
R&R” studies where R&R is: Environment is the surroundings where the measures are performed.
Repeatability & Reproducibility.
172
Measurement Purpose
Measurement is a process
In order to be worth collecting measurements must provide value -
within itself. In order to
that is, they must provide us with information and, ultimately,
measure something you must
knowledge.
go through a series of tasks
and activities in sequence.
Usually there is some from of The question…
set-up, there is an instrument
that makes the measurement, What do I need to know?
there is a way of recording the
value and it may be done by …must be answered before we begin to consider issues of
multiple people. Even when measurements, metrics, statistics or data collection systems.
you are making a judgment
call about something there is Too often organizations build complex data collection and
some form of setup. You information management systems without truly understanding how
become the instrument and the data collected and metrics calculated actually benefit the
the result of a decision is organization.
recorded someway; even if it
is verbal or it is a set of
actions that you take.
The types and sophistication of measurement vary almost infinitely. It is becoming increasingly
popular or cost effective to have computerized measurement systems. The quality of
measurements also varies significantly - with those taken by computer tending to be the best. In
some cases the quality of measurement is so bad you would be just as well off to guess at what the
outcome should be. You will be primarily concerned with the accuracy, precision and reproducibility
of measurements to determine the usability of the data.
Purpose
The purpose of
conducting an MSA The purpose of MSA is to assess any error due to
is to inaccuracy of our measurement systems.
mathematically
partition sources of The error can be partitioned into specific sources:
variation within the –  Precision
measurement •  Repeatability - within an operator or piece of equipment
system itself. This •  Reproducibility - operator to operator or attribute gage to
allows us to create attribute gage
an action plan to –  Accuracy
reduce the biggest •  Stability - accuracy over time
contributors of
•  Linearity- accuracy throughout the measurement range
measurement error.
•  Resolution – how detailed is the information
•  Bias – Off-set from true value
–  Constant Bias
–  Variable Bias – typically seen with electronic equipment,
amount of Bias changes with setting levels
173
Accuracy and Precision
Measurement systems, like

all things, generate some Accurate but not precise - On Precise but not accurate - The
amount of variation in the average these shots are in the average is not on the center but
center of the target but there is a the variability is small
results/data they output. In lot of variability
measuring we are primarily
concerned with three
characteristics:
1. How accurate is the

measurement? For a
repeated measurement
where is the average
compared to some known
standard? Think of the
target as the measurement
system, the known
standard is the bulls eye in
the center of the target. In
the first example you can
see the “measurements” are very dispersed, there is a lot of variability as indicated by the
Histogram curve at the bottom. But on average the “measurements” are on target. When the
average is on target we say the measurement is accurate. However in this example they are not
very precise.
2. How precise is the measurement? For a repeated measurement how much variability exists? As
seen in the first target example the “measurements” are not very precise but on the second target
they have much less dispersion. There is less variability as seen in the Histogram curve. However
we notice the tight cluster of “measurements” are off target, they are not very accurate.
3. The third characteristic is how reproducible is the measurement from individual to another?
What is the accuracy and precision from person to person? Here you would expect each person
that performs the measurement to be able to reproduce the same amount of accuracy and
precision as that of other person performing the same measurement.
Ultimately we make decisions based on data collected from measurement systems. If the
measurement system does not generate accurate or precise enough data we will make the
decisions that generate errors, waste and cost. When solving a problem or optimizing a process
we must know how good our data are and the only way to do this is to perform a Measurement
System Analysis.
174
MSA Uses
MSA can be used to:
Compare internal inspection standards with the standards of your

customer.
Highlight areas where calibration training is required.
Provide a method to evaluate inspector training effectiveness as well

as serve as an excellent training tool.
Provide a great way to:

–  Compare existing measurement equipment.
–  Qualify new inspection equipment.
The measurement system always has some amount of variation and that variation is additive to the
actual amount of true variation that exists in what we are measuring. The only exception is when
the discrimination of the measurement system is so poor it virtually sees everything the same.
This means you may actually be producing a better product or service than you think you are,
providing the measurement system is accurate; meaning it does not have a bias, linearity or stability
problem. It may also mean your customer may be making the wrong interpretations about your
product or service.
The components of variation are statistically additive. The primary contributors to measurement
system error are Repeatability and Reproducibility. Repeatability is the variation in measurements
obtained by one individual measuring the same characteristic on the same item with the same
measuring instrument. Reproducibility refers to the variation in the average of measurements of an
identical characteristic taken by different people using the same instrument.
Why MSA?
Why is MSA so important? Measurement System Analysis is important to:

MSA allows us to trust the
•  Study the % of variation in our process caused by our
data generated from our
measurement system.
processes. When you charter
•  Compare measurements between operators.
a project you are taking on a
•  Compare measurements between two (or more) measurement
significant burden which will
devices.
require Statistical Analysis.
What happens if you have a •  Provide criteria to accept new measurement systems (consider
new equipment).
great project with lots of data
from measurement systems •  Evaluate a suspect gage.
that produce data with no •  Evaluate a gage before and after repair.
integrity? •  Determine true process variation.
•  Evaluate effectiveness of training program.
175
Appropriate Measures
Sufficient means the
are measures are Appropriate Measures are:
available to be
•  Sufficient – available to be measured regularly
measured regularly,
if not it would take
too long to gather •  Relevant –help to understand/isolate the problems
data.
•  Representative - of the process across shifts and people
Relevant means
they will help to •  Contextual – collected with other relevant information that
understand and might explain process variability.
isolate the problems.
Representative
measures mean we Wadda ya
can detect variation wanna
across shifts and measure!?!
people.
Contextual means they are necessary to gather information on other relevant information that
actually would help to explain sources of variation.
Poor Measures
It is very common
while working projects Poor Measures can result from:
to discover the current
measurement
•  Poor or non-existent operational definitions
systems are poor. •  Difficult measures
Have you ever come
across a situation
•  Poor sampling
where the data from •  Lack of understanding of the definitions
your customer or
supplier does not
•  Inaccurate, insufficient or non-calibrated
match yours? It measurement devices
happens often. It is
likely a problem with Measurement Error compromises decisions affecting:
one of the
measurement •  Customers
systems. We have •  Producers
worked MSA projects
across critical
•  Suppliers
measurement points
in various companies.
It is not uncommon for more than 80% of the measurements to fail in one way or another.
176
Examples of What to Measure
At this point you should

have a fairly good idea Examples of what and when to measure:
of what to measure… •  Primary and secondary metrics
listed here are some
ideas to get you •  Decision points in Process Maps
thinking… •  Any and all gauges, measurement devices, instruments, etc
•  X’s in the process
•  Prior to Hypothesis Testing
•  Prior to modeling
•  Prior to planning designed experiments
•  Before and after process changes
•  To qualify operators
MSA is a Show Stopper!!!
Components of Variation
Whenever you measure anything the variation you observe

can be segmented into the following components…
Observed Variation
Unit-to-unit (true) Variation Measurement System Error
Precision Accuracy
Repeatability Reproducibility Stability Bias Linearity
All measurement systems have error. If you do not know how much of the
variation you observe is contributed by measurement system error you cannot
make confident decisions.
If you were one speeding ticket away from losing your license
how fast would you be willing to drive on your local freeway?
We are going to strive to have the measured variation be as close as possible to the true variation.
In any case we want the variation from the measurement system to be a small as possible. We are
now going to investigate the various components of variation of measurements.
177
Precision
A precise metric is one that returns the same value of a given The spread of the data
attribute every time an estimate is made. is measured by
Precision. This tells us
how well a measure
Precise data are independent of who measures them or can be repeated and
reproduced.
when the measurement is made.
Precision can be partitioned into two components:

–  Repeatability
–  Reproducibility
Repeatability and Reproducibility = Gage R+R
Repeatability
Measurements will be
different…expect it! If Repeatability is the variation in measurements obtained with one
measurements are measurement instrument used several times by one appraiser
always exactly the while measuring the identical characteristic on the same part.
same this is a flag,
sometimes it is
because the gauge Y
does not have the
proper resolution,
meaning the scale does
not go down far enough
to get any variation in Repeatability
the measurement.
For example:
For example, would
you use a football field –  Manufacturing: One person measures the purity of multiple
to measure the gap in a samples of the same vial and gets different purity measures.
spark plug? –  Transactional: One person evaluates a contract multiple times
(over a period of time) and makes different determinations of errors.
178
Reproducibility
Reproducibility will
be present when it Reproducibility is the variation in the average of the measurements
is possible to have made by different appraisers using the same measuring
more than one instrument when measuring the identical characteristic on the same
operator or more part.
than one
instrument Reproducibility
measure the same
part. Y Operator A
Operator B
For example:
–  Manufacturing: Different people perform purity test on samples
from the same vial and get different results.
–  Transactional: Different people evaluate the same contract and
make different determinations.
Time Estimate Exercise
Exercise objective: Demonstrate how well you can

estimate a 10 second time interval.
1.  Pair up with an associate.

2.  One person will say “start” and “stop” to indicate
how long they think the 10 seconds last. Do this six
times.
3.  The other person will have a watch with a second
hand to actually measure the duration of the estimate.
Record the value where your partner cannot see it.
4.  Switch tasks with partner and do it six times also.
5.  Record all estimates. What do you notice?
179
Accuracy
Accuracy and
the average are An accurate measurement is the difference between the observed average
related. Recall of the measurement and a reference value.
in the Basic –  When a metric or measurement system consistently over or under estimates
Statistics module the value of an attribute it is said to be inaccurate
we talked about Accuracy can be assessed in several ways:
the Mean and
–  Measurement of a known standard
the variance of a
distribution. –  Comparison with another known measurement method
–  Prediction of a theoretical value
Think of it this
What happens if we do not have standards, comparisons or theories?
way….If the
Measurement True
Average
System is the
distribution then
accuracy is the Accuracy
Mean and the Warning, on a cross country
precision is the trip do not assume your
variance. gasoline gage is gospel.
Measurement
Accuracy Against a Known Standard
In transactional processes the measurement system can

consist of a database query.
–  For example, you may be interested in measuring product
returns where you will want to analyze the details of the
returns over some time period.
–  The query will provide you all the transaction details.
However, before you invest a lot of time analyzing the data you
must ensure the data has integrity.
–  The analysis should include a comparison with known
reference points.
–  For the example of product returns the transaction details
should add up to the same number that appears on financial
reports, such as the income statement.
180
Accuracy versus Precision
ACCURATE PRECISE BOTH
+ =
Accuracy relates to how close

the average of the shots are to the
Master or bull's-eye.
Precision relates to the spread of

the shots or Variance.
NEITHER
Most Measurement Systems are accurate but not at all precise.
Bias
Bias is defined as the deviation of the measured value from the actual
value.
Calibration procedures can minimize and control bias within acceptable

limits. Ideally Bias can never be eliminated due to material wear and
tear!
Bias Bias
Bias is a component of Accuracy. Constant Bias is when the measurement is off by a constant
value. A scale is a prefect example; if the scale reads 3 lbs when there is no weight on it then
there is a 3 lb Bias. Make sense?
181
Stability
Stability just looks for changes in the accuracy or Bias over time.
Stability of a gage is defined as error (measured in terms of Standard

Deviation) as a function of time. Environmental conditions such as
cleanliness, noise, vibration, lighting, chemical, wear and tear or other
factors usually influence gage instability. Ideally gages can be
maintained to give a high degree of Stability but the issue can never be
eliminated… unlike Reproducibility. Gage Stability studies should be the
first exercise after calibration procedures.
Control Charts are commonly used to track the Stability of a
measurement system over time.
Drift
Stability is Bias characterized

as a function of time!
Linearity
Linearity is defined as the difference in Bias values throughout the

measurement range in which the gauge is intended to be used. This tells you
how accurate your measurements are through the expected range of the
measurements. It answers the question "Does my gage have the same
accuracy for all sizes of objects being measured?"
Linearity = |Slope| * Process Variation

Low Nominal High
+e
% Linearity = |Slope| * 100
B i a s (y)
0.00
*
-e
*
*
Reference Value (x)
y = a + b.x
y: Bias, x: Ref. Value
a: Slope, b: Intercept
Linearity just evaluates if any Bias is consistent throughout the measurement range of the
instrument. Many times Linearity indicates a need to replace or perform maintenance on the
measurement equipment.
182
Types of MSA’s
Variable Data is
always preferred over MSA’s fall into two categories:
Attribute because it
give us more to work
Attribute Variable
with.
–  Pass/Fail –  Continuous scale
Now we are gong to –  Go/No Go –  Discrete scale
review Variable MSA –  Document preparation –  Critical dimensions
testing. –  Surface imperfections –  Pull strength
–  Customer Service response –  Warp
Transactional projects typically have Attribute based measurement

systems.
Manufacturing projects generally use Variable studies more often but

do use Attribute studies to a lesser degree.
Variable MSA’s
MSA’s use a
random effects MINITAB™ calculates a column of variance components (VarComp) that are used to
calculate % Gage R&R using the ANOVA Method.
model meaning
that the levels for
the variance Measured Value True Value
components are
not fixed or
assigned, they are
assumed to be
random.
Estimates for a Gage R&R study are obtained by calculating the variance components
for each term and for error. Repeatability, Operator and Operator*Part components are
summed to obtain a total Variability due to the measuring system.
We use variance components to assess the Variation contributed by each source of
measurement error relative to the total Variation.
183
Session Window Cheat Sheet
Contribution of Variation to the total

Variation of the study.
% Contribution, based on variance

components, is calculated by dividing each
value in VarComp by the Total Variation then
multiplying the result by 100.
Use % Study Var when you are interested in

comparing the measurement system Variation to
the total Variation.
% Study Var is calculated by dividing each value in
Study Var by Total Variation and Multiplying by 100.
Study Var is calculated as 5.15 times the Standard
Deviation for each source.
(5.15 is used because when data are Normally
distributed, 99% of the data fall within 5.15
Standard Deviations.)
Refer to this when analyzing your Session Window output.
Session Window explanations:
When the process tolerance is entered in the

system MINITABTM calculates % Tolerance
which compares measurement system Variation
to customer specification. This allows us to
determine the proportion of the process
tolerance is used by the Variation in the
measurement system.
Always round down to the nearest whole number.
Notice the calculation method explained here for Distinct Categories.
184
Number of Distinct Categories
The number of distinct categories tells you how many

separate groups of parts the system is able to distinguish.
Unacceptable for estimating

process parameters and
indices
Only indicates whether the
process is producing
conforming or
1 Data Category
nonconforming parts
Generally unacceptable for

estimating process
parameters and indices
Only provides coarse
2 - 4 Categories
estimates
Recommended
5 or more Categories
Here is a rule of thumb for distinct categories.
AIAG Standards for Gage Acceptance

Here are the Automotive Industry Action Group’s definitions for Gage acceptance.
% Tolerance
or % Contribution System is…
% Study Variance
10% or less 1% or less Ideal
10% - 20% 1% - 4% Acceptable
20% - 30% 5% - 9% Marginal
30% or greater 10% or greater Poor
185
MINITABTM Graphic Output Cheat Sheet
Gage name: Sample Study - Caliper

Date of study: 2-10-01
Gage R&R (ANOVA) for Data Reported by: B Wheat
Tolerance:
Misc:
Components of Variation By Part

100 %Contribution 0.630
%Study Var
Percent
%Tolerance
50 0.625
MINITABTM breaks down the Variation in the

0 Measurement
0.620 System into specific sources.
Gage R&R Repeat Reprod Part-to-Part Part
Each cluster 1of bars
2 3 4 5 6
represents 7 8 9 10
a source of
R Chart by Operator By Operator
variation. By default each cluster will have
0.010 1 2 3
two0.630
bars corresponding to %Contribution and
Sample Range
UCL=0.005936 %StudyVar. If you add a tolerance and/or

0.005
historical
0.625 sigma, bars for % Tolerance and/or
R=0.001817 %Process are added.
0.000 LCL=0 0.620
0 InOperator
a good Measurement
1 System
2 the largest
3
Xbar Chart by Operator component of Variation is Part-to-Part

Operator*Part Interaction
Operator
0.632
0.631
1 2 3
UCL=0.6316 variation.
0.631 If instead you have large amounts 1
0.630
of Variation attributed to Gage R&R then 2
Sample Mean
0.630
0.629
Average
0.629 3
0.628 Mean=0.6282 corrective
0.628 action is needed.
0.627 0.627
0.626 0.626
0.625 LCL=0.6248 0.625
0.624 0.624
0 Part 1 2 3 4 5 6 7 8 9 10

Gage R&R (ANOVA) for Data MINITAB provides an R Chart and Xbar
TM by:
Reported B Wheat
Tolerance:
Chart by Operator. The R chart consists of:
Misc:
Components of Variation - The plotted points

By are
Part the difference between
100 %Contribution the
0.630
largest and smallest measurements on
%Study Var each part for each operator. If the
Percent
%Tolerance
50 measurements are the same the range = 0.
0.625
- The Center Line is the grand average for the
0
process.
0.620
Gage R&R Repeat Reprod Part-to-Part - The Control
Part 1 2 Limits
3 4 5represent
6 7 8 the amount of
9 10
R Chart by Operator variation expected for the subgroup ranges.

By Operator
0.010 1 2 3 These limits are calculated using the variation
0.630
within subgroups.
Sample Range
UCL=0.005936
0.005
If any of the points on the graph go above the
0.625
R=0.001817
upper Control Limit (UCL) that operator is
0.000 LCL=0
having problems consistently measuring parts.
0.620
0 Operator 1 2
The Upper Control Limit value takes3 into
Xbar Chart by Operator Operator*Part Interaction
account the number of measurements by an
0.632 1 2 3 Operator
0.631
0.631
UCL=0.6316
operator on a part and the variability between
0.630
1
2
Sample Mean
0.630
parts. If the operators are measuring
0.629
Average
0.629 3
0.628 Mean=0.6282 0.628
0.627 consistently these ranges should be small
0.627
0.626 relative to the data and the points should stay
0.626
0.625 LCL=0.6248 0.625
0.624 in control.
0.624
0 Part 1 2 3 4 5 6 7 8 9 10
186
MINITABTM Graphic Output Cheat Sheet (cont.)

MINITAB provides an R Chart and Xbar
TM
Tolerance:
Chart by Operator. The Xbar Chart compares
Misc:
the part-to-part variation to repeatability. The
By Part
100
Xbar chart consists of the following:
%Contribution 0.630
%Study Var
- The plotted points are the average
Percent
%Tolerance
50
measurement
0.625 on each part for each operator.
- The Center Line is the overall average for
0 all part measurements by all operators.
0.620
Gage R&R Repeat Reprod Part-to-Part
-Part
The Control
1 2
Limits
3 4
(UCL
5 6
and
7
LCL)
8
are
9 10

based on the variability between parts and the
0.010 1 2 3
number
0.630 of measurements in each average.
Sample Range
UCL=0.005936
0.005 Because
0.625
the parts chosen for a Gage R&R
R=0.001817
study should represent the entire range of
0.000 LCL=0 possible
0.620
parts this graph should ideally show
0 the lack-of-control.
Operator 1 Lack-of-control
2 exists
3
Xbar Chart by Operator when many points are above

Operator*Part the Upper
Interaction
0.632 1 2 3 Control Limit and/or below the Lower Control Operator
UCL=0.6316 0.631 1
0.631
Limit.
0.630 2
Sample Mean
0.630
0.629
Average
0.629 3
0.628 Mean=0.6282 0.628
0.627 In this case there are only a few points out of
0.627
0.626 0.626
0.625 LCL=0.6248 control
0.625 indicating the measurement system is
0.624
inadequate.
0.624
0 Part 1 2 3 4 5 6 7 8 9 10

Tolerance:
Misc:

100 MINITABTM provides an interaction chart
%Contribution 0.630
showing
%Study Var the average measurements taken by
Percent
%Tolerance
50
each operator on each part in the study,
0.625
arranged by part. Each line connects the
0
averages 0.620
for a single operator. Ideally the lines
Gage R&R Repeat Reprod Part-to-Part will follow
Part the same
1 2 pattern
3 4 5 and
6 7the
8 part
9 10
R Chart by Operator averages will vary enough such that differences

By Operator
0.010 1 2 3
between parts are clear.
0.630
Sample Range
UCL=0.005936
0.005
Pattern Means… 0.625
R=0.001817
0.000 LCL=0 0.620
Lines are virtually identical
0 Operators are measuring the Operator 1 2 3
parts the same
Xbar Chart by Operator Operator*Part Interaction
1 2 3 Operator
One line is consistently
0.632
0.631
That operator is measuring
UCL=0.6316 0.631 1
0.630
higher or lower than
0.630 the parts consistently higher or 2
Sample Mean
0.629
Average
0.629 3
others 0.628 lower than the others Mean=0.6282 0.628
0.627 0.627
Lines are not parallel
0.626
0.625
or they The operator’s ability to 0.626
LCL=0.6248 0.625
cross 0.624 measure a part depends on 0.624
0 which part is being measured Part 1 2 3 4 5 6 7 8 9 10
(an interaction between

operator and part)
187
MINITABTM Graphic Output Cheat Sheet (cont.)

Gage
MINITAB TM R&R (ANOVA)
generates a “byfor Data chart that
operator” Reported by: B Wheat
Tolerance:
helps us determine whether the measurements of Misc:
variability are consistent across operator. The “by
operator“100graph shows all the study
%Contribution
measurements arranged by operator. Dots %Study Var
0.630
represent the measurements; the circle-cross %Tolerance

Percent
50 0.625
symbols represent the Means. The red line
connects the average measurements for each 0.620
0
operator. You can also Repeat
Gage R&R assessReprod
whether the
Part-to-Part Part 1 2 3 4 5 6 7 8 9 10
overall Variability in part measurement
R Chart by Operator is the By Operator
same using
0.010 this graph.1 Is the spread2 in the 3
0.630
measurements similar? Or is one operator more
Sample Range
UCL=0.005936
Variable0.005
than the others? 0.625
R=0.001817
0.000 LCL=0 0.620
0 Operator 1 2 3
If the red line is … Xbar Chart by Operator

Then… Operator*Part Interaction
UCL=0.6316 0.631 1
0.631 0.630 2
Sample Mean
0.630
Parallel to the x-axis The operators are 0.629
Average
0.629 3
0.628
0.628 measuring the parts Mean=0.6282
0.627
0.627
0.626 similarly 0.626
0.625 LCL=0.6248 0.625
0.624
Not parallel to the x-axis The operators are 0.624
0
measuring the parts Part 1 2 3 4 5 6 7 8 9 10
differently

Tolerance:
Misc:

100 %Contribution 0.630
%Study Var
Percent
%Tolerance
50 0.625
MINITABTM allows us to analyze all of the
measurements taken in the study arranged by
part.0 The measurements are represented by 0.620
Gage R&R Repeat Reprod Part-to-Part Part 1 2 3 4 5 6 7 8 9 10
dots; the Means by the circle-cross symbol.
The red line connects
0.010 1
the average
2 3
measurements for each part. 0.630
Sample Range
UCL=0.005936
Ideally
0.005 multiple measurements for each 0.625
individual part have little variation (the dotsR=0.001817
for
one
0.000part will be close together) and averages
LCL=0 0.620
will vary 0enough so differences between parts Operator 1 2 3
are clear. Xbar Chart by Operator Operator*Part Interaction
UCL=0.6316 0.631 1
0.631 0.630 2
Sample Mean
0.630
0.629
Average
0.629 3
0.628 Mean=0.6282 0.628
0.627 0.627
0.626 0.626
0.625 LCL=0.6248 0.625
0.624 0.624
0 Part 1 2 3 4 5 6 7 8 9 10
188
Practical Conclusions
The Variation due to the measurement system as a percent of study Variation

is causing 92.21% of the Variation seen in the process.
By AIAG Standards this gage should not be used. By all standards the
data being produced by this gage is not valid for analysis.
% Tolerance
or % Contribution System is…
% Study Variance
10% or less 1% or less Ideal
10% - 20% 1% - 4% Acceptable
20% - 30% 5% - 9% Marginal
30% or greater 10% or greater Poor
Repeatability and Reproducibility Problems
For Repeatability Problems:

If all operators have the same Repeatability Problems:
•  Calibrate or replace gage.
Repeatability and it is too big
•  If only occurring with one operator, re-train.
the gage needs to be repaired
or replaced. While if only one Reproducibility Problems:
operator, or in the case where •  Measurement machines
there are no operators but –  Similar machines
•  Ensure all have been calibrated and the standard measurement method
several gages and only one is being utilized.
gage is showing Repeatability –  Dissimilar machines
problems, re-train the one •  One machine is superior.
operator or replace the one •  Operators
–  Training and skill level of the operators must be assessed.
gage.
–  Operators should be observed to ensure standard procedures are followed.
•  Operator/machine by part interactions
For Reproducibility Problems: –  Understand why the operator/machine had problems measuring some parts
In the case where only and not others.
machines are used and the •  Re-measure the problem parts
multiple machines are all •  Problem could be a result of gage linearity
•  Problem could be fixture problem
similar in design, check the
•  Problem could be poor gage design
calibration to ensure the
standard measurement method is being used. One of the gages may be performing differently than
the rest, the graphs will show which one is performing differently. It may need to go in for repair or it
may simply be a setup or calibration issue. If dissimilar machines are used it typically means one
machine is superior. In the case where multiple operator are working the graphs will show who will
need additional training to perform at the same level as the rest. The most common operator/
machine interaction errors are either someone misread a value, recorded the value incorrectly or
the fixture holding the part is poor.
189
Design Types
Crossed Design
A Crossed Design is used only in non-destructive testing and assumes all the parts
can be measured multiple times by either operators or multiple machines.
!  Gives the ability to separate part-to-part Variation from measurement system
Variation.
!  Assesses Repeatability and Reproducibility.
!  Assesses the interaction between the operator and the part.
Nested Design
A Nested Design is used for destructive testing and also situations where it is not
possible to have all operators or machines measure all the parts multiple times.
!  Destructive testing assumes all the parts within a single batch are identical
enough to claim they are the same.
!  Nested designs are used to test measurement systems where it is not
possible (or desirable) to send operators with parts to different locations.
!  Do not include all possible combinations of factors.
!  Uses slightly different mathematical model than the Crossed Design.
Crossed Designs are the workhorse of MSA. They are the most commonly used design in
industries where it is possible to measure something more than once. Chemical and biological
systems can use Crossed Designs also as long as you can assume the samples used come from a
homogeneous solution and there is no reason they can be different.
Nested Designs must be used for destructive testing. In a Nested Design each part is measured by
only one operator. This is due to the fact that after destructive testing the measured characteristic is
different after the measurement process than it was at the beginning. Crash testing is an example of
destructive testing.
If you need to use destructive testing you must be able to assume all parts within a single batch are
identical enough to claim they are the same part. If you are unable to make that assumption then
part-to-part variation within a batch will mask the measurement system variation.
If you can make that assumption then choosing between a Crossed or Nested Gage R&R Study for
destructive testing depends on how your measurement process is set up. If all operators measure
parts from each batch then use Gage R&R Study (Crossed). If each batch is only measured by a
single operator you must use Gage R&R Study (Nested). In fact whenever operators measure
unique parts you have a Nested Design. Your Master Black Belt can assist you with the set-up of
your design.
190
Gage R & R Study
A Gage R&R, like any study,

Gage R&R Study
requires careful planning. The
–  Is a set of trials conducted to assess the Repeatability and Reproducibility of
common way of doing an the measurement system
Attribute Gage R&R consists –  Multiple people measure the same characteristic of the same set of multiple
of having at least two people units multiple times (a crossed study)
measure 20 parts at random, –  Example: 10 units are measured by three people. These units are then
twice each. This will enable randomized and a second measure on each unit is taken
you to determine how
consistently these people A Blind Study is extremely desirable.
evaluate a set of samples –  Best scenario: operator does not know the measurement is a part of a test
against a known standard. If –  At minimum: operators should not know which of the test parts they are
there is no consistency currently measuring
among the people the
measurement system must
be improved either by
defining a new measurement NO, not that kind of R&R!
method, training, etc. You use
an Excel spreadsheet
template to record your study then to perform the calculations for the result of the study.
Variable Gage R & R Steps
The parts selected for Step 1: Call a team meeting to introduce the concepts of the Gage R&R
the MSA are not Step 2: Select parts for the study across the range of interest
random samples. We –  If the intent is to evaluate the measurement system throughout the process range
want to be sure the select parts throughout the range
parts selected –  If only a small improvement is being made to the process the range of interest is
represent the overall now the improvement range
spread of parts that Step 3: Identify the inspectors or equipment you plan to use for the analysis
–  In the case of inspectors explain the purpose of the analysis and that the
would normally be seen inspection system is being evaluated not the people
in manufacturing. Do Step 4: Calibrate the gage or gages for the study
not include parts that –  Remember Linearity, Stability and Bias
are obviously grossly Step 5: Have the first inspector measure all the samples once in random order
defective, they could Step 6: Have the second inspector measure all the samples in random order
actually skew your –  Continue this process until all the operators have measured all the parts one time
mathematical results –  This completes the first replicate
and conclude that the Step 7: Repeat steps 5 and 6 for the required number of replicates
MSA is just fine. For –  Ensure there is always a delay between the first and second inspection
example, an engine Step 8: Enter the data into MINITABTM to analyze your results
manufacturer was using Step 9: Draw conclusions to make changes if necessary
a pressure tester to
check for leaks in engine blocks. All the usual ports were sealed with plugs and the tester was
attached and pressure was applied. Obviously they were looking for pin hole leaks that would cause
problems later down the line. The team performing the MSA decided to include an engine block that
had a hole in the casting so large you could insert your entire fist. That was an obvious gross defect
and should never been included in the MSA. Do not be silly saying that once in a while you get a part
like that and it should be tested. NO IT SHOULD NOT - you should never have received it in the first
place and you have got much bigger problems to take care of before you do an MSA.
191
Gage R & R Study
This is the most

Part Allocation From Any Population
commonly used
Crossed Design.
10 x 3 x 2 Crossed Design is shown
10 parts are each A minimum of two measurements/part/operator is required.
measure by 3
Three is better!
different operators
2 different times.
To get the total Trial 1

Operator 1
number of data
points in the study Trial 2
P
simply multiply a
these numbers Trial 1
r 1 2 3 4 5 6 7 8 9 10 Operator 2
together. In this t Trial 2
study we have 60 s
measurements.
Trial 1
Operator 3
Trial 2
Data Collection Sheet
Create a data collection sheet

for:
–  10 parts
–  3 operators
–  2 trials
The next few slides show how to create a data collection table in MINITAB™. You can use
Excel also.
192

Here is the
completed table.
The trial column will
not be used for the
analysis and can
actually be deleted.
Open the file Gageaiag2.MTW to view the worksheet.
Variables:
–  Part
–  Operator
–  Response
193
Gage R & R
Use the MINITAB™

menu path
“Stat>Quality Use 1.0 for the tolerance.
Tools>Gage
Study>Gage R&R
Study (Crossed)…”.
Within the dialog box
Gage R&R Study
(Crossed), the
“Options…” button
shown in the dialog
box here allows you
to calculate variation
as a percent of study
variation, process
tolerance or a
historical Standard
Deviation.
In this example a Tolerance Range of 1 was used.
Graphical Output
Part to Part Operator

Variation needs to Error
be larger than
Gage Variation
Looking at the “Components of Variation” chart the Part to Part Variation needs to be larger than
Gage Variation. If in the “Components of Variation” chart the “Gage R&R” bars are larger than the
“Part-to-Part” bars then all your measurement Variation is in the measuring tool; i.e.… “maybe the
gage needs to be replaced”.
The same concept applies to the “Response by Operator” chart. If there is extreme Variation within
operators then the training of the operators is suspect.
194
Session Window
The Session
Window output Two-Way ANOVA Table With Interaction
from Gage R & R Source DF SS MS F P
Part 9 1.89586 0.210651 193.752 0.000
has many values.
Operator 2 0.00706 0.003532 3.248 0.062
The ANOVA table Part * Operator 18 0.01957 0.001087 1.431 0.188
values are utilized Repeatability 30 0.02280 0.000760
to calculate % Total 59 1.94529
Contribution and
Standard
Deviation. To Gage R&R
calculate % study %Contribution
Source VarComp (of VarComp)
variation and % Total Gage R&R 0.0010458 2.91
tolerance, you will Repeatability 0.0007600 2.11
need to know Reproducibility 0.0002858 0.79
values for the Operator 0.0001222 0.34
Standard Operator*Part 0.0001636 0.45
Part-To-Part 0.0349273 97.09
Deviation and
Total Variation 0.0359731 100.00
tolerance ranges.
MINITAB™ Number of Distinct Categories = 8
defaults to a value I can see clearly now!
of 6 (the number
of Standard Deviations within which about 99.7% of your values should fall). Tolerance ranges are
based on process tolerance and are business values specific to each process.
If the Variation due to Gage R&R is high consider:

•  Procedures revision?
•  Gage update?
•  20 % < % Tol GRR < 30%  Gage Unacceptable
•  Operator issue?
•  Tolerance validation? •  10 % < % Tol GRR < 20 %  Gage Acceptable
•  1 % < % Tol GRR < 10 %  Gage Preferable
Study Var %Study Var %Tolerance

Source StdDev (SD) (6 * SD) (%SV) (SV/Toler)
Total Gage R&R 0.032339 0.19404 17.05 19.40
Repeatability 0.027568 0.16541 14.54 16.54
Reproducibility 0.016907 0.10144 8.91 10.14
Operator 0.011055 0.06633 5.83 6.63
Operator*Part 0.012791 0.07675 6.74 7.67
Part-To-Part 0.186889 1.12133 98.54 112.13
Total Variation 0.189666 1.13800 100.00 113.80
Number of Distinct Categories = 8
This output tells us that the part to part variation exceeds the allowable tolerance. This gage is
acceptable.
195
Signal Averaging
Signal Averaging can be used to reduce Repeatability error when a

better gage is not available.
–  Uses average of repeat measurements.
–  Uses Central Limit Theorem to estimate how many repeat
measures are necessary.
Signal Averaging is a method

to reduce Repeatability error
in a poor gage when a better
gage is not available or when
a better gage is not possible.
Signal Averaging Example
Suppose SV/Tolerance is 35%.
SV/Tolerance must be 15% or less to use gage.
Suppose the Standard Deviation for one part measured by one person
many times is 9.5.
Determine what the new reduced Standard Deviation should be.
Here we have a problem with Repeatability, not Reproducibility, so we calculate what the Standard
Deviation should be in order to meet our desire of a 15% gage.
The 35% represents the biggest problem, Repeatability.
We are assuming 15% will be acceptable for the short term until an appropriate fix can be
implemented. The 9.5 represents our estimate for Standard Deviation of population of Repeatability.
196
Signal Averaging Example (cont.)
We now use it in the

Central Limit Determine sample size:
Theorem equation
to estimate the
needed number of
repeated measures Using the average of 6
to do this we will use repeated measures will
the Standard reduce the Repeatability
Deviation estimated component of
previously. measurement error to the
desired 15% level.
This method should be considered temporary!
Paper Cutting Exercise
Exercise objective: Perform and Analyze a variable

MSA Study.
1. Cut a piece of paper into 12 different lengths all fairly close

to one another but not too uniform. Label the back of the
piece of paper to designate its part number .
2. Perform a variable Gage R&R study as outlined in this
module. Use the following guidelines:
–  Number of parts: 12
–  Number of inspectors: 3
–  Number of trials: 5
3. Create a MINITABTM data sheet to enter the data into as
each inspector performs a length measurement. If possible
assign one person to data collection.
4. Analyze the results and discuss with your mentor.
197
Attribute MSA
The Discrete
A methodology used to assess Attribute Measurement Systems.
Measurement Study is a
set of trials conducted to
assess the ability of Attribute Gage Error
operators to use an
operational definition or
categorize samples, an
Attribute MSA has:
Repeatability Reproducibility Calibration
1 . Multiple operators
measure (categorize) –  They are used in situations where a continuous measure cannot be
multiple samples a obtained.
multiple number of times. –  It requires a minimum of 5 times as many samples as a continuous study.
For example: 3 operators –  Disagreements should be used to clarify operational definitions for the
each categorize the same categories.
50 samples, then repeat •  Attribute data are usually the result of human judgment (which category does
the measures at least this item belong in).
once. •  When categorizing items (good/bad; type of call; reason for leaving) you need
a high degree of agreement on which way an item should be categorized.
2. The test should be

blind. It is difficult to run this without the operator knowing it is a calibration test, but the
samples should be randomized and their true categorization unknown to each operator.
The test is analyzed based on correct (vs. incorrect) answers to determine the goodness of the
measuring system.
Attribute MSA Purpose
The purpose of an Attribute MSA is:

–  To determine if all inspectors use the same criteria to determine pass from fail .
–  To assess your inspection standards against your customer’s requirements.
–  To determine how well inspectors are conforming to themselves.
–  To identify how inspectors are conforming to a known master that includes:
•  How often operators ship defective product
•  How often operators dispose of acceptable product
–  Discover areas where:
•  Training is required
•  Procedures must be developed
•  Standards are not available
An Attribute MSA is similar in many ways to the continuous MSA,

including the purposes. Do you have any visual inspections in your
processes? In your experience how effective have they been?
When a Continuous MSA is not possible an Attribute MSA can be performed to evaluate the quality
of the data being reported from the process.
198

Visual Inspection Test
Take 60 Seconds and count the number of times “F” appears in this paragraph?
The Necessity of Training Farm Hands for First Class

Farms in the Fatherly Handling of Farm Live Stock is
Foremost in the Eyes of Farm Owners. Since the
Forefathers of the Farm Owners Trained the Farm Hands
for First Class Farms in the Fatherly Handling of Farm
Live Stock, the Farm Owners Feel they should carry on
with the Family Tradition of Training Farm Hands of First
Class Farmers in the Fatherly Handling of Farm Live
Stock Because they Believe it is the Basis of Good
Fundamental Farm Management.
Tally the answers? Did everyone get the same answer? Did anyone get 36? That’s the right answer!
Why not? Does everyone know what an “F” (defect) looks like? Was the lighting good in the room?
Was it quite so you could concentrate? Was the writing clear? Was 60 seconds long enough?
This is the nature of visual inspections! How many places in your process do you have visual
inspection? How good do you expect them to be?
How can we Improve Visual Inspection?
Visual Inspection can be improved by:

•  Operator Training & Certification
•  Develop Visual Aids/Boundary Samples
•  Establish Standards
•  Establish Set-Up Procedures
•  Establish Evaluation Procedures
–  Evaluation of the same location on each part.
–  Each evaluation performed under the same lighting.
–  Ensure all evaluations are made with the same standard.
Look closely now!
199
Attribute Agreement Analysis
Attribute Column: responses by the operators in the study.

Samples: I.D. for the individual pieces.
Appraisers: name or I.D. for each operator in the study.
Stat > Quality Tools > Attribute Agreement Analysis…
If there is a known true answer

the column containing that
answer goes into the “Known
standard/attribute” field.
This graph shows how each appraiser compared to the right answer, accuracy. The blue dot is the
actual percentage for each operator. The red line with the X on each end is the confidence
interval. Duncan agreed with the standard 53% of the time. We are 95% confident based on this
study that Duncan will agree with the standard between 27% and 79% of the time. To decrease
the interval, add more parts to the study.
200
Attribute Agreement Analysis (cont.)
This part of the

Session Window Attribute Agreement Analysis for Rating
Each Appraiser
versus Standard is
the same Each Appraiser versus Standard
information shown in
the previous graph. Assessment Agreement
NOTE: Left off from
this analysis for now
will be the Kappa Appraiser # Inspected # Matched Percent 95 % CI
Statistics, which will Duncan 15 8 53.33 (26.59, 78.73)
be discussed later.
Hayes 15 13 86.67 (59.54, 98.34)
Holmes 15 15 100.00 (81.90, 100.00)
Montgomery 15 15 100.00 (81.90, 100.00)
Simpson 15 14 93.33 (68.05, 99.83)
# Matched: Appraiser's assessment across trials agrees with the known standard.
Between Appraisers
Assessment Agreement
# Inspected # Matched Percent 95 % CI

15 6 40.00 (16.34, 67.71)
# Matched: All appraisers' assessments agree with each other.
All Appraisers vs Standard
Assessment Agreement
# Inspected # Matched Percent 95 % CI

15 6 40.00 (16.34, 67.71)
# Matched: All appraisers' assessments agree with the known standard.
This information can be used to determine what corrective actions, if any, need to take place. The
“all appraisers versus the standard” should be above 75% for the assessment to be considered
acceptable. The information contained in this Session Window can then be used to help decide on
corrective actions; i.e., if the operators agree with themselves but not each other or the standard
then perhaps training in the standard is in order. If some of the operators do not agree with the
standard but others do then perhaps only some training is required. BE CAREFUL – if you have
chosen someone to be the standard and they are wrong it will make it look as though everyone
else is wrong!
201
Kappa Statistics
Fleiss' Kappa Statistics
Appraiser Response Kappa SE Kappa Z P(vs > 0)

Duncan -2 0.58333 0.258199 2.25924 0.0119
-1 0.16667 0.258199 0.64550 0.2593
0 0.44099 0.258199 1.70796 0.0438
1 0.44099 0.258199 1.70796 0.0438
2 0.42308 0.258199 1.63857 0.0507
Overall 0.41176 0.130924 3.14508 0.0008
Simpson -2 1.00000 0.258199 3.87298 0.0001
-1 1.00000 0.258199 3.87298 0.0001
0 0.81366 0.258199 3.15131 0.0008
1 0.81366 0.258199 3.15131 0.0008
2 1.00000 0.258199 3.87298 0.0001
Overall 0.91597 0.130924 6.99619 0.0000
This is a slice from a much larger Session Window output.
This indicates the degree of agreement of the nominal or ordinal assessments made by multiple
appraisers when evaluating the same samples. Kappa statistics are commonly used in cross
tabulation (table) applications and in attribute agreement analysis (Attribute Gage R&R).
For example, 45 patients are examined by two doctors for a particular disease. How often will the
doctors’ diagnosis of the condition (positive or negative) agree? Another example of nominal
assessments is inspectors rating defects on TV screens. Do they consistently agree on their
classifications of bubbles, divots or dirt?
Kappa values range from -1 to +1. The higher the value of kappa the stronger the agreement.
When:
· Kappa = 1, perfect agreement exists.
· Kappa = 0, agreement is the same as would be expected by chance.
· Kappa < 0, agreement is weaker than expected by chance; this rarely happens.
Typically a kappa value of at least 0.70 is required, but kappa values close to 0.90 are preferred.
In Duncan’s case he had the option of answering with -2; -1; 0; 1; 2. His agreement with the
standard on -2 was .58333. If a value of .70 is required then Duncan needs help in his
assessment of the -2 value.
Simpson, on the other hand, was excellent with the -2 assessment but was lower with the
assessment of 1. That being said, all Simpson’s values are greater than 0.70 so Simpson did well.
202
M&M Exercise
Exercise objective: Perform and analyze an Attribute MSA Study.
•  You will need the following to complete the study:

–  A bag of M&Ms containing 50 or more pieces .
–  The attribute value for each piece.
–  Three or more inspectors.
•  Judge each M&M as pass or fail.

Number Part Attribute –  The customer has indicated they want a bright, shiny, uncracked
1 M&M Pass M&M.
2 M&M Fail
•  Pick 30 M&Ms out of a package.
3 M&M Pass
•  Enter results into either the Excel template or MINITABTM to
draw conclusions.
•  The instructor will represent the customer for the Attribute

score.
To complete this study you will need a bag of M&Ms containing 50 or more “pieces”. The
attribute value for each piece means the “True” value for each piece.
In addition to being the facilitator of this study you will also serve as the customer so you will have
the say as to if the piece is actually a Pass or Fail piece. Determine this before the inspectors
review the pieces. You will need to construct a sheet as shown here to keep track of the “pieces”
or “parts”. Then the inspectors will individually judge each piece based on the customer
specifications of bright, shiny, uncracked M&Ms. The objective is to assess the accuracy of an
“inspection” approach to quality.
203
§  Understand Precision & Accuracy
§  Understand Bias, Linearity and Stability
§  Understand Repeatability & Reproducibility
§  Understand the impact of poor gage capability on product

quality.
§  Identify the various components of variation
§  Perform the step by step methodology in variable, and

attribute MSA’s
You have now completed Measure Phase – Measurement System Analysis.
Notes
204
Lean Six Sigma

Green Belt Training
Measure Phase
Process Capability
Now we will continue in the Measure Phase with “Process Capability”.
205
Process Capability
Overview
Within this module we Welcome to Measure

are going to go through
Stability and its affect
on a process as well as Process Discovery
how to measure the
Capability of a process.
We will examine the
meaning of each of Measurement System Analysis
these and show you
how to apply them.
Process Capability
Continuous Capability
Concept of Stability
Attribute Capability
Understanding Process Capability
Process Capability:
•  The inherent ability of a process to meet the expectations of the

customer without any additional efforts*.
•  Provides insight as to whether the process has a:

–  Centering Issue (relative to specification limits)
–  Variation Issue
–  A combination of Centering and Variation
–  Inappropriate specification limits
•  Allows for a baseline metric for improvement.
*Efforts: Time, Money, Manpower, Technology and Manipulation
This is the Definition of Process Capability. We will now begin to learn how to assess it.
206
Process Capability
Capability as a Statistical Problem
Simply put Six

Sigma always starts Our Statistical Problem: What is the probability of our
with a practical process producing a defect ?
problem, translates it
into a statistical
problem, corrects the Define a Practical
statistical problem Problem
and then validates
the practical Create a
problem. Statistical Problem
We will re-visit this Correct the

concept over and Statistical Problem
over, especially in
the Analyze Phase Apply the Correction
when determining to the Practical
Problem
sample size.
Capability Analysis
Capability Analysis provides
a quantitative assessment
The X’s The Y’s
of your process’s ability to (Inputs)
Y = f(X) (Process Function)
(Outputs)
Variation – “Voice of
the Process”
meet the requirements
Frequency
placed on it. Capability Op i Verified Op i + 1
Data for
Analysis is traditionally X1
?
Y1…Yn
used for assessing the X2 Off-Line

Y1
Analysis Scrap
10.16
10.11
10.16
10.05
10.11
9.87
9.99
10.16
9.87 10.11
10.12
9.99 10.05
Correction 10.33
10.05
outputs of a process, in
10.44
10.33 10.43
10.12 10.33
X3 Y2 9.86
10.44 10.21
10.43 10.44
10.01
10.21 9.86
9.80 9.90 10.0 10.1 10.2 10.3 10.4 10.5
10.07
9.86
10.29
10.07 10.15
10.01 10.07
10.36
10.29 10.44
10.15 10.29
other words comparing the X4 10.36 10.03

10.44 10.36
10.33
10.03
10.15
10.33
Yes Y3 No 10.15
X5 Correctable
Voice of the Process to the ?
Voice of the Customer.

However, you can use the Critical X(s):
Requirements – “Voice
of the Customer”
Data - VOP
same technique to assess Any variable(s) LSL = 9.96 USL = 10.44
10.16
10.11
10.05
9.87
9.99
10.16
10.11
which exerts an
the capability of the inputs
10.33 10.12 10.05
10.44 10.43 10.33
10.21
undue influence on
9.86 10.44
10.07 10.01 9.86
10.29 10.15 10.07
the important
going into the process.
10.36 10.44 10.29
10.03 10.36
10.33
outputs (CTQ’s) of a 10.15
Defects
process Defects
They are after all outputs
from some previous
process and you have Capability Analysis Numerically -6 -5 -4 -3 -2 -1 +1 +2 +3 +4 +5 +6
expectations, specifications Compares the VOP to the VOC Percent Composition

9.70 9.80 9.90 10.0 10.1 10.2 10.3 10.4 10.5 10.6
or requirements for their

performance. Capability Analysis will give you a metric you can use to describe how well it performs
and you can convert this metric to a sigma score if you so desire.
You will learn how the output variation width of a given process output compares with the
specification width established for that output. This ratio, the output variation width divided by the
specification width, is what is know as capability.
Since the specification is an essential part of this assessment a rigorous understanding of the
validity of the specification is vitally important; it also has to be accurate. This is why it is important
to perform a RUMBA type analysis on process inputs and outputs.
207
Process Capability
Process Output Categories

Two output behaviors
determine how well we meet Incapable Off target
our customer or process Average LSL USL
output expectations. The first LSL USL Average
is the amount of variation

present in the output and the
second is how well the output
is centered relative to the Target Target
requirements. If the amount of
Re
variation is larger than the ss
du
Capable and ce
ce
difference between the upper on target ro
rp
sp
e
spec limit minus the lower nt
re
Average
ad
LSL USL Ce
spec limit, our product or
service output will always
produce defects it will not be
capable of meeting the
customer or process output Target
requirements.
As you have learned variation exists in everything. There will always be variability in every process
output. You cannot eliminate it completely but you can minimize it and control it. You can tolerate
variability if the variability is relatively small compared to the requirements and the process
demonstrates long-term stability. In other words the variability is predictable and the process
performance is on target meaning the average value is near the middle value of the requirements.
The output from a process is either: capable or not capable, centered or not centered. The degree of
capability and/or centering determines the number of defects generated. If the process is not capable
you must find a way to reduce the variation.
And if it is not centered it is obvious you must find a way to shift the performance. But what do you do
if it is both incapable and not centered? It depends but most of the time you must minimize and get
control of the variation first, this is because high variation creates high uncertainty, you cannot be
sure if your efforts to move the average are valid or not. Of course if is just a simple adjustment to
shift the average to where you want it you would do that before addressing the variation.
Problem Solving Options – Shift the Mean
Our efforts in a Six Sigma This involves finding the variables that will shift the process
project that is examining a to the target. This is usually the easiest option.
process that is performing at a
level less than desired is to LSL
USL
Shift the Mean of performance Shift
such that all outputs are within
an acceptable range.
Our ability to Shift the Mean

involves finding the variables
that will shift the process over
to the target. This is the
easiest option.
208
Process Capability
Problem Solving Options – Reduce Variation

Reducing the variation means
Reducing Variation is typically not so easy to accomplish
fewer of our outputs fail
and occurs often in Six Sigma projects.
further away from the target.
Our objective then is to
reduce variation of the inputs LSL
to stabilize the output.
Problem Solving Options – Shift Mean & Reduce Variation
Combination of shifting the

This occurs often in Six Sigma projects.
Mean and reducing variation –
This is the primary objective of
Six Sigma projects. USL
LSL Shift & Reduce
Problem Solving Options
Move the specification limits – Obviously this implies making them wider, not narrower.
Obviously this implies making Customers usually do not go for this option but if they do…
them wider, not narrower. it is the easiest!
Customers usually do not go
LSL USL USL
for this option.
Move Spec
209
Process Capability
Capability Studies
A stable process is
consistent with time. Time Capability Studies:
Series Plots are one way to •  Are intended to be regular, periodic, estimations of a process’s
check for stability; Control ability to meet its requirements.
Charts are another. Your •  Can be conducted on both Discrete and Continuous Data.
process may not be stable
•  Are most meaningful when conducted on stable, predictable
at this time. One of the
processes.
purposes of the Measure
Phase is to identify the •  Are commonly reported as Sigma Level which is optimal (short term)
many possible X’s for the performance.
defects seen, gather data •  Require a thorough understanding of the following:
and plot it to see if there are –  Customer’s or business’s specification limits
any patterns to identify –  Nature of long-term versus short-term data
what to work on first.
–  Mean and Standard Deviation of the process
When performing Capability –  Assessment of the Normality of the data (Continuous Data only)
Analysis, try to get as much –  Procedure for determining Sigma level
data as are possible, back
as far in time as possible,
over a reference frame that
is generally representative
of your process.
Steps to Capability
Select Output for

Improvement
#1 Verify Customer
Requirements
#2 Validate
Specification
Limits
#3 Collect Sample
Data
#4 Determine
Data Type
(LT or ST)
#5 Check data
For Normality
#6 Calculate
Z-Score, PPM,
Yield, Capability
Cp, Cpk, Pp, Ppk
#7
210
Process Capability
Verifying the Specifications
Questions to consider: Specifications must be

verified before completing
•  What is the source of the specifications? the Capability Analysis. It
–  Customer requirements (VOC) does not mean you will
–  Business requirements (target, benchmark) be able to change them,
–  Compliance requirements (regulations) but on occasion some
–  Design requirements (blueprint, system) internal specifications
have been made much
•  Are they current? Likely to change? tighter than the customer
wants.
•  Are they understood and agreed upon?
–  Operational definitions
–  Deployed to the work force
Data Collection
Capability Studies should include all observations (100% sampling) for a specified period.
You must know if the data
Short-term data: Long-term data:
collected from process • Collected across a narrow • Is collected across a broader inference
outputs is a short-term or a inference space. space.
• Daily, weekly; for one shift, • Monthly, quarterly; across multiple
long-term representation of shifts, machines, operators, etc
machine, operator, etc.
how well the process • Is potentially free of Special Cause • Subject to both Common and Special
performs. There are several variation. Causes of variation.
• Often reflects the optimal • More representative of process
reasons for this but for now performance level. performance over a period of time.
we will focus on it from the • Typically consists of 30 – 50 data • Typically consists of at least 100 – 200
points. data points.
perspective of assessing
Lot 1
the capability of the Lot 5
Lot 3
Fill Quantity
process.
To help you understand

short-term vs. long-term
Lot 2
data we will start by looking
Lot 4
at a manufacturing Short-term studies
example. In this scenario
the manufacturer is filling Long-term study
bottles with a certain

amount of fluid. Assume the product is built in lots. Each lot is built using a particular vendor of the
bottle, by a particular shift and set of employees and by one of many manufacturing lines. The next
lot could be from a different vendor, employees, line, shift, etc.
Each lot is sampled as it leaves the manufacturing facility on its way to the warehouse. The results
are represented by the graphic where you see the performance data on a lot by lot basis for the
amount of fill based on the samples that taken. Each lot has its own variability and average as
shown. The variability actually looks reasonable and we notice the average from lot to lot is varying
as well.
What the customer eventually experiences in the amount of fluid in each bottle is the value across
the full variability of all the lots. It can now be seen and stated that the long-term variability will always
be greater than the short-term variability.
211
Process Capability
Baseline Performance
Here is another way to look

at long-term and short-term
Process Baseline: The
performance. The “road”
average, long-term performance
appearing graphic actually level of a process when all input
represents the target variables are unconstrained.
(center line) and the upper Long-term
baseline
and lower spec limits. Here
again you see the 4
Short-term
representative performance Performance
in short-term snapshots
which result in the larger
long-term performance. ` 3
Process Baseline is a term

you will use frequently as a
way to describe the output 2
performance of a process. 1
Whenever you hear the
TARGET USL
word “Baseline” it implies
long-term performance. To
not use long-term data to describe the Baseline Performance would be dangerous.
As an example, imagine you reported the process performance Baseline was based on distribution 3
in the graphic. If so you would mislead yourself and others that the process had excellent on target
performance. If you used distribution 2 you would be led to believe the average performance was near
the USL and most of the output of the process was above the spec limit. To resolve these potential
problems it is important to always use long-term data to report the Baseline.
How do you know if the data you have is short or long-term data? Here are some guidelines. A
somewhat technical interpretation of long-term data is the process has had the opportunity to
experience most of the sources of variation impacting it. Remembering the outputs are a function of
the inputs so what we are saying is most of the combinations of the inputs, each with their full range of
variation, has been experienced by the process. You may use these situations as guidelines.
Short-term data is a “snapshot” of process performance and is characterized by these types of

conditions:
One shift One line
One batch One employee
One type of service One or only a few suppliers
Long-term data is a “video” of process performance and is characterized by these types of conditions:
Many shifts Many batches
Many employees Many services and lines
Many suppliers
Long-term variation is larger than short-term variation because of: material differences, fluctuations in
temperature and humidity, different people performing the work, multiple suppliers providing materials,
equipment wear, etc.
As a general rule, short-term data consist of 20 to 30 data points over a relatively short period of time
and long-term data consist of 100 to 200 data points over an extended period of time. Do not be
212
Process Capability
Baseline Performance (cont.)

misled by the volume of product or service produced as an indicator of long and short-term
performance. Data that represents the performance of a process producing 100,000 widgets a day for
that day will be short-term performance. Data the represents the performance of a process producing
20 widgets a day over a 3 month period will be long-term performance.
While we have used a manufacturing example to explain all this it is exactly the same for a service or
administrative type of process. In these types of processes there are still different people, different
shifts, different workloads, differences in the way inputs come into the process, different software,
computers, temperatures, etc. The same exact concepts and rules apply.
You should now appreciate why when we report process performance we need to know what the data
is representative of. Using such data we will now demonstrate how to calculate process capability and
then we will show how it is used.
There are many

ways to look at the Even stable processes will drift and shift over time by as much as
difference between 1.5 Standard Deviations on the average.
short-term and
long-term data.
Long-term
First keep on mind Overall Variation
you never have
purely short-term
or purely long-term
data. It is always
something in
between.
Short-term
Short-term data Between Group Variation
basically represent
your “entitlement”
situation: you are Short-term
controlling all the Within Group Variation
controllable
sources of
variation.
Long-term data includes (in theory) all the variation one can expect to see in the process. Usually
what we have is something in between. It is a judgment call to decide which type of data you have; it
varies depending on what you are trying to do with it and what you want to learn from it.
In general one or more months of data are probably more long-term than short-term; two weeks or
less is probably more like short-term data.
213
Process Capability
Sum of the Squares Formulas
These are the equations

describing the sum of
squares which are the
SS total = SS between + SS within
basis for the calculations
used in capability.
No, you do not need to

memorize them or even
really understand them.
They are built into Precision
Shift (short-term capability)
MINITABTM for the
x
processing of data.
Output Y
x x
x
x x
x x x
x x
x
x x x Time
x x x x
x x
x x x
x
Stability
Stability is established A Stable Process is consistent over time. Time Series Plots and
by plotting data in a
Control Charts are the typical graphs used to determine Stability.
Time Series Plot or in
a Control Chart. If the
data used in the Time Series Plot of PC Data
Control Chart goes out 70
of control the data is
not stable.
60
At this point in the
Measure Phase there
PC Data
is no reason to 50
assume the process is
stable. Performing a
Capability Study at 40
Tic
this point effectively
draws a line in the toc…
sand. tic 30
1 48 96 144 192 240 288 336 384 432 480
toc… Index
If however the process
is stable, short-term
data provides a more reliable estimate of true Process Capability.
Looking at the Time Series Plot shown here where would you look to determine the entitlement of
this process? As you can see the circled region has a much tighter variation. We would consider
this the process entitlement; meaning if we could find the X’s causing the instability this is the best
the process can perform in the short term. The idea is we have done it for some time, we should be
able to do it again. This does not mean this is the best this process will ever be able to do.
214
Process Capability
Measures of Capability
Hope Cp and Pp
•  What is Possible if your process is perfectly
Centered
•  The Best your process can be
•  Process Potential (Entitlement)
Reality Cpk and Ppk

•  The Reality of your process performance
•  How the process is actually running
•  Process Capability relative to specification
limits
Capability Formulas
Six times the sample

Standard Deviation
Sample Mean
LSL – Lower specification limit Three times the sample

Standard Deviation
USL – Upper specification limit
s – long-term Standard Deviation
215
Process Capability
MINITAB™ Example
Open the worksheet

Open worksheet Camshaft.mtw . Check for Normality: Stat > Basic Statistics > Normality.
“Camshaft.mtw”.
There are two

columns of data that The P-value of greater
show the length of than .05 tells us the
camshafts from two data are Normal
different suppliers.
Check the Normality
of each supplier.
In order to use
process capability as
a predictive statistic
the data must be
Normal for the tool
we are using in
MINITAB™.
At this point in time we are only attempting to get a Baseline number we can compare to at the end
of problem solving. We are not using it to predict a quality, we want to get a snapshot. DO NOT try
to make your process STABLE BEFORE working on it! Your process is a project because there is
something wrong with it so go figure it out, do not bother playing around with stability yet.
Create a Capability Analysis for both suppliers; assume long-term data.

Note the subgroup size for this example is 5. LSL=598 USL=602
Stat > Quality Tools > Capability Analysis (Normal)
216
Process Capability
MINITAB™ Example (cont.)
599.548 is the process

Mean which falls short of
the target (600) for
Supplier 1 and the left
tail of the distribution
falls outside the lower
specification limits. From
a practical standpoint
what does this mean?
You will have camshafts
that do not meet the
lower specification of
598 mm.
Next we look at the Cp

index. This tells us if we
will produce units within
the tolerance limits.
Supplier 1 Cp index is
.66 which tells us they need reduce the process variation and work on centering.
Look at the PMM levels? What does this tell us?
600.06 is the process

man for Supplier 2 and is
very close to the target
although both tails of the
distribution fall outside of
the specification limits.
The Cpk index is very
similar to Supplier 1 but
this infers that we need
to work on reducing
variation. When making
a comparison between
Supplier 1 and 2 relative
to Cpk vs Ppk we see
Supplier 2 process is
more prone to shifting
over time. That could be
a risk to be concerned
about.
Again, compare the PPM levels? What does this tell us? Hint look at PPM < LSL.
So what do we do. In looking only at the means you may claim Supplier 2 is the best. Although
Supplier 1 has greater potential as depicted by the Cp measure and it will likely be easier to move
their Mean than deal with the variation issues of Supplier 2. Therefore we will work with Supplier 1.
217
Process Capability
Generate the new

capability graphs MINITAB™ has a selection to calculate Benchmark Z’s or Sigma
for both suppliers levels along with the Cp and Pp statistics. By selecting these the
and compare Z graph will display the Sigma Level of your process!
values or sigma
levels.
Stat>Quality Tools>Capability Analysis>Normal…>Options…Benchmark Z’s (sigma level)
The overall long term sigma level is 1.85 for supplier 1 you should also note that it has the potential
to be 1.99 sigma as the process stands in its current state.
218
Process Capability
The overall long-

term sigma level is
1.39 for supplier 2.
You should also note
it has the potential to
be 1.39 sigma as the
process stands in its
current state.
Example Short Term
With short-term data do one of the following:

Option 1 Option 2
Enter Subgroup size: = total Go to Options & turn off Within
number of samples subgroup analysis
Using data from Column “Bi modal” in the Minitab worksheet “GraphingData.mtw”
The default of MINITAB™ assumes long-term data. Many times you will have short-term data so
adjust MINITAB™ based on Option 1 or 2 as shown here to get a proper analysis.
For Option 1 you will enter the subgroup size as the total number of data points you have in your
short-term study.
For Option 2 you will turn off the “within subgroup analysis” found inside the options selection.
219
Process Capability
Continuous Variable Caveats

Well this is one way to lie
with statistics…When Capability indices assume Normally Distributed data.
used as a predictive Always perform a Normality test before assessing Capability.
model Capability makes
assumptions about the
shape to the data. When
data is Non-normal, the
model’s assumptions do
not work and would be
inappropriate to predict.
It is actually good news

to have data that looks
like this because your
project work will be
easy!!! Why? Clearly
there is something
occurring in the process
that should be fairly
obvious and is causing these very two distinct distribution to occur. Take a look at each of the
distributions individually to determine what is causing this. DO NOT fuss or worry about Normality at
this point, hop out to the process and see what is going on.
Here in the Measure Phase stick with observed performance unless your data are Normal. There are
ways to deal with Non-normal data for predictive capability but we will look at that once you have
removed some of the Special Causes from the process. Remember here in the Measure Phase we
get a snapshot of what we are dealing with; at this point do not worry about predictability, we will
eventually get there.
Capability Steps
When we follow the
steps in performing a
capability study on
Select Output for
Improvement We can follow the steps for
Attribute Data we hit calculating capability for
a wall at step 6. #1 Verify Customer
Requirements
Continuous Data until we
Attribute Data is not
reach the question about
considered Normal #2 Validate
so we will use a Specification data Normality…
Limits
different
#3 Collect Sample
mathematical Data
method to estimate
capability. #4 Determine
Data Type
(LT or ST)
#5 Check data
for Normality
Calculate
#6 Z-Score, PPM,
Yield, Capability
Cp, Cpk, Pp, Ppk
#7
220
Process Capability
Attribute Capability Steps
Select Output for

Improvement
Notice the difference when
#1 Verify Customer
we come to step 5…
Requirements
Validate
#2 Specification
Limits
#3 Collect Sample
Data
#4
Calculate
DPU
#5
Find Z-Score
#6 Convert Z-Score
to Cp & Cpk
#7
Z Scores
Z Score is a measure of the distance in Standard Deviations of a

sample from the Mean.
–  Given an average of 50 with a Standard Deviation of 3 what is

the proportion beyond the upper spec limit of 54?
50
54
221
Process Capability
Z Table
In our case we have
to lookup the
proportion for the Z
score of 1.33. This
means that
approximately 9.1%
of our data falls
beyond the upper
spec limit of 54. If
we are interested in
determining parts
per million defective
we would simply
multiply the
proportion .09176 by
one million. In this
case there are
91,760 parts per
million defective.
Attribute Capability
Attribute data is always long-term in the shifted condition since it requires so

many samples to get a good estimate with reasonable confidence.
Short-term Capability is typically reported so a shifting method will be employed

to estimate short-term Capability.
You Want to Estimate : ZST ZLT

Short Term Long Term Sigma Short-Term Long-Term
Your Data Is : Capability Capability Level DPMO DPMO
1 158655.3 691462.5
Short Term Subtract
ZST Capability 1.5 2 22750.1 308537.5
Long Term Add 3 1350.0 66807.2

ZLT Capability 1.5 4 31.7 6209.7
5 0.3 232.7
6 0.0 3.4
Stable process can shift and drift by as much as 1.5 Standard Deviations. Want the theory behind
the 1.5…Google it! It is not important.
222
Process Capability
Attribute Capability (cont.)
Some people like

to use sigma level By viewing these formulas you can see there is a relationship between them.
(MINITAB™
reports this as “Z- If we divide our Z short-term by 3 we can determine our Cpk and if we divide
bench”), others our Z long-term by 3 we can determine our Ppk.
like to use Cpk,
Ppk. If you are
using Cpk and
Ppk you can
easily translate
that into a Z score
or sigma level by
dividing by 3.
Attribute Capability Example
A customer service group is interested in estimating the Capability

of their Call Center.
A total of 20,000 calls came in during the month but 2,666 of them
dropped before they were answered (the caller hung up).
Results of the Call Center data set:

Samples = 20,000
Defects = 2,666
They hung up….!
We will use this example to demonstrate the capability of a customer service call group.
223
Process Capability
Attribute Capability Example (cont.)
Follow these steps to

determine your 1.  Calculate DPU
process capability.
2.  Look up DPU value on the Z-Table
Remember DPU is 3.  Find Z-Score
Defects per unit, the 4.  Convert Z Score to Cpk, Ppk
total number of
possible errors or
defects that could be
counted in a process
or service. DPU is
calculated by
dividing the total Example:
number of defects by Look up ZLT
the number of units
ZLT = 1.11
or products.
Convert ZLT to ZST = 1.11+1.5 = 2.61
"Cpk” is an index (a
simple number) 1.  Calculate DPU
which measures how 2.  Look up DPU value on the Z-Table
close a process is 3.  Find Z Score
running to its 4.  Convert Z Score to Cpk, Ppk
specification limits
relative to the natural Example:
variability of the Look up ZLT
process. ZLT = 1.11
Convert ZLT to ZST = 1.11+1.5 = 2.61
A Cpk of at least
1.33 is desired and
2
is about 4 sigma +
with a yield of
.87
99.3790% .
The above Cpk of

.54 is about 1.5
sigma or a 50%
Yield.
If you want to know how that variation will affect the ability of your process to meet customer
requirements (CTQ's) you should use Cpk.
If you just want to know how much variation the process exhibits a Ppk measurement is fine.
Remember Cpk represents the short-term capability of the process and Ppk represents the long-
term capability of the process.
With the 1.5 shift the above Ppk process capability will be worse than the Cpk short-term capability.
224
Process Capability
§  Estimate capability for Continuous Data
§  Estimate capability for Attribute Data
§  Describe the impact of Non-normal Data on the analysis presented

in this module for continuous capability
You have now completed Measure Phase – Process Capability.
Notes
225
Lean Six Sigma

Green Belt Training
Measure Phase
The Measure Phase is now complete. Get ready to apply it. This module will help you create a
plan to implement the Measure Phase for your project.
226
Measure Phase Overview - The Goal
The goal of the Measure Phase is to:
•  Define, explore and classify X variables using a variety of tools.

–  Detailed Process Mapping
–  Fishbone Diagrams
–  X-Y Matrixes
–  FMEA
•  Acquire a working knowledge of Basic Statistics to use as a

communication tool and a basis for inference.
•  Perform Measurement Capability studies on output variables.
•  Evaluate stability of process and estimate starting point Capability.
Six Sigma Behaviors

the
•  Sharing best practices Talk!
Each player in the Six Sigma process must be

A ROLE MODEL
for the Lean Six Sigma culture
227
Measure Phase Deliverables
Listed here are the Measure Deliverables each candidate should

present in a Power Point presentation to their mentor and project
Champion.
At this point you should understand what is necessary to provide these

deliverables in your presentation.
–  Team Members (Team Meeting Attendance)
–  Process Map – detailed
–  FMEA
–  X-Y Matrix
–  Basic Statistics on Y
–  MSA
–  Stability graphs
–  Capability Analysis
–  Project Plan
–  Issues and Barriers
Measure Phase - The Roadblocks
Look for the potential roadblocks and plan to address them

before they become problems:
–  Team members do not have the time to collect data.
–  Data presented is the best guess by functional managers.
–  Process participants do not participate in the creation of the X-Y
Matrix, FMEA and Process Map.
It won t all be
smooth sailing…
You will run into roadblocks throughout your project. Listed here are some common ones that
Belts have to deal with in the Measure Phase.
228
DMAIC Roadmap
Process Owner
Champion/

Define
Estimate COPQ
Establish Team
Measure

Analyze

Improve

Control
The DMAIC Phases Roadmap is a flow chart of what goals should be reached during each phase of
DMAIC. Please take a moment to review.
Measure Phase
This map of the Measure

Phase rollout is more of a
Detailed Problem Statement Determined
guideline than a rule. The
way you apply the Six Sigma Detailed Process Mapping
problem-solving methods to
a project depends on the Identify All Process X’s Causing Problems (Fishbone, Process Map)
type of project you are

Select the Vital Few X’s Causing Problems (X-Y Matrix, FMEA)
working with and the
environment you are working Assess Measurement System
in. Y
Repeatable &
Reproducible?
For example in some cases N
it may make sense to jump

directly into Measurement Implement Changes to Make System Acceptable
System Analysis studies Assess Stability (Statistical Control)

while you collect data to
characterize other aspects of Assess Capability (Problem with Centering/Spread)
the process in parallel. In Estimate Process Sigma Level

other cases it may be
necessary to get a better Review Progress with Champion
understanding of the process

first. Let common sense and Ready for Analyze
data dictate your path.
229
Measure Phase Checklist
These are questions that

you should be able to Measure Questions
answer in clear, Identify Critical X’s and potential failure modes
•  Is the as is Process Map created?
understandable •  Are the decision points identified?
language at the end of •  Where are the data collection points?
this phase. •  Is there an analysis of the measurement system?
•  Where did you get the data?
Identify Critical X’s and potential failure modes
•  Is there a completed X-Y Matrix?
•  Who participated in these activities?
•  Is there a completed FMEA?
•  Has the Problem Statement changed?
•  Have you identified more COPQ?
Stability Assessment
•  Is the Voice of the Process stable?
•  If not, have the Special Causes been acknowledged?
•  Can the good signals be incorporated into the process?
•  Can the bad signals be removed from the process?
•  How stable can you make the process?
Capability Assessment
•  What is the short-term and long-term Capability of the process?
•  What is the problem; one of centering, spread or some combination?
General Questions
•  Are there any issues or barriers preventing you from completing this phase?
•  Do you have adequate resources to complete the project?
Planning for Action
WHAT WHO WHEN WHY WHY NOT HOW

Identify the complexity of the process
Focus on the problem solving process
Define Characteristics of Data
Validate Financial Benefits
Balance and Focus Resources
Establish potential relationships between variables
Quantify risk of meeting critical needs of Customer,
Business and People
Predict the Risk of sustainability
Chart a plan to accomplish the desired state of the culture
What is your defect?
When does your defect occur?
How is your defect measured?
What is your project financial goal (target & time) to reach
it?
What is your Primary metric?
What are your Secondary metrics?
Define the appropriate elements of waste
Over the last decade of deploying Six Sigma it has been found the parallel application of the tools
and techniques in a real project yields the maximum success for the rapid transfer of knowledge.
For maximum benefit you should apply what has been learned in the Measure Phase to a Six
Sigma project. Use this checklist to assist.
230
§  Have started to develop a Project Plan to complete the action

items
You have now completed the Measure Phase. Congratulations!
Notes
231
Lean Six Sigma

Green Belt Training
Analyze Phase
Welcome to Analyze
Now that we have completed the Measure Phase we are going to jump into the Analyze Phase.
Welcome to Analyze will give you a brief look at the topics we are going to cover.
232
Welcome to Analyze
Overview
These are the

deliverables for the W
W eelco
lcomm ee to
to AA nnaa ly
ly zz ee
Analyze Phase.
““ XX ”” SSiftin
iftingg
In
Infe
fere
renntia
tia l l SSta
ta tis
tistics
tics
In
Intro
tro to
to H
H yy ppooth
theessis
is Te
Tesstin
tingg
H
H yy ppooth
theessis
is Te
Tesstin
tingg N
NDD PP11
H
H yy ppooth
theessis
is Te
Tesstin
tingg N
NDD PP22
H
H yy ppooth
theessis
is Te
Tesstin
tingg N
NNND
D PP11
H
H yy ppooth
theessis
is Te
Tesstin
tingg N
NNND
D PP22
W
W ra
ra pp U
Upp &
& AA ctio
ctionn Ite
Itemm ss
Analyze Phase Roadmap

Process Owner
Champion/

Define
Estimate COPQ
Establish Team
Measure

Analyze

Improve

Control
233
Analyze Phase Process Map
Vital Few X’s Identified
State Practical Theories of Vital Few X’s Impact on Problem
Translate Practical Theories into Scientific Hypothesis
Select Analysis Tools to Prove/Disprove Hypothesis
Collect Data
Perform Statistical Tests
State Practical Conclusion
N
Statistically
Significant?
Y
Update FMEA
Practically
Significant?
N
Y
N
Root
Cause
Y Identify Root Cause
Ready for Improve and Control
This provides a process look at putting “Analyze” to work. By the time we complete this phase you
will have a thorough understanding of the various Analyze Phase concepts.
We will build upon the foundational work of the Define and Measure Phases by introducing
techniques to find root causes, then using experimentation and Lean Principles to find solutions to
process problems. Next you will learn techniques for sustaining and maintaining process
performance using control tools and finally placing your process knowledge into a high level process
management tool for controlling and monitoring process performance.
234
Lean Six Sigma

Green Belt Training
Analyze Phase
“X” Sifting
Now we will continue in the Analyze Phase with “X Sifting” – determining what the impact of the
inputs to our process are.
235
“X” Sifting
Overview
The core
fundamentals of Welcome to Analyze
Multi-Vari Analysis
this phase are
Multi-Vari Analysis X Sifting
and Classes and
Classes and Causes
Causes. Inferential Statistics
We will examine Intro to Hypothesis Testing

the meaning of
each of these and Hypothesis Testing ND P1
show you how to
apply them. Hypothesis Testing ND P2
Hypothesis Testing NND P1
Multi-Vari Studies
In the Define Phase we used Process Mapping to identify all the

possible X’s on the horizon. In the Measure Phase we used the X-Y
Matrix, FMEA and Process Map to narrow our investigation to the
probable X’s .
The XXXXXXXXXX
Themany
manyX’s Xs
when XXXXXXXXXX
whenwewefirst
firststart
start X XX XXXXX X X
(The
(Thetrivial
trivialmany)
many) X XX XXXXX X X
XX XX XX X
The
Thequantity
quantityof
ofX’s
Xs
keep
after we
reducing
think as
you
about
workY=the
f(Xproject
)+e
The
Thequantity
quantityofofX’s
Xs
when
remaining
we apply
after
XXX
leverage
DMAIC
(The vital
few)
In the Define Phase you use tools like Process Mapping to identify all possible “X’s”. In the Measure
Phase you use tools to help refine all possible “X’s” like the X-Y Diagram and FMEA.
In the Analyze Phase we start to “dis-assemble” the data to determine what it tells us. This is the fun
part.
236
“X” Sifting
Multi-Vari Definition
Multi-Vari Studies – is a tool that graphically displays patterns of variation. Multi-Vari Studies are
used to identify possible X’s or families of variation. These families of variation can hide within a
subgroup, between subgroups or over time.
The Multi-Vari Chart helps in screening factors by using graphical techniques to logically subgroup
discrete X’s (Independent Variables) plotted against a continuous Y (Dependent). By looking at the
pattern of the graphed points conclusions are drawn from about the largest family of variation.
Multi-Vari Chart can also be used to assess capability, stability and graphical relationships between
X’s and Y’s.
The use of a Multi-Vari Chart is to illustrate analysis of variance data graphically.
A picture can be worth a thousand words… or numbers.

- Multi-Vari Charts are useful in visualizing two-way interactions.
Multi-Vari Charts reveal information such as:

- Effect of work shift on Y’s.
- Impact of specific machinery, or material on Y’s.
- Effect of noise factors on Y’s, etc.
At this point in DMAIC Multi-Vari Charts are intended to be used as a passive study but later in the
process they can be used as a graphical representation where factors were intentionally changed.
The only caveat with using MINITABTM to graph the data is that the data must be balanced. Each
source of variation must have the same number of data points across time.
237
“X” Sifting
Multi-Vari Example
To put Multi-Vari studies in practice follow an example of an injection molding process.
You are probably asking yourself what is Injection Molding? Well basically an injection molding
machine takes hard plastic pellets and melts them into a fluid. This fluid is then injected into a
mold or die, under pressure, to create products, such as piping and computer cases.
Method
Typically we start with
a data collection sheet Sampling Plans should encompass all three types of
that makes sense variation: Within, Between and Temporal.
based on our 1. Create Sampling Plan
knowledge of the
process. Then follow 2. Gather Passive Date
the steps. 3. Graph Data
If we see only minor 4. Check to see if Variation is Exposed
variation in the
5. Interpret Results
sample it is time to go
back to collect
additional data. When No
your data collection Is
Yes
Create Gather
represents at least Sampling Passive
Graph Variation Interpret
Data Exposed Results
80% of the variation Plan Data
within the process you
should have enough
information to evaluate the graph.
Remember for a Multi-Vari Analysis to work the output must be continuous and the sources
of variation discrete.
238
“X” Sifting
Sources of Variation
Within unit, between
unit and temporal Within Unit or Positional
are the classic
causes of variation. –  Within piece variation related to the geometry of the part.
A unit can be a –  Variation across a single unit containing many individual parts;
single piece or a such as a wafer containing many computer processors.
grouping of pieces –  Location in a batch process such as plating.
depending on
whether they were Between Unit or Cyclical
created at unique
times. –  Variation among consecutive pieces.
–  Variation among groups of pieces.
Multi-Vari Analysis –  Variation among consecutive batches.
can be performed on
other processes, Temporal or over time Shift-to-Shift
simply identify the
categorical sources –  Day-to-Day
of variation you are –  Week-to-Week
interested in.
Machine Layout & Variables

In this example there are four widgets created with each die cycle. Therefore a unit is four widgets
created at that unique time.
Master Injection Pressure
% Oxygen
Distance to Tank
Injection
Pressure Per
Cavity
Fluid Level
#1
#2
Die Ambient
#3
Temp Temp
#4
Die
Release
An example of Within Unit Variation is measured by differences in the four widgets from a single die
cycle. For example we could measure the wall thickness for each of the four widgets.
Between Unit Variation is measured by differences from sequential die cycles. An example of
Between Unit Variation is comparing the average of wall thickness from die cycle to die cycle.
Temporal Variation is measured over some meaningful time period. For example, we would
compare the average of all the data collected in a time period say the 8 o’clock hour to the 10
o’clock hour.
239
“X” Sifting
Sampling Plan
To continue with this Monday Wednesday Friday
example the Multi-Vari Die Die Die Die Die Die Die Die Die
sampling plan will be to Cycle Cycle Cycle Cycle Cycle Cycle Cycle Cycle Cycle
#1 #2 #3 #1 #2 #3 #1 #2 #3
gather data for 3 die cycles
on 3 different days for 4 Cavity #1
widgets inside the mold.
Cavity #2
If you find this initial
sampling plan does not Cavity #3
show the variation of interest
it will be necessary to Cavity #4
continue sampling or make

changes to the sampling
plan. Monday Wednesday Friday
Die Die Die Die Die Die Die Die Die
Within-Unit Encoding Cycle Cycle Cycle Cycle Cycle Cycle Cycle Cycle Cycle
#1 #2 #3 #1 #2 #3 #1 #2 #3
Comparing individual data
Cavity #1
points within a die cycle is
Within Unit Variation. Cavity #2
Examples of measurement
could be wall thickness, Cavity #3
diameter or uniformity of
thickness to name a few Cavity #4
Monday Wednesday Friday

Between-Unit Encoding Die Die Die Die Die Die Die Die Die
Cycle Cycle Cycle Cycle Cycle Cycle Cycle Cycle Cycle
Comparing the averages #1 #2 #3 #1 #2 #3 #1 #2 #3
from each die cycle is Cavity #1

called Between Unit
Variation. Cavity #2
Cavity #3
Cavity #4
Monday Wednesday Friday

Temporal Encoding Die
Cycle
Die
Cycle
Die
Cycle
Die
Cycle
Die
Cycle
Die
Cycle
Die
Cycle
Die
Cycle
Die
Cycle
#1 #2 #3 #1 #2 #3 #1 #2 #3
Comparing the average
of all the data within a Cavity #1
day and plot three time

periods is known as Cavity #2
Temporal Variation.
Cavity #3
Cavity #4
240
“X” Sifting
Using Multi-Vari to Narrow X’s
List potential X’s and assign them to one of the families of variation.
–  This information can be pulled from the X-Y Matrix of the
Measure Phase.
If an X spans one or more families assign %’s to the supposed split.
Now let’s use the same information from the X-Y Matrix created in the Measure Phase. The following
exercise will help you assign one of the variables to the family of variation. If you find yourself with a
variable or X then assign percentages to split. Use your best judgment for the splits. Do not assume
the true X’s causing variation have to come from one in the list.
Graph the data from the process in Multi-Vari form.
Identify the largest family of variation.
Establish statistical significance through the appropriate statistical

testing.
Focus further effort on the X’s associated with the family of largest
variation.
Remember the goal is not only to figure

out what it is but also what it is not!
241
“X” Sifting
Data Worksheet
Now create the Multi-
Vari Chart in
MINITABTM.
Open the MINITABTM

Project “Analyze Data
Sets.mpj” and select
the worksheet
“MVInjectionMold.mt
w”. Take a few
minutes to look
through the worksheet
to see the balanced
structure. Create the
Multi-Vari Chart in
MINITABTM .
After you create the

graph as indicated,
take a few minutes to
create graphs using a
different order. Always use the graph that shows the variation in the easiest manner to interpret.
Run Multi-Vari
Here is the graph that should have been generated.
242
“X” Sifting
Identify The Largest Family of Variation
To find an example of
within unit variation look
at Unit 1 in the second
time period. Notice the
spread of data is 0.07.
Now let’s try to find

between unit variation.
Compare the averages
of the units within a time
period. All three time
periods appear similar so
looking at the first time
period it appears the
spread of the data is
0.18 units.
To determine temporal
variation compare the
averages between time periods. It appears time period 3 and 2 have a difference of 0.06.
To determine within unit variation find the unit with the greatest variation like Unit 1 in the second time
period. Notice the spread of data is 0.07. It appears the second unit in the third.
Notice the shifting from unit to unit is not consistent but it certainly jumps up and down. The question
at this point should be: Does this graph represent the problem of concern? Do I see at least 80% of
the variation? Read the units off the Y axis or look in the worksheet. Notice the spread of the data is
0.22 units. If the usual spread of the data is 0.25 units this data set represents 88% of the usual
variation telling us our sampling plan was sufficient to detect the problem.
Root Cause Analysis

Focus further effort on the
X’s associated with the Focus further effort on the X’s associated with the family of
family of greatest variation. greatest variation.
After the analysis we now

know the largest source of
variation is occurring die
cycle to die cycle we can
focus our effort on those
X’s we suspect have the
greatest impact. In this
case the pattern of
variation is not consistent
within the small scope of Die Cycle to Die Cycle –
data we gave gathered. Something is Changing!
Additional data may be
required or this process
may be ready for
experimentation.
243
“X” Sifting
Call Center Example
Let’s try another

example; open the A company with two call centers wants to compare two methods of
MINITABTM worksheet handling calls at each location at different times of the day.
“CallCenter.mtw”. This
example is a One method involves a team to resolve customer issues, and the
transactional application other method requires a single subject-matter expert to handle the
of the tool. call alone.
In this particular case a
company with two call •  Output (Y)
centers wants to –  Call Time
compare two methods of
handling calls at each •  Input (X)
location at different times –  Call Center (GA,NV)
of the day. One method –  Time of Day (10:00, 13:00, 17:00)
involves a team to
–  Method (Expert, Team)
resolve customer issues
and the other method
requires a single subject-
matter expert to handle
the call alone.
What is the largest source of variation…

§  Time?
§  Method?
§  Location?
Time
244
“X” Sifting
Call Center Example (cont.)

Is the largest source of variation more or less obvious? Notice the Multi-Vari graph plotted is
dependent on the order in which the variable column names are entered into MINITABTM.
Method
This example is not as easy to draw conclusions because of the source of the data. With the injection
molding process we know we are making the same parts over and over. However in this example of
a call center there is no control over the nature of calls coming in so a single outlier could affect your
judgment.
Location
245
“X” Sifting
Call Center Example (cont.)

To display individual data points click the “Options…” button. This helps to see the quantity of data
and to identify unusually long or short calls.
It is not necessary to force fit any one tool to your project. For transactional projects Multi-Vari may
be difficult to interpret purely graphically. We will re-visit this data set later when working through
Hypothesis Testing.
Multi-Vari Exercise
Exercise objective: To practice Six Sigma techniques learned

to date in your teams.
1.  Open file named MVA Cell Media.MTW .

2.  Perform Capability Analysis; use the column labeled volume.
There is only an upper specification limit of 500 ml. ?
–  Are the data Normal? _______
–  Is the process Capable? _______
3.  What is the issue that needs work in terms of Six Sigma
terminology?
–  Shift Mean? _______
–  Reduce variation? _______
–  Combination of Mean and variation? _______
–  Change specifications? _______
246
“X” Sifting
MVA Solution
Do you recall the
reason why Check for Normality…
Normality is an
issue? Normality is
required if you intend
to use the
information as a
predictive tool. Early
in the Six Sigma
process there is no
reason to assume
your data will be
Normal. Remember
if it is not Normal it
usually makes
finding potential
causes easier. Let’s
work the problem Is that
now. normal?
First check the data
for Normality. Since
the P-value is greater
than 0.05 the data
are considered
Normal. Another method to check Normality is…
Having a graphical
summary is quite
nice since it
provides a picture
of the data as well
as the summary
statistics. The
graphical summary
command in
MINITABTM is an
alternative method
to check for
Normality. Notice
the P-value in this
window is the same
as the previous.
Notice that even

though the data are
Normal, the
distribution is quite wide. If you had a process where you were filling bottles would you not expect
the process to be Normal?
247
“X” Sifting
MVA Solution (cont.)

Now it is time to
perform the
process capability.
For subgroup size
is enter 12 since
all 12 bottles are
filled at the same
time. Also, use
500 milliliters as
the upper spec
limit in order to
see how bad the
capability was
from a
manufacturers
prospective.
Under the
“Options” tab you
can select the
“Benchmark Z’s
(sigma level)” of the process, or you can leave the default as “Capability stats”. Just for fun you
can run MINITABTM to generate the Capability Analysis using 500 as the upper spec limit then run it
again as the lower spec limit and see what happens to the statistics.
Is this process in
trouble? The answer
is yes since the Z
bench value is
negative! That is very
bad. To correct this
problem the process
has to be set in such
a manner that none of
the bottles are ever
under-filled while
trying to minimize the
amount of overfill.
To answer step three

of this exercise it is a
combination of
reducing variation and
shifting the Mean.
The Mean cannot be
shifted however until
the variation is
REDUCE VARIATION!! - then shift Mean reduced dramatically.
248
“X” Sifting
MVA Solution (cont.)

The order in which you enter the factors will
produce different graphs. The “classical”
method is to use Within, Between and over-
time (Temporal) order.
The graph shows the variation within a unit is consistent across all the data. The variation between
units also looks consistent across all the data. What seems to stand out is the machine may be set
up differently from first shift to second. That should be easy to fix! What is the largest source of
variation? Within Unit Variation is the largest, Temporal is the next largest (and probably easiest to
fix) and Between Unit Variation comes in last.
So to fix this process

your game plan should What is the largest source of variation?
be based on the
information in the
Excel file and involve
additional information
you have about the
process.
This example was

based on a real
process where the
nasty culprit was
actually the location of
the in-line scale. No
one wanted to believe
a high price scale
could be generating
significant variation.
The in-line scale weighed the bottles and either sent them forward to ship or rejected them to be
topped off. The wind generated by the positive pressure in the room blew across the scale
making the weights recorded fluctuate unacceptably. The filling machine was actually quite good,
there were a few adjustments made once the variation from the scale was fixed. Once the
variation in the data was reduced, they were able to shift the Mean closer to the specification of
500 ml.
249
“X” Sifting
Remember the data

The data used in the Multi-Vari Analysis must be balanced used in the Multi-Vari
for MINITABTM to generate the graphic properly. Analysis must be
balanced for
The injection molding data collection sheet was created as follows:
MINITABTM to generate
–  3 time periods
the graphic properly.
–  4 widgets per die cycle
–  3 units per time period The injection molding
data collection sheet
was created to include:
3 time periods
4 widgets per die cycle
3 units per time period
for a total of 36 rows of
data. (3 times 4 times
3)
The data sheet is now balanced meaning there is an equal number of data points for each condition
in the data table and ready for data to be entered.
If you were to label the units 1 – 9 instead of 1 – 3 per time period MINITABTM would generate an
error message and would not be able to create the graphic. Think in terms of generic units instead
of being specific in labeling.
250
“X” Sifting
Classes of Distributions
By now you are
convinced Multi-
Multi-Vari is a tool to help screen X’s by visualizing three
Vari is a tool that
helps screen X’s primary sources of variation. Later we will perform
by visualizing Hypothesis Tests based on our findings.
three primary
sources of
variation. At this
At this point we will review classes and causes of distributions that
point we will can also help us screen X’s to perform Hypothesis Tests.
review classes –  Normal Distribution
and causes of
distributions that –  Non-normality – 4 Primary Classifications
can also help us
screen X’s to 1.  Skewness
perform 2.  Multiple Modes
Hypothesis Tests.
3.  Kurtosis
4.  Granularity
The Normal (Z) Distribution
Please review the characteristics of the Gaussian curve shown here…
Characteristics of Normal Distribution (Gaussian curve) are:

–  It is considered to be the most important distribution in statistics.
–  The total area under the curve is equal to 1.
–  The distribution is mounded and symmetric; it extends indefinitely in
both directions approaching but never touching the horizontal axis.
–  All processes will exhibit a Normal curve shape if you have pure
random variation (white noise).
–  The Z distribution has a Mean of 0 and a Standard Deviation of 1.
–  The Mean divides the area in half, 50%
on one side and 50% on
the other side.
–  The Mean, Median and
Mode are at the same
data point.
-6 -5 -4 -3 -2 -1 +1 +2 +3 +4 +5 +6
251
“X” Sifting
Normal Distribution
This Normal Curve is Why do we care?

NOT a plot of our
–  ONLY IF we need accurate estimates of Mean and Standard Deviation.
observed data!!!
•  Our theoretical distribution should MOST accurately represent our sample
This theoretical distribution in order to make accurate inferences about our population.
curve is estimated
based on our data’s
Mean and Standard
Deviation. Many
Hypothesis Tests
that are available
assume a Normal
Distribution. If the
assumption is not
satisfied we cannot
use them to infer
anything about the
future.
However just
because a
distribution of sample data looks Normal does not mean the variation cannot be reduced and a new
Normal Distribution created.
Non-Normal Distributions
Data may follow

Non-normal 1 Skewed 2 Kurtosis
Distributions for a
variety of reason,
or there may be
multiple sources
of variation
causing data that
would otherwise
be Normal to
appear not
Normal.
3 Multi-Modal
4 Granularity
252
“X” Sifting
Skewness Classification
When a distribution
is not symmetrical it Potential Causes of Skewness
is Skewed. Left Skew Right Skew
Generally a Skewed
distribution longest 60
40
tail points in the 50
Frequency
Frequency
direction of the 30 40
Skew. 20 30
20
10
10
0 0
10 15 20 4 5 6 7 8 9 10 11
1-1 Natural Limits

1-2 Artificial Limits (Sorting)
1-3 Mixtures
1-4 Non-Linear Relationships
1-5 Interactions
1-6 Non-Random Patterns Across Time
Mixed Distributions 1-3
Mixed Distributions occur when data comes from multiple

sources that are supposed to be the same yet are not.
Machine A Machine B
Operator A Operator B
Payment Method A Payment Method B Combined
Interviewer A Interviewer B
Sample A + Sample B
=
What causes Mixed Distributions? Mixed Distributions occur when data comes from several
sources that are supposed to be the same but are not.
Note both distributions that formed the combined Skewed Distribution started out as Normal
Distributions.
253
“X” Sifting
1-4 Non-Linear Relationships
Just because Non-Linear Relationships occur when the X and Y scales are
your Input (X) is different for a given change in X.
Normally
Distributed about 10
a Mean the
Output (Y) may
not be Normally
Distributed.
Y
5
Marginal Distribution
of Y
0 50 100
X
of X
1-5 Interactions
Interactions occur when two inputs interact with each other to have
a larger impact on Y than either would by themselves.
Interaction Plot for Process Output Aerosol Hairspray
On
35
Room Temperature
Spray
Off
30
25
No Spray
No Fire With Fire
If you find two inputs have a large impact on Y but would not effect Y by themselves this is called a
Interaction.
For instance if you spray an aerosol can in the direction of a flame what would happen to room
temperature? What do you see regarding these distributions?
254
“X” Sifting
1-6 Time Relationships / Patterns
Time
The distribution is dependent on time. relationships
occur when the
distribution is
30
dependent on
time. Some
examples are
25 tool wear,
chemical bath
depletion, stock
of Y
prices, etc.
20
10 20 30 40 50
Time
Often seen when tooling requires warming up , tool wear,

chemical bath depletions, ambient temperature effect on tooling.
Non-Normal Right (Positive) Skewed
Moment coefficient of Skewness will be close to zero for symmetric

distributions, negative for left Skewed and positive for right Skewed.
Find the
worksheet
named
“Distrb1.MTW”
and you will
see the column
named Pos
Skew to chart
this graphical
summary in
MINITABTM.
To measure Skewness we use Descriptive Statistics. When looking at a symmetrical distribution

Skewness will be close to zero. If the distribution is skewed to the left it will have a negative
number, if skewed to the right it should be positive.
255
“X” Sifting
Kurtosis 2
The next
Kurtosis refers to the shape of the tails. classification of
–  Leptokurtic Non-normal
Data is
–  Platykurtic
Kurtosis.
•  Different combinations of distributions causes the resulting There are two
overall shapes. types of
Kurtosis are
Leptokurtic and
Platykurtic.
Leptokurtic is
generally
peaked with
long-tails while
Platykurtic are
flat with short-
tails.
Leptokurtic Platykurtic
Peaked with Long-Tails Flat with Short-Tails
Platykurtic
Multiple Means shifting over time produces a plateau

of the data as the shift exhibits this shift.
Causes:
2-1. Mixtures: (Combined

Data from Multiple
Processes)
Multiple Set-Ups
Multiple Batches
Multiple Machines
Tool Wear (over time)
2-2 Sorting or Selecting:

Scrapping product that falls
outside the spec limits
2-3 Trends or Patterns:

Lack of Independence in the
data (example: tool wear,
chemical bath)
2-4 Non Linear

Relationships
Chemical Systems
Negative coefficient of Kurtosis indicates Platykurtic distribution. The data set for this distribution is
in the worksheet “Distrib1.MTW” and under column “Flat.”
256
“X” Sifting
Leptokurtic
Positive Kurtosis value indicates Leptokurtic distribution. The data set for this distribution is in the
worksheet “Distrib1.MTW” and under column “LongTail.”
Distributions overlaying each other that have very

different variance can cause a Leptokurtic distribution.
Causes:
2-1. Mixtures: (Combined Data

from Multiple Processes)
Multiple Set-Ups
Multiple Batches
Multiple Machines
Tool Wear (over time)
2-2 Sorting or Selecting:

Scrapping product that falls
outside the spec limits
2-3 Trends or Patterns:

Lack of Independence in the
data (example: tool wear,
chemical bath)
2-4 Non Linear Relationships

Chemical Systems
Multiple Modes 3
Reasons for Multiple Modes:
3-1 Mixtures of distributions (most likely)
3-2 Lack of independence – trends or patterns
3-3 Catastrophic failures (example: testing

voltage on a motor and the motor shorts
out so we get a zero reading)
Now that s my kind of mode!!
Multiple Modes have such dramatic combinations of underlying sources that they show distinct
modes. They may have shown as Platykurtic but were far enough apart to see separation.
Celebrate! These are usually the easiest to identify causes.
257
“X” Sifting
Bimodal Distributions
This is an example of a Bi-

Modal Distribution. 2 Different Distributions
Interestingly each peak is - 2 different machines
actually a Normal -  2 different operators
Distribution but when the -  2 different administrators
data is viewed as a group
it is obviously not Normal.
Extreme Bi-Modal (Outliers)
If you see an
extreme Outlier it
usually has its on
cause or own
source of variation.
It is relatively easy
to isolate the cause
by looking on the X
axis of the
Histogram.
258
“X” Sifting
Bi-Modal – Multiple Outliers
Having multiple
Outliers is more
difficult to
correct. This
action typically
means multiple
inputs.
Granular 4
Granular data is easy to see in a Dot Plot.

–  Use Caution!
•  It looks Normal but it is only symmetric and not Continuous.
–  Causes:
•  4-1 Measurement system resolution (Gage R&R)
•  4-2 Categorical (step-type function) data
Now let’s take a moment to notice the P-value in the Normal Probability Plot, it is definitely smaller
than 0.05! There simply is not enough resolution in the data.
259
“X” Sifting
Normal Example
Notice the contrast to the previous page!
Conclusions Regarding Distributions
Non-normal Distributions are not BAD!!!
Non-normal Distributions can give more Root Cause

information than Normal data (the nature of why…)
Understanding what the data is telling us is KEY!!!
What do you want to know ???
Find the key….
Here is what to conclude regarding distributions.
260
“X” Sifting
§  Perform a Multi-Vari Analysis

§  Interpret and a Multi-Vari Graph
§  Identify when a Multi-Vari Analysis is applicable
§  Interpret what Skewed Data looks like
§  Explain how data distributions become Non-normal when they
are really Normal
You have now completed Analyze Phase – ”X” Sifting.
Notes
261
Lean Six Sigma

Green Belt Training
Analyze Phase
Inferential Statistics
Now we will continue in the Analyze Phase with Inferential Statistics.
262
Overview
The core
fundamentals of Welcome to Analyze
this phase are
Inferential X Sifting Inferential Statistics
Statistics, Nature
of Sampling and Inferential Statistics Nature of Sampling
Central Limit
Theorem. Intro to Hypothesis Testing Central Limit Theorem
We will examine Hypothesis Testing ND P1

the meaning of
each of these and Hypothesis Testing ND P2
show you how to
apply them. Hypothesis Testing NND P1
Nature of Inference
in·fer·ence (n.) The act or process of deriving logical conclusions

from premises known or assumed to be true. The act of reasoning
from factual knowledge or evidence. 1 1. Dictionary.com
Inferential Statistics – To draw inferences about the process or

population being studied by modeling patterns of data in a way that
accounts for randomness and uncertainty in the observations. 2
2. Wikipedia.com
Putting the pieces

of the puzzle
together….
One objective of Six Sigma is to move from only describing the nature of the data or descriptive
statistics to that of inferring what will happen in the future with our data or Inferential Statistics.
263
5 Step Approach to Inferential Statistics
1. What do you want to know?
2. What tool will give you that information?
3. What kind of data does that tool require?
4. How will you collect the data?
5. How confident are you with your data summaries?
So many
questions….?
As with most things you have learned associated with Six Sigma – there are defined steps to be
taken.
Types of Error
Types of error
contribute to 1. Error in sampling
uncertainty when –  Error due to differences among samples drawn at random from the
trying to infer
population (luck of the draw).
with data.
–  This is the only source of error that statistics can accommodate.
There are four
types of error 2. Bias in sampling
that are
explained above. –  Error due to lack of independence among random samples or due to
systematic sampling procedures (height of horse jockeys only).
3. Error in measurement
–  Error in the measurement of the samples (MSA/GR&R).
4. Lack of measurement validity

–  Error in the measurement does not actually measure what it is
intended to measure (placing a probe in the wrong slot measuring
temperature with a thermometer that is just next to a furnace).
264
Population, Sample, Observation
Population
–  EVERY data point that has ever been or ever will be generated from a
given characteristic.
Sample
–  A portion (or subset) of the population, either at one time or over time.
X
X X
X X
Observation
–  An individual measurement.
Let’s review a few definitions: A population is EVERY data point that has ever been or ever will be
generated from a given characteristic. A sample is a portion (or subset) of the population either at
one time or over time. An observation is an individual measurement.
Significance
Significance is all about differences…

Practical difference and significance is:
–  The amount of difference, change or improvement that will be of
practical, economic or technical value to you.
–  The amount of improvement required to pay for the cost of making
the improvement.
Statistical difference and significance is:

–  The magnitude of difference or change required to distinguish
between a true difference, change or improvement and one that
could have occurred by chance.
Twins: Sure there are differences…

but do they matter?
265
The Mission
Mean Shift Variation Both

Reduction
Your mission, which you have chosen to accept, is to reduce cycle time, reduce the error rate,
reduce costs, reduce investment, improve service level, improve throughput, reduce lead time,
increase productivity… change the output metric of some process, etc…
In statistical terms this translates to the need to move the process Mean and/or reduce the process
Standard Deviation
You will be making decisions about how to adjust key process input variables based on sample
data, not population data - that means you are taking some risks.
How will you know your key process output variable really changed and is not just an unlikely
sample? The Central Limit Theorem helps us understand the risk we are taking and is the basis for
using sampling to estimate population parameters.
A Distribution of Sample Means
Imagine you have some population. The individual values of this population form some distribution.
Take a sample of some of the individual values and calculate the sample Mean.
Keep taking samples and calculating sample Means.
Plot a new distribution of these sample Means.
The Central Limit Theorem says as the sample size becomes large this new distribution (the sample
Mean distribution) will form a Normal Distribution no matter what the shape of the population
distribution of individuals.
266
Sampling Distributions—The Foundation of Statistics
Population •  Samples from the population, each with five observations:

3
5 Sample 1 Sample 2 Sample 3
2
12 1 9 2
10 12 8 3
1 9 5 6
6
12 7 14 11
5 8 10 10
6
12 7.4 9.2 6.4
14
3
6
11 •  In this example we have taken three samples out of the
9 population each with five observations in it. We computed a
10
10 Mean for each sample. Note the Means are not the same!
12
•  Why not?
•  What would happen if we kept taking more samples?
Every statistic derives from a sampling distribution. For instance, if you were to keep taking
samples from the population over and over a distribution could be formed for calculating Means,
Medians, Mode, Standard Deviations, etc. As you can see the above sample distributions each
have a different statistic. The goal here is to successfully make inferences regarding the statistical
data.
Constructing Sampling Distributions
To demonstrate
how sampling
distributions
work we will
create some Open Minitab Worksheet “Die Example”.
random data for

die rolls.
Create a sample
of 1,000
individual rolls of
a die that we will
store in a
variable named
“Population”.
From the
population we
will draw five
random
samples.
Roll em!
267
Sampling Distributions
To draw random samples from the population follow the command shown below and repeat four
times for the other columns.
Calc> Random Data> Sample from Columns…
Sampling Error
Calculate the Mean and Standard Deviation for each column Now compare
and compare the sample statistics to the population. the Mean and
Standard
Stat > Basic Statistics > Display Descriptive Statistics… Deviation of the
samples of 5
Descriptive Statistics: Population, Sample1, Sample2, Sample3, Sample4, Sample5 observations to
the population.
Variable N N* Mean SE Mean StDev Minimum Q1 Median Q3 Maximum What do you
Population 1000 0 3.5510 0.0528 1.6692 1.0000 2.0000 4.0000 5.0000 6.0000 see?
Sample1 5 0 3.400 0.927 2.074 1.000 1.500 3.000 5.500 6.000
Sample2 5 0 4.600 0.678 1.517 2.000 3.500 5.000 5.500 6.000
Sample3 5 0 4.200 0.663 1.483 2.000 3.000 4.000 5.500 6.000
Sample4 5 0 3.800 0.917 2.049 2.000 2.000 3.000 6.000 6.000
Sample5 5 0 3.600 0.872 1.949 1.000 2.000 3.000 5.500 6.000
Range in Mean 1.2 (4.600 – 3.400) Range in StDev 0.591 (2.074 – 1.483)
268
Sampling Error
Create 5 more
columns of Create 5 more columns of data sampling 10
data sampling observations from the population.
10
observations Calc> Random Data> Sample from Columns…
from the
population.
Sampling Error - Reduced

Calculate the Mean and Standard Deviation for each column and compare the sample statistics
to the population.
Calculate the Mean and Standard Deviation for each column

and compare the sample statistics to the population.
Stat > Basic Statistics > Display Descriptive Statistics…
Variable N N* Mean SE Mean StDev Minimum Q1 Median Q3 Maximum
Sample6 10 0 3.600 0.653 2.066 1.000 1.750 3.500 6.000 6.000
Sample7 10 0 4.100 0.567 1.792 1.000 2.750 4.500 6.000 6.000
Sample8 10 0 3.200 0.442 1.398 1.000 2.000 3.500 4.250 5.000
Sample9 10 0 3.500 0.563 1.780 1.000 2.000 3.500 5.250 6.000
Sample10 10 0 3.300 0.616 1.947 1.000 1.750 3.000 5.250 6.000
Range in Mean 0.9 (4.100 – 3.200) Range in StDev 0.668 (2.066 – 1.398)
With 10 observations the differences

between samples are now much smaller.
Can you tell what is happening to the Mean and Standard Deviation? When the sample size
increases the values of the Mean and Standard Deviation decrease.
What do you think would happen if the sample increased? Let’s try 30 for a sample size.
269
Sampling Error - Reduced
Do you notice Calc> Random Data> Sample from Columns…

anything
Stat> Basic Statistics> Display Descriptive Statistics…
different?
Look how much

smaller the
range of the
Mean and
Standard
deviations. Did
the sampling
error get
reduced? Variable N Mean StDev
Sample 11 30 3.733 1.818
Sample 12 30 3.800 1.562
Sample 13 30 3.400 1.868
Sample 14 30 3.667 1.768
Sample 15 30 3.167 1.487
Range in Mean 0. 63 Range in StDev 0.381
In theory if we kept taking samples of size n = 5 and n = 10 and

calculated the sample Means we could see how the sample
Means are distributed. Calc> Random Data> Integer…
Simulate this in MINITABTM by creating ten columns of 1000 rolls

of a die.
Feeling lucky…?
Now instead of looking at the effect of sample size on error we will create a sampling distribution
of averages. Follow along to generate your own random data.
270
For each row calculate the Mean of five columns.
Calc> Row Statistics…
Repeat this command to

calculate the Mean of C1-C10
and store result in Mean10.
The commands shown above will create new columns that are now averages from the columns of
random population data. We have 1000 averages of sample size 5 and 1000 averages of sample
size 10.
Create a Histogram of C1, Mean5 and Mean10.

Graph> Histogram> Simple…..
Multiple Graph…On separate graphs…Same X, including same bins
Select Same X, including

same bins to facilitate
comparison.
In MINITABTM follow the above commands. The Histogram being generated makes it easy to see
what happened when the sample size was increased.
271
Different Distributions
Sample Means Let’s examine how the number of

throws impacts our analysis results.
The fundamental differences
between the three Histograms are:
•  Mean5 and Mean10 belong to

continuous data types
•  C1 belongs to a discrete type of
What is different about data set (no decimals in the raw
the three distributions? data and obvious in the low
granularity on the Histogram
What happens as the (even though the x axis shows
number of die throws decimals – do not be misled)
increase? •  When data is of continuous type,
Individuals
the possibilities are numerous
and so the distribution tends to
Observations be closer toward a Normal
distribution; especially when the
As the sample size (number of die rolls) increases from 1 to 5 to 10, sample size is large. The
there are three points to note: number of possibilities are (6 x
1.  The Center remains the same. 10) + (6 x 1) = 66 possible
2.  The variation decreases. outcomes in all.
3.  The shape of the distribution changes - it tends to become •  The C1 data, no matter how
Normal.
large the sample size is, tends
The Mean of the sample Mean The Standard Deviation of the to be a flat distribution with the
distribution: sample Mean distribution, also least combination of possibilities
known as the Standard Error.
(only 6 discrete possibilities in a
die throw).
Good news: the Mean of the sample Better news: I can reduce my
Mean distribution is the Mean of the uncertainty about the population
population. Mean by increasing my sample size n.
Central Limit Theorem

If all possible random samples, each of size n, are taken from any
population with a Mean µ and Standard Deviation σ the distribution
of sample Means will:
have a Mean
Everything we have gone have a Std Dev

through with sampling error
and sampling distributions and be Normally Distributed when the parent population is Normally
was leading up to the Distributed or will be approximately Normal for samples of size 30 or
Central Limit Theorem. more when the parent population is not Normally Distributed.
This improves with samples of larger size.
Bigger is Better!
272
So What?
So how does this theorem help me

understand the risk I am taking when I use
sample data instead of population data?
Recall that 95% of Normally Distributed data is within ± 2 Standard

Deviations from the Mean. Therefore the probability is 95% my
sample Mean is within 2 standard errors of the true population Mean.
A Practical Example
Let’s say your project is to reduce the setup time for

a large casting:
–  Based on a sample of 20 setups you learn your baseline
average is 45 minutes with a Standard Deviation of 10
minutes.
–  Because this is just a sample the 45 minute average is an
estimate of the true average.
–  Using the Central Limit Theorem there is 95% probability the
true average is somewhere between 40.5 and 49.5 minutes.
–  Therefore do not get too excited if you made a process
change resulting in a reduction of only 2 minutes.
What is the likelihood of getting a sample with a 2 second difference? This could be caused either
by implementing changes or could be a result of random sampling variation, sampling error. The
95% confidence interval exceeds the 2 second difference (delta) seen as a result. What is the delta
caused from? This could be a true difference in performance or random sampling error. This is why
you look further than only relying on point estimators.
273
Sample Size and the Mean

When taking a sample we have only estimated the true Mean. All we know is the true Mean lies
somewhere within the theoretical distribution of sample Means or the t-distribution that are analyzed
using t-tests. T-tests measure the significance of differences between Means.
Theoretical distribution of
sample Means for n = 2
Theoretical distribution of Distribution of individuals in

sample Means for n = 10 the population
Standard Error of the Mean
The Standard Deviation for the distribution of Means is called

the standard error of the Mean and is defined as:
274
Standard Error
The rate of change in the Standard Error approaches zero at about 30

samples.
Standard Error
0 5 10 20 30
Sample Size
This is why 30 samples is often recommended when generating

summary statistics such as the Mean and Standard Deviation.
This is also the point at which the t and Z distributions become nearly
equivalent.
When comparing Standard Error with sample size the rate of change in the Standard Error
approaches zero at about 30 samples. This is why a sample size of 30 comes up often in
discussions on sample size.
This is the point at which the t and the Z distributions become nearly equivalent. If you look at a Z
table and a t table to compare Z=1.96 to t at 0.975 as sample approaches infinite degrees of
freedom they are equal.
275
§  Explain the term “Inferential Statistics”
§  Explain the Central Limit Theorem
§  Describe what impact sample size has on your estimates of

population parameters
§  Explain Standard Error
You have now completed Analyze Phase – Inferential Statistics.
Notes
276
Lean Six Sigma

Green Belt Training
Analyze Phase
Introduction to Hypothesis Testing
Now we will continue in the Analyze Phase with “Introduction to Hypothesis Testing”.
277

Overview
The core
fundamentals of this Welcome to Analyze
phase are
Hypothesis Testing, X Sifting
Tests for Central
Tendency, Tests for Inferential Statistics Hypothesis Testing Purpose
Variance and
ANOVA. Tests for Central Tendency
Intro to Hypothesis Testing
We will examine the Tests for Variance
Hypothesis Testing ND P1
meaning of each of ANOVA
these and show you
how to apply them. Hypothesis Testing ND P2
Six Sigma Goals and Hypothesis Testing
Our goal is to improve our Process Capability. This translates to the need to move the process Mean
(or proportion) and reduce the Standard Deviation.
§  Because it is too expensive or too impractical (not to mention theoretically impossible) to
collect population data we will make decisions based on sample data.
§  Because we are dealing with sample data there is some uncertainty about the true
population parameters.
Hypothesis Testing helps us make fact-based decisions about whether there are different population
parameters or that the differences are just due to expected sample variation.
Process Capability of Process Before Process Capability of Process After
LSL USL LSL USL

P rocess Data Within Process Data Within
LSL 100.00000 Overall LSL 100.00000 Overall
Target * Target *
U SL 120.00000 Potential (Within) C apability Potential (Within) C apability
U SL 120.00000
Sample M ean 108.65832 Cp 1.42 Cp 2.14
Sample M ean 109.86078
Sample N 150 C PL 1.23 Sample N 100 C PL 2.11
StDev (Within) 2.35158 C P U 1.61 StDev (Within) 1.55861 C P U 2.17
StDev (O v erall) 5.41996 C pk 1.23 StDev (O v erall) 1.54407 C pk 2.11
C C pk 1.42 C C pk 2.14
O v erall C apability O v erall C apability
Pp 0.62 Pp 2.16
PPL 0.53 PPL 2.13
PPU 0.70 PPU 2.19
P pk 0.53 P pk 2.13
C pm * C pm *
96 100 104 108 112 116 120 102 105 108 111 114 117 120
O bserv ed P erformance Exp. Within P erformance Exp. O v erall Performance O bserv ed Performance Exp. Within P erformance Exp. O v erall Performance
P P M < LSL 6666.67 P P M < LSL 115.74 P P M < LSL 55078.48 PP M < LSL 0.00 P PM < LSL 0.00 P PM < LSL 0.00
P P M > U SL 0.00 P P M > U SL 0.71 P P M > U SL 18193.49 PP M > U SL 0.00 P PM > U SL 0.00 P PM > U SL 0.00
P P M Total 6666.67 P P M Total 116.45 P P M Total 73271.97 PP M Total 0.00 P PM Total 0.00 P PM Total 0.00
278
Purpose of Hypothesis Testing
The purpose of appropriate Hypothesis Testing is to integrate the Voice of the Process with the
Voice of the Business to make data-based decisions to resolve problems.
Hypothesis Testing can help avoid high costs of experimental efforts by using existing data. This
can be likened to:
Local store costs versus mini bar expenses.
There may be a need to eventually use experimentation but careful data analysis can
indicate a direction for experimentation if necessary.
The probability of occurrence is based on a pre-determined statistical confidence.

§  Decisions are based on:
§  Beliefs (past experience)
§  Preferences (current needs)
§  Evidence (statistical data)
§  Risk (acceptable level of failure)
The Basic Concept for Hypothesis Tests
Recall from the discussion on classes and cause of distributions that a data set may seem Normal
yet still be made up of multiple distributions. Hypothesis Testing can help establish a statistical
difference between factors from different distributions.
0.8
0.7
0.6
0.5
freq
0.4
0.3
0.2
0.1
0.0
-3 -2 -1 0 1 2 3
x
Did my sample come from this population? Or this? Or this?
Because of not typically having the capability to test an entire population we must use samples from
the population to make inferences. Since we are using sample data not the entire population we
need to have methods assure the sample is a fair representation of the population.
When we use a proper sample size Hypothesis Testing gives us a way to detect the likelihood a
sample came from a particular distribution. Sometimes the questions can be: Did our sample come
from a population with a Mean of 100? Is our sample variance significantly different than the
variance of the population? Is it different from a target?
279
Significant Difference
Are the two distributions significantly different from each

other?
How sure are we of our decision?
How do the number of observations affect our confidence in

detecting population Mean?
µ1! µ2!
Sample 1 Sample 2
Do you see a difference between Sample 1 and Sample 2? There may be a real difference
between the samples shown; however, we may not be able to determine a statistical difference. Our
confidence is established statistically which has an effect on the necessary sample size. Our ability
to detect a difference is directly linked to sample size and in turn whether we practically care about
such a small difference.
Detecting Significance
Statistics provide a methodology to detect differences.

–  Examples might include differences in suppliers, shifts or
equipment.
–  Two types of significant differences occur and must be well
understood…. practical and statistical.
–  Failure to tie these two differences together is one of the most
common errors in statistics.
HO: The sky is not falling.
HA: The sky is falling.
We will discuss the difference between practical and statistical throughout this session. We can
affect the outcome of a statistical test simply by changing the sample size.
280
Practical versus Statistical
Practical Difference: The difference resulting in an improvement of

practical or economic value to the company.
–  Example, an improvement in yield from 96 to 99 percent.
Statistical Difference: A difference or change to the process that

probably (with some defined degree of confidence) did not happen
by chance.
–  Examples might include differences in suppliers, markets or servers.
We will see it is possible to realize a statistically

significant difference without realizing a
practically significant difference.
Let’s take a moment to explore the concept of Practical Differences versus Statistical Differences.
Detecting Significance
During the Measure Phase it is important the nature

of the problem be well understood. Mean Shift
In understanding the problem the practical
difference to be achieved must match the statistical
difference.
The difference can be either a change in the Mean

or in the variance.
Detection of a difference is then accomplished using

statistical Hypothesis Testing.
An important concept to understand is the process

Variation Reduction
of detecting a significant change. How much of a
shift in the Mean will offset the cost in making a
change to the process?
This is not necessarily the full shift from the

Business Case of your project. Realistically, how
small or how large a delta is required? The larger
the delta the smaller the necessary sample will be
because there will be a very small overlap of the
distributions. The smaller the delta is the larger the sample size has to be to be able to detect a
statistical difference.
281
Hypothesis Testing
A Hypothesis Test is an a priori theory relating to differences between variables.
A statistical test or Hypothesis Test is performed to prove or disprove the theory.
A Hypothesis Test converts the practical problem into a statistical problem.

§  Since relatively small sample sizes are used to
estimate population parameters there is always a
chance of collecting a non-representative sample.
§  Inferential Statistics allows us to estimate the
probability of getting a non-representative sample
DICE Example
You have rolled dice before have you not? You know dice that you would find in a board game or in
Las Vegas.
Well assume we suspect a single die is “Fixed.” Meaning it has been altered in some form or
fashion to make a certain number appear more often than it rightfully should.
Consider the example on how we would go about determining if in fact a die was loaded.
If we threw the die five times and got five ones what would you conclude? How sure can you be?
The probability of getting just a single one. The probability of getting five ones.
We could throw a die a number of times and track how many times each face
occurred. With a standard die we would expect each face to occur 1/6 or
16.67% of the time.
If we threw the die 5 times and got 5 ones what would you conclude? How
sure can you be?
–  Pr (1 one) = 0.1667 Pr (5 ones) = (0.1667)5 = 0.00013
There are approximately 1.3 chances out of 10,000 we could have gotten 5
ones with a standard die.
Therefore we would say we are willing to take a 0.013% chance of being

wrong about our hypothesis that the die was loaded since the results do not
come close to our predicted outcome.
282
Hypothesis Testing
When it comes to
Hypothesis Testing,
you must look at
three focus points
to help validate
your claim. These
points are Type I,
α
Type II and Sample
Size.
DECISIONS
β n
Statistical Hypothesis
A hypothesis is a predetermined theory about the nature of, or relationships between, variables.
Statistical tests can prove (with a certain degree of confidence) a relationship exists. With
Hypothesis Testing the primary assumption is the null hypothesis is true. Therefore statistically you
can only reject or fail to reject the null hypothesis.
If the null is rejected this means you have data that supports the alternative hypothesis.
We have two alternatives for hypothesis:
–  The null hypothesis Ho assumes there are no differences or

relationships. This is the default assumption of all statistical
tests.
–  The alternative hypothesis Ha states there is a difference or

relationship.
P-value > 0.05 Ho = no difference or relationship

P-value < 0.05 Ha = is a difference or relationship
Making a decision does not FIX a

problem, taking action does.
283
Steps to Statistical Hypothesis Test

There are six steps to Hypothesis Testing. With Step 3 your alpha may change depending on the
problem at hand. An alpha of .05 is common in most manufacturing. In transactional projects an
alpha of 0.10 is common when dealing with human behavior. Being 90% confident a change to a
sale procedure will produce results is most likely a good approach. A not-so-common alpha is 0.01.
This is only used when it is necessary to make the null hypothesis very difficult to reject.
Any differences between
1.  State the Practical Problem.
observed data and claims
made under H0 may be real or 2.  State the Statistical Problem.
due to chance. Hypothesis a)  HO: ___ = ___
Tests determine the
b)  HA: ___ ≠ ,>,< ___
probabilities of these
differences occurring solely due 3.  Select the appropriate statistical test and risk levels.
to chance and call them P- a)  α = .05
values.
b)  β = .10
The a level of a test (level of 4.  Establish the sample size required to detect the difference.
significance) represents the
5.  State the Statistical Solution.
yardstick against which P-
values are measured and H0 is 6.  State the Practical Solution.
rejected if the P-value is less
than the alpha level. Noooot THAT
practical solution!
The most commonly used
levels are 5%, 10% and 1%
Hypothesis Testing Risk

The alpha risk or Type 1 Error (generally called the “Producer’s Risk”) is the probability we could
be wrong in saying something is “different.” It is an assessment of the likelihood the observed
difference could have occurred by random chance. Alpha is the primary decision-making tool of
most statistical tests.
Alpha risk can also be Actual Conditions

explained as: The risk with
implementing a change when Not Different Different
(Ho is True) (Ho is False)
you should not.
Alpha risk is typically lower Not Different Correct Type II

than beta risk because you are (Fail to Reject Ho) Decision Error
more hesitant to make a Statistical
mistake about claiming the Conclusions
significance of an X (and
Different
Type 1 Correct
therefore spending money) as
compared to overlooking an X (Reject Ho) Error Decision
(which is never revealed).
There of two types of error Type I with an associated risk equal to alpha (the first letter in the
Greek alphabet), and of course named the other one Type II with an associated risk equal to beta.
The formula reads: alpha is equal to the probability of making a Type 1 error or alpha is equal to
the probability of rejecting the null hypothesis when the null hypothesis is true.
284
Alpha Risk
Alpha (α) risks are expressed relative to a reference distribution.

Distributions include:
–  t-distribution
The a-level is represented
–  z-distribution by the clouded areas.
–  χ2- distribution Sample results in this area

lead to rejection of H0.
–  F-distribution
Region of Region of
DOUBT DOUBT
Accept as chance differences
Hypothesis Testing Risk
The beta risk or Type 2 Error (also called the “Consumer’s Risk”) is the probability we could be
wrong in saying two or more things are the same when, in fact, they are different.
Actual Conditions
Not Different Different
(Ho is True) (Ho is False)
Not Different Correct Type II

(Fail to Reject Ho) Decision Error
Statistical
Conclusions
Different
Type 1 Correct
(Reject Ho) Error Decision
Another way to describe beta risk is failing to recognize an improvement. Chances are the sample
size was inappropriate or the data was imprecise and/or inaccurate.
Reading the formula: Beta is equal to the probability of making a Type 2 error.
Or: Beta is equal to the probability of failing to reject the null hypothesis given that the null
hypothesis is false.
285
Beta Risk
Beta and sample

size are very Beta Risk is the probability of failing to reject the null hypothesis
closely related. when a difference exists.
When calculating
Sample size in
Distribution if H0 is true
MINITABTM we
always enter the
Reject H0
“power” of the
α = Pr(Type I error)
test which is one
minus beta. In α = 0.05
doing so we are H0 value
establishing a
sample size that
will allow the Accept H0 Distribution if Ha is true
proper overlap of β= Pr(Type II error)
distributions.
µ
Critical value of test

statistic
Distinguishing between Two Samples
Recall from the Central Limit Theorem

as the number of individual Theoretical Distribution
of Means
observations increase the Standard
Error decreases.
δ" When n = 2
δ=5
In this example when n = 2 we cannot S=1
distinguish the difference between the
Means (> 5% overlap, P-value > 0.05).
When n = 30 we can distinguish

between the Means (< 5% overlap, P-
value < 0.05) There is a significant
difference.
Theoretical Distribution
of Means
When n = 30
δ=5
S=1
286
Delta Sigma—The Ratio between d and S
Delta (d) is the size of the difference between

two Means or one Mean and a target value. Large Delta
Sigma (S) is the sample Standard Deviation of

δ"
the distribution of individuals of one or both of
the samples under question.
When δ & S is large we do not need statistics

because the differences are so large.
If the variance of the data is large it is difficult

to establish differences. We need larger
sample sizes to reduce uncertainty.
Large S
We want to be 95% confident in all of our estimates!

All samples are estimates of the population. All statistics based on samples are estimates of the
equivalent population parameters. All estimates could be wrong!
These are typical questions you will experience or hear during sampling. The most common answer
is “It depends.”. Primarily because someone could say a sample of 30 is perfect where that may
actually be too many. Point is you do not know what the right sample is without the test.
Question: How many samples should we take?

Answer: Well, that depends on the size of your delta and
Standard Deviation .
Question: How should we conduct the sampling?
Answer: Well, that depends on what you want to know .
Question: Was the sample we took large enough?
Answer: Well, that depends on the size of your delta and

Standard Deviation .
Question: Should we take some more samples just to be sure?
Answer: No, not if you took the correct number of samples the
first time!
287
The Perfect Sample Size

The minimum sample size
required to provide exactly 5%
overlap (risk). In order to
distinguish the Delta.
Note: If you are working with non-

Normal Data multiply your
calculated sample size by 1.1.
40 50 60 70
Population
40 50 60 70
Hypothesis Testing Roadmap – Continuous Data
Here is a Hypothesis Testing roadmap for Continuous Data. This is a great reference tool while you
are conducting Hypothesis Tests.
Normal
s
ou
t inu
n
Co Data
Test of Equal Variance 1 Sample Variance 1 Sample t-test
Variance Equal Variance Not Equal
Two Samples Two Samples
2 Sample T One Way ANOVA 2 Sample T One Way ANOVA
288
Hypothesis Testing Roadmap – Continuous Data
s
u ou
n
nti
C o D a ta Non Normal
Test of Equal Variance Median Test
Mann-Whitney Several Median Tests
Hypothesis Testing Roadmap – Attribute Data
Attribute Data
ute
t t rib
A ata
D
One Factor Two Factors
Two or More
One Sample Two Samples Samples
One Sample Two Sample Chi Square Test

Proportion Proportion (Contingency Table)
Minitab:
Stat - Basic Stats - 2 Proportions Minitab:
If P-value < 0.05 the proportions Stat - Tables - Chi-Square Test
are different If P-value < 0.05 at least one
proportion is different
Chi Square Test

(Contingency Table)
Minitab:
Stat - Tables - Chi-Square Test
If P-value < 0.05 the factors are not
independent
289
Common Pitfalls to Avoid
While using Hypothesis Testing the following facts should be borne in

mind at the conclusion stage:
–  The decision is about Ho and NOT Ha.
–  The conclusion statement is whether the contention of Ha was upheld.
–  The null hypothesis (Ho) is on trial.
–  When a decision has been made:
•  Nothing has been proved.
•  It is just a decision.
•  All decisions can lead to errors (Types I and II).
–  If the decision is to Reject Ho then the conclusion should read There
is sufficient evidence at the α level of significance to show that state
the alternative hypothesis Ha.
–  If the decision is to Fail to Reject Ho then the conclusion should read
There is not sufficient evidence at the α level of significance to show
that state the alternative hypothesis.
Notes
290
§  Articulate the purpose of Hypothesis Testing

§  Explain the concepts of the Central Tendency
§  Be familiar with the types of Hypothesis Tests
You have now completed Analyze Phase – Introduction to Hypothesis Testing.
Notes
291
Lean Six Sigma

Green Belt Training
Analyze Phase
Hypothesis Testing Normal Data Part 1
Now we will continue in the Analyze Phase with “Hypothesis Testing Normal Data Part 1”.
292
Overview
The core
phase are
Hypothesis Testing, X Sifting
Tests for Central
Tendency, Tests for Inferential Statistics
Variance and
ANOVA. Intro to Hypothesis Testing Sample Size
We will examine the
Hypothesis Testing ND P1 Testing Means
meaning of each of
these and show you Analyzing Results
Test of Means (t-tests)
T-tests are used to compare a Mean against a target and to compare Means from two different
samples and to compare paired data. When comparing multiple Means it is inappropriate to use a t-
test. Analysis of variance or ANOVA is used when it is necessary to compare more than 2 Means.
t-tests are used:
–  To compare a Mean against a target.

•  i.e.; The team made improvements and wants to compare
the Mean against a target to see if they met the target.
–  To compare Means from two different samples.

•  i.e.; Machine one to machine two.
•  i.e.; Supplier one quality to supplier two quality.
–  To compare paired data.

•  Comparing the same part before and after a given process.
They don t look

the same to me!
293
1 Sample t
Here we are looking for the region in which we can be 95% certain our true population Mean will lie.
This is based on a calculated average, Standard Deviation, number of trials and a given alpha risk
of .05.
A 1-sample t-test is used to compare an expected population Mean to a
In order for the Mean target.
of the sample to be
considered not
significantly different
than the target the
target must fall within Target µsample
the confidence
interval of the sample MINITABTM performs a one sample t-test or t-confidence interval for the
Mean. Mean.
Use 1-sample t to compute a confidence interval and perform a Hypothesis

Test of the Mean when the population Standard Deviation, σ, is unknown.
For a one or two-tailed 1-sample t:
–  H0: µsample = µtarget If P-value > 0.05 fail to reject Ho

–  Ha: µsample ≠, <, > µtarget If P-value < 0.05 reject Ho
1 Sample t-test Sample Size
T One common pitfall in

statistics is not understanding
Population Target what the proper sample size
should be. If you look at the
n = 2 Cannot tell the graphic, the question is: Is
difference X
X
there a difference between my
X
between the sample X
X
XX
X
process Mean and the desired
and the target.
XX X X X X target. If we had population
data it would be very easy –
n = 30 Can tell the X
no they are not the same but
difference X
between the sample XX they may be within an
X X
and the target. X XX acceptable tolerance (or
S specification window). If we
SE Mean = took a sample of 2 can we tell
n
a difference? No, because the
spread of the distribution of averages from samples of 2 will create too much uncertainty making it
very difficult to statistically say there is a difference.
If you remember from earlier, 95% of the area under the curve of a Normal Distribution falls within
plus or minus 2 Standard Deviations. Confidence intervals are based on your selected alpha level so
if you selected an alpha of 5% the confidence interval would be 95% which is roughly plus or minus 2
Standard Deviations. Using your eye to guesstimate you can see the target value falls within plus or
minus 2 Standard Deviations of the sampling distribution of sample size 2.
If you used a sample of 30 could you tell if the target was different? Just using your eye it appears
the target is outside the 95% confidence interval of the Mean. Luckily MINITABTM makes this very
easy…
294
Sample Size
Instead of going
through the dreadful
hand calculations of
sample size we will
Three fields must be filled in
use MINITABTM. and one left blank.
Three fields must be
filled in and one left
blank in the sample
size window.
MINITABTM will solve
for the third. If you
want to know the
sample size you must
enter the difference,
which is the shift that
must be detected. It
is common to state
the difference in terms
of “generic” Standard Deviations when you do not have an estimate for the Standard Deviation of
the process. For example, if you want to detect a shift of 1.5 Standard Deviations enter that in
difference and enter 1 for Standard Deviation. If you knew the Standard Deviation was 0.8 enter it
for Standard Deviation and 1.2 for the difference (which is a 1.5 Standard Deviation shift in terms of
real values).
If you are unsure of the desired difference or in many cases simply get stuck with a sample size that
you did not have a lot of control over, MINITABTM will tell you how much of a difference can be
detected. You, as a practitioner, must be careful when drawing Practical Conclusions because it is
possible to have statistical significance without practical significance. In other words - do a reality
check. MINITABTM has made it easy to see an assortment of sample sizes and differences.
Try the example shown.

Power and Sample Size
Notice as the sample size
increases there is not as big
1-Sample t Test
an effect on the difference. If
it was only necessary to see
Testing Mean = null (versus not = null)
a difference of 0.9 why
Calculating power for Mean = null + difference
bother taking any more
Alpha = 0.05 Assumed Standard Deviation = 1
samples than 15? The
Standard Deviation entered
has an effect on the
difference calculated. Sample
The various sample sizes
Size Power Difference
show how much of a
Take a few moments to 10 0.9 1.15456
difference can be detected
explore different Standard 15 0.9 0.90087
assuming a Standard
Deviation sizes in 20 0.9 0.76446
Deviation = 1.
MINITABTM to see their effect 25 0.9 0.67590
on difference. 30 0.9 0.61245
35 0.9 0.56408
40 0.9 0.52564
295
1-Sample t Example
1. Practical Problem:
•  We are considering changing suppliers for a part we currently purchase
from a supplier that charges us a premium for the hardening process.
•  The proposed new supplier has provided us with a sample of their
product. They have stated they can maintain a given characteristic of 5
on their product.
•  We want to test the samples to determine if their claim is accurate.
2. Statistical Problem:
Ho: µN.S. = 5
Ha: µN.S. ≠ 5
3. 1-sample t-test (population Standard Deviation unknown,

comparing to target).
α = 0.05 β = 0.10
Let’s now try a 1-sample t example.
Step 1: Take a moment to review the practical problem

Step 2: The Statistical Problem is: The null hypothesis is the Mean of the new supplier is equal to
5. The alternative hypothesis is the Mean of the new supplier is not equal to 5. This is considered a
2-tailed test if you have heard that terminology before.
Step 3: Our selected alpha level is 0.05 and beta is 0.10.
4. Sample Size:
•  Open the MINITABTM worksheet: Exh_Stat.MTW .
•  Use the C1 column: Values
–  In this case, the new supplier sent 9
samples for evaluation.
–  How much of a difference can be
detected with this sample?
296
Hypothesis Testing
Follow along in
MINITABTM and
as you can see
we will be able to
detect a This means we will be able to
difference of 1.24 detect a difference of only 1.24 if
with the sample of the population has a Standard
Deviation of 1 unit.
9.
If this was not

good enough you MINITABTM Session Window
would need to Power and Sample Size
request additional 1-Sample t Test
samples. Testing Mean = null (versus not = null)
Calculating power for Mean = null + difference
Alpha = 0.05 Assumed Standard Deviation =
1
Sample
9 0.9 1.23748
Example: Follow the Road Map
Now refer to the road

map for Hypothesis 5. State Statistical Solution
Testing to first check Stat > Basic Statistics > Normality Test…
for Normality. In
MINITABTM select Are the data in the values column Normal?
“Stats>Basic
Statistics>Normality
Test”. For the
“Variable Fields”
double-click on
“Values” in the left-
hand box. Once this
is complete select
“OK”.
Since the P-value is

greater than 0.05 we
fail to reject the null
hypothesis that the
data are Normal.
297
1-Sample t Example
Perform the one

sample t-test. In
MINITABTM select
“Stat>Basic
Statistics>1-Sample t”.
From the left-hand box
double-click on
“Values”.
In the “Options” button

there is a selection for Click Graphs
the alternative
hypothesis, the default - Select all 3
is not equal which Click Options…
corresponds to our
hypothesis. If your - In CI enter 95
alternative hypothesis
was a greater than or
less than you would
have to change the
default.
Histogram of Values
Based on the graph we can say there is a statistical difference or reject the null hypothesis for the
following reason: A Histogram is not especially interesting when there are so few data points but it
does show the 95% confidence interval of the data along with the hypothesized value of 5 noted as
the Ho or null hypothesis.
Note our target Mean (represented by red Ho) is outside our

population confidence boundaries which tells us there is a
significant difference between population and target Mean.
298
Box Plot of Values
The Box Plot shows a different representation of the data but the conclusion is the same.
Individual Value Plot (Dot Plot)
As you will see the conclusion is the same but the Dot Plot is just another representation of data.
299
Session Window
Ho Ha
n
(Xi − X) 2
s= ∑
One-Sample T: Values i =1 n −1
Test of mu = 5 vs not = 5 S
SE Mean =
n
Variable N Mean StDev SE Mean 95% CI T P

Values 9 4.78889 0.24721 0.08240 (4.59887, 4.97891) -2.56 0.034
T-Calc = Observed – Expected over SE Mean

T-Calc = X-bar – Target over Standard Error
T-Calc = 4.7889 – 5 over .0824 = - 2.56
N – sample size
Mean – calculate mathematic average
StDev – calculated individual Standard Deviation (classical method)
SE Mean – calculated Standard Deviation of the distribution of the Means
Confidence Interval that our population average will fall between 4.5989 and 4.9789
Shown here is the MINITABTM Session Window output for the 1-Sample t-test.
Evaluating the Results
Since the P-value of 0.034 is less than 0.05 reject the null hypothesis.
Based on the samples given there is a difference between the

average of the sample and the desired target.
X Ho
6. State Practical Conclusions

The new supplier’s claim they can meet the target of 5 for the
hardness is not correct.
300
Manual Calculation of 1- Sample t
Let’s compare the manual calculations to what the

computer calculates.
–  Calculate t-statistic from data:
X − Target 4.79 − 5.00

t= = = −2.56
s 0.247
n 9
–  Determine critical t-value from t-table in reference section.

•  When the alternative hypothesis has a not equal sign it is a
two-sided test.
•  Split the α in half and read from the 0.975 column in the t-
table for n -1 (9 - 1) degrees of freedom.
Here are the manual calculations of the 1-samle t, verify that MINITABTM is correct.
Manual Calculation of 1- Sample t
degrees of T - Distribution
freedom
.600 .700 .800 .900 .950 .975 .990 .995
1 0.325 0.727 1.376 3.078 6.314 12.706 31.821 63.657
2 0.289 0.617 1.061 1.886 2.920 4.303 6.965 9.925
3 0.277 0.584 0.978 1.638 2.353 3.182 4.541 5.841
4 0.271 0.569 0.941 1.533 2.132 2.776 3.747 4.604
5 0.267 0.559 0.920 1.476 2.015 2.571 3.365 4.032
6 0.265 0.553 0.906 1.440 1.943 2.447 3.143 3.707

7 0.263 0.549 0.896 1.415 1.895 2.365 2.998 3.499
8 0.262 0.546 0.889 1.397 1.860 2.306 2.896 3.355
9 0.261 0.543 0.883 1.383 1.833 2.262 2.821 3.250
10 0.260 0.542 0.879 1.372 1.812 2.228 2.764 3.169
µ!
-2.56
The data supports the alternative -2.306 2.306
hypothesis that the estimate for the
Mean of the population is not 5.0. α/2=.025
0
Critical Regions
301
Confidence Intervals for Two-Sided t-test
Here is the formula for

the confidence interval. The formula for a two-sided t-test is:
Notice we get the same
results as MINITABTM.
s s
X − t α/2,n −1 ≤ µ ≤ X + t α/2,n −1
n n
or
X ± t crit SE mean = 4.788 ± 2.306 * .0824
4.5989 to 4.9789
4.5989 X Ho
4.9789
4.7889
1-Sample t Exercise
Exercise objective: Utilize what you have learned to

conduct and analyze a one sample t-test using
MINITABTM.
1.  The last engineering estimation said we would achieve

a product with average results of 32 parts per million
(ppm).
2.  We want to test if we are achieving this performance

level, we want to know if we are on target with 95%
confidence in our answer. Use worksheet
HYPOTTESTSTUD with data in column ppm VOC
3. Are we on Target?
302
1-Sample t Exercise: Solution

Since we do not know the population Standard Deviation we will use the 1 sample t-test to
determine if we are at target.

After selecting column C1 and
setting “Hypothesis Mean” to
32.0, click “Graphs” and
select “Histogram of data” to
get a good visualization of the
analysis.
Depending on the test you are

running you may need to
select “Options” to set your
desired confidence Interval
and hypothesis. In this case
the MINITABTM Defaults are
what we want.
303
Because we used the

option of “Graphs” we
get a nice visualization
of the data in a
Histogram AND a plot
of the null hypothesis
relative to the
confidence level of the
population Mean.
Because the null

hypothesis is within
the confidence level
you know we will “fail
to reject” the null
hypothesis and accept
the equipment is
running at the target of
32.0.

In MINITABTM’s Session Window (ctrl – M) you can see the P-value of 0.201. Because it is above
0.05 we “fail to reject” the null hypothesis so we accept the equipment is giving product at a target of
32.0 ppm VOC.
304
Hypothesis Testing Roadmap
Normal
s
uou
n
nti
Co Data
Two samples Two samples
2 Sample t-test
Notice the
difference in the A 2-sample t-test is used to compare two Means.
Stat > Basic Statistics > 2-Sample t
hypothesis for two-
tailed vs. one-tailed MINITABTM performs an independent two-sample t-test to generate a
test. This confidence interval.
terminology is only
used to know which Use 2-Sample t to perform a Hypothesis Test and compute a
column to look confidence interval of the difference between two population Means
down in the t-table. when the population Standard Deviations, σ’s, are unknown.
Two tailed test:

–  H0: µ1 = µ2 If P-value > 0.05 fail to reject Ho
–  Ha: µ1 ≠ µ2 If P-value < 0.05 reject Ho
One tailed test:

–  H0: µ1 = µ2
–  Ha: µ1 > or < µ2
µ1 µ2
305
Sample Size
Instead of going through the dreadful hand calculations of sample size we will use MINITABTM;
select “Stat>Power and Sample Size>2-Sample t”. Three fields must be filled in and one left blank
in the sample size window. MINITABTM will solve for the third. If you want to know the sample size
you must enter the difference which is the shift that must be detected. It is common to state the
difference in terms of “generic” Standard Deviations when you do not have an estimate for the
Standard Deviation of the process. For example if you want to detect a shift of 1.5 Standard
Deviations enter that in difference and enter 1 for Standard Deviation. If you knew the Standard
Deviation and it was 0.8 enter it for Standard Deviation and 1.2 for the difference (which is a 1.5
Standard Deviation shift in terms of real values).
Three fields must be filled in

and one left blank.
If you are unsure of the desired difference or in many cases simply get stuck with a sample size
you did not have a lot of control over MINITABTM will tell you how much of a difference can be
detected. You as a practitioner must be careful when drawing Practical Conclusions because it is
possible to have statistical
significance without practical Power and Sample Size
significance. In other words - do a 2-Sample t Test
reality check. MINITABTM has made Testing Mean 1 = Mean 2 (versus not equal)
it easy to see an assortment of Calculating power for Mean 1 = Mean 2 + difference
sample sizes and differences. Try Alpha = 0.05 Assumed Standard Deviation = 1
the example shown. Sample
As you can see we used the same
10 0.9 1.53369
command here just as in the 1- The various sample
15 0.9 1.22644 sizes show how much
sample t. Do you think the results
20 0.9 1.05199 of a difference can be
are different?
25 0.9 0.93576 detected assuming the
Standard Deviation = 1.
Correct, the results are different. 30 0.9 0.85117
35 0.9 0.78605
40 0.9 0.73392
The sample size is for each group.
306
2-Sample t Example
Over the next several

lesson pages we will 1. Practical Problem:
explore an example •  We have conducted a study in order to determine the effectiveness
for a 2-Sample t-test. of a new heating system. We have installed two different types of
Step 1. Read dampers in home ( Damper = 1 and Damper = 2).
Practical Problem •  We want to compare the BTU.In data from the two types of
Step 2. The null dampers to determine if there is any difference between the two
hypothesis is the products.
Mean of BTU.In for 2. Statistical Problem:
damper 1 is equal to
the Mean of BTU.In H o: µ 1 = µ 2
for damper 2. H a: µ 1 ≠ µ 2
The alternative
3. 2-Sample t-test (population Standard Deviations unknown).
hypothesis is the
Means are not equal. α = 0.05 β = 0.10
Step 3. We will use
the 2-Sample t-test No, not that kind of damper!
since the population
Standard Deviations
are unknown.
Now in Step 4. Open the worksheet in MINITABTM called: “Furnace.MTW”
How is the data coded?
The only way we can work with the data in the BTU.In is by unstacking the data by damper type.
4. Sample Size:
•  Open the MINITABTM worksheet: Furnace.MTW
•  Scroll through the data to see how the data is coded.
•  In order to work with the data in the BTU.In column we will need
to unstack the data by damper type.
307
2-Sample t Example
We will unstack the data in BTU.In using the subscripts in Damper. Store the unstacked data after
the last column in use. Check the “Name the columns containing the unstacked data” box. Then
click “OK”.
Data > Unstack Columns…
Notice the “unstacked” data for each damper. We now have two columns.
308
2-Sample t Example
Now let’s perform a 2 Sample t example. In MINITABTM select “Stat>Power and Sample size>2-
Sample t”.
For the field “Sample Sizes:” enter ‘40 space 50’ because our data set has unequal sample sizes
which is not uncommon. The smallest difference that can be detected is based on the smallest
sample size, so in this case it is: 0.734.
MINITABTM Session Window
Example: Follow the Roadmap…
5. State Statistical Solution
309
Normality Test – Is the Data Normal?
The data is considered Normal since the P-value is greater than 0.05.
This is the Normality Plot for damper 2. Is the data Normal? It is Normal, continuing down the
roadmap…
310
Test of Equal Variance (Bartlett’s Test)
In MINITABTM select “Stat>ANOVA>Test for Equal Variance”. This will allow us to perform a
Bartlett’s Test.
Sample 1
Sample 2
The P-value of 0.586 indicates there is no statistically significant difference in variance.
Bartlett’s Test (>2) (f-test 2-samples)
311
2 Sample t-test Equal Variance
Let’s continue along the roadmap… Perform the 2-Sample t-test; be sure to check the box “Assume
equal variances”.
Box Plot
5. State Statistical Conclusions: Fail to reject the null hypothesis.

6. State Practical Conclusions: There is no difference between the
dampers for BTU’s in.
The Box Plots do not show much of a difference between the dampers.
312
Minitab Session Window

Take a moment to review the MINITABTM Session Window.
Calculated
Average n
(Xi − X) 2
s= ∑
i =1 n −1
S
SE Mean =
n
Two- Sample T-Test

Number of -1.450 0.980 (Variances Equal)
Samples -0.38
Ho: µ1 = µ2
Ha: µ1≠ or < or > µ2
Exercise

conduct and analyze a 2 sample t-test using
MINITABTM.
1.  Billy Bob’s Pool Care has conducted a study on the

effectiveness of two chlorination distributors in a swimming
pool. (Distributor 1 & Distributor 2).
2.  The up and coming Billy Bob Jr., looking to prove himself,
wants a comparison done on the Clor.Lev_Post data from
the two types of distributors in order to determine if there is
any difference between the two products.
3.  With 95% confidence is there a significant difference

between the two distributors?
4.  Use data within MINITABTM Worksheet “Billy Bobs Pool.mtw”
313
2 Sample t-test: Solution
1. What do we want to know: With 95% confidence is there a

significant difference between the two distributors?
H o: µ 1 = µ 2
H a: µ 1 ≠ µ 2
3. 2-Sample t-test (population

Standard Deviations unknown).
α = 0.05 β = 0.10
4. Now we need to look at the data to

determine the Sample Size but let’s
see how the data is formatted first.
Data > Unstack Columns…
•  Unstack the data in: Select Clor.Levl_Post

•  Using subscripts in: Select Distributor
To unstack the data follow the steps here. This will generate two new columns of data shown on the
next page…
314
By unstacking
the data we how
have the Clor.Lev •  Clor.Lev_Post_1 =
data separated Distributor 1
by the distributor
it came from.
•  Clor.Lev_Post_2 =
Now let’s move
on to trying to Distributor 2
determine correct
sample size.
Follow path in MINITABTM: “Stat > Power and Sample Size > 2-Sample t…”
315
We want to determine what is the

smallest difference that can be
detected based on our data.
Fill in the three areas and leave

“Differences:” blank so that
MINITABTM will tell us the differences
we need.
The smallest difference that can be

calculated is based on the smallest
sample size.
In this case:
.7339 rounded to.734
Follow the path: “Stat > Basic

Statistics > Normality Test…”
316
Check Normality for Clor.Lev_Post_1. The result shows us a P-value of 0.304 so our data is
Normal. Recall if the P-value is greater than .05 then we will consider our data Normal.
The result shows

us a P-value of
0.304 so our data
is Normal.
Check Normality for Clor.Lev_Post_2. The result shows us a P-value of 0.941 so our data is also
Normal.
The result shows

us a P-value of
0.941 so our data
is also Normal.
317
Test for Equal Variances
MINITABTM Path: “Stat > ANOVA > Test for Equal Variances…”
For the “Response:” we select our stacked column “Clor.Lev_Post”
For our “Factors:” we select our stacked column “Distributor”
318
Look at the P-value of 0.113.
This tells us there is no statistically significant difference in the variance in these two data sets.
What does this mean….We can finally run a 2 sample t–test with equal variances?
Look at the P-value of 0.113 ~

This tells us there is no statistically significant difference in the
variance in these two data sets.
What does this mean….We can finally run a 2 sample t–test with
Equal Variances?
For “Samples:” enter “Clor.Lev_Post” For “Subscripts:” enter “Distributors”
For Samples: enter Clor.Lev_Post
For Subscripts: enter Distributors
319

Look at the Box

Plot and Session Look at the Box Plot and Session Window.
Window. There is
NO significant There is NO significant difference between the distributors.
difference
between the
Distributors.
The Box Plots

show VERY little
Hmm, we re
difference
between the
a lot alike!
Distributors. Also
note the P-value
in the Session
Window– there is
no difference
between the two
Distributors.
Normal
us
inuo
nt
Co Data
320
Unequal Variance Example
Open MINITABTM worksheet: “2 sample unequal variance data”
Don t just sit there….

open it!
Normality Test
Run a Normality Test…

Let’s compare the data
in Sample one and
Sample three columns.
Our data sets are

Normally Distributed.
321
Test for Equal Variance
Stat>ANOVA>Test of Equal Variance

Standard Deviation
of Samples
We use F-Test Statistic

because our data is
P-value is less than 0.05
so our variances are not
equal.
Medians of Samples
This is the output from MINITABTM. Notice even though the names of the columns in MINITABTM
were Sample 1 and Sample 3 MINITABTM used Factor levels 1 and 2 to differentiate the outcome.
We have to interpret the meaning for factor levels properly; it is simply the difference between the
samples labeled one and three in our worksheet.
2-Sample t-test Unequal Variance
UNCHECK
Assume equal
variances box.
You can see there is very little difference in the 2-Sample t-tests.
322
Boxplot of Stacked by C4
15 Indicates
Sample
Means
10
Stacked
-5
1 2
C4
The Box Plot shows no difference between the Means. The overall box is smaller for sample on the
left; an indication for the difference in variance.
Individual Value Plot of Stacked vs C4
15
Indicates
10 Sample
Means
Stacked
-5
1 2
C4
By looking at this Individual Value Plot you can notice a big spread or variance of the data.
323
Two-Sample T-Test
(Variances Not Equal)
Ho: µ1 = µ2 (P-value > 0.05)

Ha: µ1 ≠ or < or > µ2 (P-value < 0.05)
Stat>Basic Stats> 2 sample T (Deselect Assume Equal Variance)
What does the P-value of 0.996 mean? After conducting a 2-sample t-test there is no significant
difference between the Means.
Normal
s
uou
n
nti
Co Data
324
Paired t-test
•  A Paired t-test is used to compare the Means of two measurements from the
same samples generally used as a before and after test.
•  MINITABTM performs a paired t-test. This is appropriate for testing the

difference between two Means when the data are paired and the paired
differences follow a Normal Distribution.
•  Use the Paired t command to compute a confidence interval and perform a

Hypothesis Test of the difference between population Means when
observations are paired. A paired t-procedure matches responses that are
dependent or related in a pair-wise manner. delta
(δ)
•  This matching allows you to account for
variability between the pairs usually resulting in
a smaller error term, thus increasing the sensitivity
of the Hypothesis Test or confidence interval.
–  Ho: µδ = µo
–  Ha: µδ ≠ µo µbefore µafter
•  Where µδ is the population Mean of the differences and µ0 is the hypothesized

Mean of the differences, typically zero.
Example
1.  Practical Problem:

•  We are interested in changing the sole material for a popular
brand of shoes for children.
•  In order to account for variation in activity of children wearing the
shoes each child will wear one shoe of each type of sole
material. The sole material will be randomly assigned to either
the left or right shoe.
H o: µ δ = 0
H a: µ δ ≠ 0
3. Paired t-test (comparing data that must remain paired).
α = 0.05 β = 0.10
Just checking your

souls, er…soles!
325
Example (cont.)
4. Sample Size: Now let’s open

•  How much of a difference can be detected with 10 samples? “EXH_STAT
Delta.MTW” for
Open MinitabTM Worksheet “EXH_STAT DELTA.MTW”
analysis. Use
columns labeled
Mat-A and Mat-B.
Paired t-test Example
In MINITABTM open “Stat>Power and Sample size>1-Sample t”. Enter in the appropriate Sample
Size, Power Value and Standard Deviation.
Now that s
a tee test!
This means we will be able to detect

a difference of only 1.15 if the
Standard Deviation is equal to 1.
Given the sample size of 10 we will be able to detect a difference of 1.15. If this was your process
you would need to decide if this was good enough. In this case, is a difference of 1.15 enough to
practically want to change the material used for the soles of the children’s shoes?
326
For the next test we must first calculate the difference between the two columns. In MINITABTM
open “Calc>Calculator”. We placed Mat-B first in the equation shown because it was generally
higher than the values for Mat-A.

Calc > Calculator
We need to calculate the difference

between the two distributions. We are Check this box so MinitabTM will
concerned with the delta; is the Ho recalculate as new data is entered.
outside the t-calc (confidence interval)?
Following the Hypothesis Test roadmap we first test the AB-

Delta distribution for Normality.
327
1-Sample t
Stat > Basic Statistics > 1-Sample t-test…

Since there is only one column,
AB Delta, we do not test for
Equal Variance per the
Hypothesis Testing roadmap.
Check this data for statistical

significance in its departure
from our expected value of
zero.
Box Plot
5. State Statistical Conclusions: Reject the null hypothesis
6. State Practical Conclusions: We are 95% confident there is a

difference in wear rates between the two materials.
Analyzing the Box Plot we see the null hypothesis falls outside the confidence interval so we reject
the null hypothesis. The P-value is also less than 0.05. Given this we are 95% confident there is a
difference in the wear between the two materials used for the soles of children’s shoes.
328
Paired T-Test
Another way to analyze this data is to use the paired t-test

command.
Stat>Basic Statistics>Paired T-test
Click on Graphs… and

select the graphs you would
like to generate.
Distinguishing between Two Samples
The P-value from this

Paired T-Test tells us the
difference in materials is
statistically significant.
As you will see the conclusions are the same but simply presented differently.
329
If you analyze this as a 2-sample t–test it simply compares the Means of Material A to Material B.
The power of the paired test is it increases the sensitivity of the test without having to look at a
series of other factors.
The wrong way to analyze this data is to use a 2-sample t-test:
Paired t-test Exercise
Exercise objective: Utilize what you have learned to conduct

an analysis a paired t-test using MINITABTM.
1. A corrugated packaging company produces material that uses

creases to make boxes easier to fold. It is a Critical to Quality
characteristic to have a predictable Relative Crease Strength. The
quality manager is having her lab test some samples labeled 1-11.
Then those same samples are being sent to her colleague at
another facility who will report their measurements on those same
1-11 samples.
2. The US quality manager wants to know with 95% confidence what

the average difference is between the lab located in Texas and the
lab located in Mexico when measuring Relative Crease Strength.
3. Use the data in columns Texas & Mexico in

HypoTestStud.mtw to determine the answer to the quality
manager’s question.
330
Paired t-test Exercise: Solution
Because the two labs Calc > Calculator…

ensured to exactly report
measurement results for
the same parts and the
results were put in the
correct corresponding row,
we are able to do a paired
t-test.
The first thing we must do

is create a new column
with the difference
between the two test
results.
We must confirm the differences (now in a new calculated column) are from a Normal Distribution.
This was confirmed with the Anderson-Darling Normality Test by doing a graphical summary under
Basic Statistics.
331
Paired t-test Exercise: Solution
As we have seen before this 1 Sample T analysis is found with:

Stat>Basic Stat>1-sample T
Even though the Mean difference is 0.23 we have a 95% confidence interval that
includes zero so we know the 1-sample t-test’s null hypothesis was failed to be
rejected . We cannot conclude the two labs have a difference in lab results.
The P-value is greater than

0.05 so we do not have the
95% confidence we wanted
to confirm a difference in the
lab Means. This confidence
interval could be reduced
with more samples taken
next time and analyzed by
both labs.
Normal
ous
u
ntin a
Co Dat
LSS Green Belt eBook v12 MT

Continuous Data Roadmap

332
© Open Source Six Sigma, LLC

333
§  Determine appropriate sample sizes for testing Means

§  Conduct various Hypothesis Tests for Means
§  Properly Analyze Results
You have now completed Analyze Phase – Hypothesis Testing Normal Data Part 1.
Notes
334
Lean Six Sigma

Green Belt Training
Analyze Phase
Now we will continue in the Analyze Phase with “Hypothesis Testing Normal Data Part 2”.
335
Overview
We are now
moving into
Welcome to Analyze
Hypothesis
Testing Normal
Data Part 2 where X Sifting
we will address
Calculating Inferential Statistics
Sample Size,
Variance Testing Intro to Hypothesis Testing
and Analyzing
Results. Hypothesis Testing ND P1
Calculate Sample Size
We will examine Hypothesis Testing ND P2 Variance Testing
the meaning of
each of these and Analyze Results
show you how to
apply them.
Tests of Variance
Tests of Variance are used for both Normal and Non-normal

Data.
Normal Data
–  1 Sample to a target
–  2 Samples: F-Test
–  3 or More Samples: Bartlett’s Test
Non-Normal Data
–  2 or more samples: Levene’s Test
The null hypothesis states there is no difference between the

Standard Deviations or variances.
–  Ho: σ1 = σ2 = σ3 …
–  Ha: at least one is different
336
1-Sample Variance
A 1-sample variance test is used to compare an expected

population variance to a target.
Stat > Basic Statistics > Graphical Summary
If the target variance lies inside the confidence interval then we

fail to reject the null hypothesis.
–  Ho: σ2Sample = σ2Target
–  Ha: σ2Sample ≠ σ2Target
Use the sample size calculations for a 1 sample t-test.
1 Sample t-test Sample Size

•  We are considering changing supplies for a part we currently
purchase from a supplier that charges a premium for the
hardening process and has a large variance in their process.
•  The proposed new supplier has provided us with a sample of
their product. They have stated they can maintain a variance of
0.10.
Ho: σ2 = 0.10 or Ho: σ = 0.31
Ha: σ2 ≠ 0.10 Ha: σ ≠ 0.31
3. 1-sample variance:
α = 0.05 β = 0.10
The Statistical Problem can be stated two ways:

The null hypothesis: The variance is equal to 0.10 and the alternative hypothesis: The variance is
not equal to 0.10
OR
The null hypothesis: The Standard Deviation is equal to 0.31 and the alternative hypothesis: The
Standard Deviation is not equal to 0.31
337
1-Sample Variance
4. Sample Size:
•  Open the MINITABTM worksheet: Exh_Stat.MTW
•  This is the same file used for the 1 Sample t example.
–  We will assume the sample size is adequate.
Stat > Basic Statistics > Graphical Summary
Take time to notice

the Standard
Deviation of 0.2472
falls within 95%
confidence interval.
Based off this data the
Statistical Solution is
“fail to reject the null”.
What does this mean

from a practical stand
point? They can
maintain a variance of
0.10 that is valid.
Typically shifting a
Mean is easier to
accomplish in a
process than reducing
variance. The new
supplier would be
worth continuing the
relationship to see if they can increase the Mean slightly while maintaining the reduced variance.
338
Test of Variance Example

We want to determine the effect of two different storage methods on
the rotting of potatoes. You study conditions conducive to potato rot
by injecting potatoes with bacteria that cause rotting and subjecting
them to different temperature and oxygen regimes. We can test the
data to determine if there is a difference in the Standard Deviation
of the rot time between the two different methods.
H o: σ 1 = σ 2
H a: σ 1 ≠ σ 2
3. Equal Variance test (F-test since there are only 2 factors.)
The Statistical problem is:

The null hypothesis: The Standard Deviation of the first method is equal to the Standard Deviation
of the second method.
The alternative hypothesis: The Standard Deviation of the first method is not equal to the Standard
Deviation of the second method.
These hypotheses can also be stated in terms of variance.
4. Sample Size:
Now open the data set •  Open the MINITABTM worksheet: Exh_aov.MTW
“EXH_AOV.MTW”.
Follow along in
MINITABTM.
Another method for testing

for Equal Variance will
allow more than one
factor. Columns “Temp”
and “Oxygen” are the
factors. “Rot” is the
Output.
339
Normality Test – Follow the Roadmap
Check
Normality. 5. Statistical Solution:
Stat>Basic Statistics>Normality Test
According to the graph we have Normal data. Based on the “p-value” we can see that the data
does follow a Normal distribution.
Ho: Data is Normal

Ha: Data is NOT Normal
Stat>Basic Stats> Normality Test (Use Anderson Darling)
340
Test of Equal Variance

Now conduct the test for Equal Variance. This time we have Rot as the response and Temp and
Oxygen as factors.
Stat>ANOVA>Test for Equal Variance
This graph shows a test of Equal Variance displaying Bonferroni 95% confidence for the response
Standard Deviation at each level. As you will see the Bartlett’s and Levene’s test are displayed in
the same Session Window. The asymmetry of the intervals is due to the Skewness of the chi-
square distribution.
For the potato rot

example you fail to
reject the null
hypothesis of the
variances being equal
for factors,
Temperature as well
as Oxygen.
P-value > 0.05 shows insignificant

difference between variance
341
Test for Equal Variance Statistical Analysis
Use this if
data is Normal and
for Factors > or = 2
Use this if
data is Non-normal and
for Factors > or = 2
Does the Session Window have the same P-values as the Graphical Analysis? However, from a
sample point of view and the change in variance we need to detect, we also suspect that he sample
size may or may not be adequate. So one of the key aspects of the ANOVA test is that one needs a
minimum sample size while they look for either significant differences either in the mean or variance.
It is critical to perform that test prior to compiling the data to plan an ANOVA.
342
Tests for Variance Exercise
Exercise objective: Utilize what you have learned to conduct

and analyze a test for Equal Variance using MINITABTM.
1. The quality manager was challenged by the plant director as to why

the VOC levels in the product varied so much. After using a
Process Map some potential sources of variation were identified.
These sources included operating shifts and the raw material
supplier. Of course the quality manager has already clarified the
Gage R&R results were less than 17% study variation so the gage
was acceptable.
2.  The quality manager decided to investigate the effect of the raw
material supplier. He wants to see if the variation of the product
quality is different when using supplier A or supplier B. He wants to
be at least 95% confident the variances are similar when using the
two suppliers.
3.  Use data ppm VOC and RM Supplier to determine if there is a

difference between suppliers.
Tests for Variance Exercise: Solution
First we want to do a graphical summary of the two samples from the two suppliers.
343
In “Variables:” enter ‘ppm

VOC’
In “By variables:” enter

‘RM Supplier’
We want to see if the two

samples are from Normal
populations.
The P-value is greater than 0.05 for both Anderson-Darling Normality Tests so we conclude the
samples are from Normally Distributed populations because we “failed to reject” the null hypothesis
that the data sets are from Normal Distributions.
Are both Data Sets Normal?
344
Continue to
determine if
they are of
Equal Variance.
For “Response:”
enter ‘ppm VOC’
For “Factors:” enter

‘RM Supplier’
Note MINITABTM
defaults to 95%
confidence interval
which is exactly the
level we want to test
for this problem.
345
Because the two populations were considered to be Normally Distributed the F-test is used to
evaluate whether the variances (Standard Deviation squared) are equal.
The P-value of the F-test was greater than 0.05 so we “fail to reject” the null hypothesis.
So once again in English: The variances are equal between the results from the two suppliers on our
product’s ppm VOC level.
Normal
s
uou
n
nti
Co Data
Two Samples
Two Samples
Two Samples Two Samples
346
Purpose of ANOVA
Analysis of Variance (ANOVA) is used to investigate and model

the relationship between a response variable and one or more
independent variables.
Analysis of Variance extends the two sample t-test for testing the
equality of two population Means to a more general null hypothesis
of comparing the equality of more than two Means versus them not
all being equal.
–  The classification variable, or factor, usually has three or more
levels (If there are only two levels, a t-test can be used).
–  Allows you to examine differences among Means using multiple
comparisons.
–  The ANOVA test statistic is:
Avg SS between S2 between

= 2
Avg SS within S within
What do we want to know?
Is the between group variation large enough to be distinguished from the within group variation?
delta X (Between Group Variation)
(δ)
Total (Overall) Variation
Within Group Variation

(level of supplier 1)
X
X
X X
X
X X X
µ1 µ2
347
Calculating ANOVA
Take a moment to review the formulas for an ANOVA.
Where:
G = the number of groups (levels in the study)
xij = the individual in the jth group
nj = the number of individuals in the jth group or level
X = the grand Mean
Xj = the Mean of the jth group or level
Total (Overall) Variation
delta
(δ)
Within Group Variation
(Between Group Variation)
Between Group Variation Within Group Variation Total Variation

g g nj g nj
∑ nj (Xj − X) 2 ∑∑ (Xij − X) 2 ∑ ∑ (X ij − X) 2
j=1 j=1 i =1 j=1 i =1
Calculating ANOVA
The alpha risk increases as the number of Means increases with a

pair-wise t-test scheme. The formula for testing more than one
pair of Means using a t-test is:
k
1 − (1 − α )
where k = number of pairs of means
so, for 7 pairs of means and an α = 0.05 :
7
1 - (1 - 0.05) = 0.30
or 30% alpha risk
The reason we do not use a t-test to evaluate series of Means is because the alpha risk increases as
the number of Means increases. If we had 7 pairs of Means and an alpha of 0.05 our actual alpha risk
could be as high as 30%. Notice we did not say it was 30% only that it could be as high as 30% which
is quite unacceptable.
348
Three Samples
We have three potential suppliers claiming to have equal levels of quality. Supplier B provides a
considerably lower purchase price than either of the other two vendors. We would like to choose the
lowest cost supplier but we must ensure we do not effect the quality of our raw material.
File>Open Worksheet > ANOVA.MTW
We would like test the data to determine if there is

a difference between the three suppliers.
Follow the Roadmap…Test for Normality
Compare P-values.
The samples of all three

suppliers are Normally
Distributed.
Supplier A P-value = 0.568

Supplier B P-value = 0.385
Supplier C P-value = 0.910
349
Test for Equal Variance…
Before testing for

Test for Equal Variance (must stack data to create “Response” & “ Factors”):
Equal Variance
you must first
stack the
worksheet.
According to the
data there is no
significant
difference in the
variance of the
three suppliers.
ANOVA in MINITABTM
Follow along in MINITABTM.
Stat>ANOVA>One-Way Unstacked
Enter Stacked Supplier data in

Responses:
Click on Graphs… ,
Check Boxplots of data
350
ANOVA
What does this graph

tell us?
There does not seem

to be a huge
difference here.
ANOVA Session Window

Looking at the P-value the conclusion is we fail to reject the null hypothesis. According to the data
there is no significant difference between the Means of the 3 suppliers.
P-value > .05

No Difference between suppliers
Stat>ANOVA>One Way (unstacked)
351
ANOVA
Before looking up the f critical value you must first know what the degrees of freedom are. The
purpose of the ANOVA’s test statistic uses variance between the Means divided by variance within the
groups. Therefore, the degrees of freedom would be 3 suppliers minus 1 for 2 degrees of freedom.
The denominator would be 5 samples minus 1 (for each supplier) multiplied by 3 suppliers, or 12
degrees of freedom. As you can see the critical F value is 3.89 and since the calculated f of 1.40 not
close to the critical value we fail to reject the null hypothesis.
F-Calc F-Critical
D/N 1 2 3 4
1 161.40 199.50 215.70 224.60
2 18.51 19.00 19.16 19.25
3 10.13 9.55 9.28 9.12
4 7.71 6.94 6.59 6.39
5 6.61 5.79 5.41 5.19
6 5.99 5.14 4.76 4.53
7 5.59 4.74 4.35 4.12
8 5.32 4.46 4.07 3.84
9 5.12 4.26 3.86 3.63
10 4.96 4.10 3.71 3.48
11 4.84 3.98 3.59 3.36
12 4.75 3.89 3.49 3.26
13 4.67 3.81 3.41 3.18
14 4.60 3.74 3.34 3.11
15 4.54 3.68 3.29 3.06
Sample Size
Let’s check on how much difference we can see with a sample of 5.
Will having a
sample of 5 show
a difference?
After crunching
the numbers a
sample of 5 can
only detect a
difference of 2.56
Standard
Deviations. Which
means the Mean
would have to be
at least 2.56
Standard
Deviations until we
could see a
difference. To help
elevate this
problem a larger
sample should be used. If there is a larger sample you would be able to have a more sensitive
reading for the Means and the variance.
352
ANOVA Assumptions
1.  Observations are adequately described by the model.

2.  Errors are Normally and independently distributed.
3.  Homogeneity of variance among factor levels.
In one-way ANOVA model adequacy can be checked by either of

the following:
1.  Check the data for Normality at each level and for homogeneity
of variance across all levels.
2.  Examine the residuals (a residual is the difference in what the
model predicts and the true observation).
!  Normal plot of the residuals
!  Residuals versus fits
!  Residuals versus order
If the model is adequate the residual plots will be structureless.
Residual Plots
To generate the residual plots in MINITABTM select “Stat>ANOVA>One-way Unstacked>Graphs”
then select “Individual value plot” and check all three types of plots.
Stat>ANOVA>One-Way Unstacked>Graphs
353
Histogram of Residuals
The Histogram of Residuals should

show a bell-shaped curve.
Normal Probability Plot of Residuals

The Normality plot of the residuals should follow a straight line on the probability plot. (Does a
pencil cover all the dots?)
Normality plot of the Residuals should follow a straight line.

Results of our example look good.
The Normality assumption is satisfied.
354
Residuals versus Fitted Values

The plot of Residuals
versus fits examines The plot of Residuals versus fits examines constant variance.
constant variance. The plot should be structureless with no outliers present.
Our example does not indicate a problem.
The plot should be
structureless with no
Outliers present.
ANOVA Exercise

conduct an analysis of a one way ANOVA using
MINITABTM.
1.  The quality manager was challenged by the plant director as

to why the VOC levels in the product varied so much. The
quality manager now wants to find if the product quality is
different because of how the shifts work with the product.
2.  The quality manager wants to know if the average is

different for the ppm VOC of the product among the
production shifts.
3.  Use Data in columns ppm VOC and Shift in hypotest

stud.mtw to determine the answer for the quality manager
at a 95% confidence level.
355
ANOVA Exercise: Solution

First we need to do a graphical summary of the samples from the 3 shifts.
Stat>Basic Stat>Graphical Summary
We want to see if the

3 samples are from
Normal populations.
In “Variables:” enter
‘ppm VOC’
In “By Variables:”
enter ‘Shift’
356
P-Value 0.446
The P-value is greater than 0.05
for both Anderson-Darling
Normality Tests so we conclude
the samples are from Normally
Distributed populations because
we failed to reject the null
hypothesis that the data sets are
from Normal Distributions.
P-Value 0.334 P-Value 0.658
First we need to determine if our

data has Equal Variances.
Stat > ANOVA > Test for Equal Variances…
Now we need to test the variances.
For Response: enter ppm VOC
For Factors: enter Shift
357
The P-value of the F-test was greater than 0.05 so we fail to

reject the null hypothesis.
Are the variances are equal…Yes!
We need to use the One-Way ANOVA to

determine if the Means are equal of
product quality when being produced by
the 3 shifts. Again we want to put 95.0 for
the confidence level.
Stat > ANOVA > One-Way…
For Response: enter ppm VOC
For Factor: enter Shift
Also be sure to click Graphs… to select Four

in one under residual plots.
Also, remember to click Assume equal

variances because we determined the
variances were equal between the 2 samples.
358

We must look at the Residual Plots to be sure our ANOVA analysis is valid. Since our residuals look
Normally Distributed and randomly patterned, we will assume our analysis is correct.
Since the P-value of the ANOVA test is less than 0.05 we “reject” the null hypothesis that the Mean
product quality as measured in ppm VOC is the same from all shifts.
We “accept” the alternate hypothesis that the Mean product quality is different from at least one
shift.
Don t miss that

shift!
Since the confidence intervals

of the Means do not overlap
between Shift 1 and Shift 3 we
see one of the shifts is
delivering a product quality with
a higher level of ppm VOC.
359
§  Be able to conduct Hypothesis Testing of Variances
§  Understand how to Analyze Hypothesis Testing Results
You have now completed Analyze Phase – Hypothesis Testing Normal Data Part 2.
Notes
360
Lean Six Sigma

Green Belt Training
Analyze Phase
Hypothesis Testing Non-Normal Data
Part 1
Now we will continue in the Analyze Phase with “Hypothesis Testing Non-Normal Data Part 1”.
361
Hypothesis Testing Non-Normal Data Part 1
Overview
The core
phase are Equal
Variance Tests and X Sifting
Tests for Medians.
We will examine the
meaning of each of
Intro to Hypothesis Testing
these and show you
how to apply them.
Equal Variance Tests
Tests for Medians
Non-Normal Hypothesis Tests
At this point we have covered the tests for determining significance for Normal Data. We will
continue to follow the roadmap to complete the test for Non-Normal Data with Continuous Data.
Later in the module we will use another roadmap that was designed for Discrete data.
Recall that Discrete data does not follow a Normal Distribution, but because it is not
Continuous Data, there are a separate set of tests to properly analyze the data.
We can test for anything!!
362
1 Sample t
Why do we care if a data set is Normally Distributed?
§  When it is necessary to make inferences about the true nature of the
population based on random samples drawn from the population.
§  When the two indices of interest (X-Bar and s) depend on the data
being Normally Distributed.
§  For problem solving purposes, because we don’t want to make a bad
decision – having Normal Data is so critical that with EVERY statistical
test, the first thing we do is check for Normality of the data.
Recall the four primary causes for Non-normal data:
§  Skewness – Natural and Artificial Limits
§  Mixed Distributions - Multiple Modes
§  Kurtosis
§  Granularity
We will focus on skewness for the remaining tests for Continuous Data.
Skewness is a natural state for much data. Any data that has natural or artificial limits typically
exhibits a Skewed Distribution when it is operating near the limit. The other three causes for Non-
normality are usually a symptom of a problem and should be identified, separated and corrected.
We will focus on Skewness for the remaining tests for Continuous Data. A common reaction to Non-
normal Data is to simply transform it. Please see your Master Black Belt to determine if a transform is
appropriate. Often data is beaten into submission only to find out there was an underlying cause for
Non-normality that was ignored. Remember we want you to predict whether the data should be
Normal or not. If you believe your data should be Normal but it is not there is most likely an
underlying cause that can be removed which will then allow the data to show its true nature and be
Normal.
Now we will continue

down the Non-Normal
us
side of the roadmap. uo
ntin a
Notice this slide is Co Dat Non Normal
primarily for tests of
Medians.
Test of Equal Variance Median Test
Mann-Whitney Several Median Tests
363
Sample Size
Levene’s Test of Equal Variance is used to compare the

estimated population Standard Deviations from two or more
samples with Non-normal Distributions.
–  Ho: σ1 = σ2 = σ3 …
–  Ha: At least one is different.
You have already seen this command in the last module. This is simply the application for Non-
normal Data. The question is: Are any of the Standard Deviations or variances statistically
different?
Follow the Roadmap…
Open the MINITABTM worksheet EXH_AOV.MTW
P-value < 0.05 (0.00)

Assume data is not
Stat > Basic Statistics > Normality test…
In MINITABTM select “Stat>Basic Stats>Normality Test”. As you can see the P-value for the
Normality test is less than 0.05 therefore we reject the null hypothesis that the data are Normal.
364
Test of Equal Variance Non-Normal Distribution
Use Levene’s Statistics for Non-

Normal Data
P-value > 0.05 (0.860) Assume
variance is equal.
Ho: σ1 = σ2 = σ3 …
Ha: At least one is different.
Next we test for Equal Variance. In MINITABTM select: “Stat>ANOVA>Test for Equal Variance”.
Since the data was not Normal we need to know that the only correct test statistic is the Levene’s
test and not the F-test. Had there been more than two variances tested Bartlett’s and Levene’s tests
would have appeared.
Test of Equal Variance Non-Normal Distribution
When testing 2 samples with Normal Distribution use F-test:

–  To determine whether two Normal Distributions have Equal
Variance.
When testing >2 samples with Normal Distribution use Bartlett’s test:
–  To determine whether multiple Normal Distributions have
Equal Variance.
When testing two or more samples with Non-normal Distributions use

Levene’s Test:
–  To determine whether two or more distributions have Equal
Variance.
Our focus for this module is working with Non-normal Distributions.
365
Hypothesis Test Exercise
Exercise objective: To practice solving problem

presented using the appropriate Hypothesis Test.
A credit card company wants to understand the need for

customer service personnel. The company thinks there is
variability impacting the efficiency of its customer service staff.
The credit card company has two types of cards. The company
wants to see if there is more variability in one type of customer
card than another. The Black Belt was selected and told to give
with 95% confidence the answer of similar variability between
the two card types.
1.  Analyze the problem using the Hypothesis Testing roadmap.

2.  Use the columns named CallsperWk1 and CallsperWk2 in
Minitab worksheet “Hypoteststud.mwt”.
3.  Having a confidence level of 95% is there a difference in
variance?
Test for Equal Variance Example: Solution
First test to see if the data is Normal or Non-Normal.
366
Since there are two

variables we need to
perform a Normality Test on
CallsperWk1 and
CallsperWk2.
First select the variable

‘CallsperWk1’ and Press
“OK”.
Follow the same steps for

‘CallsperWk2’.
Based on the P-value the variable being analyzed is Non-normal Data.
For the Data to be

Normal the P-value
must be greater
than 0.05
367
Since we know the variables are

Non-normal Data continue to
follow the Roadmap.
The next step is to test Calls/

Week for equal variance.
Before performing a Levene’s

Test we have to stack the
columns for CallsperWk1 and
CallsperWk2 because currently
the data is in separate columns.
Stat>ANOVA>Test for Equal Variances

After stacking the Calls/Week
columns the next step in the
Roadmap is performing a
Levene’s Test.
As you can see the data illustrates a P-value of 0.247 which is more than 0.05. As a result there
is no variance between CallperWk1 and CallperWk2. Therefore with a 95% confidence level we
368
Nonparametric Tests
A non-parametric test makes no assumptions about Normality.
For a Skewed distribution:

- The appropriate statistic to describe the central tendency is the Median rather than the
Mean.
- If just one distribution is not Normal a non-parametric should be used.
Non-parametric Hypothesis Testing works the same way as parametric testing. Evaluate the P-
value in the same manner
~ ~ ~
Target X X1 X2
Mean and Median
In general nonparametric tests do the following: rank order the data, sum the data by ranks, sign
the data above or below the target, and calculate, compare and test the Median. Comparisons
and tests about the Median make nonparametric tests useful with very Non-normal Data.
This Graphical Summary provides the confidence interval for the Median.
With Normal Data notice the With skewed data the Mean is
symmetrical shape of the influenced by the Outliers.
distribution and how the Mean and Notice the Median is still
the Median are centered. centered.
A nderson-Darling N ormality Test A nderson-Darling N ormality Test
A -S quared 0.30 A -S quared 3.72

P -V alue 0.574 P -V alue < 0.005
M ean 350.51 M ean 4.8454

S tDev 5.01 S tDev 3.1865
V ariance 25.12 V ariance 10.1536
S kew ness -0.079532 S kew ness 1.11209
Kurtosis -0.635029 Kurtosis 1.26752
N 75 N 200
M inimum 339.09 M inimum 0.1454

1st Q uartile 347.48 1st Q uartile 2.4862
M edian 350.48 M edian 4.1533
3rd Q uartile 353.99 3rd Q uartile 6.5424
M aximum 359.53 M aximum 16.4629
340 344 348 352 356 360 0 3 6 9 12 15
95% C onfidence Interv al for M ean 95% C onfidence Interv al for M ean
349.35 351.66 4.4011 5.2898
95% C onfidence Interv al for M edian 95% C onfidence Interv al for M edian
349.30 351.85 3.6296 4.7174
95% C onfidence Interv al for S tDev 95% C onfidence Interv al for S tDev
4.32 5.97 2.9018 3.5336
95% Confidence Intervals 95% Confidence Intervals
Mean Mean
Median Median
349.0 349.5 350.0 350.5 351.0 351.5 352.0 3.5 4.0 4.5 5.0 5.5
369
MINITABTM’s Nonparametrics
1-Sample Sign: performs a one-sample sign test of the Median and calculates the corresponding
point estimate and confidence interval. Use this test as an alternative to one-sample Z and one-
sample t-tests.
1-Sample Wilcoxon: performs a one-sample Wilcoxon signed rank test of the Median and
calculates the corresponding point estimate and confidence interval (more discriminating or efficient
than the sign test). Use this test as a nonparametric alternative to one-sample Z and one-sample t-
tests.
Mann-Whitney: performs a Hypothesis Test of the equality of two population Medians and
calculates the corresponding point estimate and confidence interval. Use this test as a
nonparametric alternative to the two-sample t-test.
Kruskal-Wallis: performs a Hypothesis Test of the equality of population Medians for a one-way
design. This test is more powerful than Mood’s Median (the confidence interval is narrower, on
average) for analyzing data from many populations but is less robust to Outliers. Use this test as an
alternative to the one-way ANOVA.
Mood’s Median Test: performs a Hypothesis Test of the equality of population Medians in a one-
way design. Test is similar to the Kruskal-Wallis Test. Also referred to as the Median test or sign
scores test. Use as an alternative to the one-way ANOVA.
There are 5 basic nonparametric tests that

MINITABTM calculates. Each one has a
counterpart in normal Hypothesis Testing.
370
1-Sample Sign Test
Here is a little trick! Dividing This test is used to compare the Median of one distribution to a
the sample size from a t-test target value.
estimate by 0.864 should –  Must have at least one column of numeric data. If there is more
give you a large enough than one column of data MINITABTM performs a one-sample
sample regardless of the Wilcoxon test separately for each column.
underlying distribution…most The hypotheses:
of the time. –  H0: M = Mtarget
–  Ha: M ≠ Mtarget
For instance, having a
Interpretation of the resulting P-value is the same.
sample size of 23 using the t-
test method, the sample size
would increase by 3. If there Note: For the purpose of calculating sample size for a non-
is a Normal Distribution parametric (Median) test use:
(assuming) this number n
would increase by 1. n non-parametric = t test
Truthfully it is really possible 0.864
to decrease the sample size
depending on the distribution selected for the alternative.
1-Sample Example

Our facility requires a cycle time from an improved process of 63 minutes. This
process supports the customer service division and has become a bottleneck to
completion of order processing. To alleviate the bottleneck the improved process
must perform at least at the expected 63 minutes.
2.  Statistical Problem:

Ho: M = 63
Ha: M ≠ 63
3.  1-Sample Sign or 1-Sample Wilcoxon
Open the MINITABTM worksheet: DISTRIB1.MTW

Stat>Non parametric> 1 sample sign …
Or
Stat> Non parametric> 1 sample Wilcoxon
4.  Sample Size:

This data set has 500 samples (well in excess of necessary sample size).
The Statistical Problem is: The null hypothesis is that the Median is equal to 63 and the alternative
hypothesis is the Median is not equal to 63.
Open the MINITABTM Data File: “DISTRIB1.MTW”. Next you have a choice of either performing a
1-Sample Sign Test or 1-Sample Wilcoxon Test because both will test the Median against a target.
For this example we will perform a 1-Sample Sign Test.
371
1-Sample Example
Stat>Non parametric> 1 Sample Sign …
For a two tailed test choose

the not equal for the
alternative hypothesis.
As you can see the P-value is less than 0.05 so we must reject the null hypothesis which means
we have data that supports the alternative hypothesis that the Median is different than 63. The
actual Median of 65.70 is shown in the Session Window. Since the Median is greater than the
target value it seems the new process is not as good as we may have hoped.
Stat>Non parametric> 1 Sample Wilcoxon …
Perform the same steps as the 1-Sample Sign to use the 1-sample Wilcoxon.
372
1-Sample Example
For a confidence interval

enter desired level
Stat>Non parametric> 1 Sample Sign …
Since the target of 63

is not within the
confidence interval
reject the null
hypothesis.
For the 1-sample sign test select a confidence interval level of 95%. As you can see this yields a
result with intervals of 65.26 to 66.50. The NLI means a non-linear interpolation method was
used to estimate the confidence intervals. As you can see the confidence interval is very narrow.
Since the target of 63 is not within the confidence interval reject the null hypothesis.
Since the target of 63 is not

within the confidence interval
As you will see the confidence interval is even tighter for the Wilcoxon test. Therefore we reject
the null, the Median is higher than the target of 63. Unfortunately the Median was higher than
the target which is not the desired direction.
373
Hypothesis Test Exercise
Exercise objective: To practice solving a problem

A mining company is falling behind profit targets. The mine

manager wants to determine if his mine is achieving the
target production of 2.1 tons/day with some limited data to
analyze. The mine manager asks the Black Belt to
determine if the mine is achieving 2.1 tons/day and the
Black Belt says she will answer with 95% confidence.
1.  Analyze the problem using the Hypothesis Testing

roadmap.
2.  Use the column Tons hauled within the Minitab worksheet
“Hypoteststud.mtw.
3.  Does the Median equal the target value?
1 Sample Example: Solution
According to the hypothesis the Mine Manager feels he is achieving his target of
2.1 tons/day.
H0: M = 2.1 tons/day Ha: M ≠ 2.1 tons/day
Since we are using one sample we have a choice of choosing either a 1 Sample-Sign or 1
Sample Wilcoxon. For this example we will use a 1 Sample-Sign.
374
1 Sample Example: Solution
Sign Test for Median: Tons hauled

Sign Test of Median = 2.100 versus = 2.100
N Below Equal Above P Median
Tons hauled 17 14 0 3 0.0127 1.800
The results show a P-value of 0.0127 and a Median of 1.800.
The Black Belt in this case does not agree; based on this data
the Mine Manager is not achieving his target of 2.1 tons/day.
We disagree!
Mann-Whitney Example
The Mann-Whitney test is used to test if the Medians for 2 samples

are different.
1.  Determine if different machines have different Median cycle

times.
2.  H o : M 1 = M2
H a : M 1 ≠ M2
3.  Perform the Mann-Whitney test. Use the data provided in the
MINITABTM worksheet: Nonparametric.mtw
4.  There are 200 data points for each machine well over the
minimum number of samples necessary.
375
Mann-Whitney Example
When looking at the

Probability Plot First run a Normality Test…of course!
Match A yields a less
than .05 P-value.
Now look at Graph
B? Ok now you have
one graph that is
Non-normal Data and
the other that is
Normal. The good
news is when
performing a
Nonparametric Test
of 2 Samples only
one has to be
Normal. With that
said now let’s
perform a Mann-
Whitney.
Perform the Mann-

Whitney test. Since Now you will actually run the Mann-Whitney test and based on the
zero (the difference results end up determining that Medians of the machines are different.
between the 2
Stat>Nonparametric>Mann-Whitney…
Medians) is not
contained within the
confidence interval
we reject the null If the samples are the same
zero would be included within
hypothesis. Also
the confidence interval.
the last line in the
Session Window
where it says … is
significant at 0.0010
is the equivalent of
a P-value for the
Mann-Whitney test.
The Practical
Conclusion is there
is a difference
between the
Medians of the two
machines.
376
Exercise
Exercise objective: To practice solving problem presented

using the appropriate Hypothesis Test.
A credit card company now understands there is no variability

difference in customer calls/week for the two different credit card
types. This means no difference in strategy of deploying the
workforces. However the credit card company wants to see if there is
a difference in call volume between the two different card types. The
company expects no difference since the total sales of the two credit
card types are similar. The Black Belt was told to evaluate with 95%
confidence if the averages were the same. The Black Belt reminded
the credit card company the calls/day were not Normal distributions
so he would have to compare using Medians since Medians are used
to describe the central tendency of Non-normal Populations.

2.  Use the columns named CallsperWk1 and CallsperWk2 in MINITABTM
worksheet “Hypoteststud.mtw”
3.  Is there a difference in call volume between the 2 different card
types?
Mann-Whitney Example: Solution
Since we know the data for CallperWk1 and CallperWk 2 are Non-normal we can proceed to
performing a Mann-Whitney Test.
Stat>Nonparametrics>Mann-Whitney
377
Mann-Whitney Example: Solution
As you can see there is no significant difference in the Median

between CallsperWk1 and CallsperWk2.
Therefore, there is no significant difference in call volume between

the two different card types.
Mood’s Median Test
The final two tests are the Mood’s Median and the Kruskal Wallis.
1.  An aluminum company wanted to compare the operation of its three

facilities worldwide. They want to see if there is a difference in the
recoveries among the three locations. A Black Belt was asked to help
management evaluate the recoveries at the locations with 95% confidence.
2.  Ho: M1 = M2 = M3
Ha: at least one is different
3.  Use the Mood’s Median test.
4.  Based on the smallest sample of 13 the test will be able to detect a
difference close to 1.5.
5.  Statistical Conclusions: Use the data in the columns named “Recovery” and
“Location” in the MinitabTM worksheet “Hypoteststud.mtw” for analysis.
= = ?
378
Follow the Roadmap…Normality

Instead of using the Anderson-Darling test for Normality this time we used the graphical summary
method. It gives a P-value for Normality and allows a view of the data the Normality test does not.
Stat>Basic Statistics>Graphical Summary…
Notice evidence of Outliers in at least 2 of the 3 populations. You could do Box Plot to get a clearer
idea about Outliers.
379
Follow the Roadmap…Equal Variance
Check for
Equal
Variance.
Mood’s Median Test
Stat>NonParametrics > Moods Median [Session Output}…
We observe the confidence intervals for the Medians of the three

populations. Note there is no overlap of the 95% confidence
levels for Bangor—so we visually know the P-value is below 0.05.
380
Kruskal-Wallis Test
Using the same data set analyze using the Kruskal-Wallis test.
Using the same data set analyze using the Kruskal-Wallis test.
This output is the least friendly to interpret. Look for the

P-value which tells us we reject the null hypothesis. We
have the same conclusion as with the Mood’s Median test.
When comparing the Kruskal-Wallis test to the Mood’s Median test the Kruskal-Wallis test is better.
In this case the Kruskal-Wallis Test showed the variances were equal and illustrated the same
conclusion.
Exercise
Exercise objective: To practice solving problem presented

using the appropriate Hypothesis Test.
A company making cell phones is interested in evaluating the

defect rate of 3 months from one of its facilities. A customer felt
the defect rate was surprising lately but did not know for sure. A
Black Belt was selected to investigate the first three months of
this year. She is to report back to senior management with 95%
confidence about any shift(s) in defect rates.
2.  Use the columns named ppm defective1, ppm defective2 and
ppm defective3 in MINITABTM worksheet “Hypoteststud.mtw”
3.  Are the defect rates equal for three months?
381
Cell Phone Defect Rate Example: Solution

Let’s follow the Roadmap to
check to see if the data is
Normal.
Instead of performing a
Normality Test we can find the
P-value using the Graphical
Summary in MINITABTM.
Now let’s take a moment to

compare the 3 variables.
Since our 3 variables are less
than 0.05 the data is Non-
normal.
Stat>Basic
Statistics>Graphical
Summary
Before we can perform a Mood’s

Median Test we must first stack the
columns ppm defective1, ppm
defective2 and ppm defective3.
Again when comparing the Kruskal-Wallis test to the Mood’s Median test, the Kruskal-Wallis test is
better. In this case the Kruskal-Wallis Test showed the variances were equal and illustrated the
same conclusion.
382
Cell Phone Defect Rate Example: Solution
The P-value is
over 0.05…
therefore we After stacking the
accept the null columns we can perform
hypothesis.
a Mood’s Median Test.
Stat>Nonparametric>Mood s Median Test
Unequal Variance
Where do you go in the roadmap if the variance is not equal?

–  Unequal variances are usually the result of differences in
the shape of the distribution.
•  Extreme tails
•  Outliers
•  Multiple modes
These conditions should be explored through data

demographics.
For Skewed Distributions with comparable Medians it is unusual

for the variances to be different without some assignable cause
impacting the process.
383
Example
This is an example of comparable products. As you can see Model A is Normal but Model B is not.
First open MinitabTM worksheet “Var_Comp.mtw”. Then check for Normality

using “Stat > Basic Statistics > Normality”….
Model A and Model B are similar in nature (not exact) but are
manufactured in the same plant.
Model A is Normal, Model B is Non-normal.
Now let’s check the variance.
Does Model B have a larger variance than Model A? The Median for Model B is much lower. How
can we capitalize on our knowledge of the process? Let’s look at data demographic to help us
explain the differences between the two processes.
Now let’s check for Equal Variances using Levene’s Test but remember
first you will need to stack the data so you can run this test…
The P-value is just under the limit of .05. Whenever the result is borderline,
as in this case, use your process knowledge to make a judgment.
384
Data Demographics
What clues can explain the difference in variances? This example illustrates how Non-normal Data
can have significant informational content as revealed through data demographics. Sometimes this
is all that is needed to draw conclusions.
Let’s look at data demographics for clues.
Graph> Dotplot> Multiple Y’s, Simple
Black Belt Aptitude Exercise

•  A recent deployment at a client raised the question of which

educational background is best suited to be a successful
Black Belt candidate.
•  In order to answer the question the MBB instructor randomly
sampled the results of a Six Sigma pretest taken by now
certified Black Belts at other businesses.
•  Undergraduate backgrounds in Science, Liberal Arts,
Business and Engineering were sampled.
•  Management wants to know so they can screen prospective
candidates for educational background.

2.  What educational background is best suited for a potential
Black Belt?
3.  Use the data within MinitabTM worksheet “BBaptitude.mtw”
385
Black Belt Aptitude Exercise: Solution
First follow the Roadmap to check the data for Normality.

Stat > Basic Statistics > Normality Test…
Now let’s look at the MINITABTM Session Window. As you can see the P-value is greater than 0.05.
Next we are going to check for variance.

(Remember, stack the data first!)
The data illustrates there is not a difference in variance. Therefore we reject the accept the null
hypothesis, there is no difference between a potential Black Belt’s degree and performance.
386
§  Conduct Hypothesis Testing for Equal Variance
§  Conduct Hypothesis Testing for Medians
§  Analyze and interpret the results
You have now completed Analyze Phase – Hypothesis Testing Non-Normal Data Part 1.
Notes
387
Lean Six Sigma

Green Belt Training
Analyze Phase
Hypothesis Testing Non-Normal Data
Part 2
Now we will continue in the Analyze Phase with “Hypothesis Testing Non-Normal Data Part 2”.
388
Overview
The core
phase are Tests for
Proportions and X Sifting
Contingency Tables.
We will examine the
meaning of each of Intro to Hypothesis Testing
these and show you

Tests for Proportions
Contingency Tables
Hypothesis Testing Roadmap Attribute Data
Attribute Data
ute
tt rib
A ata
D
One Factor Two Factors
Two or More
One Sample Two Samples Samples
One Sample Two Sample Chi Square Test

Proportion Proportion (Contingency Table)
MINITABTM: MINITABTM:
Stat - Basic Stats - 2 Proportions Stat - Tables - Chi-Square Test
If P-value < 0.05 the proportions If P-value < 0.05 at least one
are different proportion is different
Chi Square Test

(Contingency Table)
MINITABTM:
Stat - Tables - Chi-Square Test
If P-value < 0.05 the factors are not
independent
We will now continue with the roadmap for Attribute Data. Since Attribute Data is Non-normal by
definition it belongs in this module on Non-normal Data.
389
Sample Size and Types of Data

Sample size is dependent on the type of data.
For Continuous Data:

–  Capability Analysis – a minimum of 30 samples
–  Hypothesis Testing – depends on the practical difference
to be detected and the inherent variation in the process
as well as the statistical confidence you wish to have.
For Attribute Data:

–  Capability Analysis – a lot of samples
–  Hypothesis Testing – a lot but depends on practical
difference to be detected as well as the statistical
confidence you wish to have.
MINITABTM can estimate sample sizes but

remember the smaller the difference that needs to
be detected the larger the sample size must be!
Proportion versus a Target

This formula is an approximation for ease of manual calculation.
This test is used to determine if the process proportion (p)

equals some desired value, p0.
The hypotheses:
–  Ho: p = p 0
–  Ha: p p 0
The observed test statistic is calculated as follows:
(normal approximation) Z =
(pˆ − p )
0
p (1 − p )n
obs
0 0
This is compared to Zcrit = Za/2
390
Now let’s try an example:
1.  Shipping accuracy has a target of 99%; determine if the current

process is on target.
Stat > Power and Sample Size > 1 Proportion…
2.  Hypotheses:
–  Ho: p = 0.99 Enter multiple values for alternative
values of p and MINITABTM will give
–  Ha: p 0.99
the different sample sizes.
3.  One sample proportion test

–  Choose a = 5%
4.  Sample size:
Take note of the how quickly the sample size increases as the alternative proportion goes up. It
would require 1402 samples to tell a difference between 98% and 99% accuracy. Our sample of
500 will do because the alternative hypothesis is 96% according to the proportion formula.
Our sample included 500 shipped items of which 480 were

accurate.
X 480
p̂ = = = 0.96
n 500
391
Stat > Basic Statistics > 1 Proportion…
5.  Statistical Conclusion: Reject the null hypothesis because the hypothesized
Mean is not within the confidence interval.
6.  Practical Conclusion: We are not performing to the accuracy target of 99%.
After you analyze the data you will see the statistical conclusion is to reject the null hypothesis.
What is the Practical Conclusion…(the process is not performing to the desired accuracy of 99%).
Sample Size Exercise

You are the shipping manager charged with improving

shipping accuracy. Your annual bonus depends on
your ability to prove shipping accuracy is better than
the target of 80%.
1.  How many samples do you need to take if the

anticipated sample proportion is 82%?
2.  Out of 2000 shipments only 1680 were accurate.

•  Do you get your annual bonus?
•  Was the sample size good enough?
392
Proportion vs Target Example: Solution

The Alternative Proportion should be .82 and the Hypothesized Proportion should be .80. Select a
Power Value of ‘.9’ and click “OK”.
As you can see the Sample Size should be at least 4073 to prove our hypothesis.
First we must determine the proper sample size to

achieve our target of 80%.
Stat > Power and Sample Size > 1 Proportion…
Do you get your bonus?
Yes, you get your bonus since .80 is not within the confidence interval. Because the improvement
was 84%, the sample size was sufficient.
Answer: Use alternative proportion of .82, hypothesized proportion of .80. n=4073. Either you had
better ship a lot of stuff or you had better improve the process more than just 2%!
Now let’s calculate if we

receive our bonus…
Out of the 2000 shipments

?
1680 were accurate. Was
the sample size sufficient?
X 1680
p̂ = = = 0.84
n 2000
393
Comparing Two Proportions
MINITABTM gives
you a choice of This test is used to determine if the process defect rate (or
proportion, p) of one sample differs by a certain amount, D, from that
using the normal of another sample (e.g., before and after your improvement actions)
approximation or
the exact The hypotheses:
method. We will H0: p1 - p2 = D
use the exact
Ha: p1 – p2 = D
method. The
formula is an
approximation
The test statistic is calculated as follows:
for ease of
manual p̂1 − p̂ 2 − D
calculation. Zobs =
p̂1 (1 − p̂1 ) n1 + p̂ 2 (1 − p̂ 2 ) n 2
This is compared to Zcritical = Za/2
Catch some Z’s!
Sample Size and Two Proportions Practice
Take a few moments to practice calculating the minimum sample size

required to detect a difference between two proportions using a power
of 0.90.
Enter the expected proportion for proportion 2 (null hypothesis).
For a more conservative estimate when the null hypothesis is close to

100 use smaller proportion for p1. When the null hypothesis is close
to 0, use the larger proportion for p1.
a δ p1 p2 n
5% .01 0.79 0.8 ___________
5% .01 0.81 0.8 ___________ Answers:
34,247
5% .02 0.08 0.1 ___________
32,986
5% .02 0.12 0.1 ___________ 4,301
5% .01 0.47 0.5 ___________ 5,142
5% .01 0.53 0.5 ___________ 5,831
5,831
394

In MINITABTM click “Stat>Power and Sample Size>2 Proportions”. For the field “Proportion 1
values:” type ‘.85’ and for the field “Power values:” type ‘.90’; The last field “Proportion 2:” is ‘.
95’ then click “OK”.
1.  Shipping accuracy must improve from a historical baseline of 85%

towards a target of 95%. Determine if the process improvements made
have increased the accuracy.
2.  Hypotheses: Stat>Power and Sample Size> 2 Proportions…
–  Ho: p1 – p2 = 0.0
–  H a: p 1 – p 2 0.0
3.  Two sample proportion test
–  Choose a = 5%
4.  Sample size ~
A sample of at least 188 is necessary for each group to be able to detect a 10% difference. If you
have reason to believe your improved process is has only improved to 90% and you would like to
be able to prove that improvement is occurring the sample size of 188 is not appropriate.
Recalculate using .90 for proportion 2 and leave proportion 1 at .85. It would require a sample
size of 918 for each sample!
The data
shown was The following data were taken:
gathered for
two Total Samples Accurate
processes.
Before Improvement 600 510
After Improvement 225 212
Calculate proportions:
X1 510
Before Improvement: 600 samples, 510 accurate p̂1 = = = 0.85
n1 600
X 2 212
After Improvement: 225 samples, 212 accurate p̂ 2 = = = 0.942
n 2 225
395

To compare two proportions in MINITABTM select “Stat>Basic Statistics>2 Proportions”…Select
the “Summarized data” option and in the “Trials:” and “Events:” column input the appropriate
data and click “OK”.
5.  Statistical Conclusion: Reject

the null
6.  Practical Conclusion: You have

achieved a significant
difference in accuracy.
Stat>Basic Statistics>2 Proportions…
Boris and Igor Exercise
Exercise objective: To practice solving a problem

Boris and Igor tend to make a lot of mistakes writing

requisitions.
# Req's # Wrong
Boris 356 47
Igor 571 99
1.  Who is worse?

2.  Is the sample size large enough?
396
2 Proportion vs Target Example: Solution
First we need to calculate our estimated

p1 and p2 for Boris and Igor.
X1 47
Boris p̂1 = = = 0.132
n1 356
X 2 99
Igor p̂ 2 = = = 0.173
n 2 571
Results:
As you can see Now let’s see what the minimum sample size should be…
we Fail to reject
the null Stat > Power and Sample Size > 2 Proportions
hypothesis with
the data given.
One conclusion
is the sample
size is not large
enough. It
would take a
minimum
sample of 1673
to distinguish
the sample
proportions for
Boris and Igor.
Sample X N Sample p Difference = p (1) - p (2)

1 47 356 0.132022 Estimate for difference: -0.0413576
2 99 571 0.173380 95% CI for difference: (-0.0882694, 0.00555426)
Test for difference = 0 (vs not = 0): Z = -1.73 P-Value = 0.084
Power and Sample Size Test for Two Proportions

Testing proportion 1 = proportion 2 (versus not =)
Calculating power for proportion 2 = 0.13
Alpha = 0.05
Sample Target
Proportion 1 Size Power Actual Power
0.17 1673 0.9 0.900078
The sample size is for each group.
397
Contingency Tables
C o n tin g e n cy Ta b le s a re u s e d to s im u lta n e o u s ly co m p a re
m o re th a n tw o s a m p le p ro p o rtio n s w ith e a ch o th e r.
It is ca lle d a C o n tin g e n cy Ta b le b e ca u s e w e a re te s tin g

w h e th e r th e p ro p o rtio n is co n tin g e n t u p o n , o r d ep e n d e n t
u p o n th e fa cto r u s e d to s u b g ro u p th e d a ta .
Th is te s t g e n e ra lly w o rk s th e b e s t w ith 5 o r m o re
o b s erv a tio n s in e a ch ce ll. O b s e rv a tio n s ca n b e p o o le d b y
co m b in in g ce lls .
S o m e e x a m p le s fo r u s e in clu d e :
– R e tu rn p ro p o rtio n b y p ro d u ct lin e
– C la im p ro p o rtio n b y cu s to m er
– D e fect p ro p o rtio n b y m a n u fa ctu rin g lin e
Th e n u ll h y p o th e s is is th a t th e p o p u la tio n p ro p o rtio n s o f
e a ch g ro u p a re th e s a m e .
– H 0 : p 1 = p 2 = p 3 = … = p n
– H a : a t le a s t o n e p is d iffe re n t
S ta tis ticia n s h a v e s h o w n th a t th e fo llo w in g s ta tis tic fo rm s

a ch i-‐s q u a re d is trib u tio n w h e n H 0 is tru e :
2
∑
(observed − expected)
expected
W h e re “ o b s e rv e d ” is th e s a m p le fre q u e n cy , “ e x p e cte d ”
is th e ca lcu la te d fre q u e n cy b a s e d o n th e n u ll h y p o th e s is ,
a n d th e s u m m a tio n is o v e r a ll ce lls in th e ta b le .
That? ..oh, that’s my

contingency table!
398
Test Statistic Calculations
Chi-square Test
r c (Oij − E ij ) 2 Where:
χ o2 = ∑ ∑
i =1 j=1 E ij O = the observed value
(from sample data)
E = the expected value
(F * F )
E ij = row col r = number of rows
Ftotal c = number of columns
Frow = total frequency for that row
2 2
χ critical =χ α, ν Fcol = total frequency for that column
Ftotal = total frequency for the table
From the Chi-Square Table
n = degrees of freedom [(r-1)(c-1)]
Wow!!! Can you believe this is the math in a Contingency Table. Thank goodness for MINITABTM.
Now let’s do an example.
Contingency Table Example
1.  Larry, Curley and Moe are order entry operators and you
suspect one of them has a lower defect rate than the others.
2.  Ho: pMoe = pLarry = pCurley
Ha: at least one p is different
3.  Use Contingency Table since there are 3 proportions.
4.  Sample Size: To ensure a minimum of 5 occurrences were
detected the test was run for one day.
Moe Larry Curley

Defective 5 8 20
OK 20 30 25
Can’t you clowns get

the entries correct?!
Note the data gathered in the table. Curley is not looking too good right now (as if he ever did).
399
The sample data

are the The sample data are the observed frequencies. To calculate
“observed” the expected frequencies, first add the rows and columns:
frequencies. To
calculate the
“expected” Moe Larry Curley Total
frequencies first Defective 5 8 20 33
add the rows and OK 20 30 25 75
columns. Then Total 25 38 45 108
calculate the
overall proportion Then calculate the overall proportion for each row:
for each row.
Moe Larry Curley Total

Defective 5 8 20 33 0.306
OK 20 30 25 75 0.694 33/108 = 0.306
Total 25 38 45 108
Now use these proportions to calculate the expected

frequencies in each cell:
0.306 * 45 = 13.8
Moe Larry Curley Total

Defective 5 8 20 33 0.306
OK 20 30 25 75 0.694
Total 25 38 45 108
0.694 * 38 = 26.4
400
Next calculate the χ2 value for each cell in the table:
(observed - expected)2
expected
Moe Larry Curley (20 − 13.8)2 = 2.841

Defective 0.912 1.123 2.841
OK 0.401 0.494 1.250 13.8
Finally add these numbers to get the observed chi-square:
χ obs
2 = 0.912 +1.123 + 2.841+
0.401+ 0.494 +1.250

χ obs = 7.02
2
The final step is to create a summary table including the observed chi-squared.
A summary of the table:
Moe Larry Curley

Observed 5 8 20
Expected 7.6 11.6 13.8
Defective χ2 0.912 1.123 2.841
Observed 20 30 25
Expected 17.4 26.4 31.3 χ obs
2 = 7.02
OK χ2 0.401 0.494 1.250
401
Critical Value ~
•  Like any other Hypothesis Test compare the observed statistic
with the critical statistic. We decide a = 0.05 so what else do we
need to know?
•  For a chi-square distribution we need to specify n in a
Contingency Table:
n = (r - 1)(c - 1), where
r = # of rows
c = # of columns
•  In our example we have 2 rows and 3 columns so n = 2
•  What is the critical chi-square? For a Contingency Table all the
risk is in the right hand tail (i.e. a one-tail test); look it up in
MINITABTM using Calc>Probability Distributions>Chisquare…
χ crit
2 = 5.99
Graphical Summary:
Since the observed chi-square exceeds the critical chi-square
we reject the null hypothesis that the defect rate is independent
of which person enters the orders.
Chi-square probability density function for ν = 2
0.5
0.4
0.3
Accept Reject
f
0.2
2 = 7.02
χobs
0.1
0.0
0 1 2 3 4 5 6 7 8
2 = 5.99
χcrit
chi-square
402
Contingency Table Example (cont.)
Using MINITABTM ~
•  Of course MINITABTM eliminates the tedium of crunching these

numbers. Type the order entry data from the Contingency
Table Example into MINITABTM as shown:
•  Notice the row labels are not necessary and row and column
totals are not used just the observed counts for each cell.
As you can see the data confirms: to reject the null hypothesis and the Practical Conclusion is: The
defect rate for one of these stooges is different. In other words defect rate is contingent upon the
stooge.
Stat>Tables>Chi-Square Test (2 way table in worksheet)
5.  Statistical Conclusion: Reject the null hypothesis.
6.  Practical Conclusion: The defect rate for one of these stooges is different. In other
words, defect rate is contingent upon the stooge.
403
Quotations Exercise

•  You are the quotations manager and your team thinks the
reason you do not get a contract depends on its complexity.
•  You determine a way to measure complexity and classify
lost contracts as follows:
Low Med High

Price 8 10 12
Lead Time 10 11 9
Technology 5 9 16
1.  Write the null and alternative hypothesis.

2.  Does complexity have an effect?
Contingency Table Example: Solution
First we need to create a table

in MINITABTM
Secondly, in MINITABTM
perform a Chi-Square Test
Stat>Tables>Chi-Square Test
404
Contingency Table Example: Solution (cont.)
After analyzing the data we can see the P-value is 0.426 which is larger than 0.05. Therefore we
accept the null hypothesis.
Are the factors independent of each other?
Overview
Contingency Tables are another form of Hypothesis Testing.

They are used to test for association (or dependency) between two
classifications.
The null hypothesis is that the classifications are independent.
A Chi-square Test is used for frequency (count) type data.
If the data is converted to a rate (over time) then a continuous type
test would be possible. However, determining the period of time that
the rate is based on can be controversial. We do not want to just
pick a convenient interval; there needs to be some rationale behind
the decision. Many times we see rates based on a day because that
is the easiest way to collect data. However a more appropriate way
would be to look at the rate distribution per hour.
Per hour? Per day? Per month?
405
§  Calculate and explain test for proportions
§  Calculate and explain contingency tests
You have now completed Analyze Phase – Hypothesis Testing Non-Normal Data Part 2.
Notes
406
Lean Six Sigma

Green Belt Training
Analyze Phase
Now we will conclude the Analyze Phase with “Wrap Up and Action Items.
407
Analyze Phase Wrap Up Overview
The goal of the Analyze Phase is to:
•  Locate the variables significantly impacting your Primary Metric.

Then establish Root Causes for X variables using Inferential
Statistical Analysis such as Hypothesis Testing and Simple
Modeling.
•  Gain and demonstrate a working knowledge of Inferential

Statistics as a means of identification of leverage variables.
Six Sigma Behaviors
•  Embracing change
•  Continuous learning
•  Being tenacious and courageous
•  Make data-based decisions
•  Being rigorous
•  Thinking outside of the box
Each player in the Lean Six Sigma process must be

A ROLE MODEL
for the Lean Six Sigma culture.
A Six Sigma Black Belt has a tendency to take on many roles therefore these behaviors help
throughout the journey.
408
Analyze Deliverables
Sample size is dependent on the type of data.
•  Listed here are the Analyze Phase deliverables each candidate will
present in a Power Point presentation at the beginning of the Improve
Phase training.
•  At this point you should all understand what is necessary to provide
these deliverables in your presentation.
–  Data Demographics
–  Hypothesis Testing (applicable tools)
–  Modeling (applicable tools)
–  Strategy to reduce the X’s
–  Project Plan It’s your show!
Analyze Phase - The Roadblocks

Each phase will have roadblocks. Many will be similar throughout your project.

–  Lack of data
–  Data presented is the best guess by functional
managers
–  Team members do not have the time to collect data
–  Process participants do not participate in the analysis
planning
–  Lack of access to the process
409
DMAIC Roadmap
Now you should be able to prove/disprove the impact “X’s” have on a problem.
Process Owner
Champion/

Define
Estimate COPQ
Establish Team
Measure

Analyze

Improve

Control
Analyze Phase
Over 80% of projects will

realize their solutions in the Vital Few X’s Identified
Analyze Phase – then we
State Practical Theories of Vital Few X’s Impact on Problem
must move to the Control
Phase to assure we can Translate Practical Theories into Scientific Hypothesis
sustain our improvements. Select Analysis Tools to Prove/Disprove Hypothesis
Collect Data
Perform Statistical Tests
State Practical Conclusion
Statistically
Significant?
N
Y Update FMEA
N
Practically
Significant?
Root
Cause
N
Y Identify Root Cause
Ready for Improve and Control
410
Analyze Phase Checklist
Analyze Questions
Define Performance Objectives Graphical Analysis

•  Is existing data laid out graphically?
•  Are there newly identified secondary metrics?
•  Is the response discrete or continuous?
•  Is it a Mean or a variance problem or both?
Document Potential X’s Root Cause Exploration

•  Are there a reduced number of potential X’s?
•  Who participated in these activities?
•  Are the number of likely X’s reduced to a practical number for analysis?
•  What is the statement of Statistical Problem?
•  Does the process owner buy into these Root Causes?
Analyze Sources of Variability Statistical Tests

•  Are there completed Hypothesis Tests?
•  Is there an updated FMEA?
General Questions
•  Are there any issues or barriers preventing you from completing this phase?
•  Do you have adequate resources to complete the project?
Planning for Action
This is a template that should be used with each project to assure you take the proper steps –
remember, Six Sigma is very much about taking steps. Lots of them and in the correct order.
Qualitative screening of vital from controllable trivial X’s
Qualitative screening for other factors
Quantitative screening of vital from controllable trivial X’s
Ensure compliance to problem solving strategy
Quantify risk of meeting needs of customer, business and people
Predict risk of sustainability
Chart a plan to accomplish desired state of culture
Assess shift in process location
Minimize risk of process failure
Modeling Continuous or Non Continuous Output
Achieving breakthrough in Y with minimum efforts
Validate Financial Benefits
411
§  Have started to develop a project plan to meet the deliverables
§  Be ready to apply the Six Sigma method through your project
You’re on your way!
You have now completed the Analyze Phase. Congratulations!
Notes
412
Lean Six Sigma

Green Belt Training
Improve Phase
Welcome to Improve
Now that we have completed the Analyze Phase we are going to jump into the Improve Phase.
Welcome to Improve will give you a brief look at the topics we are going to cover.
413
Welcome to Improve
Overview
Well now that the
Analyze Phase is over
W e lco m e to Im p ro v e
on to a more difficult
phase. The good news
is….you will hardly ever P ro ce s s M o d e lin g : R e g r e s s io n
use this stuff, so pay
close attention! A d v a n ce d P ro ce s s M o d e lin g :
We will examine the M LR
meaning of each of
these and show you D e s ig n in g Ex p e rim e n ts
how to apply them.
Ex p e rim e n ta l M e th o d s
Fu ll Fa cto r ia l Ex p e rim e n ts
Fra ctio n a l Fa cto r ia l Ex p e rim e n ts
DMAIC Roadmap
Process Owner
Champion/

Define
Estimate COPQ
Establish Team
Measure

Analyze

Improve

Control
We are currently in the Improve Phase and by now you may be quite sick of Six Sigma, really! In
this module we are going to look at additional approaches to process modeling. It is actually quite
fun in a weird sort of way!
414
Welcome to Improve
Improve Phase
Analysis Complete
Identify Few Vital X’s
Experiment to Optimize Value of X’s
Simulate the New Process
Validate New Process
Implement New Process
Ready for Control
After completing the Improve Phase you will be able to put to use the steps as depicted here.
415
Lean Six Sigma

Green Belt Training
Improve Phase
Process Modeling Regression
Now we will continue in the Improve Phase with “Process Modeling: Regression”.
416
Overview
Welcome to Improve
Correlation
Process Modeling: Regression Introduction to Regression
Advanced Process Modeling: Simple Linear Regression

MLR
Designing Experiments
Experimental Methods
Full Factorial Experiments
Fractional Factorial Experiments
In this module of Process Modeling we will study Correlation, Introduction to Regression and Simple
Linear Regression. These are some powerful tools in our data analysis tool box.
We will examine the meaning of each of these and show you how to apply them.
417
Correlation
•  The primary purpose of linear correlation analysis is to measure the

strength of linear association between two variables (X and Y).
•  If X increases with no definite change in the value of Y, there is no
correlation or no association between X and Y.
•  If X increases and there is a shift in the value of Y there is a correlation.
•  The correlation is positive when Y tends to increase with an increase in X
and negative when Y tends to decrease with an increase in X.
•  If the ordered pairs (X, Y) tend to follow a straight line path there is a linear
correlation.
•  The preciseness of the shift in Y as X increases determines the strength of
the linear correlation.
•  To conduct a linear correlation analysis we need:
–  Bivariate Data – Two pieces of data that are variable
–  Bivariate data is comprised of ordered pairs (X/Y)
–  X is the independent variable
–  Y is the dependent variable
Notes
418
Correlation Coefficient
Ho: No Correlation Ho ho ho….

Ha: There is Correlation
Ha ha ha….
The Correlation Coefficient always assumes a value between –1 and +1.
The Correlation Coefficient of the population, R, is estimated by the sample

Correlation Coefficient, r:
The null hypothesis for correlation is: there is no correlation, the alternative is there is correlation.
The Correlation Coefficient always assumes a value between –1 and +1.
The Correlation Coefficient of the population, large R, is estimated by the sample Correlation
Coefficient, small r and is calculated as shown.
Types and Magnitude of Correlation
Strong Positive Correlation Moderate Positive Correlation Weak Positive Correlation
110 110 85
100
100
90
90
75
Output
Output
80
Output
80
70
70
60 65
60
50
50
40
55
40
30
40 50 60 70 80 90 100 110 120 50 60 70 80 90 100 40 50 60 70 80 90
Input Input Input
Strong Negative Correlation Moderate Negative Correlation Weak Negative Correlation

110 110
85
100
100
90
90
80 75
Output
Output
80
Output
70
70
60
65
60
50
50
40
40 55
30
0 10 20 30 40 50 60 70 80 0 10 20 30 40 50 10 20 30 40 50 60
Input Input Input
The graphics shown here are labeled as the type and magnitude of their correlation: Strong,
Moderate or Weak correlation.
419
Limitations of Correlation
To properly
understand •  The magnitude of the Correlation Coefficient is somewhat relative and
Regression you should be used with caution.
must first •  As usual statistical significance is judged by comparing a P-value with the
understand
Correlation. chosen degree of alpha risk.
Once a
•  Guidelines for practical significance are as follows:
relationship is
described a –  If | r | > 0.80, relationship is practically significant
Regression can
be performed. –  If | r | < 0.20, relationship is not practically significant
A strong positive
or negative
Area of negative Area of positive
Correlation linear correlation No linear correlation linear correlation
between X and
Y does not
indicate -1.0 -0.8 -0.2 0 0.2 0.8 +1.0
causality.
Correlation
provides an indication of the strength but does not provide us with an exact numerical relationship.
Regression however provides us with that information; more specifically a Y equals a function of X
equation. Just like any other statistic be certain to assess the Correlation Coefficient is both
statistically significant and practically significant.
Correlation Example
RB Stats Correlation.mtw
X values Y values
The Correlation Coefficient [r]: Payton carries Payton yards
•  Is a positive value if one variable 196 679
increases as the other variable 311 1390
increases. 339 1852

•  Is a negative value if one variable 333 1359
decreases as the other increases. 369 1610
317 1460
339 1222
148 596
Correlation Formula
314 1421
381 1684
Σ( X i − X )(Yi − Y ) 324 1551
r=
2 2
∑( X i − X ) ∑(Yi − Y ) 321 1333
146 586
We will use some data from a National Football League player, Walter Payton, formerly of the
Chicago Bears. Open MINITABTM worksheet “RB Stats Correlation.mtw” as shown here.
420
Correlation Analysis
Graph>Scatter Plot>Simple…
Get outta
my way!
In MINITABTM select “Graph>Scatter Plot>Simple”. This “Scatterplot – Simple” window will open. To
select your Y variable double-click on “payton yards” from the left hand box. For the X variable double-
click “payton carries” from the same box. To enable MINITABTM for the use of a “Lowess Scatter Plot”
click on the “Data View…” button and select the “Smoother” tab… from there you will see a Lowess
option. Select this option and click “OK”.
Correlation Example
Lowess stands for

LOcally-WEighted Look at the graph. Do you observe any correlation in this graph?
Scatterplot Smoother.
The Lowess routine fits a
smoothed line to the data
which should be used to
explore the relationship
between two variables
without fitting a specific
model, such as a
regression line or
theoretical distribution.
Lowess smoothers are
most useful when the
curvature of the
relationship does not
change sharply. In this
example it appears there
is correlation in the data.
421
Correlation Example (cont.)

Now we will
generate the
Correlation
Coefficient using
MINITABTM. Follow
the MINITABTM
command path
shown here and
select the
“Variables:” double-
click on “payton
carries” and “payton Correlation Coefficient is high and
yards” from the left the P-value is low. Reject the null
hypothesis; there is a correlation.
box. The Correlation
Coefficient is high
Results for: RB STATS CORRELATION.MTW
which corresponds
to the graph on the Scatterplot of Payton yards vs Payton carries
previous slide that Correlations: Payton carries, Payton yards
shows positive Pearson correlation of Payton carries and Payton yards = 0.935
correlation.
P-Value = 0.000
The P-value is low
at .935 so we reject the null hypothesis by saying there is significant correlation between Payton’s
carries and the number of yards.
Regression Analysis
Correlation ONLY tells

us the strength of a The last step to proper analysis of Continuous Data is to determine the
relationship while Regression Equation.
Regression gives the
The Regression Equation can mathematically predict Y for any given X.
mathematical
relationship or the
MINITABTM gives the BEST FIT for the plotted data.
prediction model.
Prediction Equations:
Y = a + bx (Linear or 1st order model)
Y = a + bx + cx2 (Quadratic or 2nd order model)
Y = a + bx + cx2 + dx3 (Cubic or 3rd order model)
Y = a (bx) (Exponential)
422
Simple versus Multiple Regression
In Simple
Simple Regression: Regression there
–  One X, One Y is only one X
–  Analyze in MINITABTM using commonly
referred to as a
•  Stat>Regression>Fitted Line Plot or
predictor or
•  Stat>Regression>Regression
regressor.
Multiple
Multiple Regression: Regression
–  Two or More X’s, One Y allows many Y’s.
Recall we are
–  Analyze in MINITABTM using:
only presenting
Simple
Regression in
this phase and
In both cases the R-sq value signifies the will present
input variation contribution on the output Multiple
Regression in
variation as explained in the model.
detail in the next
phase.
Regression Analysis Graphical Output
There are
two ways to
perform a
Simple
Regression.
One is the
Fitted Line
Plot which
will give a
Scatter Plot
with a Fitted
Line and will
generate a
limited
Regression
Equation in
the Session
Window of
MINITABTM
as shown
here.
Follow the
MINITABTM command prompt shown here, double-click “payton yards” for Response (Y) and
double-click “payton carries” for the Predictor (X) and click “OK” which will produce this output.
423
Regression Analysis Statistical Output
Stat > Regression > Regression
R-Sq value of 87.3% = 1798587 / 2059413

R-Sq (adj) of 86.2% = (1798587 – 23711)/2059413
Mean Squares
R-Sq value of 87.3% quantifies the strength of the association

between Carries and Yards. In this case our Prediction Equation
explains 87.3% of the total variation seen in Yards . 12.7% of
the variation seen in Yards is not explained by our equation.
Let’s look at the Regression Analysis Statistical Output. The difference between R squared and
adjusted R squared is not terribly important in Simple Regression.
In Multiple Regression where there are many X’s it becomes more important which you will see
in the next module.
Regression (Prediction) Equation
The Regression
Analysis generates
a prediction model
based on the best
fit line through the
data represented
by the equation
shown here.
To predict the Constant Level of X

number of yards
Payton would run Coefficient
if he had 250
carries you simply
fill in that value in
the equation and
The solution:
solve.
Payton yards = -163.497 + 4.91622(250) = 1,065.6
424
Regression (Prediction) Equation (cont.)
You could
make an fairly Compare to the Fitted Line.
accurate
estimate by
using the Line
Plot also.
~1067 yds
Regression Graphical Output
For a demonstration check other Regression fits.

Stat>Regression>Fitted Line Plot
Quadratic and Cubic – Check the r2 value against the

linear model to determine if the difference between the
variance explained by our equation is significant.
MINITABTM will also generate both quadratic and cubic fits. Select the appropriate variables for (Y) and
(X) and for the type of Regression Model choose “Quadratic” or “Cubic” for the regression model type.
425
Regression Graphical Output (cont.)
Quadratic
If the R-Sq value improves Cubic

significantly or if the
assumptions of the residuals
are better met as a result of
utilizing the quadratic or cubic
equation you will want to use
the best fitting equation.
Use the best fitting equation by looking at the R-Sq value. If it improves significantly or if the
assumptions of the residuals are better met as a result of utilizing the quadratic or cubic equation
you should use it.
Here there is no big difference so we will stick with the linear model.
Residuals
As in ANOVA the residuals should:

–  Be Normally Distributed (normal plot of residuals)
–  Be independent of each other
•  no patterns (random)
•  data must be time ordered (residuals vs. order graph)
–  Have a constant variance (visual, see residuals versus fits chart,
should be (approximately) same number of residuals above and
below the line, equally spread.)
426
Residuals (cont.)
Residual Plots can be generated from both the Fitted Line Plot and
regression selection in MINITABTM.
Standardized Residual is also

known as the Studentized
residual or internally
Studentized Residual. The
Standardized Residual is the
Residual divided by an estimate
of its Standard Deviation.
This form of the Residual takes

into account the Residuals may
have different variances which
can make it easier to detect
Outliers.
Residual Plots can be generated from both the Fitted Line Plot and regression selection when using
MINITABTM.
Here we produced the graph by selecting the “Four in one” option.
Normality assumption… Equal variance assumption…
Independence assumption…
427
Normal Probability Plot of Residuals
To view a normal probability plot in MINITABTM select “Stat>Regression>Fitted Line Plot” and click
on the “Graph” button. You will notice underneath “Residual Plots” there are four options to
choose from. For this example select “Normal plot of residuals”. We will test Residuals versus
Fitted Values and Residual versus Order of Data in the next few pages.
Normally Distributed response assumption -
Residuals should lay near the straight line

(to within a fat pencil of each other).
As you can see the Normal probability plot of residuals evaluates the Normally Distributed response
assumption. The residuals should lay near the straight line to within a fat pencil. Looking at a
Normal probability plot to determine Normality takes a little practice. Technically speaking however
it is inappropriate to generate an Anderson-Darling or any other Normality test that generates a P-
value to determine Normality. The reason is residuals are not independent and do not meet a basic
assumption for using the Normality tests. Dr. Douglas Montgomery of Arizona State University
coined the phrase “fat pencil test” much to the chagrin of many of his colleagues.
428
Residuals versus Fitted Values
Residuals versus
Fitted Values Equal Variance assumption ~
evaluates the Equal
Variance
assumption. Here
you want to have a
random scattering of
points.
You DO NOT want

to see a “funnel
effect” where the
residuals gets
bigger and bigger as
the Fitted Value gets
bigger or smaller.
Should be randomly scattered with no patterns.
Residuals versus Order of Data
Independence assumption ~
Should show no trends

either up or down and
should have
approximately
the same number of
points above and below
the line (approximately
constant variance).
Residuals versus the order of data is used to evaluate the Independence Assumption. It should not
show trends either up or down and should have approximately the same number of points above
and below the line.
429
Modeling Y = f(x) Exercise
Exercise objective: To gain an understanding of how to use

regression/correlation function in MINITABTM. Examine
correlation and regression for the Dorsett data in the RB
stats correlation file and answer the following questions.
1.  What is the type and magnitude of the correlation?

a. Strong Positive
b. Moderate Positive
c. Weak Positive
d. Strong Negative
2. What is the Prediction Equation?
3. What is the predicted value or yardage if Dorsett carries the

football 325 times?
4. Are all assumptions met?

RB Stats Correlation.mtw
430
Modeling Y = f(x) Exercise: Question 1 Solution
To determine the Type and Magnitude of the relationship we need to

run a basic Scatter Plot.
From Graph select Scatterplot then Simple …
For Y variables enter dorsett yards ; for X variables enter
dorsett carries .
The Scatter Plot demonstrates a Strong Positive Correlation .
431
To determine the Prediction Equation we need to run a Fitted Line Plot.

Stat > Regression > Fitted Line Plot…
Fitted Line Plot
For Response (Y): enter dorsett yards

For Predictor (X): enter dorsett carries
The Prediction Equation is shown here…
432
If Dorsett carries the football 325 times the predicted value

would be determined as follows…
Step 1: Dorsett Yards = -160.1 + 4.993 (Dorsett Carries)
Step 2: Dorsett Yards = -160.1 + 4.993 (325)
Step 3: Dorsett Yards = -160.1 + 1622.725
Solution: Dorsett Yards = 1462.63
If Dorsett carries the football 325 times the predicted value would be determined that Dorsett would
carry the football for 1462.63 yards – approximately!
All three
assumptions The Normality Assumptions have been satisfied.
have been The Equal Variance Assumptions have been satisfied.
satisfied. The Independence Assumptions have been satisfied.
Ah, so much
satisfaction!
433
§  Perform the steps in a Correlation and a Regression Analysis
§  Explain when Correlation and Regression is appropriate
You have now completed Improve Phase – Process Modeling Regression.
Notes
434
Lean Six Sigma

Green Belt Training
Improve Phase
Advanced Process Modeling
Now we will continue with the Improve Phase “Advanced Process Modeling MLR”.
435
Overview
Welcome to Improve
Review Corr./Regression
Process Modeling: Regression
Non-Linear Regression
Advanced Process Modeling:
MLR
Transforming Process Data
Multiple Regression
The fundamentals of this phase are as shown. We will examine the meaning of each of these
and show you how to apply them.
Correlation and Linear Regression Review
Correlation and Linear Regression are used:

–  With historical process data. It is NOT a form of experimentation.
–  To determine if two variables are related in a linear fashion.
–  To understand the strength of the relationship.
–  To understand what happens to the value of Y when the value of X
is increased by one unit.
–  To establish a Prediction Equation enabling us to predict Y for any
level of X.
Correlation explores association.

Correlation and regression do
not imply a causal relationship.
Designed experiments allow
for true cause and effect
relationships to be identified.
Correlations: StirRate, Impurity

Pearson correlation of StirRate and Impurity = 0.959
P-value = 0.000
Recall the Simple Linear Regression and Correlation covered in a previous module. The essential
tools presented here describe the relationship between two variables. A independent or input factor
and typically an output response. Causation is NOT always proved; however the tools do present a
guaranteed relationship.
436
Correlation Review
The Pearson
Correlation Correlation is used to measure the linear relationship between two
Coefficient, Continuous Variables (bi-variate data).
represented here Pearson Correlation Coefficient, r , will always fall between –1 and +1.
as “r”, shows the A Correlation of –1 indicates a strong negative relationship, one factor
strength of a increases the other decreases.
relationship in
A Correlation of +1 indicates a strong positive relationship, one factor
Correlation. An “r” increases so does the other.
of zero indicates no
correlation.
P-value > 0.05, Ho: No relationship
P-value < 0.05, Ha: Is relationship
The P-value proves
the statistical
confidence of our r
Strong No Strong
conclusion Correlation Correlation Correlation
representing the
possibility a
relationship exists. -1.0 0 +1.0
Simultaneously, the Decision Points

Pearson Correlation
Coefficient shows
the “strength” of the relationship. For example, P-value standardized at .05, then 95%
confidence in a relationship is exceeded by the two factors tested.
Linear Regression Review
Presented here
StirRate is directly Linear Regression is used to model the relationship between a
related to Impurity of Continuous response variable (Y) and one or more Continuous
the process; the independent variables (X). The independent predictor variables are
relationship between most often Continuous but can be ordinal.
the two is one unit –  Example of ordinal - Shift 1, 2, 3, etc.
StirRate causes P-value > 0.05, Ho: Regression Equation is not significant
0.4643 Impurity P-value < 0.05, Ha: Regression Equation is significant
increase. StirRate
locked at 30 and
Impurity calculated
by 30 times 0.4643,
subtracting .
0.632gives us a 13.3
Impurity. Granted,
we have an error in The change in Impurity
for every one unit
our model, the red change in StirRate
points do not lie (Slope of the Line)
exactly on the blue
line. The dependent
response variable is
Impurity and the StirRate is the independent predictor as both variables in this example are
perpetual.
437
Correlation Review
Numerical
relationship is left Correlation tells us the strength of a linear relationship not
out when speaking the numerical relationship.
of Correlation.
The last step to proper analysis of Continuous Data is to
Correlation shows
determine the Regression Equation.
potency of linear
relationship, The Regression Equation can mathematically predict Y for any
mathematical given X.
relationship is The Regression Equation from MINITABTM is the best fit for the
shown by and plotted data.
through the
Prediction Equation
of Regression. As Prediction Equations:
shown these Y = a + bx (Linear or 1st order model)
Correlations or Y = a + bx + cx2 (Quadratic or 2nd order model)
Regressions are not
Y = a + bx + cx2 + dx3 (Cubic or 3rd order model)
proven casual
relationships. We Y = a (b )
x (Exponential)
are attempting to
PROVE statistical commonality. Exponential, quadratic, simple linear relationships or even
predictable outputs (Y) concern REGRESSION equations. More complex relationships are
approaching.
Simple versus Multiple Regression Review

Simply Regressions
have one X and are Simple Regression
referenced as the
–  One X, One Y
regressors or
predictors; multiple –  Analyze in MINITABTM using
X’s give reason to •  Stat>Regression>Fitted Line Plot or
output or response •  Stat>Regression>Regression
variable, this is
Multiple Regression
accounts. Multiple Regression
Strength of the –  Two or More X’s, One Y
regression known –  Analyze in MINITABTM Using
quantity by R •  Stat>Regression>Best Subsets
squared and dictates
overall variation in
output (Y),
independent variable
subjected to the In both cases the R-sq value estimates the
regression equation. amount of variation explained by the model.
438
Regression Step Review

How to run a Regression
The basic steps to follow in Regression are as follows:
is directed here. Using a
Scatter Plot, and 1.  Create Scatter Plot (Graph>Scatterplot)
understanding the 2.  Determine Correlation (Stat>Basic Statistics>Correlation – P-value less than
variation between the X’s 0.05)
and Y’s, activate a 3.  Run Fitted Line Plot choosing linear option (Stat>Regression>Fitted Line
Plot)
Correlation analysis
4.  Run Regression (Stat>Regression>Regression) (Unusual Observations?)
allowing a potential linear
5.  Evaluate R2, adjusted R2 and P-values
relationship indication.
6.  Run Non-linear Regression if necessary (Stat>Regression>Fitted Line
Third step is to find Plot)
existing linear 7.  Analyze residuals to validate assumptions. (Stat>Regression>Fitted Line
mathematical Plot>Graphs)
relationships which calls a.  Normally distributed
b.  Equal variance
for a Prediction Equation
c.  Independence
then fourth to find the
d.  Confirm one or two points do not overly influence model.
potency or strength of the
linear relationship if one One step at a time….
exists. Linear Regression
accompanied by the variation of the input gives a variety of output results and a completion of the
fifth step denoted, the amount percentage a given output has. It also includes the answer to
strength of statistical confidence within our Linear Regression.
To conclude a Linear Regression exists; majority has that a 95% statistical confidence or above
has to be obtained. If unsatisfied conclusions are drawn, as a point of contingency, step 6 is
essential. At present, in step 6, we contemplate the potential Non-linear Regression. However
this is necessary only if we can not find a Regression Equation (statistical and practical) variation
of output by way of scoping the input or by analyzing the model error for correctness. Step 7,
depicted subsequently, validates residuals are a necessity for a valid model.
Simple Regression Example
Recalling tools This data set is from the mining industry. It is an evaluation of ore
learned in the concentrators.
Analyze Phase,
presented here is a
Simple Regression Graph > Scatterplot…
example examining
a piece of
equipment
pertaining to a
mining company.
This diagram plots
output to input,
following the
Regression steps.
Notice how the
equipment is
agitated by output of
PGM concentrate.
Opening the MINITABTM file named “Concentrator.MTW” will show how output is always applied to
the Y axis (dependent), as input is always applied to the X axis (independent).
439
Example Correlation
Identifying the
existing Linear
Regression is the
second step.
Having the Pearson
Correlation
Coefficient at .847
and a P-value less
than .05 we see with
a very strong
statistical confidence
a Linear Regression.
If no Correlation
existed the
coefficient would be
closer to zero,
remember?
Example Regression Line
Stat > Regression > Fitted Line Plot…
Now finding the Prediction Equation of the linear relationship involves two factors; output response
and input variable. Grams per ton of the PGM concentrate is the output and the RPM of the
agitator is the input. Knowing a positive slope exists by a greater than zero Correlation Coefficient
indicates the agitators RPM increases in correlation with the PGM concentrate. The slope of Linear
Regression equals 1.333. Did you recall the Pearson correlation coefficient exceeded zero?
440
Example Linear Regression

Shown here is a
Linear Regression of
70% process
variation. Considering
step five; a 12 data
point MINITABTM alert
for a large residual
comes to fruition. R
squared, R squared The P-value < 0.05
adjusted and a therefore the Regression
unusual listing of is significant.
observations pertain
to our full Regression
Analysis. With these Notice the unusual
concerns refer to the observation may
MINITABTM window (if indicate a Non-linear
necessary) and a analysis may explain
Non-linear more of the variation
Regression might be in the data.
in consideration.
Example Regression Line
Stat>Regression>Fitted Line Plot
Notice how the new line is a more appropriate demonstration of our data since the curvature better
fits the plotted points. This is the essence of choosing a Non-linear Regression and choosing a
Quadratic Regression. The model option can be used, simply by clicking the “Quadratic:”.
441
Example Linear and Non-Linear Regression
We have here Linear Model

both Regression
models. In terms
of R squared More variation is explained
being higher in using the Non-linear model
percentage rate since the R-Squared is higher
and the S statistic is lower
on the Non-linear which is the estimated
model as Standard Deviation of the
apposed to that of Non- Linear Model error in the model.
the Linear we see
more process
variation. In
addition, S
presents
estimated
Standard
Deviation of
errors, Non-linear
model has a
lower decimal.
Let’s now consider the model error. You need not be perplexed, model error has many variables.
Output dependency on the impact of other input variables and measurement system errors of
output and inputs can be causes. Since the MINITABTM Session Window displays these very
Regression Analyses feel free to use.
Example Residual Analysis

The
recommendation
here would be to
use standardized
residuals and the
“Four in one”
option for plotting.
In the upper left
window “Graph”
NEEDS to be
clicked yielding
appropriate
modeling and
analyzing the
residuals to
conclude the
seventh step.
442

Example Residual Analysis
Having selected the “Four in
one” option, we now see all
analyses presented and
must keep in mind our
assumptions to consider the
possibilities of a valid
Regression. Residuals do
not have a pattern across
the data collected, however,
they do have a similar
variation across the board of
Fitted Values. Moreover, in a
valid Regression of all
residuals will be distributed.
Similarities between the

residuals across the Fitted
Values in the upper right
graph show no monumental Normally Distributed residuals (Normal Probability Plot)
differences as to variation. Equal variance (Residuals vs. Fitted Values)
Independence (Residuals vs. Order of Data)
Random placement of the
residuals are proven by the
bottom right graph; no pattern is in essence. Looking for Normality the bottom left graph (the
Histogram) indicates we have a bell curve, as does the upper right graph proving residuals placed
near the straight line. Now, have we met the necessary requirements of the criteria? With these
randomly dispersed residual data points finding the impact of just a single one is the confirmation.
Non-Linear Relationships Summary
Methods to find Non-linear Relationships:

–  Scatter Plot indicating curvature.
–  Unusual observations in Linear Regression model.
–  Trends of the residuals versus the Fitted Values Plot in

Simple Linear Regression.
–  Subject matter expert knowledge or team experience.

When identifying Non-linear Relationships, looking at the graphical variation of output to input on
any given Scatter Plot the Non-linear relationship is self evident. Using step four of the Regression
Analysis methodology, unusual observation will ask us to focus deeper at Fitted Line Plots to see
what is the solution for the historical data. To detect Non-linearity carefully look at the Residuals
vs. Fitted Values graph of a Linear Regression. Finding clustering and/or trends of data one could
conclude a Non-linear Regression. Relying on a team or expert who has prior knowledge can avail
much information, also.
443
Types of Non-Linear Relationships
The simple
Linear model,
the quadratic
model, the
logarithm model
and the inverse
model define the
more
conventional
relationships
between outputs
and inputs.
Oh, which formula to use?!
Mailing Response Example
This example will demonstrate how to use confidence and

prediction intervals.
What percent discount should be offered to achieve a minimum
10% response from the mailing?
The discount is created
through sales coupons being
sent in the mail.
Clip em!
Open the MINITABTM file called “Mailing Response vs. Discount.mtw”. This shows transactions by
a retail store chain giving the relationship between discount percentages and the customer
response. With the input variable displayed in C1 and output displayed in C2, Belts need to
establish which discount rate will yield a 10% response from customers.
444
Mailing Response Scatter Plot

The output
versus the
input is
graphically
plotted with Graph > Scatterplot…
the output
plotted on the
Y-axis.
Notice we
have some
curvature in
the customer
response.
Mailing Response Correlation
Now we are testing for a Linear relationship by running a Correlation. The results of the analysis
are a strong confidence level since the P-value is less than .05. Do you notice the Pearson
Correlation Coefficient is almost 1.0? That indicates a strong Correlation.
445
Mailing Response Fitted Line Plot
This model
shows a very
high R-squared
at 94.5%.
Having noticed
earlier the
apparent
curvature of
the data, the
next step is to
consider a
Non-linear
Regression
Note there are no
Analysis. unusual observations.
Even though the R
squared values are
high a Non-linear fit
may be better based
on the Fitted Line Plot.
Mailing Response Non-Linear Fitted Line Plot
Notice the R squared value for the

Non-linear fit increased to 98.6%
from 94.5% in the Linear Regression.
We are satisfied! The application of a Non-linear Regression Model shows an increased R-squared.
446
Confidence and Prediction Intervals
Keeping in mind the

original question the In order to answer the original question it is necessary to
store wants 10% of the evaluate the confidence and prediction intervals.
coupons redeemed by
their customers so
what discount rate will
generate this
response?
…..Options
A powerful option is the
Fitted Line Plot
analysis so click
“options” after running
“statregressionfittedline
plot” command. Now
select “Display
confidence interval”
and “Display prediction
interval” and leave the
Confidence Level at
95%.
Taking a look at
what has changed in
the MINITABTM
window by selecting
both interval options,
Confidence and
Prediction; each Draw a vertical line where
interval is assigned 10% intersects the lower
prediction interval line.
a color code, the red
is Confidence and
the green is
Prediction. In the
Draw a horizontal
previous “Option” line at 10%.
box we can widen or
narrow the interval
by changing the
Confidence Level,
with the Prediction With 95% confidence a discount
intervals we find how of 18% should create at least a
10% response from the mailing.
all data falls in
between a range
having a particular confidence level of 95%. Much importance lies upon the horizontal line,
however to answer the original question, we need to find out what Prediction interval is of our most
importance. The percentage of customers who would respond with 18% coupon mailed would be
10 to 23 %, this being at 95% Confidence Level; moreover, if we had drawn this horizontal line
incorrectly we would have had a result of 10% or less.
447
Confidence and Prediction Intervals
Having less data

available to
predict the
Regression
Equation usually
causes the
Confidence
Interval to flare
out at the
extreme ends. If
a Prediction
The Prediction Interval is the range where a new
Equation exists observation is expected to fall. In this case we are
it would be 95% confident an 18% discount will yield between 10%
and 23% response from the mailing.
found within the
red lines The Confidence Interval is the range where the
indicating the Prediction Equation is expected to fall. The true
Prediction Equation could be different. However, given
Confidence the data we are 95% confident the true Prediction
Interval at 95%. Equation falls within the Confidence Interval.
Considering the question of yielding 10% or more, finding the Regression Equation is of menial
importance compared to estimating where the data ought to predict the relationship. The
Prediction Interval will provide a degree of confidence in how the customers will respond. This
estimate is of great importance.
Residual Analysis
Confirming the validity, taking
To complete the example check the Residual Analysis for
into consideration our residuals
validation of the assumptions for Regression Analysis.
and completing step seven is
next. Having a variation of
outputs is due to a high level in
R-squared but from that
information we can not draw
the conclusion it is a sufficient
model. We can have
confidence in our model
because all three assumptions
are satisfied; outputs are
Normally and Randomly
Distributed across the
observation order and have
similar variance across the
Fitted Values. The store should
give a discount of 18%
expecting at least a 10%
response from customers.
Now does the present data for the response fit the equation as predicted?
448
Transforming Process Data

In the case where data is Non-linear it is possible to perform Regression
using two different methods:
•  Non-linear Regression (already discussed)
•  Linear Regression on transformed data
Either the X or Y may be transformed.
Any statistical tools requiring transformation uses these methods.
Advantages of transforming data:

•  Linear Regression is easier to visually understand and manage.
•  Non-normal Data can be changed to resemble Normal Data for
statistical analyses where Normality is required.
Disadvantages of transforming data:

•  Difficult to understand transformed units.
•  Difficult without automation or computers.
Majority has it that Belts find data that is abnormally distributed. We have learned how to do Non-
linear Regression but another approach is to transform it into Linear Regression. Outputs or inputs
can be transformed and many people will wonder “What's the point?” Simplicity is the answer and
has a great deal of value.
Data that is asymmetric can often be transformed to make it more symmetric using a numerical
function which operates more strongly on large numbers than small ones; such as logarithms
and roots.
Transform Rules:
1.  The transform must preserve the relative order of the data.
2.  The transform must be a smooth and continuous function.
3.  Most often useful when the ratio of largest to smallest value is greater
than two. In most cases the transform will have little effect when this rule
is violated.
4.  All external reference points (spec limits, etc.) must use the same
transform.
Transformation Power(p)
Cube 3
{ }
Square 2
xp
xtrans= No Change 1
log(x) Square Root 0.5
Logarithm 0
Reciprocal Root -0.5
Reciprocal -1
449
Effect of Transformation
Using a mathematical function we have transformed this data. This example shows how taking a
square root of this data yields a Normal distribution. The challenge then is to find the appropriate
transform function.
Before Transform After Transform

20
25
20 15
Frequency
Frequency
15
10
10
5
5
0 0
10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100
Right Skew Sqrt
The transformed data now shows a

Normal Distribution.
Transforming Data Using MINITABTM
The Box Cox Transformation procedure in MINITABTM is a

method of determining the transform power (called
lambda in the software) for a set of data.
Stat>Control Charts>Box-Cox Transformation
Transform.MTW
In finding an appropriate transform MINITABTM performs a function to aid the Belt. This is known as
Box Cox Transformation.
450
Box Cox Transform
Selecting a transform,
in the upper graph
MINITABTM presents a
lambda of .5, the
lambda is a
mathematical function
applied to the data. In
taking a square root
you can notice two
probabilities of plots in
the graphs below. The Before Transform After Transform
right plot obviously
shows a new data set
after having been
transformed by the
square root and the left
showing abnormal x 0.50 or x
distribution with red
dots away from the blue
line symbolized by a P-
value of under .05.
Using the function “Stat, Basic Statistic, Normality Test” confirmation of the change in distribution of the
particular data can be accomplished at your discretion.
Transforming Without the Box Cox Routine

Using the
Transform.MTW
“Calc.Calculator” An alternative method of transforming
command in data is to use standard transforms.
MINITABTM can The square root and natural log
aid you in an transform are most commonly.
attempt to do a A disadvantage of using the Box Cox
transformation. transformation is the difficulty in
Type in a new reversing the transformation.
column name in
The column of process data is in C1,
“Store result in labeled Pos Skew. Remember this
variable:”, if you data was not Normally Distributed as
obtain a data set determined with the Anderson Darling
already. Next Normality test.
placing the cursor
in the Using the MINITABTM calculator,
calculate the square root of each
“Expression” box
observation in C1 and store in C3,
search for the
calling it Square Root .
name of the
function in the lower right area of the window and double click.
Before executing the transformation make sure the word “number” is highlighted then within the
function the new column shall appear in the “Expression:” box. The transformed data will show
alongside the unchanged data providing you click the “OK” button.
451
Transforming Without the Box Cox Routine

When using MINITABTM
The output should resemble this view.
for the majority of
commands, the order of Confirm if the new data set found in C3 is Transform.MTW
columns is unimportant. Normally Distributed.

Moreover, if the square
root data set appears in
a different column it is
not a problem. Finding
that the new data is
Normally Distributed
after creating the
transformed data set
placed under the
column labeled
“Square Root” is a
necessity. Our transform is the square root - the same as the Box Cox
transform of lambda = 0.5
Remembering from the
Measure Phase the
“stat, basic statistics, normality test” command is now of great importance. Interestingly enough
the Box Cox found the best transformation was the same square root we executed.
Multiple Linear Regression

Multiple Linear Regression investigates multiple input variable’s effect on an
output simultaneously.
–  If R2 is not as high as desired in the Simple Linear Regression.
–  Process knowledge implies more than one input affects the output.
The assumptions for residuals with Simple Regressions are still necessary for
Multiple Linear Regressions.
An additional assumption for MLR is the independence of predictors (X’s).
–  MINITABTM can test for multicollinearity (Correlation between the predictors
or X’s).
Model error (residuals) is impacted by the addition of

measurement error for all the input variables.
In review, we only do Regression on historical data and Regression is not applied to experimental
data. Furthermore we covered performing Regression involving one input and one output. Now
taking into account Multiple Linear Regressions when they are applicable allows us to identify Linear
Regression including one output and more than one input at the same time. If you have not identified
enough of the output variation recall R-squared measures the amount of variation for the output in
Correlation with the input you selected. In looking at the equations here we can assume in Multiple
Linear Regressions each input is independent of the another, no Correlation exists. Having the
inputs independent of one another gives each their own slope. Also we see the epsilon at the end of
the equation representing the fact that every Regression has model error.
452
Definitions of MLR Equation Elements

The definitions for the elements of the Multiple Linear Regression model
are as follows:
Y = β0+ β1X1 + β2X2 + β3X3 + ε#
Y = The response (dependent) variable.

X1, X2, X3: The predictor (independent) inputs. The predictor variables
used to explain the variation in the observed response variable, Y.
β0: The value of Y when all the explanatory variables (the Xs) are equal to
zero.
β1, β2, β3 (Partial Regression Coefficient): The amount by which the
response variable (Y) changes when the corresponding Xi changes by
one unit with the other input variables remaining constant.
ε (Error or Residual): The observed Y minus the predicted value of Y
from the Regression.
Simple Linear Equations and Multiple Linear Equations are very similar however each in Multiple
Linear Regression there is partial Regression Coefficient and beta one and beta zero apply to
Simple Linear Regressions. Earlier we did Regressions in this module do you recall the residuals
we had? Residuals are defined as the observed value minus the predicted value.
MLR Step Review

The basic steps to follow in Multiple Linear Regression are:
1.  Create matrix plot (Graph>Matrix Plot)

2.  Run Best Subsets Regression (Stat>Regression>Best Subsets)
3.  Evaluate R2, adjusted R2 , Mallows Cp, number of predictors and S.
4.  Iteratively determine appropriate regression model. (Stat>Regression>
Regression >Options)
5.  Analyze residuals (Stat>Regression>Regression >Graphs)
a.  Normally Distributed
b.  Equal variance
c.  Independence
d.  Confirm one or two points do not overly influence model
6.  Verify your model by running present process data to confirm your
model error.
With many different input variables on hand and only one output it can be so tedious to find if
variations come from one particular input. Using a Matrix Plot can greatly speed the process and it
will show which is impacting the output the most. After narrowing the field of variables use the best
given command to complete the Multiple Linear Regression. We identify the correct command by
examining R-squared, R-squared adjustable, number of predictors, S variable and Mallows Cp.
Following this we must iteratively confirm inputs are statistically significantly. We have then only
confirmation of this valid model and we MUST especially in consideration for Multiple Linear
Regressions process and witness the presently performing Regression.
453
Multiple Linear Regression Model Selection
When comparing and verifying models consider the following:

1.  Should be a reasonably small difference between R2 and R2 -
adjusted (much less than 10% difference).
2.  When more terms are included in the model does the adjusted R2
increase?
3.  Use the statistic Mallows Cp. It should be small and less than the
number of terms in the model.
4.  Models with smaller S (Standard Deviation of error for the model)
are desired.
5.  Simpler models should be weighed against models with multiple
predictors (independent variables).
6.  The best technique is to use MINITABTM’s Best Subsets command.
Using “Best Subsets Regression” we will be given multiple statistics provided by MINITABTM. It is
in our best interest to use the least confusing Multiple Linear Regression model based on these
guidelines.
Flight Regression Example
An airplane manufacturer wanted to see what variables affect

flight speed. The historical data available covered a period of
10 months.
Graph > Matrix Plot…
Flight Regression MLR.MTW
The MINITABTM “Flight Regression MLR.mtw” needs to be opened to see historical data being
analyzed by an airplane manufacturer. Output is listed as flight speeds and the other columns contain
input variables. With these we will build a Matrix Plot and witness the possibility of relationships
among the variables come to fruition. Using the “Graph variables:” box we enter all inputs and
outputs.
454
Flight Regression Example Matrix Plot

Now we are given a
fairly confusing graph Look for plots that show Correlation.
of outputs and inputs
to interpret. Do not be
discouraged this is Output Response
just a plethora of
sporadically plotted,
outputs and inputs,
flight speeds vs.
altitudes. Seeing at
least two inputs
having Correlation
shows the need to
continue with a
Multiple Linear
Regression. The
lower half has
identical data as the
upper half of the
Predictors
outputs just the axis Since two or more predictors show Correlation, run MLR.
are not reversed.
Flight Regression Example Best Subsets
Best Subsets Regression: Flight Speed versus Altitude,

Turbine Angl, ...
Response is Flight Speed
F
T u
u e
r l
b /
i A
A n i
l e r
t
i A r
t n a T
u g t I e
Mallows d l i C m
Vars R-Sq R-Sq(adj) C-p S e e o R p
1 72.1 71.1 38.4 28.054 X
1 39.4 37.2 112.8 41.358 X
2 85.9 84.8 9.0 20.316 X X
2 82.0 80.6 17.9 22.958 X X
3 87.5 85.9 7.5 19.561 X X X
3 86.5 84.9 9.6 20.267 X X X
4 89.1 87.3 5.7 18.589 X X X X
4 88.1 86.1 8.2 19.481 X X X X
5 89.9 87.7 6.0 18.309 X X X X X
In MINITABTM using the “Best Subsets Regression” command is efficient and powerful since it loads all
inputs to a single output. In the “Free predictors:” box we place all inputs of interest. This particular
command can be helpful in other circumstances however right now let’s place the output column of
data in the “Response:” box. When the evaluation is done the results are given in rows; 1st column - #
of variables, 2nd column - R squared, 3rd column - R squared adjusted, 4th column is mallows Cp, 5th
column - Standard Deviation of the model error and finally the 6th column - input variables.
455
Flight Regression Example Model Selection
Best Subsets Regression: Flight Speed versus

Altitude, Turbine Angl, ...
T
F
u
List of all the
u e Predictors (X’s)
r l
b /
i A
A n i
l e r
t
i A r
t n a T
u g t I e What model would you select?
Mallows d l i C m
1 72.1 71.1 38.4 28.054 X Let’s consider the 5 predictor model:
1 39.4 37.2 112.8 41.358 X
2 85.9 84.8 9.0 20.316 X X •  Highest R-Sq(adj)
2 82.0 80.6 17.9 22.958 X X
3 87.5 85.9 7.5 19.561 X X X •  Lowest Mallows Cp
3 86.5 84.9 9.6 20.267 X X X
4 89.1 87.3 5.7 18.589 X X X X •  Lowest S
4 88.1 86.1 8.2 19.481 X X X X
5 89.9 87.7 6.0 18.309 X X X X X •  However there are many terms
In choosing the correct model, our attention goes to the bottom - 5 term Linear Regression. Are they
all statistically significant?
Stat>Regression>Regression…
…Options
Let’s go back to “Stat>Regression>Regression” again and click on the “Options” button. Place
all outputs in the “Response:” box and the inputs in the “Predictors:” box.
456
Flight Regression Example Model Selection
Regression Analysis: Flight Speed versus Altitude, Turbine Angle, ...
The regression equation is

Flight Speed = 770 + 0.153 Altitude + 5.81 Turbine Angle + 8.70 Fuel/Air ratio
- 52.3 ICR + 4.11 Temp
Predictor Coef SE Coef T P VIF The VIF for temp indicates it

Constant 770.4 229.7 3.35 0.003 should be removed from the
Altitude 0.15318 0.06605 2.32 0.030 2.3
Turbine Angle 5.806 2.843 2.04 0.053 1.4
model. Go back to the Best
Fuel/Air ratio 8.696 3.327 2.61 0.016 3.2 Subsets analysis and select the
ICR -52.269 6.157 -8.49 0.000 2.6 best model that does not
Temp 4.107 3.114 1.32 0.200 5.4
include the predictor temp.
S = 18.3088 R-Sq = 89.9% R-Sq(adj) = 87.7%
Variance Inflation Factor (VIF) detects Correlation among predictors.

•  VIF = 1 indicates no relation among predictors
•  VIF > 1 indicates predictors are correlated to some degree
•  VIF between 5 and 10 indicates Regression Coefficients are poorly
estimated and are unacceptable
Do you notice any similarities here? A foreign column has appeared labeled VIF. This indicates if a
high Correlation among inputs exists. Temp has a high VIF so we will remove it.
Regression Analysis: Flight Speed versus Altitude, Turbine Angle, ...

Flight Speed = 616 + 0.117 Altitude + 6.70 Turbine Angle + 12.2 Fuel/Air ratio
- 48.2 ICR
The VIF values are NOW
acceptable.
Predictor Coef SE Coef T P VIF
Constant 616.1 200.7 3.07 0.005
Altitude 0.11726 0.06109 1.92 0.067 1.9 Evaluate the P-values.
Turbine Angle 6.702 2.802 2.39 0.025 1.3 •  If p > 0.05 the term(s)
Fuel/Air ratio 12.151 2.082 5.84 0.000 1.2 should be removed from the
ICR -48.158 5.391 -8.93 0.000 1.9
Regression.
S = 18.5889 R-Sq = 89.1% R-Sq(adj) = 87.3%

Remove altitude, re-run model.
In removing Temp we rerun “stat,regression,regression” command and choose four terms

remaining. No temp in the box, we want 95% confidence and four are remaining of the terms,
rerun to Multiple Linear Regression proceeding the removal of Altitude.
457
Flight Regression Example Model Selection (cont.)
Note: It is not necessary

Best Subsets Regression: Flight Speed versus to re-run the Best Subsets
Altitude, Turbine Angl, ...
analysis. The numbers do
not change.
F
T u
u e
r l
b /
i A
A n i
l e r
t
i A r
t n a T
u g t I e
Mallows d l i C m Select a model with 4
1 72.1 71.1 38.4 28.054 X terms because Temp was
1
2
39.4
85.9
37.2
84.8
112.8
9.0
41.358
20.316
X
X X
removed as a predictor
2 82.0 80.6 17.9 22.958 X X since it had Correlation
3 87.5 85.9 7.5 19.561 X X X
3 86.5 84.9 9.6 20.267 X X X
with the other variables.
4 89.1 87.3 5.7 18.589 X X X X Re-run the Regression.
4 88.1 86.1 8.2 19.481 X X X X
5 89.9 87.7 6.0 18.309 X X X X X
To start step four we want to take into account the Regression Model that does not include TEMP.
We have satisfied the Best Subsets model so we need not rerun this command.
Regression Analysis: Flight Speed versus Turbine Angl, Fuel/Air rat, ICR

Flight Speed = 887 + 4.82 Turbine Angle + 12.1 Fuel/Air ratio - 55.0 ICR
Predictor Coef SE Coef T P VIF

Constant 886.6 150.4 5.90 0.000
Turbine Angle 4.822 2.763 1.75 0.093 1.1
Fuel/Air ratio 12.106 2.191 5.53 0.000 1.2
ICR -55.009 4.251 -12.94 0.000 1.1
S = 19.5613 R-Sq = 87.5% R-Sq(adj) = 85.9%
Re-run the
The P-value for Turbine Angle Regression
now indicates it should be
removed and the Regression
re-run because P > 0.05
Here we have removed Altitude from the “Predictors:” box and the Regression output now shows
the Turbine Angle is not statistically significant.
458
Flight Regression Final Regression Model
Regression Analysis: Flight Speed versus Fuel/Air ratio, ICR

Flight Speed = 1101 + 10.9 Fuel/Air ratio - 55.2 ICR
This is the final Regression model
Predictor Coef SE Coef T P VIF because all remaining terms are
Constant 1101.04 90.00 12.23 0.000
statistically significant (we wanted
Fuel/Air ratio 10.921 2.163 5.05 0.000 1.1
ICR -55.197 4.414 -12.51 0.000 1.1 95% confidence or P-value < 0.05) and
the R-Sq shows the remaining terms
S = 20.3162 R-Sq = 85.9% R-Sq(adj) = 84.8% explain 85% of the variation of flight
speed.
Analysis of Variance
Source DF SS MS F P
Regression 2 65500 32750 79.35 0.000
Residual Error 26 10731 413 Consider removing this
Total 28 76231 Outlier but be careful, this is
Note the ICR predictor historical data that has no
Source DF Seq SS accounts for 84.7% of
Fuel/Air ratio 1 951 further information.
ICR 1 64549
the variation.
84.7% = 64549/76231 Remember the objective is
Unusual Observations to get information to be
Fuel/Air Flight used in a Designed
Obs ratio Speed Fit SE Fit Residual St Resid Experiment where true
1 40.6 618.00 624.29 11.55 -6.29 -0.38 X cause and effect
22 36.3 578.00 524.45 5.43 53.55 2.74R
relationships can be
R denotes an observation with a large standardized residual.
X denotes an observation whose X value gives it large influence. established.
Shown here is the entire Regression output for a complete discussion of the final Multiple Linear
Regression model. We have two predictor variables and all are statistically significant.
Flight Regression Example Residual Analysis
Now having a final

model it is VITAL
to confirm the
residuals are
correct and the
model is valid.
How do we do
this? Graph and
appropriate
commands to
analyze.
459
Flight Regression Example Residual Analysis (cont.)
•  Normally Distributed Residuals (Normal Probability Plot)

•  Equal Variance (Residuals vs. Fitted Values)
•  Independence (Residuals vs. Order of Data)
It appears our model is valid and the residuals are satisfactory!
Notes
460
§  Perform Non-Linear Regression Analysis
§  Perform Multiple Linear Regression Analysis (MLR)
§  Examine Residuals Analysis and understand its effects
You have now completed Improve Phase – Advanced Process Modeling.
Notes
461
Lean Six Sigma

Green Belt Training
Improve Phase
Now we are going to continue with the Improve Phase “Designing Experiments”.
462
Overview
Within this
module we Welcome to Improve
will provide an
introduction to
Process Modeling: Regression
Design of
Experiments, Advanced Process Modeling:
explain what MLR Reasons for Experiments
they are, how
they work and Designing Experiments Graphical Analysis
when to use
them. DOE Methodology
Project Status Review
•  Understand our problem and its impact on the business. (Define)

•  Established firm objectives/goals for improvement. (Define)
•  Quantified our output characteristic. (Define)
•  Validated the measurement system for our output characteristic.
(Measure)
•  Identified the process input variables in our process. (Measure)
•  Narrowed our input variables to the potential X’s through
Statistical Analysis. (Analyze)
•  Selected the vital few X’s to optimize the output response(s).
(Improve)
•  Quantified the relationship of the Y’s to the X’s with Y = f(x).
(Improve)
463
Six Sigma Strategy
Cu s
Suppl
iers sto ut
O
SIPOC
me Inp
ut
VOC
Con rs
pu
Project Scope
trac Emplo
tors yees
ts
P-Map, X-Y Matrix,
(X1) (X11) (X9)
(X2) (X3) (X4) (X8) FMEA,
(X6) (X7) (X5) (X10) Capability
(X3) (X4) (X1) (X11)

(X5) (X8) Box Plot, Scatter
(X2) Plots, Regression
(X5) (X3) Fractional Factorial

Full Factorial
Center Points
(X11)
(X4)
This is reoccurring awareness. By using tools we filter the variables of defects. When talking of
the Improve Phase in the Six Sigma methodology we are confronted by many Designed
Experiments; transactional, manufacturing, research.
Reasons for Experiments

The Analyze Phase narrowed down the many inputs to a critical few now it
is necessary to determine the proper settings for these few inputs because:
–  The vital few potentially have interactions.
–  The vital few will have preferred ranges to achieve optimal results.
–  Confirm cause and effect relationships among factors identified in Analyze
Phase (e.g. Regression)
Understanding the reason for an experiment can help in selecting the
design and focusing the efforts of an experiment.
Reasons for experimenting are:
–  Problem Solving (Improving a process response)
–  Optimizing (Highest yield or lowest customer complaints)
–  Robustness (Constant response time)
–  Screening (Further screening of the critical few to the vital few X’s)
Design where you’re going - be sure you get there!
Designs of Experiments help the Belt to understand the cause and effect between the process
output or outputs of interest and the vital few inputs. Some of these causes and effects may
include the impact of interactions often referred to synergistic or cancelling effects.
464
Desired Results of Experiments

Designed
Experiments allow Problem Solving
us to describe a –  Eliminate defective products or services.
mathematical –  Reduce cycle time of handling transactional processes.
relationship Optimizing
between the –  Mathematical model is desired to move the process response.
inputs and –  Opportunity to meet differing customer requirements (specifications or
outputs. VOC).
However, often Robust Design
the mathematical –  Provide consistent process or product performance.
equation is not
–  Desensitize the output response(s) to input variable changes including
necessary or used
NOISE variables.
depending on the
–  Design processes knowing which input variables are difficult to maintain.
focus of the
experiment. Screening
–  Past process data is limited or statistical conclusions
prevented good narrowing of critical factors in Analyze
Phase.
When it rains it PORS!
DOE Models versus Physical Models

Here we have models that are results of Designed Experiments. Many have difficulty determining
DOE models from that of physical models. A physical model includes: biology, chemistry, physics
and usually many variables, typically using complexities and calculus to describe. DOE models do
not include any complex calculus: they include the most important variables and show variation of
data collected. DOE will focus on the region of interest.
What are the differences between DOE modeling and physical

models?
–  A physical model is known by theory using concepts of physics,
chemistry, biology, etc...
–  Physical models explain outside area of immediate project needs
and include more variables than typical DOE models.
–  DOE describes only a small region of the experimental space.
The objective is to
minimize the response.
The physical model is
not important for our
business objective. The
DOE Model will focus in
the region of interest.
465
Definition for Design of Experiments
Design of Experiments (DOE) is a scientific method of planning and Design of Experiment

conducting an experiment that will yield the true cause and effect shows the cause and
relationship between the X variables and the Y variables of interest. effect relationship of
variables of interest X
DOE allows the experimenter to study the effect of many input
and Y. By way of input
variables that may influence the product or process simultaneously,
variables, designed
as well as possible interaction effects (for example synergistic
effects). experiments have been
noted within the Analyze
The end result of many experiments is to describe the results as a Phase then are executed
mathematical function. in the Improve Phase.
Y = f (x) DOE tightly controls the
input variables and
The goal of DOE is to find a design that will produce the information
carefully monitors the
required at a minimum cost.
uncontrollable variables.
Properly designed DOE’s are more efficient experiments.
One Factor at a Time is NOT a DOE
Let’s assume a Belt has One Factor at a Time (OFAT) is an experimental style but not a planned
found in the Analyze Phase experiment or DOE.
that pressure and The graphic shows yield contours for a process that are unknown to the
experimenter.
temperature impact his Trial Temp Press Yield
process and no one knows Yield Contours Are 1 125 30 74
75
what yield is achieved for the Unknown To Experimenter 2 125 31 80
3 125 32 85
possible temperature and 4 125 33 92
pressure combinations. 80 5 125 34 86
6 130 33 85
Pressure (psi)
7 120 33 90
If a Belt inefficiently did a One 135
6
85
Factor at a Time experiment 130

1 2 3
90
5 Optimum identified
4
(referred to as OFAT) one 125
with OFAT
95
variable would be selected to 120 7
change first while the other

variable is held constant. True Optimum available
Once the desired result was
30 31 32 33 34 35 with DOE
Temperature (C)
observed the first variable is
set at that level and the second variable is changed. Basically you pick the winner of the
combinations tested.
The curves shown on the graph above represent a constant process yield if the Belt knew the
theoretical relationships of all the variables and the process output of pressure. These contour lines
are familiar if you have ever done hiking in the mountains and looked at an elevation map which
shows contours of constant elevation. As a test we decided to increase temperature to achieve a
higher yield. After achieving a maximum yield with temperature we decided to change the other
factor, pressure. We then came to the conclusion the maximum yield is near 92% because it was the
highest yield noted in our seven trials.
With the Six Sigma methodology we use DOE which would have found a higher yield using
equations. Many sources state that OFAT experimentation is inefficient when compared with DOE
methods. Some people call it hit or miss. Luck has a lot to do with results using OFAT methods.
466
Types of Experimental Designs
DOE is iterative in
nature and may require The most common types of DOE’s are:
more than one –  Fractional Factorials
experiment at times. •  4-15 input variables
As we learn more about –  Full Factorials

•  2-5 input variables
the important variables
our approach will –  Response Surface Methods (RSM)
change as well. If we •  2-4 input variables
have a very good Response
understanding of our Surface
process maybe we will Full Factorial
only need one Fractional Factorials

experiment, if not we
very well may need a
series of experiments.
Fractional Factorials or screening designs are used when the process or product knowledge is low.
We may have a long list of possible input variables (often referred to as factors) and need to screen
them down to a more reasonable or workable level.
Full Factorials are used when it is necessary to fully understand the effects of interactions and when
there are between 2 to 5 input variables.
Response surface methods (not typically applicable) are used to optimize a response typically when
the response surface has significant curvature.
Value Chain
Full factorial
designs are The general notation used to designate a full factorial design is
generally noted as given by:
2 to the k where k
2 k
is number of input
variables or factors
and 2 is the
number of levels all
factors used. In the
table two levels and
four factors are –  Where k is the number of input variables or factors.
shown; by using –  2 is the number of levels that will be used for each factor.
the formula how
many runs would •  Quantitative or qualitative factors can be used.
be involved in this
design? 16 is the
answer, of course.
467
Visualization of 2 Level Full Factorial
Let’s consider a 2 squared

design which means we 600 (-1,+1) (+1,+1)
300
have 2 levels for 2 factors.
Temp
The factors of interest are 350
temperature and pressure. 22 Press
There are several ways to 500
visualize this 2 level Full Press
Factorial design. In 600 500
experimenting we often use Uncoded levels for factors (-1,-1) (+1,-1)
what is called coded 300F Temp 350F
variables. Coding simplifies
the notation. The low level T P T*P Four experimental runs:
for a factor is minus one, -1 -1 +1
the high level is plus one. •  Temp = 300, Press = 500
+1 -1 -1
Coding is not very friendly
-1 +1 -1 •  Temp = 350, Press = 500
when trying to run an
+1 +1 +1 •  Temp = 300, Press = 600
experiment so we use
uncoded or actual variable Coded levels for factors •  Temp = 350, Press = 600
levels. In our example 300
degrees is the low level, 500 degrees is the high level for temperature.
Back when we had to calculate the effects of experiments by hand it was much simpler to use coded
variables. Also when you look at the Prediction Equation generated you could easily tell which
variable had the largest effect. Coding also helps us explain some of the math involved in DOE.
Fortunately for us MINITABTM calculates the equations for both coded and uncoded data.
Graphical DOE Analysis - The Cube Plot
The
representation Consider a 23 design on a catapult...
here has two
cubed designs
8.2 4.55 A B C Response
and 2 levels of
three factors and Run Start Stop Meters
Number Angle Angle Fulcrum Traveled
shows a treatment
3.35 1.5 1 -1 -1 -1 2.10
combination table
using coded 2 1 -1 -1 0.90
Stop Angle
inputs level 3 -1 1 -1 3.35
settings. The 4 1 1 -1 1.50

5.15 2.4
table has 8 5 -1 -1 1 5.15
experimental 6 1 -1 1 2.40
Fulcrum
runs. Run 5 7 -1 1 1 8.20
shows start angle, 2.1 Start Angle 0.9
8 1 1 1 4.55
stop angle very
low and the
fulcrum relatively What are the inputs being manipulated in this design?
high.
How many runs are there in this experiment?
468
Graphical DOE Analysis - The Cube Plot (cont.)
MINITABTM generates
various plots, the cube This graph is used by the experimenter to visualize how the
plot is one. Open the response data is distributed across the experimental space.
MINITABTM worksheet
Stat>DOE>Factorial>Factorial Plots … Cube, select response and factors
“Catapult.mtw”.
How do you
This cube plot is a 2
read or
cubed design for a
interpret this
catapult using three plot?
variables:
Start Angle
Stop Angle
Fulcrum What are
these?
Here we used coded
variable level settings so
we do not know what the
actual process setting
were in uncoded units. Catapult.mtw
The data means for the
response distances are the on the corners of the cube. If we set the stop angle high, start
angle low and fulcrum high we would expect to launch a ball about 8.2 meters with the catapult.
Make sense?
Graphical DOE Analysis - The Main Effects Plot
The Main Effects

Plots shown here This graph is used to see the relative effect of each factor
display the effect on the output response.
that the input Stat>DOE>Factorial>Factorial Plots … Main Effects, select response and factors
values have on
the output
response. The y
axis is the same
for each of the
plots so they can
be compared side
by side.
Hint: Check
Which has the the slope!
steepest Slope?
Which factor has
What has the
the largest impact
largest impact on
on the output?
the output?
Answer: Fulcrum
469
Main Effects Plot’s Creation
Avg Distance at Low Setting of Start Angle: 2.10 + 3.35 + 5.15 + 8.20 = 18.8/4 = 4.70
Main Effects Plot (data means) for Distance
-1 1 -1 1 -1 1
5.2
4.4
Dist
3.6
2.8
2.0
Start Angle Stop Angle Fulcrum
Avg. distance at High Setting of Start Angle: 0.90 + 1.50 + 2.40 + 4.55 = 9.40/4 = 2.34
Run # Start Angle Stop Angle Fulcrum Distance
1 -1 -1 -1 2.10
2 1 -1 -1 0.90
3 -1 1 -1 3.35
4 1 1 -1 1.50
5 -1 -1 1 5.15
6 1 -1 1 2.40
7 -1 1 1 8.20
8 1 1 1 4.55
In order to create the Main Effects Plot we must be able to calculate the average response at the low
and high levels for each Main Effect. The coded values are used to show which responses must be
used to calculate the average.
Let’s review what is happening here. How many experimental runs were operated with the start angle
at the high level or 1. The answer is 4 experimental runs shows the process to run with the start angle
at the high level. The 4 experimental runs running with the start angle at the high level are run
number 2, 4, 6 and 8. If we take the 4 distances or process output and take the average, we see the
average distance when the process had the start angle running at the high level was 2.34 meters.
The second dot from the left in the Main Effects Plots shows the distance of 2.34 with the start angle
at a high level.
Interaction Definition
Interactions occur when variables act
together to impact the output of the
Higher
process. Interactions plots are B-
constructed by plotting both variables Y
together on the same graph. They take When B changes
from low to high
Output
the form of this graph. Note the the output drops

relationship between variables A and Y dramatically.
changes as the level of variable B
B+
changes. When B is at its high (+) level
Lower
variable A has almost no effect on Y. - A +
When B is at its low (-) level A has a
strong effect on Y. The feature of
interactions is non-parallelism between When B changes
the two lines. from low to high
the output drops
very little.
470
Degrees of Interaction Effect

Degrees of interaction can
be related to non- Some Interaction No Interaction Full Reversal
parallelism and the more High
B-
High High
B-
non-parallel the lines are B-
the stronger the Y B+

B+ Y B+ Y
interaction.
B+
A common Low Low Low
- A + - A + - A +
misunderstanding is that
Strong Interaction Moderate Reversal
the lines must actually
High
cross each other for an B- High
B-
interaction to exist but that
is NOT true. The lines Y Y
may cross at some level B+

B+ B+
OUTSIDE of the Low Low
experimental region but - A + - A +

we really do not know that.
Parallel lines show absolutely no interaction and in all likelihood will never cross.
Interaction Plot Creation
Calculating the points

Interaction Plot (data means) for Distance
to plot the interaction Start Angle
6.5
is not as straight -1
1
forward as it was in 5.5
the Main Effects Plot. 4.5

Mean
Here we have four 3.5
points to plot and 2.5

since there are only 8 (4.55 + 2.40)/2 = 3.48
1.5
data points each
(0.90 + 1.50)/2 = 1.20
average will be -1
Fulcrum 1
created using data

Run # Start Angle Stop Angle Fulcrum Distance
points from two 1 -1 -1 -1 2.10
experimental runs. 2 1 -1 -1 0.90
3 -1 1 -1 3.35
This plot is the 4 1 1 -1 1.50
interaction of Fulcrum 5 -1 -1 1 5.15
6 1 -1 1 2.40
with Start Angle on the 7 -1 1 1 8.20
distance. Starting with 8 1 1 1 4.55
the point indicated
with the green arrow above we must find the response data when the fulcrum is set low and start
angle is set high (notice the color coding MINITABTM uses in the upper right hand corner of the plot
for the second factor). The point indicated with the purple arrow is where fulcrum is set high and start
angle is high. Take a few moments to verify the remaining two points plotted.
Let’s review what is happening here. The dot indicated by the green arrow is the Mean distance when
the fulcrum is at the low level as indicated by a -1 and when the start angle is at the high level as
indicated by a 1. Earlier we said the point indicated by the green arrow had the fulcrum at the low
level and the start angle at the high level. Experimental runs 2 and 4 had the process running at
those conditions so the distance from those two experimental runs is averaged and plotted in
reference to a value of 1.2 on the vertical axis. You can note the red dotted line shown is for when the
start angle is at the high level as indicated by a 1.
471
Graphical DOE Analysis - The Interaction Plots
Based on how Stat>DOE>Factorial>Factorial Plots … Interactions, select response and factors

many factors you
When you select more than two variables MINITABTM generates an
select
Interaction Plot Matrix which allows you to look at interactions
MINITABTM will
create a number simultaneously. The plot at the upper right shows the effects of Start
of interaction Angle on Y at the two different levels of Fulcrum. The red line shows the
plots. effects of Fulcrum
on Y when Start
Here there are 3 Angle is at its
factors selected high level. The
so it generates black line
the 3 interaction represents the
plots. These are
effects of Fulcrum
referred to as 2-
on Y when Start
way interactions.
Angle is at its low
level.
Note: In setting up this graph we selected
options and deselected draw full
interaction matrix
MINITABTM will also plot the mirror images just in case it is easier to interpret with the variables
flipped. If you care to create the mirror image of the interaction plots, while creating interaction
plots, click on “Options” and choose “Draw full interaction plot matrix” with a checkmark in the box.
These mirror images present the same data but visually may be easier to understand.
Stat>DOE>Factorial>Factorial Plots … Interactions, select response and factors

The plots at the lower left in the graph below (outlined in blue) are the mirror
image plots of those in the upper right. It is often useful to look at each
interaction in both representations.
Choose this option

for the additional
plots.
472
DOE Methodology
1.  Define the Practical Problem

2.  Establish the Experimental Objective
3.  Select the Output (response) Variables
4.  Select the Input (independent) Variables
5.  Choose the Levels for the Input Variables
6.  Select the Experimental Design
7.  Execute the experiment and Collect Data
8.  Analyze the data from the designed experiment and draw
Statistical Conclusions
9.  Draw Practical Solutions
10. Replicate or validate the experimental results
11. Implement Solutions
Generate Full Factorial Designs in MINITABTM
It is easy to
generate full
factorial designs in
MINITABTM. Follow
the command path
shown here.
These are the DOE > Factorial > Create Factorial Design…
designs
MINITABTM will
create. They are
color coded using
the Red, Yellow
and Green. Green
are the “go”
designs, yellow are
the “use caution”
designs and red
are the “stop, wait
and think” designs.
It has a similar
meaning as do
street lights.
473
Create Three Factor Full Factorial Design
Stat>DOE>Factorial>Create Factorial Design
Let’s create a three factor full factorial design using the MINITABTM command shown at the top of the
slide. This design we selected will give us all possible experimental combinations of 3 factors using 2
levels for each factor.
Be sure to change the number of factors as seen in the upper left of the slide to 3. Also be sure not to
forget to click on the “Full factorial” line within the Designs box shown in the lower right of the slide.
In the “Options”
box of the upper
left MINITABTM
display one can
change the order
of the
experimental runs.
To view the design
in standard order
(not randomized
for now) be sure
to uncheck the
default of
“Randomize
runs” in the
“Options” tab.
“Un-checking”
means no
checkmark is in
the white box next
to “Randomize
runs”.
474
Create Three Factor Full Factorial Design (cont.)

Enter the names of the
three factors as well as
the numbers for the
levels shown in the
lower right portion of
this slide. To reach this
display click on
“Factors…” in the
upper left hand display.
Remember when we
discussed uncoded
levels? The process
settings of 140 and 180
for the start angle are
examples of uncoded
levels.
Three Factor Full Factorial Design

Here is the worksheet
MINITABTM creates. If you
had left the “Randomize
runs” selection checked in
the Options box your
design would be in a
different order than shown.
Notice the structure of the
last 3 columns where the
factors are shown. The first
factor, start angle, goes
from low to high as you
read down the column. The
second factor, stop angle,
has 2 low then 2 high all Hold on! Here we go….
the way down the column
and the third factor,
fulcrum, has 4 low then 4 high. Notice the structure just keeps doubling the pattern. If we had
created a 4 factor full factorial design the fourth factor column would have had 8 rows at the low
setting then 8 rows at the high setting. You can see it is very easy to create a full factorial design.
This standard order as we call it is not however the recommended order in which an experiment
should be run. We will discuss this in detail as we continue through the modules.
One warning to you as a new Belt to using MINITABTM. Never copy, paste, delete or move columns
within the first 7 columns or MINITABTM may not recognize the design you are attempting to use.
Is our experiment done? Not at all. The process must now be run at the 8 experimental set of
conditions shown above and the output or outputs of interest must be recorded in columns to the
right of our first 7 columns shown. After we have collected the data we will then analyze the
experiment. Remember the 11 Step DOE methodology from earlier?
475
§  Determine the reason for experimenting
§  Describe the difference between a physical model and a DOE

model
§  Explain an OFAT experiment and its primary weakness
§  Shown Main Effects Plots and interactions, determine which

effects and interactions may be significant
§  Create a Full Factorial Design
You have now completed Improve Phase – Designing Experiments.
Notes
476
Lean Six Sigma

Green Belt Training
Improve Phase
Congratulations on completing the training portion of the Improve Phase. Now comes the
exciting and challenging part…implementing what you have learned to real world projects.
477
Improve Phase Overview—The Goal
This is a summary of
the purpose for the The goal of the Improve Phase is to:
Improve Phase.
Avoid getting into
analysis paralysis, •  Determine the optimal levels of the variables which are significantly
only use DOE’s as impacting your Primary Metric.
necessary. Most
problems will NOT •  Demonstrate a working knowledge of modeling as a means of
require the use of process optimization.
Designed
Experiments
however to qualify as
a Black Belt you at
least need to have
an understanding of
DOE as described in
this course.
Improve Phase Action Items
•  Listed here are the Improve Phase deliverables each candidate will
present in a Power Point presentation at the beginning of the
Control Phase training.
•  At this point you should understand what is necessary to provide

these deliverables in your presentation.
–  Experiment Justification
–  Experiment Plan / Objective
–  Experiment Results
–  Project Plan
It s your show!
Before beginning the Control Phase you should prepare a clear presentation that addresses each
topic shown here.
478
Six Sigma Behaviors

the
•  Sharing best practices Walk!
Each player in the Lean Six Sigma process must be

A ROLE MODEL
for the Lean Six Sigma culture.
Improve Phase - The Roadblocks
Look for the potential roadblocks and plan to address them before
they become problems:
–  Lack of data
–  Data presented is the best guess by functional managers
–  Team members do not have the time to collect data
–  Process participants do not participate in the analysis planning
–  Lack of access to the process
Each phase will have roadblocks. Many will be similar throughout your project.
479
DMAIC Roadmap
Process Owner
Champion/

Define
Estimate COPQ
Establish Team
Measure

Analyze

Improve
Implement Solutions to Control or Eliminate Xs Causing Problems

Control
The objective of the Improve Phase is simple – utilize advanced statistical methods to identify
contributing variables OR more appropriately optimize variables to create a desired output.
Improve Phase
Over 80% of projects will realize their
solutions in the Analyze Phase –
Designed Experiments can be extremely Analysis Complete
effective when used properly. It is
imperative that a Designed Experiment is Identify Few Vital X’s
justified. From an application and

practical standpoint if you can identify a Experiment to Optimize Value of X’s
solution by utilizing the strategy and tools

within the Measure and Analyze Phases Simulate the New Process
then do it. Do not force Designed
Experiments.
Validate New Process
Remember your sole objective in

conducting a Lean Six Sigma project is to
Implement New Process
find a solution to the problem. You
created a Problem Statement and an
Objective Statement at the beginning of Ready for Control
your project. However you can reach a
solution that achieves the stated goals in
the Objective Statement, then implement
them and move on to another issue –
there are plenty!
480
Improve Phase Checklist
Improve Phase Questions
•  Are the potential X’s measurable and controllable for an experiment?
•  Are they statistically and practically significant?
•  How much of the problem have you explained with these X’s?
•  Have you clearly justified the need for conducting a Designed

Experiment?
•  Are adequate resources available to complete the project?
•  What next steps are you recommending?
These are questions the participant should be able to answer in clear, understandable language at
the end of this phase.
Planning for Action
A DOE to meet your problem solving strategy
Scheduling your experimental plan
Executing your planned DOE
Analysis of results form your DOE
Obtain mathematical model to represent process
Planning the pilot validation for breakthrough
Present statistical promise to process owner
Prepare for implementation of final model
Schedule resources for implementation timeline
Conclude on expected financial benefits
Over the last decade of deploying Lean Six Sigma it has been found the parallel application of the
tools and techniques in a real project yields the maximum success for the rapid transfer of
knowledge. Thus we have developed a follow up process that involves planning for action between
the conclusion of this phase and the beginning of the Control Phase. It is imperative you complete
this to keep you on the proper path. Thanks and good luck!
481
§  Have started to develop a project plan to complete the action items
You’re on your way!
You have now completed Improve Phase – Wrap Up and Action Items.
Notes
482
Lean Six Sigma

Green Belt Training
Control Phase
Welcome to Control
Now that we have completed the Improve Phase we are going to jump into the Control Phase.
Welcome to Control will give you a brief look at the topics we are going to cover.
483
Welcome to Control
Overview
These are the modules
we will cover in the
Control Phase as we Welcome to Control
attempt to insure that
the gains we have
made with our project
remain in place..
Lean Controls
We will examine the
meaning of each of Defect Controls
these and show you
how to apply them.
Statistical Process Control (SPC)
Six Sigma Control Plans
DMAIC Roadmap
Process Owner
Champion/

Define
Estimate COPQ
Establish Team
Measure

Analyze

Improve

Control
484
Welcome to Control
Control Phase Finality with Control Plans
Improvement Selected
Develop Training Plan
Implement Training Plan
Develop Documentation Plan
Implement Documentation Plan
Develop Monitoring Plan
Implement Monitoring Plan
Develop Response Plan
Implement Response Plan
Develop Plan to Align Systems and Structures
Align Systems and Structures
Go to Next Project
485
Lean Six Sigma

Green Belt Training
Control Phase
Lean Controls
Now we will continue in the Control Phase with “Lean Controls”.
486
Lean Controls
Overview
You can see in this section of the course we will look at the Vision of Lean, Lean Tools and
Sustaining Project Success.
Welcome to Control Vision of Lean Supporting Six Sigma
Lean Controls Lean Tool Highlights
Defect Controls Project Sustained Success
Lean Controls
You have begun the process of sustaining your project after finding the “vital few” X’s.
In Advanced Process Capability we discussed removing some of the Special Causes causing
spread from Outliers in the process performance.
This module gives more tools from the Lean toolbox to stabilize your process.
Belts, after some practice, often consider this module’s set of tools a way to improve some
processes that are totally “out of control” or of such poor Process Capability prior applying the Six
Sigma methodology.
The tools we are going to review within this module can be used to help control a process. They can
be utilized at any time in an improvement effort not just in Control. These Lean concepts can be
applied to help reduce variation, effect Outliers or clean up a process before, during or at the
conclusion of a project.
Let’s get this place

cleaned up!!
487
Lean Controls
The Vision of Lean Supporting Your Project
Remember the goal is to achieve and the SUSTAIN our improvements. We discussed 5S in the
Define Phase but we are going to review it with a twist here in the Control Phase.
Kanban
The Continuous Goal…
Sustaining Results Kaizen
p  We cannot sustain
Kanban without Kaizen.
Standardized Work
p  We cannot sustain Kaizen

(Six Sigma) without
Standardized Work.
Visual Factory p  We cannot sustain Standardized

Work without a Visual Factory.
p  We cannot sustain a visual

5S Workplace Organization factory without 5S.
Lean tools add discipline required to further

sustain gains realized with Six Sigma Belt Projects.
What is Waste (MUDA)?

The first step toward waste elimination is waste identification which you did originally with your
Project Charter and measured with your primary metric even if you did not use the term waste. All
Belt projects focus efforts into one (or more) of these seven areas.
Waste is often the root of any Six Sigma project.

The 7 basic elements of waste (muda in Japanese) include:
–  Muda of Correction
–  Muda of Overproduction
–  Muda of Processing
–  Muda of Conveyance
–  Muda of Inventory
–  Muda of Motion Get that garbage outta here!
–  Muda of Waiting
The specifics of the MUDA were discussed in the Define Phase:
–  The reduction of MUDA can reduce your Outliers and help with
defect prevention. Outliers exist because of differing waste among
procedures, machines, people, etc.
488
Lean Controls
The Goal
Remember that any project Do not forget the goal ~ Sustain your Project by eliminating MUDA!
needs to be sustained. With this in mind we will introduce and review some of the Lean tools
Muda (pronounced like used to sustain your project success.
mooo dah) are wastes than
can reappear if the
following Lean tools are not
used. The goal is to have
your Belts move on to other
projects and not be used as
firefighters.
O!
N
5S - Workplace Organization
The term “5S” derives from the
Japanese words for five practices
leading to a clean and
manageable work area. The five Before.. After..
“S” are: ‘Seiri' means to
separate needed tools, parts and
instructions from unneeded
materials and to remove the •  5S means the workplace is clean, there is a place for
latter. 'Seiton' means to neatly everything and everything is in its place.
arrange and identify parts and •  5S is the starting point for implementing improvements to
tools for ease of use. 'Seiso' a process.
means to conduct a cleanup
campaign. 'Seiketsu' means to •  To ensure your gains are sustainable you must start with
conduct seiri, seiton, and seiso at a firm foundation.
frequent, indeed daily, intervals to •  Its strength is contingent upon the employees and
maintain a workplace in perfect company being committed to maintaining it.
condition. 'Shitsuke' means to
form the habit of always following the first four S’s.
On the next page we have translated the Japanese words to English words. Simply put, 5S
means the workplace is clean, there is a place for everything and everything is in its place. The
5S will create a workplace that is suitable for and will stimulate high quality and high productivity
work. It will make the workplace more comfortable and a place that you can be proud of.
Developed in Japan this method assumes no effective and quality job can be done without clean
and safe environment and without behavioral rules. The 5S allow you to set up a well adapted
and functional work environment ruled by simple yet effective rules. 5S deployment is done in a
logical and progressive way. The first three S’s are workplace actions while the last two are
sustaining and progress actions.
It is recommended to start implementing 5S in a well chosen pilot workspace or pilot process and
spread to the others step by step.
489
Lean Controls
5S Translation - Workplace Organization
Step Japanese Literal Translation English
Step 1: Seiri Clearing Up Sorting
Step 2: Seiton Organizing Straightening
Step 3: Seiso Cleaning Shining
Step 4: Seketsu Standardizing Standardizing
Step 5: Shitsuke Training & Discipline Sustaining
Focus on using the English words, much easier to remember.
The English translations are:
Seiri = Sorting
Eliminate everything not required for the current work, keeping only the bare essentials.
Seiton = Straightening
Arrange items in a way that they are easily visible and accessible.
Seiso = Shining
Clean everything and find ways to keep it clean. Make cleaning a part of your everyday
work.
Seketsu = Standardizing
Create rules by which the first three S’s are maintained.
Shitsuke = Sustaining
Keep 5S activities from unraveling
490
Lean Controls
SORTING - Decide what is needed.

The first stage of 5S is
to organize the work Definition:
area, leaving only the –  To sort out necessary and
tools and materials unnecessary items.
necessary to perform
–  To store often used items
daily activities. When
at the work area,
“sorting” is well
infrequently used items
implemented
away from the work area
communication and dispose of items that
between workers is are not needed. Things to remember
improved and product •  Start in one area then sort
Why:
quality and through everything.
productivity are –  Removes waste. •  Discuss removal of items with
all persons involved.
increased. –  Safer work area. •  Use appropriate
–  Gains space. decontamination,
environmental and safety
–  Easier to visualize the procedures.
process. •  Items that cannot be removed
immediately should be tagged
for later removal.
•  If necessary use movers and
riggers.
A Method for Sorting
5S usually begins with a Item

great initial cleaning where
sorting out the items is a
highlight. For each item it Useful Unknown Useless
must be stated if it is
useful, useless or
undetermined. For some
items the statement may Keep &
Monitor
be touchy as nobody
Keep &
seems to know if they are Store
really useful or not and
what is their frequency of Useful Useless
Sorting
use.
Always start with the ABC

easiest items to classify. Dispose
Storage
Difficulty should be no
excuse, go for it, starting with easiest: Sort each item according to three categories:
1. Useful 2. Useless 3. Unknown
The two first categories are problem to sort as their status is clear. Dispose of immediately any
useless items because they just clutter the workspace, lead to loss of time, confusion and poor
quality. For items in the unknown category or the frequency of use is unclear keep them where they
are for a predetermined period of time and if found that they are not used dispose of them.
For items that are useful there is also a method for determining how and where they should be
stored to help you achieve a clean and orderly workplace.
491
Lean Controls
A Method for Sorting
Frequency of Use Use this graph as a general guide

for deciding where to store items
along with the table below.
A
B
C
Distance
Frequency of Keep within arms Keep in local Keep in remote

Utilization Class reach location location
Daily or several times
a day A YES MAYBE NO
Weekly B MAYBE YES NO
Monthly or quarterly C NO NO YES
After you have determined the usefulness of an item set three classes for determining where to store an
item based on the frequency of use and the distance to travel to get the item. “A” is for things which are
to be kept close at hand because the frequency of use is high. “B” is if the item is used infrequently but
approximately on a weekly basis. Do no put it on your work surface rather keep in easy walking
distance, i.e. on a bookshelf or in a nearby cabinet usually in the same room you are in. For “C” items it
is acceptable to store in a somewhat remote place, meaning a few minutes walk away.
By rigorously applying the sort action and the prescribed method you will find the remainder of the 5S
items will be quite easy to accomplish. It is very difficult to order a large number of items in a given
space and the amount of cleaning increases with the number of items. Your workplace should only
contain those items needed on a daily to weekly basis to perform your job.
STRAIGHTENING – Arranging Necessary Items
The second stage of

5S involves the Definition:
orderly arrangement –  To arrange all necessary
items.
of needed items so
–  To have a designated place
they are easy to use
for everything.
and accessible for
–  A place for everything and
“anyone” to find. everything in its place.
Orderliness eliminates –  Easily visible and accessible.
waste in production Why:
Things to remember
•  Things used together
and clerical activities. –  Visually shows what is should be kept together.
required or is out of place. •  Use labels, tape, floor
–  More efficient to find items and markings, signs and
documents (silhouettes/ shadow outlines.
labels). •  Sharable items should be
–  Saves time by not having to kept at a central location
(eliminated excess).
search for items.
–  Shorter travel distances.
492
Lean Controls
SHINING – Cleaning the Workplace
The third stage of 5S

Definition:
is keeping everything
clean and swept. –  Clean everything and
This maintains a
find ways to keep it
clean.
safer work area and
problem areas are –  Make cleaning a part of
quickly identified. An your everyday work.
important part of
“shining” is “Mess Why:
Things to remember
Prevention.” In other –  A clean workplace
•  “Everything in its place” frees up
words do not allow indicates a quality time for cleaning.
litter, scrap, product and process. •  Use an office or facility layout as a
shavings, cuttings, –  Dust and dirt cause visual aid to identify individual
product contamination responsibilities for cleaning. This
etc. to land on the eliminates “no man’s land”.
floor in the first and potential health
•  Cleaning the work area is like
hazards. bathing. It relieves stress and
place.
–  A clean workplace helps strain, removes sweat and dirt and
identify abnormal prepares the body for the next day.
conditions.
STANDARDIZING – Creating Consistency
The fourth stage of 5S

Definition:
involves creating a
consistent approach –  To maintain the workplace
for carrying out tasks at a level that uncovers
and procedures. problems and makes them
Orderliness is the obvious.
core of –  To continuously improve
“standardization” and your office or facility by
is maintained by continuous assessment
Visual Controls which and action.
might consist of: Things to remember
Signboards, Painted •  We must keep the workplace neat
Why: enough for visual identifiers to be
Lines, Color-coding
strategies and –  To sustain Sorting, effective in uncovering hidden
Storage and Shining problems.
Standardizing “Best
•  Develop a system that enables
Methods” across the activities every day.
everyone in the workplace to see
organization. problems when they occur.
493
Lean Controls
SUSTAINING – Maintaining the 5S
This last stage of 5S

is the discipline and
commitment of all Definition:
other stages. –  To maintain our
Without “sustaining” discipline we need to
your workplace can
practice and repeat until
easily revert back to
being dirty and it becomes a way of life.
chaotic. That is why
it is so crucial for Why:
your team to be Things to Remember
–  To build 5S into our •  Develop schedules and
empowered to
improve and everyday process. check lists.
•  Good habits are hard
maintain their
to establish.
workplace. Keeping
•  Commitment and discipline
a 5S program vital in toward housekeeping are
an organization essential first steps toward
creates a cleaner being world class.
workplace, a safer
workplace. It
contributes to how
we feel about our product, our process, our company and ourselves. It provides a customer
showcase to promote your business and product quality will improve – especially by reducing
contaminants. Efficiency will increase also. When employees take pride in their work and workplace it
can lead to greater job satisfaction and higher productivity.
The Visual Factory
A Visual Factory
can best be The basis and foundation of a Visual Factory are the 5S Standards.
represented by a
A Visual Factory enables a process to manage its processes with clear indications of
workplace where a
opportunities. Your team should ask the following questions if looking for a project:
recently hired
–  Can we readily identify Downtime Issues?
supervisor can
–  Can we readily identify Scrap Issues?
easily identify
–  Can we readily identify Changeover Problems?
inventory levels, –  Can we readily identify Line Balancing Opportunities?
extra tools or –  Can we readily identify Excessive Inventory Levels?
supplies, scrap –  Can we readily identify Extraneous Tools & Supplies?
issues, downtime
concerns or even Exercise:
issues with setups –  Can you come up with any opportunities for VISUAL aids in your project?
or changeovers. –  What visual aids exist to manage your process?
494
Lean Controls
What is Standardized Work?

Affected employees should understand once they together have defined the standard they will be
expected to perform the job according to that standard.
If the items are organized and orderly then

standardized work can be accomplished.
–  Less Standard Deviation of results
–  Visual factory demands framework of Standardized Work
standardized work
The one best way to perform each

operation has been identified and agreed
upon through general consensus (not Visual Factory
majority rules)
–  This defines the Standard work
procedure
We cannot sustain
Standardized Work
without 5S and the
Visual Factory. 5S - Workplace Organization
Prerequisites for Standardized Work
Standardized work does not happen without the Visual

Factory which can be further described with:
Availability of required tools (5S). Operators cannot be expected to

maintain standard work if required to locate needed tools
Consistent flow of raw material. Operators cannot be expected to

maintain standard work if they are searching for needed parts
Visual alert of variation in the process (Visual Factory). Operators,

material handlers, office staff all need visual signals to keep standard
work a standard
Identified and labeled in-process stock (5S). As inventory levels of in-

process stock decrease a visual signal should be sent to the material
handlers to replenish this stock
The steps in developing CTQ’s are identifying the customer, capturing the Voice of the Customer and
finally validating the CTQ’s.
495
Lean Controls
What is Kaizen?
Definition of Kaizen*: The philosophy of Kaizen

continual improvement, that every process can
and should be continually evaluated and
improved in terms of time required, resources
used, resultant quality and other aspects Standardized Work
relevant to the process.
Kaikaku are breakthrough successes which are

the first focus of Six Sigma projects.
Visual Factory
* Note: Kaizen Definition from: All I Needed To

Know About Manufacturing I Learned
in Joe’s Garage. Miller and Schenk,
Bayrock Press, 1996. Page 75.
A Kaizen event is very similar to a Six Sigma project. A Six Sigma project is actually a Kaizen. By
involving your project team or other in an area to assist with implementing the Lean control or
concepts you will increase buy in of the team which will effect your projects sustainability.
Prerequisites for Kaizen
Kaizens need the following cultural elements:
Management Support. Consider the corporate support which is the reason why
Six Sigma focus is a success in your organization
Measurable Process. Without standardized work we really would not have a

consistent process to measure. Cycle times would vary, assembly methods
would vary, batches of materials would be mixed, etc…
Analysis Tools. There are improvement projects in each organization that cannot
be solved by an operator. This is why we teach the analysis tools in the
breakthrough strategy of Lean Six Sigma.
Operator Support. The organization needs to understand its future lies in the
success of the value-adding employees. Our roles as Belts are to convince
operators that we are here for them--they will then be there for us.
A Kaizen event can be small or large in scope. Kaizens are improvement with a purpose of
constantly improving a process. Some Kaizens are very small changes like a new jig or placement of
a product or more involved projects. Kaizens are Six Sigma projects with business impact.
496
Lean Controls
What is Kanban?
Kanbans are the best inventory control method for impacting

some of the 7 elements of MUDA.
Kanban provides production, conveyance and delivery Kanban
information. In its purest form the system will not allow
any goods to be moved within the facility without an
appropriate Kanban (or signal) attached to the goods.
–  The Japanese word for a communication signal Kaizen
or card--typically a signal to begin work.
–  Kanban is the technique Standardized Work
used to pull products and
material through and into
the Lean manufacturing system.
–  The actual Kanban can be a
physical signal such as an empty Visual Factory
container or a small card.
This is a building block. A Kanban needs to be supported by the previous steps we have reviewed. If
Kanbans are abused they will actually backfire and effect the process in a negative manner.
Two Types of Kanban

There are two categories of Kanbans, finished good Kanbans and incoming material Kanbans as
depicted here.
There are two main categories of Kanbans:

Type 1: Finished goods Kanbans ~ Intra-process
–  Signal Kanban: Should be posted
at the end of the processing area P.I.K.
to signal for production to begin.
Production
–  P.I.K Kanban: Used for a much Instruction Kanban
more refined level of inventory
control. Kanban is posted as Signal
inventory is depleted thus insuring
only the minimum allowable level
of product is maintained.
Type 2: Incoming Material Kanbans ~ Withdrawal

–  Used to purchase materials from a Inter-Process
supplying department either
internal or external to the Between two
organization. Regulates the processes
amount of WIP inventory located
at a particular process. Supplier
497
Lean Controls
Prerequisites for a Successful Kanban System

Kanbans should
smooth out inventory These items support successful Kanbans:
and keep product •  Improve changeover procedures.
flowing but use them
cautiously. If you •  Relatively stable demand cycle.
prematurely
implement a Kanban •  Number of parts per Kanban (card) MUST be standard and SHOULD be
it WILL backfire. kept to as few as possible parts per card.
•  Small amount of variation (or defects).
•  Near zero defects should be sent to the assembly process (result of

earlier belt projects).
•  Consistent cycle times defined by Standardized Work.
•  Material handlers must be trained in the organization of the

transportation system.
Warnings Regarding Kanban
As we have indicated, if you do NOT have 5S, Visual

Factory, Standardized Work and ongoing Kaizens…
Kanbans cannot succeed.
Kanban systems are not quick fixes to large inventory

problems, workforce issues, poor product planning,
fluctuating demand cycles, etc...
Don t forget that weakest Link thing!
It is not possible to implement a viable Kanban system without a strong support structure made up
of the prerequisites. One of the most difficult concepts for people to integrate is the simplicity of
the Lean tools… and to keep the discipline. Benchmarks have organizations using up to seven
years to implement a successful Kanban System all the way through supplier and customer supply
chain.
498
Lean Controls
The Lean Tools and Sustained Project Success
The Lean tools help sustain project success. The main

lessons you should consider are:
1.  The TEAM should 5S the project area and begin integrating Visual Factory
indicators.
–  Indications of the need for 5S are:
–  Outliers in your project metric
–  Loss of initial gains from project findings
2.  The TEAM should develop Standardized Work instructions.

–  They are required to sustain your system benefits.
–  However, remember without an organized workplace with 5S
Standardized Work instructions will not create consistency
3.  Kaizens and Kanbans cannot be attempted without organized workplaces

and organized work instructions.
–  Remember the need for 5S and Standardized Work instructions to
support our projects.
4.  Project Scope dictates how far up the Lean tools ladder you need to
implement measures to sustain any project success from your DMAIC
efforts.
The 5 Lean concepts are an excellent method for Belts to sustain their project success. If you have
Outliers, declining benefits or dropping process capability, you need to consider the concepts
presented in this module.
Class Exercise
In the boundaries for your project scope give some examples of Lean
tools in operation.
–  Others can learn from those items you consider basic.
List other Lean tools you are most interested in applying to sustain your
project results.
499
Lean Controls
•  Describe the Lean tools
•  Understand how these tools can help with project sustainability
•  Understand how the Lean tools depend on each other
•  Understand how tools must document the defect prevention created

in the Control Phase
You have now completed Control Phase – Lean Controls.
Notes
500
Lean Six Sigma

Green Belt Training
Control Phase
Defect Controls
Now we will continue in the Control Phase with the “Defect Controls”.
501
Defect Controls
Overview
Welcome to Control
Lean Controls Realistic Tolerance and Six Sigma Design
Defect Controls Process Automation or Interruption
Statistical Process Control (SPC) Poka-Yoke
In an effort to put in place Defect Controls we will examine Tolerances, Process Automation and
Poka-Yoke.
Purpose of Defect Prevention in Control Phase
Process improvement efforts often falter during implementation of new

operating methods learned in the Analyze Phase.
Sustainable improvements cannot be achieved without control tactics to

guarantee permanency.
Defect Prevention seeks to gain permanency by eliminating or rigidly

defining human intervention in a process.
Yes sir, we are in CONTROL!
With Defect Prevention we want to ensure the improvements created during the project stay in place.
502
Defect Controls
Sigma Level for Project Sustaining in Control

5-6σ: Six Sigma product and/or process design eliminates an error
BEST
condition OR an automated system monitors the process and
automatically adjusts Critical X’s to correct settings without human
intervention to sustain process improvements
4-5σ: Automated mechanism shuts down the process and prevents

further operation until a required action is performed
The best approach to
3-5σ: Mistake proofing prevents a product/service from passing onto Defect Prevention is to
the next step
design Six Sigma right
3-4σ: SPC on X’s with the Special Causes are identified and acted into the process.
upon by fully trained operators and staff who adhere to the rules
2-4σ: SPC on Y’s
1-3σ: Development of SOPs and process audits
0-1σ: Training and awareness WORST
Designing products and processes such that the output Y

meets or exceeds the target capability.
24
Specification on Y
22
6s Product/Process Design
Distribution 21
of Y
19
Relationship
17 Y = f(x)
10 11 12 13 14 15 16 17 18 19 20
Distribution of X
The process specifications for X are set such that the target capability for Y
is achieved.
Both the target and tolerance of the X must be addressed in the spec limits.
Upper
Prediction
25
Interval
Specification for Y
23
Distribution 21 Relationship
of Y
Y = f(x)
19
17
10 11 12 13 14 15 16 17 18 19 20 Lower
Prediction
Distribution of Xs Interval
If the relationship between X and Y is empirically developed through

Regressions or DOE’s then uncertainty can exist.
As a result confidence intervals should be used when establishing the
specifications for X.
503
Defect Controls
Product/Process Design Example
Using 95% prediction bands within MINITABTM

Stat > Regression>Fitted Lin Plot …..Options…Display Prediction Interval
Regression Plot
Generate your own Y = 7.75434 + 5.81104X
R-Sq = 88.0 %
Data Set(s) and
experiment with this 90
MINITABTM function. 80
70
60
Output 50
40
30
What are the 20
spec limits for 10
Regression
the output? 0
95% PI
0 5 10
Input
What is the tolerance range for the input?
If you want 6σ performance you must remember to tighten the
output’s specification to select the tolerance range of the input.
Usually we use the prediction band provided by MINITABTM. This is controllable by manipulation of
the confidence intervals… 90%, 05%, 99%, etc. Play with adjusting the prediction bands to see
the effect it has.
Regression Plot
Y = 2.32891 - 0.282622X
R-Sq = 96.1 %
10
High output spec connects
with top line in both cases.
Output2
Regression
0 95% PI Regression Plot

Y = 7.75434 + 5.81104X
R-Sq = 88.0 %
-30 -20 -10 0
Input2 90
80
70
60
50
Output
40
30
20
Regression
Lower input spec

10
95% PI
0
0 5 10
Input
504
Defect Controls
Poor Regression Impacting Tolerancing
Regression Plot
Y = -4.7E-01 R-Sq =
+ 0.811312X 90.4 %
Poor Correlation does not

allow for tighter tolerancing.
20
Outp1
10
Regression Plot
Regression
Y = 1.46491 R-Sq =
0 95% PI
+ 0.645476X 63.0 %
0 10 20 30
30
Inp1
20
Outp2
10
Regression
0
95% PI
0 10 20 30
Inp1
5 – 6 σ Full Automation
Full Automation: Systems that monitor the process and automatically

adjust Critical X’s to correct settings.
•  Automatic gauging and system adjustments

•  Automatic detection and system activation systems - landing gear
extension based on aircraft speed and power setting
•  Systems that count cycles and automatically make adjustments
based on an optimum number of cycles
•  Automated temperature controllers for controlling heating and
cooling systems
•  Anti-Lock braking systems
•  Automatic welder control units for volts, amps and distance traveled
on each weld cycle
Automation can be an option as well which removes the human element and its inherent
variation. Although use caution to automate a process many time people jump into automation
prematurely. If you automate a poor process what will that do for you?
505
Defect Controls
Full Automation Example
A Black Belt is working on controlling rust on machined surfaces of

brake rotors:
–  A rust inhibiter is applied during the wash cycle after final
machining is completed
–  Concentration of the inhibiter in the wash tank is a Critical X that
must be maintained
–  The previous system was an S.O.P. requiring a process
technician to audit and add the inhibiter manually
As part of the Control Phase the team has implemented an automatic

check and replenish system on the washer.
Don’t worry boss, it’s automated!!
4 – 5 σ Process Interruption
Process Interruption: Mechanism installed to shut down the process

and prevent further operation until a required action is preformed:
•  Ground fault circuit breakers
•  Child proof caps on medications
•  Software routines to prevent undesirable commands
•  Safety interlocks on equipment such as light curtains, dual palm
buttons, ram blocks
•  Transfer system guides or fixtures that prevent over or undersized
parts from proceeding
•  Temperature conveyor interlocks on ovens
•  Missing component detection that stops the process when triggered
506
Defect Controls
4 – 5 σ Process Interruption (cont.)
Example:
•  A Black Belt is working on launching a new electric drive unit on a
transfer system
–  One common failure mode of the system is a bearing failure on the
main motor shaft
–  It was determined a high press fit at bearing installation was
causing these failures
–  The Root Cause of the problem turned out to be undersized
bearings from the supplier
•  Until the supplier could be brought into control or replaced the team
implemented a press load monitor at the bearing press with a indicator
–  If the monitor detects a press load higher than the set point it
shuts down the press not allowing the unit to be removed from
press until an interlock key is turned and the ram reset in the
manual mode
–  Only the line lead person and the supervisor have keys to the
interlock
–  The non-conforming part is automatically marked with red dye
Process Interruption
3 – 5 σ Mistake Proofing
Mistake Proofing is
great because it is Mistake Proofing is best defined as:
usually inexpensive
–  Using wisdom, ingenuity or serendipity to create devices
and very effective.
Consider the many allowing a 100% defect free step 100% of the time.
everyday examples of
Mistake Proofing. Poka-Yoke is the Japanese term for mistake proofing or to avoid
You can not fit the yokeuro inadvertent errors poka .
diesel gas hose into
an unleaded vehicle 1 2 3 4
gas tank. Pretty

straightforward, right?
5 7 8
See if you
can find the
6 Poka- Yokes!
507
Defect Controls
Traditional Quality vs. Mistake Proofing
This clearly highlights the difference between the two approaches. What are the benefits to the
Source Inspection method?
Traditional Inspection
Result
Sort
Worker or Do Not Do Defective At Other
Machine Error Anything Step
Discover Take Action/ No Next

Error Feedback Defect Step
Source Inspection
KEEP ERRORS FROM
TURNING INTO DEFECTS
Styles of Mistake Proofing
Two states of a defect are addressed with Mistake Proofing.
ERROR ABOUT TO OCCUR ERROR HAS OCCURRED
DEFECT ABOUT TO OCCUR DEFECT HAS OCCURRED

(Prediction) (Detection)
WARNING SIGNAL WARNING SIGNAL
CONTROL / FEEDBACK CONTROL / FEEDBACK
SHUTDOWN SHUTDOWN
(Stop Operation) (Stop Operation)
508
Defect Controls
Mistake Proofing Devices Design
Hints to help design a Mistake Proofing device:

–  Simple
–  Inexpensive
–  Give prompt feedback
–  Give prompt action (prevention)
–  Focused application
–  Have the correct people’s input
BEST ...makes it impossible for errors to occur

BETTER …allows for detection while error is being made
GOOD ...detects defect before it continues to the next operation
The very best approaches make creating a defect impossible. Recall the gas hose example, you
can not put diesel fuel into an unleaded gas tank unless you really try hard or have a hammer.
Types of Mistake Proof Devices
Contact Method
–  Physical or energy contact
with product
1 Guide Pins of
Different Sizes
•  Limit switches
•  Photo-electric beams
Fixed Value Method 2 Error Detection
and Alarms
–  Number of parts to be
attached/assembled etc.
are constant
–  Number of steps done 3 Limit Switches
in operation
•  Limit switches
Motion-step Method 4 Counters
–  Checks for correct sequencing

–  Checks for correct timing
•  Photo-electric switches 5 Checklists
and timers
509
Defect Controls
Mistake Proofing Examples

Let’s consider
examples of Everyday examples of mistake-proofing: •  Automobile
mistake •  Home –  Seat belts
proofing or –  Automated shutoffs on electric –  Air bags
Poka-Yoke coffee pots –  Car engine warning lights
devices even in –  Ground fault circuit breakers for •  Office
bathroom in or outside electric
the home. circuits
–  Spell check in word processing
software
Have a –  Pilotless gas ranges and hot water –  Questioning Do you want to delete
discussion heaters after depressing the Delete button
about them in –  Child proof caps on medications on your computer
the work –  Butane lighters with safety button •  Factory
environment as •  Computers –  Dual palm buttons and other guards
well. –  Mouse insertion on machinery
•  Retail
–  USB cable connection
–  Tamper proof packaging
–  Battery insertion
–  Bar code price recognition
–  Power save feature
Advantages of Mistake Proofing as a Control Method
Mistake Proofing advantages include:

–  Only simple training programs are required
–  Inspection operations are eliminated and the process is simplified
–  Relieves operators from repetitive tasks of typical visual inspection
–  Promotes creativity and value adding activities
–  Results in defect free work
–  Requires immediate action when problems arise
–  Provides 100% inspection internal to the operation
The best resource for pictorial examples of Mistake Proofing is:
Poka-Yoke: Improving Product Quality by Preventing Defects.

Overview by Hiroyuki Hirano. Productivity Press, 1988.)
To see a much more in-depth review of improving the product or service quality by preventing defects
you MUST review the book shown here. A comprehensive 240 Poka-Yoke examples are shown and
can be applied to many industries. The Poka-Yoke’s are meant to address errors from processing,
assembly, mounting, insertion, measurement, dimensional, labeling, inspection, painting, printing,
misalignment and many other reasons.
510
Defect Controls
Defect Prevention Culture and Good Control Plans
Involve everyone in Defect Prevention:

–  Establish Process Capability through SPC
–  Establish and adhere to standard procedures
–  Make daily improvements
–  Invent Mistake Proofing devices
Make immediate feedback and action part of culture
Do not just stop at one Mistake Proofing device per product
Defect Prevention is needed for all potential defects
Defect Prevention implemented MUST be

documented in your living FMEA for the
process/product
All of the Defect Prevention methods used must be documented in your FMEA and the Control
Plan discussed later in the Control Phase.
Class Exercise
Take a look around your work area or office to see what things you can
identify as Mistake Proofed.
Talk with your fellow workers about:

–  How was the need for the control system identified?
–  If a Critical X is Mistake Proofed how was it identified as being
critical?
–  How are they maintained?
–  How are they verified as working properly?
–  Are they ever disabled?
Look for other areas where such beneficial things could be applied.
Prepare a probable defect prevention method to apply to your

project.
List any potential barriers to implementation.
511
Defect Controls
•  Describe some methods of Defect Prevention
•  Understand how these techniques can help with project sustainability:

–  Including reducing those Outliers as seen in the Advanced Process
Capability section
–  If the Critical X was identified then prevent the cause of defective Y
•  Understand what tools must document the Defect Prevention created in the
Control Phase
You have now completed Control Phase – Defect Controls.
Notes
512
Lean Six Sigma

Green Belt Training
Control Phase
Statistical Process Control
We will now continue in the Control Phase with “Statistical Process Control or SPC”.
513
Overview
Welcome to Control
Lean Controls
Elements and Purpose

Defect Controls
Methodology
Special Cause Tests
Examples
Statistical techniques can be used to monitor and manage process performance. Process
performance, as we have learned, is determined by the behavior of the inputs acting upon it in the
form of Y = f(X). As a result it must be well understood we can monitor only the performance of a
process output. Many people have applied Statistical Process Control (SPC) to only the process
outputs. Because they were using SPC their expectations were high regarding a new potential level
of performance and control over their processes. However, because they only applied SPC to the
outputs they were soon disappointed. When you apply SPC techniques to outputs it is appropriately
called Statistical Process Monitoring or SPM.
You of course know you can only control an output by controlling the inputs exerting an influence on
the output. This is not to say applying SPC techniques to an output is bad, there are valid reasons
for doing this. Six Sigma has helped us all to better understand where to apply such control
techniques.
In addition to controlling inputs and monitoring outputs control charts are used to determine the
baseline performance of a process, evaluate measurement systems, compare multiple processes,
compare processes before and after a change, etc. Control Charts can be used in many situations
that relate to process characterization, analysis and performance.
To better understand the role of SPC techniques in Six Sigma we will first investigate some of the
factors that influence processes then review how simple probability makes SPC work and finally
look at various approaches to monitoring and controlling a process.
514
SPC Overview: Collecting Data
Control Charts are usually derived Population:

from samples taken from the –  An entire group of objects that have been made or will be
larger population. Sampling must made containing a characteristic of interest
be collected in such a way it does Sample:
–  A sample is a subset of the population of interest
not bias or distort the
–  The group of objects actually measured in a statistical
interpretation of the Control Chart. study
The process must be allowed to –  Samples are used to estimate the true population
operate normally when taking a parameters
sample. If there is any special
treatment or bias given to the Population
process over the period the data
is collected the Control Chart
interpretation will be invalid. The Sample
frequency of sampling depends Sample
on the volume of activity and the Sample
ability to detect trends and
patterns in the data. At the onset
you should error on the side of taking extra samples, and then, if the process demonstrates its ability
to stay in control you can reduce the sampling rate.
Using rational subgroups is a common way to assure you collect representative data. A rational
subgroup is a sample of a process characteristic in which all the items in the sample were produced
under very similar conditions over in a relatively short time period. Rational subgroups are usually
small in size, typically consisting of 3 to 5 units to make up the sample. It is important that rational
subgroups consist of units produced as closely as possible to each other especially if you want to
detect patterns, shifts and drifts. If a machine is drilling 30 holes a minute and you wanted to collect a
sample of hole sizes a good rational subgroup would consist of 4 consecutively drilled holes. The
selection of rational subgroups enables you to accurately distinguish Special Cause variation from
Common Cause variation.
Make sure your samples are not biased in any way; meaning they are randomly selected. For
example, do not plot only the first shift’s data if you are running multiple shifts. Do not look at only one
vendor’s material if you want to know how the overall process is really running. Finally do not
concentrate on a specific time to collect your samples; like just before the lunch break.
If your process consists of multiple machines, operators or other process activities producing streams
of the same output characteristic you want to control it would be best to use separate Control Charts
for each of the output streams.
If the process is stable and in control the sample observations will be randomly distributed around the
average. Observations will not show any trends or shifts and will not have any significant outliers from
the random distribution around the average. This type of behavior is to be expected from a normally
operating process and is why it is called Common Cause variation. Unless you are intentionally trying
to optimize the performance of a process to reduce variation or change the average, as in a typical Six
Sigma project, you should not make any adjustments or alterations to the process if is it demonstrating
only Common Cause variation. That can be a big time saver since it prevents “wild goose chases.”
If Special Cause variation occurs you must investigate what created it and find a way to prevent it from
happening again. Some form of action is always required to make a correction and to prevent future
occurrences.
515
SPC Overview: Collecting Data (cont.)

You may have noticed there has been no mention of the specification limits for the characteristic
being controlled. Specification limits are not evaluated when using a Control Chart. A process in
control does not necessarily mean it is capable of meeting the requirements. It only states it is
stable, consistent and predictable. The ability to meet requirements is called Process Capability, as
previously discussed.
SPC Overview: I-MR Chart
•  An I-MR Chart combines a Control Chart of the average moving range with the Individuals
Chart.
•  You can use Individuals Charts to track the process level and to detect the presence of
Special Causes when the sample size is one batch.
•  Seeing these charts together allows you to track both the process level and process
variation at the same time providing greater sensitivity to help detect the presence of
Special Causes.
I-MR Chart
U C L=226.12
225.0
Individual Value
222.5
_
220.0 X=219.89
217.5
215.0
LC L=213.67
1 13 25 37 49 61 73 85 97 109
O bser v ation
8
U C L=7.649
6
Moving Range
4
__
M R=2.341
2
0 LC L=0
1 13 25 37 49 61 73 85 97 109
O bser v ation
Individual Values (I) and Moving Range (MR) Charts are used when each measurement
represents one batch. The subgroup size is equal to one when I-MR charts are used. These
charts are very simple to prepare and use. The graphic shows the Individuals Chart where the
individual measurement values are plotted with the Center Line being the average of the
individual measurements. The Moving Range Chart shows the range between two subsequent
measurements.
There are certain situations when opportunities to collect data are limited or when grouping the
data into subgroups simply does not make practical sense. Perhaps the most obvious of these
cases is when each individual measurement is already a rational subgroup. This might happen
when each measurement represents one batch, when the measurements are widely spaced in
time or when only one measurement is available in evaluating the process. Such situations
include destructive testing, inventory turns, monthly revenue figures and chemical tests of a
characteristic in a large container of material.
All these situations indicate a subgroup size of one. Because this chart is dealing with individual
measurements it, is not as sensitive as the X-Bar Chart in detecting process changes.
516
SPC Overview: Xbar-R Chart
If each of your observations consists of a subgroup of data rather than just individual
measurements an Xbar-R chart provides greater sensitivity. Failure to form rational
subgroups correctly will make your Xbar-R Charts incorrect.
Xbar-R Chart
U C L=225.76
225
Sample Mean
222 _
_
X=221.13
219
LC L=216.50
216
1 3 5 7 9 11 13 15 17 19 21 23
Sample
U C L=16.97
16
Sample Range
12
_
8 R=8.03
0 LC L=0
1 3 5 7 9 11 13 15 17 19 21 23
Sample
An Xbar-R is used primarily to monitor the stability of the average value. The Xbar Chart plots the
average values of each of a number of small sampled subgroups. The averages of the process
subgroups are collected in sequential, or chronological, order from the process. The Xbar Chart,
together with the Rbar Chart shown, is a sensitive method to identify assignable causes of product
and process variation and gives great insight into short-term variations.
These charts are most effective when they are used as a matched pair. Each chart individually
shows only a portion of the information concerning the process characteristic. The upper chart
shows how the process average (central tendency) changes. The lower chart shows how the
variation of the process has changed.
It is important to track both the process average and the variation separately because different
corrective or improvement actions are usually required to effect a change in each of these two
parameters.
The Rbar Chart must be in control in order to interpret the averages chart because the Control
Limits are calculated considering both process variation and Center. When the Rbar Chart shows
not in control, the Control Limits on the averages chart will be inaccurate and may falsely indicate
an out of control condition. In this case, the lack of control will be due to unstable variation rather
than actual changes in the averages.
Xbar and Rbar Charts are often more sensitive than I-MR but are frequently done incorrectly. The
most common error is failure to perform rational sub-grouping correctly.
A rational subgroup is simply a group of items made under conditions that are as nearly identical as
possible. Five consecutive items made on the same machine with the same setup, the same raw
materials and the same operator are a rational subgroup. Five items made at the same time on
different machines are not a rational subgroup. Failure to form rational subgroups correctly will
make your Xbar-Rbar Charts dangerously wrong.
517
SPC Overview: U Chart
•  C Charts and U Charts are for tracking defects.

•  A U Chart can do everything a C Chart can so we will just learn how to do a
U Chart. This chart counts flaws or errors (defects). One search area can
have more than one flaw or error.
•  Search area (unit) can be practically anything we wish to define. We can look
for typographical errors per page, the number of paint blemishes on a truck
door or the number of bricks a mason drops in a workday.
•  You supply the number of defects on each unit inspected.
U Chart of Defects
0.14 1
1
UCL=0.1241
0.12
Sample Count Per Unit
0.10
0.08
0.06 _
U=0.0546
0.04
0.02
0.00 LCL=0
1 3 5 7 9 11 13 15 17 19
Sample
The U Chart plots defects per unit data collected from subgroups of equal or unequal sizes. The
“U” in U Charts stands for defects per Unit. U Charts plot the proportion of defects that are
occurring.
The U Chart and C Chart are very similar. They both are looking at defects but the U Chart does not
need a constant sample size as does the C Chart. The Control Limits on the U Chart vary with the
sample size and therefore they are not uniform; similar to the P Chart which we will describe next.
Counting defects on forms is a common use for the U Chart. For example, defects on insurance
claim forms are a problem for hospitals. Every claim form has to be checked and corrected before
going to the insurance company. When completing a claim form a particular hospital must fill in 13
fields to indicate the patient’s name, social security number, DRG codes and other pertinent data. A
blank or incorrect field is a defect.
A hospital measured their invoicing performance by calculating the number of defects per unit for
each day’s processing of claims forms. The graph demonstrates their performance on a U Chart.
The general procedure for U Charts is as follows:

1. Determine purpose of the chart
2. Select data collection point
3. Establish basis for sub-grouping
4. Establish sampling interval and determine sample size
5. Set up forms for recording and charting data and write specific instructions on use of the chart
6. Collect and record data
7. Count the number of nonconformities for each of the subgroups
8. Input into Excel or other statistical software
9. Interpret chart together with other pertinent sources of information on the process and take
corrective action if necessary
518
SPC Overview: P Chart
•  NP Charts and P Charts are for tracking defectives.

•  A P Chart can do everything an NP Chart can so we will just learn how to do
a P Chart!
•  Used for tracking defectives – the item is either good or bad, pass or fail,
accept or reject.
•  Center Line is the proportion of rejects and is also your Process Capability.
•  Input to the P Chart is a series of integers — number bad, number rejected.
In addition you must supply the sample size.
P Chart of Errors
0.30
UCL=0.2802
0.25
Proportion
_
0.20 P=0.2038
0.15
LCL=0.1274
1 3 5 7 9 11 13 15 17 19
Sample
The P Chart plots the proportion of nonconforming units collected from subgroups of equal or
unequal size (percent defective). The proportion of defective units observed is obtained by dividing
the number of defective units observed in the sample by the number of units sampled. P Charts
name comes from plotting the Proportion of defectives. When using samples of different sizes the
upper and lower control limits will not remain the same - they will look uneven as exhibited in the
graphic. These varying Control Chart limits are effectively managed by Control Charting software.
A common application of a P Chart is when the data is in the form of a percentage and the sample
size for the percentage has the chance to be different from one sample to the next. An example
would be the number of patients arriving late each day for their dental appointments. Another
example is the number of forms processed daily requiring rework due to defects. In both of these
examples the quantity would vary from day to day.
The general procedure for P Charts is as follows:

1. Determine purpose of the chart
2. Select data collection point
3. Establish basis for sub-grouping
4. Establish sampling interval and determine sample size
5. Set up forms for recording and charting data and write specific instructions on chart usage
6. Collect and record data. It is recommended that at least 20 samples be used to calculate the
Control Limits
7. Compute P, the proportion nonconforming for each of the subgroups
8. Load data into Excel or other statistical software
9. Interpret chart together with other pertinent sources of information on the process and take
corrective action if necessary
519
SPC Overview: Control Methods/Effectiveness
Type 1 Corrective Action = Countermeasure: improvement made to the process

which will eliminate the error condition from occurring. The defect will never be created.
This is also referred to as a long-term corrective action in the form of Mistake Proofing
or design changes.
Type 2 Corrective Action = Flag: improvement made to the process which will detect
when the error condition has occurred. This flag will shut down the equipment so the
defect will not move forward.
SPC on X’s or Y’s with fully trained operators and staff who respect the rules. Once a
chart signals a problem everyone understands the rules of SPC and agrees to shut
down for Special Cause identification. (Cpk > certain level).
Type 3 Corrective Action = Inspection: implementation of a short-term containment

which is likely to detect the defect caused by the error condition. Containments are
typically audits or 100% inspection.
SPC on X’s or Y’s with fully trained operators. The operators have been trained and
understand the rules of SPC, but management will not empower them to stop for
investigation.
S.O.P. is implemented to attempt to detect the defects. This action is not sustainable
short-term or long-term.
SPC on X’s or Y’s without proper usage = WALL PAPER.
The most effective form of control is called a type 1 corrective action. This is a control applied to the
process which will eliminate the error condition from occurring. The defect can never happen. This
is the “prevention” application of the Poka-Yoke method.
The second most effective control is called a type 2 corrective action. This a control applied to the
process which will detect when an error condition has occurred and will stop the process or shut
down the equipment so that the defect will not move forward. This is the “detection” application of
the Poka-Yoke method.
The third most effective form of control is to use SPC on the X’s with appropriate monitoring on the
Y’s. To be effective employees must be fully trained, they must respect the rules and management
must empower the employees to take action. Once a chart signals a problem everyone
understands the rules of SPC and agrees to take emergency action for special cause identification
and elimination.
The fourth most effective correction action is the implementation of a short-term containment which
is likely to detect the defect caused by the error condition. Containments are typically audits or
100% inspection.
Finally you can prepare and implement an S.O.P. (standard operating procedure) to attempt to
manage the process activities and to detect process defects. This action is not sustainable either
short-term or long-term.
Do not do SPC for the sake of just saying you do SPC. It will quickly deteriorate to a waste of time
and a very valuable process tool will be rejected from future use by anyone who was associated
with the improper use of SPC.
Using the correct level of control for an improvement to a process will increase the acceptance of
changes/solutions you may wish to make and it will sustain your improvement for the long-term.
520
Purpose of Statistical Process Control
Every process has Causes of Variation known as:

–  Common Cause: Natural variability
–  Special Cause: Unnatural variability
•  Assignable: Reason for detected Variability
•  Pattern Change: Presence of trend or unusual pattern
SPC is a basic tool to monitor variation in a process.
SPC is used to detect Special Cause variation telling us the process is

out of control … but does NOT tell us why.
SPC gives a glimpse of ongoing process capability AND is a visual

management tool.
SPC has its uses because it is known that every process has known variation called Special Cause
and Common Cause variation. Special Cause variation is unnatural variability because of
assignable causes or pattern changes. SPC is a powerful tool to monitor and improve the variation
of a process. This powerful tool is often an aspect used in visual factories. If a supervisor or
operator or staff is able to quickly monitor how its process is operating by looking at the key inputs or
outputs of the process, this would exemplify a visual factory.
SPC is used to detect Special Causes in order to have those operating the process find and remove
the Special Cause. When a Special Cause has been detected the process is considered to be “out
of control”.
SPC gives an ongoing look at the Process Capability. It is not a capability measurement but it is a
visual indication of the continued Process Capability of your process.
Not this special cause!!
521
Elements of Control Charts
Developed by Dr Walter A. Shewhart of Bell Laboratories from 1924.

Graphical and visual plot of changes in the data over time.
–  This is necessary for visual management of your process.
Control Charts were designed as a methodology for indicating change in
performance, either variation or Mean/Median.
Charts have a Central Line and Control Limits to detect Special Cause variation.
Control Chart of Recycle

60 1
UCL=55.24
Special Cause 50
Variation Detected
40
Individual Value
30
_
X=29.06 Process Center
(usually the Mean)
20
Control Limits
10
LCL=2.87
0
1 4 7 10 13 16 19 22 25 28
Observation
Control Charts were first developed by Dr. Shewhart in the early 20th century in the U.S. Control
Charts are a graphical and visual plot of a process and charts over time like a Time Series Chart.
From a visual management aspect a Time Plot is more powerful than knowledge of the latest
measurement. These charts are meant to indicate change in a process. All SPC charts have a
Central Line and Control Limits to aid in Special Cause variation.
Notice, again, we never discussed showing or considering specifications. We are advising you to
never have specification limits on a Control Chart because of the confusion often generated.
Remember we want to control and maintain the process improvements made during the project.
These Control Charts and their limits are the Voice of the Process. These charts give us a running
view of the output of our process relative to established limits.
522
Understanding the Power of SPC
Control Charts indicate when a process is out of control or exhibiting Special Cause variation
but NOT why!
SPC Charts incorporate upper and lower Control Limits.

–  The limits are typically +/- 3 σ from the Center Line.
–  These limits represent 99.73% of natural variability for Normal Distributions.
SPC Charts allow workers and supervision to maintain improved process performance from
Lean Six Sigma projects.
Use of SPC Charts can be applied to all processes.

–  Services, manufacturing and retail are just a few industries with SPC applications.
–  Caution must be taken with use of SPC for Non-normal processes.
Control Limits describe the process variability and are unrelated to customer specifications.
(Voice of the Process instead of Voice of the Customer)
–  An undesirable situation is having Control Limits wider than customer specification
limits. This will exist for poorly performing processes with a Cp less than 1.0
Many SPC Charts exist and selection must be appropriate for effectiveness.
The Control Chart Cookbook
General Steps for Constructing Control Charts

1. Select characteristic (Critical X or CTQ) to be charted.
2. Determine the purpose of the chart.
3. Select data-collection points.
4. Establish the basis for sub-grouping (only for Y’s).
5. Select the type of Control Chart.
6. Determine the measurement method/criteria.
7. Establish the sampling interval/frequency.
8. Determine the sample size.
9. Establish the basis of calculating the Control Limits. Stirred or
10. Set up the forms or software for charting data. Shaken?
11. Set up the forms or software for collecting data.
12. Prepare written instructions for all phases.
13. Conduct the necessary training.
523
Focus of Six Sigma and the Use of SPC

This concept should be very
familiar to you by now. If we
understand the variation caused
Y = f(x)
by the X’s we should be To get results should we focus our behavior on the Y or X?
monitoring with SPC the X’s first.
Y X1 . . . XN
By this time in the methodology Dependent Independent
you should clearly understand Output Input
the concept of Y = f(x). Using Effect Cause
SPC we are attempting to control Symptom Problem
the Critical X’s in order to control Monitor Control
the Y.
When we find the vital few X’s first consider
using SPC on the X’s to achieve a desired Y.
Control Chart Anatomy
Special Cause
Variation Run Chart of Statistical Process Control (SPC)
Process is Out
of Control
data points involves the use of statistical
techniques to interpret data to control
Upper Control
the variation in processes. SPC is used
Limit primarily to act on out of control
processes but is also used to monitor
+/- 3 sigma
Common Cause
Variation the consistency of processes producing
Process is In
Control products and services.
Lower Control A primary SPC tool is the Control Chart

Limit
- a graphical representation for specific
Special Cause
Mean quantitative measurements of a
Variation
Process is Out
process input or output. In the Control
of Control
Process Sequence/Time Scale Chart these quantitative measurements
are compared to decision rules
calculated based on probabilities from the measurement of process performance.
Comparison of the decision rules to the performance data detects any unusual variation in the
process that could indicate a problem with the process. Several different descriptive statistics can be
used in Control Charts. In addition there are several different types of Control Charts to test for
different causes, such as how quickly major vs. minor shifts in process averages are detected.
Control Charts are Time Series Charts of all the data points with one addition. The Standard
Deviation for the data is calculated for the data and two additional lines are added to the chart. These
lines are placed +/- 3 Standard Deviations away from the Mean and are called the Upper Control
Limit (UCL) and Lower Control Limit (LCL). Now the chart has three zones: 1. The zone between the
UCL and the LCL which called the zone of Common Cause variation, 2. The zone above the UCL
which a zone of Special Cause variation and 3. another zone of Special Cause variation below the
LCL.
Control Charts graphically highlight data points that do not fit the normal level of expected variation.
This is mathematically defined as being more than +/- 3 Standard Deviations from the Mean. It is all
based off probabilities. We will now demonstrate how this is determined.
524
Control and Out of Control
Outlier
3
2
1
99.7%
95%
68%
-1
-2
-3
Outlier
Control Charts provide you with two basic functions; one is to provide time based information on the
performance of the process which makes it possible to track events affecting the process and the
second is to alert you when Special Cause variation occurs. Control Charts graphically highlight data
points not fitting the normal level of variation expected. Common Cause variation level is typically
defined as +/- 3 Standard Deviations from the Mean. This is also know as the UCL and LCL
respectively.
Recall the “area under the curve” discussion in the lesson on Basic Statistics remembering +/- one
Standard Deviation represented 68% of the distribution, +/- 2 was 95% and +/- 3 was 99.7%. You
also learned from a probability perspective your expectation is the output of a process would have a
99.7% chance of being between +/- 3 Standard Deviations. You also learned the sum of all
probability must equal 100%. There is only a 0.3% chance (100% - 99.7%) a data point will be
beyond +/- 3 Standard Deviations. In fact since we are talking about two zones, one zone above the
+3 Standard Deviations and one below it, we have to split 0.3% in two, meaning there is only a
0.15% chance of being in one of the zones.
There is only a .0015 (.15%) probability a data point will either be above or below the UCL or LCL.
This is a very small probability as compared to .997 (99.75%) probability the data point will be
between the UCL and the LCL. What this means is there must have been something special happen
to cause a data point to be that far from the Mean; like a change in vendor, a mistake, etc. This is
why the term Special Cause or assignable cause variation applies. The probability a data point was
this far from the rest of the population is so low that something special or assignable happened.
Outliers are just that, they have a low probability of occurring, meaning we have lost control of our
process. This simple, quantitative approach using probability is the essence of all Control Charts.
525
Size of Subgroups
Typical subgroup sizes are 3-12 for variable data:

–  If difficulty of gathering sample or expense of testing exists the size, n,
is smaller.
–  3, 5 and 10 are the most common size of subgroups because of ease of
calculations when SPC is done without computers.
Lot 1 Lot 5
Lot 3
Lot 2
Lot 4
Short-term studies
Long-term study
The Impact of Variation
Remember the Control

Limits are based on your Sources of Variation
PAST data and depending - Natural Process Variation - Natural Process Variation - Natural Process Variation
as defined by subgroup - Different Operators - Different Operators
on what sources of selection - Supplier Source
variation you have
included in your
subgroups, the control
-UCL
limits which detect the
Special Cause variation -LCL
will be affected. You really
want to have subgroups
with only Common Cause
variation so if other First select the spread we
will declare as the Natural
sources of variation are Process Variation so
whenever any point lands So when a second source And, of course, if two additional
detected, the sources will outside these Control of variation appears we sources of variation arrive we will
Limits an alarm will sound will know! detect that too!
be easily found instead of
buried within your
If you base your limits on all three sources of variation, what will sound the alarm?
definition of subgroups.
Let’s consider if you were tracking delivery times for quotes on new business with an SPC chart. If
you decided to not include averaging across product categories you might find product categories
are assignable causes but you might not find them as Special Causes since you have included
them in the subgroups as part of your rationalization.
You really want to have subgroups with only Common Cause variation so if other sources of
variation are detected the sources will be easily found instead of buried within your definition of
subgroups.
526
Frequency of Sampling
Sampling Frequency is a balance between cost of sampling and testing versus cost of not detecting
shifts in Mean or variation.
Process knowledge is an input to frequency of samples after the subgroup size has been decided.
- If a process shifts but cannot be detected because of too infrequent sampling the
customer suffers
- If choice is given of large subgroup samples infrequently or smaller subgroups
more frequently most choose to get information more frequently.
- In some processes with automated sampling and testing frequent sampling is
easy.
If undecided as to sample frequency sample more frequently to confirm detection of process shifts
and reduce frequency if process variation is still detectable.
A rule of thumb also states “sample a process at least 10 times more frequent than the frequency of
‘out of control’ conditions”.
Sometimes it can be a struggle how often to sample your process when monitoring results. Unless
the measurement is automated inexpensive and recorded with computers and able to be charted with
SPC software without operator involvement then frequency of sampling is an issue.
Let’s reemphasize some points. First, you do NOT want to under sample and not have the ability to
find Special Cause variation easily. Second, do not be afraid to sample more frequently and then
reduce the frequency if it is clear Special Causes are found frequently.
Sampling too little will not allow for sufficient detection of

shifts in the process because of Special Causes.
I Chart of Sample_3
Output 7.5
UCL=7.385
All possible samples 7.0

7.5
7
Individual Value
6.5
_
6.5 6.0
X=6.1
6
5.5 5.5
5 5.0
Sample every half hour LCL=4.815
1 7 13 19 25 31 37
1 2 3 4 5 6 7 8 9 10 11 12 13
Observation
I Chart of Sample_6 I Chart of Sample_12

6.6
UCL=8.168 UCL=6.559
8
6.4
6.2
7
Individual Value
Individual Value
6.0
_ _
X=6.129 X=5.85
6 5.8
5.6
5
Sample every hour 5.4 Sample 4x per shift
5.2
LCL=5.141
4 LCL=4.090
5.0
1 2 3 4 5 6 7 1 2 3 4
Observation Observation
527
SPC Selection Process
The Control Charts you Choose Appropriate

Control Chart
choose to use will
always be based first
on the type of data you ATTRIBUTE type
of data
CONTINUOUS
have then on the

objective of the Control type of
subgroup
attribute
Chart. The first data
size
DEFECTS DEFECTIVES
selection criteria will be
whether you have Sample size 1 2-5 10+
type
Attribute or Continuous of defect
type of
subgroups
I – MR X–R X–S
Data. Chart Chart Chart
CONSTANT VARIABLE CONSTANT VARIABLE
Individuals Mean & Mean &
Continuous SPC refers & Moving Range Std. Dev.
Range
to Control Charts
NP
displaying process C Chart U Chart
Chart
P Chart SPECIAL CASES
input or output Number of Incidences Number of Proportion

Incidences per Unit Defectives Defectives
characteristics based CumSum
Chart
EWMA
Chart
on Continuous Data -
Cumulative Exponentially
data where decimal Sum Weighted Moving
Average
subdivisions have
meaning. When these Control Charts are used to control the Critical X input characteristic it is called
Statistical Process Control (SPC). These charts can also be used to monitor the CTQ’s, the important
process outputs. When this is done it is referred to as Statistical Process Monitoring (SPM).
There are two categories of Control Charts for Continuous Data: charts for controlling the process
average and charts for controlling the process variation. Generally, the two categories are combined.
The principal types of Control Charts used in Six Sigma are: charts for Individual Values and Moving
Ranges (I-MR), charts for Averages and Ranges (XBar-R), charts for Averages and Standard
Deviations (XBar-S) and Exponentially Weighted Moving Average charts (EWMA).
Although it is preferable to monitor and control products, services and supporting processes with
Continuous Data, there will be times when Continuous Data is not available or there is a need to
measure and control processes with higher level metrics, such as defects per unit. There are many
examples where process measurements are in the form of Attribute Data. Fortunately there are
control tools that can be used to monitor these characteristics and to control the critical process inputs
and outputs that are measured with Attribute Data.
Attribute Data, also called discrete data, reflects only one of two conditions: conforming or
nonconforming, pass or fail, go or no go. Four principal types of Control Charts are used to monitor
and control characteristics measured in Attribute Data: the p (proportion nonconforming), np (number
nonconforming), c (number of non-conformities), and u (non-conformities per unit) charts. Four
principle types of Control Charts are used to monitor and control characteristics measured in Discrete
Data: the p (proportion nonconforming), np (number nonconforming), c (number of non-conformities),
and u (non-conformities per unit) charts. These charts are an aid for decision making. With Control
Limits, they can help us filter out the probable noise by adequately reflecting the Voice of the Process.
A defective is defined as an entire unit, whether it be a product or service, that fails to meet
acceptance criteria, regardless of the number of defects in the unit. A defect is defined as the failure to
meet any one of the many acceptance criteria. Any unit with at least one defect may be considered to
be a defective. Sometimes more than one defect is allowed, up to some maximum number, before the
product is considered to be defective.
528
Understanding Variable Control Chart Selection
Type of Chart When do you need it?
Average & Range u  Production is higher volume; allows process Mean and variability to be
or S viewed and assessed together; more sampling than with Individuals
(Xbar and R or Chart (I) and Moving Range Charts (MR) but when subgroups are
Xbar and S) desired. Outliers can cause issues with Range (R) charts so Standard
Deviation charts (S) used instead if concerned.
Most Common
Individual and u  Production is low volume or cycle time to build product is long or
Moving Range homogeneous sample represents entire product (batch etc.); sampling
and testing is costly so subgroups are not desired. Control limits are
wider than Xbar Charts. Used for SPC on most inputs.
u  Set-up is critical, or cost of setup scrap is high. Use for outputs

Pre-Control
u  Small shift needs to be detected often because of autocorrelation of the
Exponentially output results. Used only for individuals or averages of Outputs.
Weighted Infrequently used because of calculation complexity.
Moving Average
u  Same reasons as EWMA (Exponentially Weighted Moving Range) except
Cumulative Sum the past data is as important as present data.
Less Common
Understanding Attribute Control Chart Selection
Type of Chart When do you need it?
P u  Need to track the fraction of defective units; sample

size is variable and usually > 50
nP u  When you want to track the number of defective units

per subgroup; sample size is usually constant and
usually > 50
C u  When you want to track the number of defects per

subgroup of units produced; sample size is constant
u  When you want to track the number of defects per

U unit; sample size is variable
The P Chart is the most common type of chart in understanding Attribute Control Charts.
529
Detection of Assignable Causes or Patterns
Control Charts indicate Special Causes being either assignable causes or patterns.
The following rules are applicable for both variable and Attribute Data to detect Special Causes.
These four rules are the only applicable tests for Range (R), Moving Range (MR) or Standard
Deviation (S) charts.
–  One point more than 3 Standard Deviations from the Center Line.
–  6 points in a row all either increasing or all decreasing.
–  14 points in a row alternating up and down.
–  9 points in a row on the same side of the center line.
These remaining four rules are only for variable data to detect Special Causes.
–  2 out of 3 points greater than 2 Standard Deviations from the Center Line on the same
side.
–  4 out of 5 points greater than 1 Standard Deviation from the Center Line on the same
side.
–  15 points in a row all within one Standard Deviation of either side of the Center Line.
–  8 points in a row all greater than one Standard Deviation of either side of the Center
Line.
Remember Control Charts are used to monitor a process performance and to detect Special
Causes due to assignable causes or patterns. The standardized rules of your organization may
have some of the numbers slightly differing. For example, some organizations have 7 or 8 points
in a row on the same side of the Center Line. We will soon show you how to find what your
MINITABTM version has for defaults for the Special Cause tests.
There are typically 8 available tests for detecting Special Cause variation. Only 4 of the 8 Special
Cause tests can be used. Range, Moving Range or Standard Deviation charts are used to monitor
“within” variation.
If you are unsure of what is meant by these specific rule definitions, do not worry. The next few
pages will specifically explain how to interpret these rules.
530
Recommended Special Cause Detection Rules
•  If implementing SPC manually without software initially the most visually obvious
violations are more easily detected. SPC on manually filled charts are common place
for initial use of Defect Prevention techniques.
•  These three rules are visually the most easily detected by personnel.
–  6 points in a row all either increasing or all decreasing.
–  15 points in a row all within one Standard Deviation of either side of the Center Line.
•  Dr. Shewhart working with the Western Electric Co. was credited with the following four
rules referred to as Western Electric Rules.
–  8 points in a row on the same side of the Center Line.
–  2 out of 3 points greater than 2 Standard Deviations from the Center Line on the same side.
–  4 out of 5 points greater than 1 Standard Deviation from the Center Line on the same side.
•  You might notice the Western Electric rules vary slightly. The importance is to be
consistent in your organization deciding what rules you will use to detect Special
Causes.
•  VERY few organizations use all eight rules for detecting Special Causes.
Special Cause Rule Default in MINITABTM
If a Belt is using MINITABTM she must be aware of the default setting

rules. Program defaults may be altered by:
Tools>Options>Control Charts and Quality Tools> Tests
Many experts have commented on the appropriate tests and

numbers to be used. Decide, then be consistent when implementing.
531
Special Cause Test Examples
As promised, we will now closely review the definition of the Special Cause tests. The first test is
one point more than 3 sigmas from the Center Line. The 3 sigma lines are added or subtracted
from the Center Line. The sigma estimation for the short-term variation will be shown later in this
module.
If only one point is above the upper 3 sigma line or below the lower 3 sigma line, then a Special
Cause is indicated. This does not mean you need to confirm if another point is also outside of the 3
sigma lines before action is to be taken. Don’t forget the methodology of using SPC.
This is the MOST common Special Cause test used in SPC charts.
Test 1 One point beyond zone A

1
A
B
C
C
B
A
If you want to see the MINITABTM output on the left execute the MINITABTM command “Stat,
Control Charts, Variable Charts for Individuals, Individuals” then select the “Xbar-R Chart –
Options” then “Tests” tab. Remember your numbers may vary in the slide and those are set in
the defaults as you were shown recently in this module. From now on we will assume your rules
are the same as shown in this module. If not just adjust the conclusions.
532

The second test
for detecting This test is an indication of a shift in the process Mean.
Special Causes
is nine points in
a row on the
same side of the
Center Line.
This means if Test 2 Nine points in a row on
same side of center line
nine consecutive
points are above A
the Center Line a B
Special Cause is C
detected that C
would account B 2
for a potential
A
Mean shift in the
process.
This rule would

also be violated
nine consecutive points are below the Center Line. The amount away from the Center Line does not
if
matter as long as the consecutive points are all on the same side.
The third test looking This test is indicating a trend or gradual shift in the Mean.
for a Special Cause
is six points in a row
all increasing or all
decreasing. This
means if six
consecutive times Test 3 Six points in a row, all
increasing or decreasing
the present point is
A
higher than the
3
B
previous point then
C
the rule has been
C
violated and the
process is out of B
control. The rule is A
also violated if for six

consecutive times
the present point is
lower than the
previous point.
This rule obviously needs the time order when plotting on the SPC charts to be valid. Typically,
these charts plot increasing time from left to right with the most recent point on the right hand side
of the chart. Do not make the mistake of seeing six points in a line indicating an out of control
condition. Note on the example shown on the right, a straight line shows 7 points but it takes that
many in order to have six consecutive points increasing. This rule would be violated no matter
what zone the points occur.
533
Special Cause Test Examples (cont.)
The fourth rule

for a Special This test is indicating a non-random pattern.
Cause indication
is fourteen points
in a row
alternating up
and down. In Test 4 Fourteen points in a
row, alternating up and down
other words if the A
first point
B
increased from
C
the last point and
C
the second point 4
B
decreased from
A
the first point and
the third point
increased from
the second point
and so on for
fourteen points
then the process is considered out of control or a Special Cause is indicated. This rule does not
depend on the points being in any particular zone of the chart. Also note the process is not
considered to be out of control until after the 14th point has followed the alternating up and down
pattern.
The fifth Special Cause This test is indicating a shift in the Mean or a worsening of
test looks for 2 out of 3 variation.
consecutive points more
than 2 sigma away from
the Center Line on the
same side. The 2 sigma Test 5 Two out of three points in
a row in zone A (one side of center
line is obviously 2/3 of line)
the distance from the A
5
Center Line as the 3 B

sigma line. Please note C
it is not required that the C
points more than 2 B
sigma away be in A 5
consecutive order they

just have to be within a
group of 3 consecutive
points. Notice the
example shown on the right does NOT have 2 consecutive points 2 sigma away from the Center
Line but 2 out of the 3 consecutive are more than 2 sigma away. Notice this rule is not violated if
the 2 points that are more than 2 sigma but NOT on the same side.
Have you noticed MINITABTM will automatically place a number by the point that violates the
Special Cause rule and that number tells you which of the Special Cause tests has been violated.
In this example shown on the right the Special Cause rule was violated two times.
534
Special Cause Test Examples (cont.)

The sixth Special Cause
This test is indicating a shift in the Mean or degradation of
test looks for any four out of
variation.
five points more than one
sigma from the Center Line
all on the same side. Only
the 4 points that were more Test 6 Four out of five points in
zone B or beyond (one side of
than one sigma need to be center line)
on the same side. If four of A
6
the five consecutive points B

are more than one sigma C
from the Center Line and on C
the same side do NOT B 6
A
make the wrong assumption
that the rule would not be
violated if one of the four
points was actually more
than 2 sigma from the
Center Line.
This test is indicating a dramatic improvement of the
The seventh Special Cause variation in the process.
test looks for 15 points in a
row all within one sigma
from the Center Line. You
might think this is a good Test 7 Fifteen points in a row in
thing and it certainly is. zone C (both sides of center line)
A
However you might want to
B
find the Special Cause for
C
this reduced variation so
C 7
the improvement can be B
sustained in the future. A
The eighth and final test

for Special Cause
detection is having eight
points in a row all more
than one sigma from the This test is indicating a severe worsening of variation.
Center Line. The eight
consecutive points can be
any number of sigma
away from the Center Test 8 Eight points in a row
beyond zone C (both sides of
Line. Do NOT make the center line)
wrong assumption this rule A
would not be violated if B
some of the points were C
more than 2 sigma away C
from the Center Line. If B 8
A
you reread the rule it just
states the points must be
more than one sigma from
the Center Line.
535
SPC Center Line and Control Limit Calculations
This is a reference in case you really want to get into the nitty-gritty. The formulas shown here are
the basis for Control Charts.
Calculate the parameters of the Individual and MR Control Charts with the
following:
Center Line Control Limits

k
k
∑x i
∑R i
i
UCL x = X + E 2 MR UCL MR = D4 MR
X= i =1 MR =
k k LCL x = X − E 2 MR LCL MR = D 3 MR
Where ~
Xbar: Average of the individuals becomes the Center Line on the Individuals Chart
Xi: Individual data points
k: Number of individual data points
Ri : Moving range between individuals generally calculated using the difference between
each successive pair of readings
MRbar: The average moving range, the Center Line on the Range Chart
UCLX: Upper Control Limit on Individuals Chart
LCLX: Lower Control Limit on Individuals Chart
UCLMR: Upper Control Limit on moving range
LCLMR : Lower Control Limit on moving range (does not apply for sample sizes below 7)
E2, D3, D4: Constants that vary according to the sample size used in obtaining the moving
range
Calculate the parameters of the Xbar and R Control Charts with the
following:

k
k
∑x i ∑R
i
i UCL x = X + A2 R UCL R = D4 R
X= i =1
R = LCL x = X − A 2 R LCL R = D 3 R
k k
Where ~
Xi: Average of the subgroup averages, it becomes the Center Line of the Control Chart
Xi: Average of each subgroup
k: Number of subgroups
Ri : Range of each subgroup (Maximum observation – Minimum observation)
Rbar: The average range of the subgroups, the Center Line on the Range Chart
UCLX: Upper Control Limit on Average Chart
LCLX: Lower Control Limit on Average Chart
UCLR: Upper Control Limit on Range Chart
LCLR : Lower Control Limit Range Chart
A2, D3, D4: Constants that vary according to the subgroup sample size
536
SPC Center Line and Control Limit Calculations (cont.)
Yet another reference just in case anyone wants to do this stuff manually…have fun!!!!
Calculate the parameters of the Xbar and S Control Charts with the
following:

k k
∑x i ∑s
i =1
i UCL x = X + A3 S UCLS = B4 S
X= i =1
S=
k k LCLx = X − A3 S LCLS = B3 S
Where ~
Xi: Average of the subgroup averages it becomes the Center Line of the Control Chart
Xi: Average of each subgroup
k: Number of subgroups
si : Standard Deviation of each subgroup
Sbar: The average S. D. of the subgroups, the Center Line on the S chart
UCLX: Upper Control Limit on Average Chart
LCLX: Lower Control Limit on Average Chart
UCLS: Upper Control Limit on S Chart
LCLS : Lower Control Limit S Chart
A3, B3, B4: Constants that vary according to the subgroup sample size
We are now moving to the formula summaries for the attribute SPC Charts. These formulas are fairly
basic. The upper and lower Control Limits are equidistant from the Mean % defective unless you
reach a natural limit of 100 or 0%. Remember the p Chart is for tracking the proportion or %
defective.
These formulas are a bit more elementary because they are for Attribute Control Charts. Recall p
Charts track the proportion or % defective.
Calculate the parameters of the P Control Charts with the

following:
Total number of defective items p (1 − p )

p= UCL p = p + 3
Total number of items inspected ni
p (1 − p )
LCL p = p − 3
Where ~ ni
p: Average proportion defective (0.0 – 1.0)
ni: Number inspected in each subgroup
LCLp: Lower Control Limit on P Chart
UCLp: Upper Control Limit on P Chart
Since the Control Limits are a function of

sample size they will vary for each sample.
537
The nP Chart’s formulas resemble the P Chart. This chart tracks the number of defective items in a
subgroup.
Calculate the parameters of the nP Control Charts with the

following:
Total number of defective items UCL np = n i p + 3 ni p (1 − p )

np =
Total number of subgroups
Where ~ LCL np = n i p − 3 n i p(1 - p)

np: Average number defective items per subgroup
LCLnp: Lower Control Limit on nP chart
UCLnp: Upper Control Limit on nP chart
Since the Control Limits AND Center Line are a

function of sample size they will vary for each sample.
The U Chart is also basic in construction and is used to monitor the number of defects per unit.
Calculate the parameters of the U Control Charts with the

following:
Total number of defects Identified u

u= UCL u = u + 3
Total number of Units Inspected ni
u
Where ~ LCL u = u − 3
ni
u: Total number of defects divided by the total number of units inspected.
LCLu: Lower Control Limit on U Chart.
UCLu: Upper Control Limit on U Chart.
Since the Control Limits are a function of

sample size they will vary for each sample.
538
The C Control Charts are a nice way of monitoring the number of defects in sampled subgroups.
Calculate the parameters of the C Control Charts with the

following:

Total number of defects UCL c = c + 3 c
c=
Total number of subgroups
LCLc = c − 3 c
Where ~
c: Total number of defects divided by the total number of subgroups.

LCLc: Lower Control Limit on C Chart.
UCLc: Upper Control Limit on C Chart.
This EWMA can be considered a smoothing monitoring system with Control Limits. This is rarely
used without computers or automated calculations. The items plotted are NOT the actual
measurements but the weighted measurements. The exponentially weighted moving average is
useful for considering past and historical data and is most commonly used for individual
measurements although has been used for averages of subgroups.
Calculate the parameters of the EWMA Control Charts with the

following:

σ λ
Zt = λ X t + (1− λ) Zt −1 UCL = X + 3 ( )[1 − (1 − λ) 2t ]
n 2−λ
σ λ
LCL = X − 3 ( )[1 − (1 − λ) 2t ]
Where ~ n 2−λ
Zt: EWMA statistic plotted on Control Chart at time t
Zt-1: EWMA statistic plotted on Control Chart at time t-1
λ: The weighting factor between 0 and 1 – suggest using 0.2
σ: Standard Deviation of historical data (pooled Standard Deviation for subgroups
– MRbar/d2 for individual observations)
Xt: Individual data point or sample averages at time t
UCL: Upper Control Limit on EWMA Chart
LCL: Lower Control Limit on EWMA Chart
n: Subgroup sample size
539
Calculate the parameters of the CUSUM control charts with

MINITABTM or other program since the calculations are even more
complicated than the EWMA charts.
Because of this complexity of formulas execution of either this or

the EWMA are not done without automation and computer
assistance.
Ah, anybody got a laptop?
The CUSUM is an even more difficult technique to handle with manual calculations. We are not even
showing the math behind this rarely used chart. Following the Control Chart selection route shown
earlier, we remember the CUSUM is used when historical information is as important as present data.
540
Pre-Control Charts
Pre-Control Charts use limits relative to the specification limits. This is the
first and ONLY chart wherein you will see specification limits plotted for
Statistical Process Control. This is the most basic type of chart and
unsophisticated use of process control.
Red Zones. Zone outside the

0.0 0.25 0.5 0.75 1.0 specification limits. Signals the
process is out-of-control and
should be stopped
Yellow Zones. Zone between

RED Yellow GREEN Yellow Red
the PC Lines and the
specification limits indicating
caution and the need to watch
the process closely
Green Zone. Zone lies

LSL Target USL between the PC Lines, signals
the process is in control
The Pre-Control Charts are often used for startups with high scrap cost or low production volumes
between setups. Pre-Control Charts are like a stoplight are the easiest type of SPC to use by
operators or staff. Remember Pre-Control Charts are to be used ONLY for outputs of a process.
Another approach to using Pre-Control Charts is to use process capability to set the limits where
yellow and red meet.
Process Setup and Restart with Pre-Control
Qualifying Process
•  To qualify a process five consecutive parts must fall within the green zone
•  The process should be qualified after tool changes, adjustments,
new operators, material changes, etc.
Monitoring Ongoing Process

•  Sample two consecutive parts at predetermined frequency
–  If either part is in the red, stop production and find reason for variation
–  When one part falls in the yellow zone inspect the other and:
•  If the second part falls in the green zone then continue
•  If the second part falls in the yellow zone on the same side make an
adjustment to the process
•  If second part falls in the yellow zone on the opposite side or in the
red zone the process is out of control and should be stopped
–  If any part falls outside the specification limits or in the red zone the
process is out of control and should be stopped
541
Responding to Out of Control Indications
SPC is an exciting
•  The power of SPC is not to find out what the Center Line and Control Limits are.
tool but we must
•  The power is to react to the Out of Control (OOC) indications with your Out of Control
not get enamored Action Plans (OCAP) for the process involved. These actions are your corrective
with it. The power actions to correct the output or input to achieve proper conditions.
of SPC is not to
find the Center Line VIOLATION:
and Control Limits Special Cause is indicated
but to react to out
of control
indications with an
out of control action OCAP:
plan. SPC for If response time is too high get
effectiveness at additional person on phone bank
controlling and
reducing long-term
•  SPC requires immediate response to a Special Cause indication.
variation is to
•  SPC also requires no sub optimizing by those operating the process.
respond
–  Variability will increase if operators always adjust on every point if not at the
immediately to out Center Line. ONLY respond when an Out of Control or Special Cause is detected.
of control or –  Training is required to interpret the charts and response to the charts.
Special Cause
indications. Plot an
I Chart of Individual Value from the worksheet titled “Individual Chart” and show the point that is out of
control. SPC can actually be harmful if those operating the process respond to process variation with
suboptimizing. A basic rule of SPC is if it is not out of control as indicated by the rules do not make
any adjustments. There are studies where an operator who responds to off center measurements will
actually produce worse variation than a process not altered at all. Remember, being off the Center
Line is NOT a sign of out of control because Common Cause variation exists.
Training is required to use and interpret the charts not to mention training for you as a Belt to properly
create an SPC chart.
Attribute SPC Example
Practical Problem: A project has been launched to get rework
reduced to less than 25% of paychecks. Rework includes contacting a
manager about overtime hours to be paid. The project made some
progress but decides they need to implement SPC to sustain the gains
and track % defective. Please analyze the file paycheck2.mtw and
determine the Control Limits and Center Line.
Step 3 and 5 of the methodology is the primary focus for this example.
–  Select the appropriate Control Chart and Special Cause tests to
employ
–  Calculate the Center Line and Control Limits
–  Looking at the data set we see 20 weeks of data.
–  The sample size is constant at 250.
–  The amount of defective in the sample is in column C3.
Paycheck2.mtw
542
Attribute SPC Example (cont.)

The example includes % paychecks defective. The metric to be charted is % defective. We see the
P Chart is the most appropriate Attribute SPC Chart.
Notice specifications were never discussed. Let’s calculate

the Control Limits and Central Line for this example.
We will confirm what rules for Special Causes are included in our
Control Chart analysis.
543

We will confirm what rules for Special Causes are included in our Control Chart analysis. The top 3
were selected.
Remember to click on the Options… and Tests tab to

clarify the rules for detecting Special Causes.
…. Chart Options>Tests
No Special Causes were detected. The average % defective

checks were 20.38%. The UCL was 28.0% and 12.7% for the LCL.
Now we must see if the next few weeks are showing Special
Cause from the results. The sample size remained at 250 and the
defective checks were 61, 64, 77.
544
Remember we have
calculated the Let’s continue our example:
Control Limits from
the first 20 weeks. –  Step 6: Plot process X or Y on the newly created Control Chart
We must now put in –  Step 7: Check for Out-Of-Control (OOC) conditions after each point
3 new weeks and –  Step 8: Interpret findings, investigate Special Cause variation & make
NOT have improvements following the Out of Control Action Plan (OCAP)
MINITABTM calculate
new Control Limits
which will be done
automatically if we Notice the new 3 weeks of data
do not follow this was entered into the spreadsheet.
technique. Let’s now
execute Steps 6 - 8.
When done notice

that 3 weeks of data
was entered into the
spreadsheet.
…… Chart Options>Parameters
Place the pbar from the first

chart we created in the
Estimate” tab. This will
prevent MINITABTM from
calculating new Control Limits
which is step 9.
The new updated SPC

chart is shown with one
Special Cause.
545
Because of the Special Cause the process must refer to the OCAP or Out of Control Action Plan
stating what Root Causes need to be investigated and what actions are taken to get the process
back in Control.
After the corrective actions were taken wait until the next sample is taken to see if the process
has changed to not show Special Cause actions.
–  If still out of control refer to the OCAP to take further action to improve the process.
DO NOT make any more changes if the process shows back in Control after the next
reading.
•  Even if the next reading seems higher than the Center Line! Do not cause more
variability.
If process changes are documented after this project was closed the Control Limits should be
recalculated as in step 9 of the SPC methodology.
Practical Problem: A job shop drills holes for its largest

customer as a final step to deliver a highly engineered fastener.
This shop uses five drill presses and gathers data every hour
with one sample from each press representing part of the
subgroup. You can assume there is insignificant variation within
the five drills and the subgroup is across the five drills. The data
is gathered in columns C3-C7.
Step 3 and 5 of the methodology is the primary focus for this

example.
–  Select the appropriate Control Chart and Special Cause
tests to employ
–  Calculate the Center Line and Control Limits
Holediameter.mtw
Let’s walk through another example of using SPC within MINITABTM but in this case it will be with
Continuous Data. Open the MINITABTM worksheet titled “hole diameter.mtw” and select the
appropriate type of Control Chart and calculate the Center Line and Control Limits.
Let’s try another one, this time variable…
546

The example has Continuous Data, subgroups and we have no interest in small changes in this
small process output. The Xbar R Chart is selected because we are uninterested in the Xbar S
Chart for this example.
VARYING VARYING
Specifications were never discussed. Let’s calculate the Control

Limits and Center Line for this example.
Control Chart analysis.
547
Remember to click on the Options… and Tests tab to

clarify the rules for detecting Special Causes.
……..Xbar-R Chart Options>Tests
Control Chart analysis. The top 2 of 3 were selected.
Also confirm the Rbar method is used for estimating Standard

Deviation.
Stat>Control Charts>Variable Charts for Subgroups>Xbar-R>Xbar-R Chart Options>Estimate
548
No Special Causes were detected in the Xbar Chart. The average hole
diameter was 26.33. The UCL was 33.1 and 19.6 for the LCL.
Now we will use the Control Chart to monitor the next 2 hours and see if
we are still in control.
Some more steps….

– Step 6: Plot process X or Y on the newly created Control Chart
– Step 7: Check for Out-Of-Control (OOC) conditions after each point
– Step 8: Interpret findings, investigate Special Cause variation, & make
improvements following the Out of Control Action Plan (OCAP)
549
……..Xbar-R Chart Options>Parameters
The updated SPC Chart is

shown with no indicated
Special Causes in the Xbar
Chart. The Mean, UCL and
LCL are unchanged because
of the completed option .
Because of no Special Causes the process does not refer to the OCAP or Out of
Control Action Plan and NO actions are taken.
If process changes are documented after this project was closed the Control Limits
should be recalculated as in Step 9 of the SPC methodology.
550
Recalculation of SPC Chart Limits
•  Step 9 of the methodology refers to recalculating SPC limits.

•  Processes should see improvement in variation after usage of SPC.
•  Reduction in variation or known process shift should result in Center
Line and Control Limits recalculations.
–  Statistical confidence of the changes can be confirmed with
Hypothesis Testing from the Analyze Phase.
•  Consider a periodic time frame for checking Control Limits and
Center Lines.
–  3, 6, 12 months are typical and dependent on resources and
priorities
–  A set frequency allows for process changes to be captured.
•  Incentive to recalculate limits include avoiding false Special Cause
detection with poorly monitored processes.
•  These recommendations are true for both Variable and Attribute
Data.
SPC Chart Option in MINITABTM for σ Levels
Remembering
many of the tests This is possible with ~
are based on the
1st and 2nd
Standard
Deviations from
the Center Line Stat>Quality Charts> …..
some Belts Options>S Limits tab
prefer to have
some additional
lines displayed.
The extra lines

can be helpful if
users are using
MINITABTM for
the SPC.
551
•  Describe the elements of an SPC Chart and the purposes of SPC

•  Understand how SPC ranks in Defect Prevention
•  Describe the 13 step route or methodology of implementing a chart
•  Design subgroups if needed for SPC usage
•  Determine the frequency of sampling
•  Understand the Control Chart selection methodology
•  Be familiar with Control Chart parameter calculations such as UCL,
LCL and the Center Line
You have now completed Control Phase – Statistical Process Control.
Notes
552
Lean Six Sigma

Green Belt Training
Control Phase
Now we are going to continue in the Control Phase with “Six Sigma Control Plans”.
553
Overview
The last physical result of the Control Phase is the Control Plan. This module will discuss a
technique to selection various solutions you might want from all of your defect reduction techniques
found earlier in this phase. We will also discuss elements of a Control Plan to aid you and your
organization to sustain your project’s results.
Welcome to Control
Lean Controls
Defect Controls
Solution Selection
Control Plan Elements
End of Control: Your Objectives
You have already decided on the some defect reduction methodology.
Final decisions will clarify which defect reduction tools to use.

–  Capital expenditures may be required.
–  Training hurdles to overcome.
–  Management buy-in not completed.
This module will help select solutions with a familiar tool.
The Control Phase allows the Belt and team to tackle other processes in the
future.
–  The elements of a Control Phase aid to document how to maintain the
process.
This module identifies the elements of strong Control Plans.
Remember: The objective is to sustain the gains

initially found in the D, M, A, I Phases.
We have discussed all of the tools to improve and sustain your project success. However you might have
many options or too many options to implement final monitoring or controls. This module will aid you in
defect reduction selection.
Another objective of this module is to understand the elements of a good Control Plan needed to sustain
your gains.
554
Selecting Solutions
Selecting improvements to implement:

–  High-level objective evaluation of all potential improvements ~
•  Impact of each improvement
•  Cost to implement each improvement
•  Time to implement each improvement
–  Balance desire with quantifiable evaluation ~ What’s next??
•  Engineering always wants the gold standard
•  Sales always wants inventory
•  Production always wants more capacity
The tool for selecting Defect Prevention methods is unnecessary for

just a few changes to the process ~
–  Many projects with smaller scopes have few but vital control
methods put into the process.
Selecting solutions comes down to a business decision. The impact, cost and timeliness of the
improvement are all important. These improvement possibilities must be balanced against the
business needs. A cost benefit analysis is always a good tool to use to assist in determining the
priorities.
Recall us talking about the progression of a Six Sigma project? Practical Problem – Statistical
Problem – Statistical Solution – Practical Solution. Consider the Practical Solutions from a
business decision point of view.
Impact Considerations
Impact of the improvement:

–  Time frame of improvements ~
•  Long-term versus Short-term effectiveness
–  If a supplier will lose a major customer because of
defects the short term benefit will prevail first.
–  Effectiveness of the improvement types ~
•  Removing the Root Cause of the defect
•  Monitoring/flagging for the condition that produces a defect
•  Inspecting to determine if the defect occurred
•  Training people to not produce defects
Now that’s IMPACT!
555
Cost Considerations
Cost to implement improvement:

–  Initial cost to implement improvement ~
•  Cost to train existing work force
•  Cost to purchase any new materials necessary for
improvement
•  Cost of resources used to build improvement
•  Any capital investments required
–  On-going costs to sustain improvement ~
•  Future training, inspection, monitoring and material costs
It’s all about the cash!
Time Considerations
Time to implement improvement:

–  Technical time constraints ~
•  What is the minimum time it would take to implement?
–  Time to build/create improvement, time to implement
improvement
–  Political time constraints ~
•  What other priorities are competing for the technical time to
build the improvement?
–  Cultural time constraints ~
•  How long will it take to gain support from necessary
stakeholders?
The clock’s ticking……
556
Improvement Selection Matrix
Implementing this familiar tool to prioritize proposed

improvements is based on the three selection criteria of time,
cost and impact.
–  All the process outputs are rated in terms of their relative
importance to the process ~
•  The outputs of interest will be the same as those in your X-Y
Matrix.
•  The relative ranking of importance of the outputs are the same
numbers from the updated X-Y Matrix.
–  Each potential improvement is rated against the three criteria of
time, cost and impact using a standardized rating scale
–  Highest overall rated improvements are best choices for
implementation
This should
resemble the X-
Y Matrix. This
tool is best used
when multiple
improvement
efforts are
considered. The
outputs listed
above in most
cases resemble
those of your
original X-Y
Matrix but you
might have
another
business output
added.
The significance rating is the relative ranking of outputs. If one output is rated a 10 and it is twice the
importance of a second output, the rating for the second output would be a 5. The improvements, usually
impacting the X’s, are listed and the relative impact of each item on the left is rated against its impact to
the output. The overall impact rating for one improvement is the sum of the individual impact ratings
multiplied by their respective significant rating of the output impacted. Items on the left having more
impacts on multiple outputs will have a higher overall impact rating. The cost and timing ratings are
multiplied against the overall impact rating.
The improvements listed with the highest overall ratings are the first to get consideration. The range of
impact ratings can be zero to seven. An impact of zero means no impact. The cost and timing ratings are
rated zero to seven. With zero being prohibitive in the cost or timing category.
557
Improvement Selection Matrix Project Outputs
Primary and Secondary Metrics of your Project.

–  List each of the Y’s across the horizontal axis
–  Rate the importance of the process Y’s on a scale of 1 to 10
•  1 is not very important, 10 is critical
•  The Significance rankings must match your updated X-Y
Matrix rankings
Improvement Selection Matrix
Just like when using Impact Ratings

the FMEA your ratings 7 X's are removed from impacting the process output.
may vary for the three
Continual control and adjustment of critical X's impacting the
Selection Matrix 6
process output.
categories. Feel free
Continual control of critical X's prevents defects in the process
to use whatever 5
objective ratings you output from X.
desire. Defect detection of the process output prevents unknown defects
4
from leaving the process.
These are some
3 Process inspection or testing is improved to find defects better.
general guideline
ratings, customize Process is improved with easier control of a critical X impacting the
them to meet your 2
process output.
business, just try to 1 Personnel are trained about X's impact on the process output.
standardize whatever 0 X's have no impact on the process output.
criteria you choose.
The recommended
Cost to Implement Ratings
cost ratings from zero
Improvement Costs are minimal with upfront and ongoing
to seven are here. In 7
expenses.
many companies,
Improvement Costs are low and can be expensed with no capital
expenditures that are 6
not capitalized usually authorization and recurring expenses are low.
are desired because Improvement Costs are low and can be expensed with no capital
5
they are smaller and authorization and recurring expenses are higher.
are merely expensed. Medium capital priority because of relative ranking of return on
4
Your business may investment.
have different Low capital priority because of relative ranking of return on
3
strategies or need of investment.
cash so consider your High capital and ongoing expenses make a low priority for capital
2
business’ situation. investment.
High capital and/or expenses without acceptable return on
1
investment.
Significant capital and ongoing expenses without alignment with
0
business priorities.
558
Improvement Selection Matrix (cont.)
Time to Implement Ratings

7 Less than a week to get in place and workable.
6 7 - 14 days to get in place and workable.
5 2 - 8 weeks to get the improvement in place and workable.
4 2 - 3 months to get the improvement in place and workable.
Over a year to get the improvement in place and workable. All
0
above times include time for approvals process.
These time ratings are ranked from zero to seven. You might wonder why for something that would
take a year or more we suggest a zero rating suggesting the improvement not be considered. Many
businesses have cycle times of products less than a year so improvements that long are ill
considered.
Example of Completed Solution Selection Matrix
Plenty of bottled water
Outside noises do not
interfer with speakers
Food choices include

Coffee is hot and rich
"healthy choices"
OVERALL
COST TIME OVERALL
IMPACT
RATING RATING RATING
RATING
available
tasting
Significance Rating 10 9 8 9
Impact Impact Impact Impact
Potential Improvements Rating Rating Rating Rating
1 Hotel staff monitors room 2 2 6 0 86 7 7 4214
2 Mgmt visits/leaves ph # 2 0 4 0 52 7 7 2548
3 Replace old coffee makers/coffee 0 7 0 0 63 3 6 1134
4 Menus provided with nutrition info 0 0 0 4 36 5 5 900
5 Comp. gen. "quiet time" scheduled 6 0 0 0 60 3 3 540
6 Dietician approves menus 0 0 0 7 63 5 2 630
Improvement Selection Matrix Output

Improvements with the higher overall rating should be given first priority.
Keep in mind long time frame capital investments, etc. should have
parallel efforts to keep delays from further occurring.
This is just an example of a completed Selection Matrix. Remember a cost or time rating of zero
would eliminate the improvement from consideration for your project. Remember your ratings of the
solutions should involve your whole team to get their knowledge and understanding of final
priorities.
Again, higher overall ratings are the improvements to be considered. Do NOT forget about the
potential to run improvements in parallel. Running projects of complexity might need the experience
of a trained project manager. Often projects need to be managed with Gantt charts or timelines
showing critical milestones.
559
Implementing Solutions in Your Organization
Once you have made

Implementation plans should emphasize the need to: your selection of defect
–  Organize the tasks and resources reduction solutions you
–  Establish realistic time frames and deadlines need to plan those
–  Identify actions necessary to ensure success solutions. A plan
means more than the
Components of an implementation plan include: proverbial back of the
–  Work breakdown structure envelope solution and
should include
–  Influence strategy for priorities and resourcing
timelines, critical
–  Risk management plan milestones, project
–  Audit results for completion and risks. review dates and
specific actions noted
All solutions must be part of Control Plan Document. for success in your
solution
implementation. Many
people use Excel or MS
We have a plan don’t we? Project but many
options exist to plan
your project closing
with these future
sustaining plans.
What is a Control Plan?
A Control Plan is:

•  Written summary describing systems used for monitoring/controlling process
or product variation
•  Document allowing team to formally document all control methods used to
meet project goal
•  Living document to be updated as new measurement systems and control
methods are added for continuous improvement
•  Often used to create concise operator inspection sheet
•  NOT a replacement of information contained in detailed operating,
maintenance or design instructions
•  ESSENTIAL portion of final project report ~
–  Final projects are organizationally dependent
•  Informal or formal
–  Filed as part of project tracking mechanism for organization
•  Track benefits
•  Reference for unsustained results
560
WHO Should Create a Control Plan
The team working on the project!!!!
ANYONE who has a role in defining, executing or changing the

process:
–  Associates
–  Technical Experts
–  Supervisors
–  Managers
–  Site Manager
–  Human Resources
We did it!!
WHY Do We Need a Control Plan?
Project results need to be sustained.

•  Control Plan requires operators/engineers, managers, etc. to
follow designated control methods to guarantee product quality
throughout system
•  Allows a Belt to move onto other projects!
•  Prevents need for constant heroes in an organization who
repeatedly (seem to) solve the same problems
•  Control Plans are becoming more of a customer requirement
Going for distance, not the sprint!
561
Control Plan Elements
The 5 elements of a Control Plan include the documentation, monitoring, response, training and
aligning systems and structures.
Control Plan
Documentation Response Plan Process owners

Plan are accountable
to maintain new
Aligning level of process
Systems Monitoring Plan Training Plan performance
& Structures
Implemented Verified Financial Impact

Improvements
Control Plan Information
Control Plans use all

of the information The team develops the Control Plan by utilizing all
from the previous available information from the following:
phases of your
project and the –  Results from the Measure and Analyze Phases
defect prevention –  Lessons learned from similar products and processes
methods selected.
–  Team’s knowledge of the process
Control Plans may
not be exciting –  Design FMEAs
because you are not –  Design reviews
doing anything new
to the process but
–  Defect Prevention Methods selected
stabilizing the
process in the future Documentation
Plan
Response Plan
with this document.

Aligning
Systems Monitoring Plan Training Plan
& Structures
562
Training Plan
Who/What organizations require training?
–  Those impacted by the improvements ~

Training
•  People who are involved in the process Plan
impacted by the improvement
•  People who support the process impacted by the
improvement
–  Those impacted by the Control Plan ~

•  Process owners/managers
•  People who support the processes involved in the Control
Plan
•  People who will make changes to the process in the future
Who will complete the training?

–  Immediate training ~
•  The planning, development and execution is a Training
responsibility of the project team Plan
•  Typically some of the training is conducted by

the project team
–  Qualified trainers ~
•  Typically owned by a training department or process owner
•  Those who are responsible for conducting the on-going
training must be identified
Specific training materials need developing.

–  PowerPoint, On the Job Checklist, Exercises, etc.
When will training be conducted?
What is the timeline to train everyone Training

Plan
on the new process(es)?
What will trigger ongoing training?

–  New employee orientation?
–  Refresher training?
–  Part of the response plan when monitoring shows
performance degrading?
563
Training Plan (cont.)
Training Plan Outline
Training
Plan
Integration into
Schedule for Ongoing New Final Location of
Who Will Create Training Modules Who Will be Schedule for Employee Employee
Training Module Modules Completion Trained Training Trainer(s) Training Manuals
Documentation Plan
Documentation is necessary to ensure what Documentation

Plan
has been learned from the project is shared and

institutionalized:
–  Used to aid implementation of solutions
–  Used for on-going training
This is often the actual Final Report some organizations use.
Documentation must be kept current to be useful.
564
Documentation Plan (cont.)
Items to be included in the Documentation Plan:
Documentation
–  Process documentation ~ Plan
•  Updated Process Maps/flowcharts

•  Procedures (SOP’s)
•  FMEA
–  Control Plan documentation ~

•  Training manuals
•  Monitoring plan—process management charts, reports, SOPs
•  Response plan—FMEA
•  Systems and structures—job descriptions, performance
management objectives
Assigning responsibility for Documentation Plan:

–  Responsibility at implementation ~ Documentation
Plan
•  Belt ensures all documents are current at
hand off
•  Belt ensures there is a process to modify
documentation as the process changes in place
•  Belt ensures there is a process in place to review
documentation on regular basis for currency/accuracy
–  Responsibility for ongoing process (organizationally based) ~
•  Plan must outline who is responsible for making updates/
modifications to documentation as they occur
•  Plan must outline who is responsible to review documents—
ensuring currency/accuracy of documentation
565
Documentation Plan (cont.)
Documentation
Documentation Plan Outline Plan
Update/
Items Immediate Review
Document Modification
Necessary Responsibility Responsibility
Responsibility
Monitoring Plan
Purpose of a Monitoring Plan:

–  Assures gains are achieved and sustained
–  Provides insight for future process improvement Monitoring
Plan
activities
Development of a Monitoring Plan:

–  Belt is responsible for the development of the monitoring plan
–  Team members will help to develop the plan
–  Stakeholders must be consulted
–  Organizations with financial tracking would monitor results.
Sustaining the Monitoring Plan:

–  Functional managers will be responsible for adherence to the
monitoring plan ~
•  They must be trained on how to do this
•  They must be made accountable for adherence
566
Monitoring Plan (cont.)
Knowledge Tests:
–  When to Sample ~
Monitoring
•  After training Plan
•  Regular intervals
•  Random intervals (often in auditing sense)
–  How to Sample
–  How to Measure
I knew I should have paid more attention!
Statistical Process Control:

–  Control Charts ~ Monitoring
Plan
•  Posted in area where data collected
•  Plot data points real time
–  Act on Out of Control Response with guidelines from
the Out of Control Action Plan (OCAP).
–  Record actions taken to achieve in-control results.
•  Notes impacting performance on chart should be
encouraged
–  Establishing new limits ~
•  Based on signals that process performance has changed
567
Response Plan
FMEA is a great tool to use for the Monitoring Plan

Monitoring
Plan
Potential C
Process Potential S Potential O Current D R Responsible S O D R
Failure Modes l Recommend Taken
# Function Failure Effects E Causes of C Process E P Person & E C E P
(process a Actions Actions
(Step) (Y's) V Failure (X's) C Controls T N Target Date V C T N
defects) s
1
–  Allows process manager and those involved in the process to see

the entire process and how everyone can contribute to a defect
free product/service.
–  Provides the means to keep the document current—reassessing
RPNs as the process changes
Monitoring Plan
Check Lists/Matrices
–  Key items to check
Monitoring
–  Decision criteria; decision road map Plan
–  Multi-variable tables
Visual Management
–  Alerts or signals to trigger action ~
•  Empty bins being returned to when need stock
replenished
•  Red/yellow/green reports to signal process performance
–  Can be audible also
–  5S is necessary for Visual Management
568
Response Plan
Response Plans — outline process(es) to follow when Response Plan
there is a defect or Out of Control from monitoring ~

–  Out of control point on Control Chart
–  Non random behavior within Control Limits in
Control Chart
–  Condition/variable proven to produce defects present
–  Check sheet failure
–  Automation failure
Response to poor process results are a must in training
Response Plans are living documents updated

with new information as it becomes available.
Components of Response Plan: Response Plan
–  The triggers for a response ~

•  What are the Failure Modes to check for?
•  Usually monitor the highest risk X's in the process
–  The recommended response for the Failure Mode
–  The responsibilities for responding to the Failure Mode
–  Documentation of response plan being followed in a Failure
Mode
–  Detailed information on the conditions surrounding the Failure
Mode
569
Response Plan – Abnormality Report
Response Plan
•  Detailed documentation Process
when failure modes Metric

occur.
Current Situation
Signal
•  Provide a method for Situation Code

on-going continuous
improvement. Detailed Situation
Date
Investigation of Cause
•  Reinforce
commitment to Code of Cause
eliminating defects.
Corrective Action
•  Fits with ISO 9000 standard of Who To Be Involved
having a CAR or Corrective

Root Cause Analysis
Action Request. What To Be Done
Date for completion of analysis
•  Method to collect frequency of

corrective actions. Date for implementation of permanent prevention
Aligning Systems and Structures
Systems and structures are the basis for allowing

people to change their behaviors permanently ~
Aligning
–  Performance goals/objectives Systems
–  Policies/procedures & Structures
–  Job descriptions
–  Incentive compensation
–  Incentive programs, contests, etc
There are long- and short-term strategies

for alignment of systems and structures.
570
Aligning Systems and Structures (cont.)
•  Get rid of measurements not aligned with

desired behaviors
Aligning
Systems
•  Get rid of multiple measures for the same & Structures
desired behaviors
•  Implement measures aligned with desired behaviors currently not

motivated by incentives
•  Change management must consider your process changes and how

the process will respond?
•  Are the hourly incentives hurting your chance of success?
Project Sign Off
Best method to assure acceptance of Control

Plan is having supervisors and management for
the area involved. Aligning
Systems
& Structures
–  Meeting for a summary report
–  Specific changes to the process highlighted
–  Information where Control Plan is filed
Now that’s a Control Plan!
571
•  Identify all five phases of the Lean Six Sigma methodology
•  Identify at least three tools from each phase
•  Show progress on your ongoing project

Now for the last few questions to ask if you have been progressing on a real world project while
taking this learning. First, has your project made success in the primary metric without
compromising your secondary metrics? Second, have you been faithfully updating your metric
charts and keeping your process owner and project Champion updated on your team’s activities. If
not, then start NOW.
Remember a basic change management idea you learned in the Define Phase. If you get
involvement of team members who work in the process and keep the project Champion and
process owner updated as to results then you have the greatest chance of success.
You have now completed Control Phase – Six Sigma Control Plans.
Notes
572
Lean Six Sigma

Green Belt Training
Control Phase
573
Control Phase Overview—The Goal
The goal of the Control Phase is to:
•  Assess the final Process Capability.
•  Revisit Lean with an eye for sustaining the project.
•  Evaluate methods for Defect Prevention.
•  Explore various methods to monitor process using SPC.
•  Implement a Control Plan.
Gooooaaallllll!!
Organizational Change
Each player in the process has a role in

SUSTAINING project success achieved.
•  Accept responsibility
•  Monitoring
•  Responding
•  Managing
•  Embracing change & continuous learning
•  Sharing best practices
•  Potential for horizontal replication or expansion of results
574
Control Phase—The Roadblocks

•  Lack of project sign off

•  Team members are not involved in Control Plan design
•  Management does not have knowledge on monitoring and
reacting needs
•  Financial benefits are not tracked and integrated into
business
•  Lack of buy in of process operators or staff
Breakthrough!!
DMAIC Roadmap
Process Owner
Champion/

Define
Estimate COPQ
Establish Team
Measure

Analyze

Improve

Control
575
Control Phase
Improvement Selected
Develop Training Plan
Implement Training Plan
Develop Documentation Plan
Implement Documentation Plan
Develop Monitoring Plan
Implement Monitoring Plan
Develop Response Plan
Implement Response Plan
Develop Plan to Align Systems and Structures
Align Systems and Structures
Go to Next Project
Control Phase Checklist
Control Questions
Step One: Process Enhancement And Control Results
• How do the results of the improvement(s) match the requirements of the business
case and improvement goals?
• What are the vital few X’s?
• How will you control or redesign these X’s?
• Is there a process Control Plan in place?
• Has the Control Plan been handed off to the process owner?
Step Two: Capability Analysis for X and Y Process Capability

• How are you monitoring the Y’s?
Step Three: Standardization And Continuous Improvement

• How are you going to ensure this problem does not return?
• Is the learning transferable across the business?
• What is the action plan for spreading the best practice?
• Is there a project documentation file?
• How is this referenced in process procedures and product drawings?
• What is the mechanism to ensure this is not reinvented in the future?
Step Four: Document what you have learned
• Is there an updated FMEA?
• Is the Control Plan fully documented and implemented?
• What are the financial implications?
• Are there any spin-off projects?
• What lessons have you learned?
General Questions
• Are there any issues/barriers preventing the completion of the project?
• Do the Champion, the Belt and Finance all agree this project is complete?
576
Planning for Action

Test validation plan for a specific time
Calculate benefits for breakthrough
Implement change across project team
Process map of improved process
Finalize Key Input Variables (KPIV) to meet goal
Prioritize risks of output failure
Control plan for output
Control plan for inputs
Chart a plan to accomplish the desired state of the culture
Mistake proofing plan for inputs or outputs
Implementation plan for effective procedures
Knowledge transfer between Belt, PO and team members
Knowledge sharing between businesses and divisions
Lean project control plan
Establish continuous or attribute metrics for Cpk
Identify actual versus apparent Cpk
Finalize problem solving strategy
Complete RPN assessment with revised frequency and controls
Show improvement in RPN through action items
Repeat same process for secondary metrics
Summary
•  Have a clear understanding of the specific deliverables to complete

your project
•  Have started to develop a Project Plan to meet the deliverables
•  Have identified ways to deal with potential roadblocks
•  Be ready to apply the Lean Six Sigma method on your NEXT project
577
It’s a Wrap
Congratulations you
have completed Lean
Six Sigma Green Belt
Training!!!
578
Glossary
Affinity Diagram - A technique for organizing individual pieces of information into groups or broader categories.
ANOVA - Analysis of Variance – A statistical test for identifying significant differences between process or
system treatments or conditions. It is done by comparing the variances around the means of the conditions
being compared.
Attribute Data - Data which on one of a set of discrete values such as pass or fail, yes or no.
Average - Also called the mean, it is the arithmetic average of all of the sample values. It is calculated by adding
all of the sample values together and dividing by the number of elements (n) in the sample.
Bar Chart - A graphical method which depicts how data fall into different categories.
Black Belt - An individual who receives approximately four weeks training in DMAIC, analytical problem solving,
and change management methods. A Black Belt is a full time six sigma team leader solving problems under the
direction of a Champion.
Breakthrough Improvement - A rate of improvement at or near 70% over baseline performance of the as-is
process characteristic.
Capability - A comparison of the required operation width of a process or system to its actual performance
width. Expressed as a percentage (yield), a defect rate (dpm, dpmo,), an index (Cp, Cpk, Pp, Ppk), or as a
sigma score (Z).
Cause and Effect Diagram - Fishbone Diagram - A pictorial diagram in the shape of a fishbone showing all
possible variables that could affect a given process output measure.
Central Tendency - A measure of the point about which a group of values is clustered; two measures of central
tendency are the mean, and the median.
Champion - A Champion recognizes, defines, assigns and supports the successful completion of six sigma
projects; they are accountable for the results of the project and the business roadmap to achieve six sigma
within their span of control.
Characteristic - A process input or output which can be measured and monitored.
Common Causes of Variation - Those sources of variability in a process which are truly random, i.e., inherent
in the process itself.
Complexity -The level of difficulty to build, solve or understand something based on the number of inputs,
interactions and uncertainty involved.
Control Chart - The most powerful tool of statistical process control. It consists of a run chart, together with
statistically determined upper and lower control limits and a centerline.
Control Limits - Upper and lower bounds in a control chart that are determined by the process itself. They can
be used to detect special or common causes of variation. They are usually set at ±3 standard deviations from
the central tendency.
Correlation Coefficient - A measure of the linear relationship between two variables.
Cost of Poor Quality (COPQ) - The costs associated with any activity that is not doing the right thing right the
first time. It is the financial qualification any waste that is not integral to the product or service which your
company provides.
579
Glossary
CP - A capability measure defined as the ratio of the specification width to short-term process performance
width.
CPk -. An adjusted short-term capability index that reduces the capability score in proportion to the offset of the
process center from the specification target.
Critical to Quality (CTQ) - Any characteristic that is critical to the perceived quality of the product, process or
system. See Significant Y.
Critical X - An input to a process or system that exerts a significant influence on any one or all of the key
outputs of a process.
Customer - Anyone who uses or consumes a product or service, whether internal or external to the providing
organization or provider.
Cycle Time - The total amount of elapsed time expended from the time a task, product or service is started
until it is completed.
Defect - An output of a process that does not meet a defined specification, requirement or desire such as time,
length, color, finish, quantity, temperature etc.
Defective - A unit of product or service that contains at least one defect.
Deployment (Six Sigma) - The planning, launch, training and implementation management of a six sigma
initiative within a company.
Design of Experiments (DOE) - Generally, it is the discipline of using an efficient, structured, and proven
approach to interrogating a process or system for the purpose of maximizing the gain in process or system
knowledge.
Design for Six Sigma (DFSS) - The use of six sigma thinking, tools and methods applied to the design of
products and services to improve the initial release performance, ongoing reliability, and life-cycle cost.
DMAIC - The acronym for core phases of the six sigma methodology used to solve process and business
problems through data and analytical methods. See define, measure, analyze, improve and control.
DPMO - Defects per million opportunities – The total number of defects observed divided by the total number
of opportunities, expressed in parts per million. Sometimes called Defects per Million (DPM).
DPU - Defects per unit - The total number of defects detected in some number of units divided by the total
number of those units.
Entitlement - The best demonstrated performance for an existing configuration of a process or system. It is an
empirical demonstration of what level of improvement can potentially be reached.
Epsilon ε - Greek symbol used to represent residual error.
Experimental Design - See Design of Experiments.
Failure Mode and Effects Analysis (FMEA) - A procedure used to identify, assess, and mitigate risks
associated with potential product, system, or process failure modes.
Finance Representative - An individual who provides an independent evaluation of a six sigma project in
terms of hard and/or soft savings. They are a project support resource to both Champions and Project
Leaders.
580
Glossary
Fishbone Diagram - See cause and effect diagram.
Flowchart - A graphic model of the flow of activities, material, and/or information that occurs during a process.
Gage R&R - Quantitative assessment of how much variation (repeatability and reproducibility) is in a measurement
system compared to the total variation of the process or system.
Green Belt - An individual who receives approximately two weeks of training in DMAIC, analytical problem solving,
and change management methods. A Green Belt is a part time six sigma position that applies six sigma to their
local area, doing smaller-scoped projects and providing support to Black Belt projects.
Hidden Factory or Operation - Corrective and non-value-added work required to produce a unit of output that is
generally not recognized as an unnecessary generator of waste in form of resources, materials and cost.
Histogram - A bar chart that depicts the frequencies (by the height of the plotted bars) of numerical or
measurement categories.
Implementation Team - A cross-functional executive team representing various areas of the company . Its charter
is to drive the implementation of six sigma by defining and documenting practices, methods and operating policies.
Input - A resource consumed, utilized, or added to a process or system. Synonymous with X, characteristic, and
input variable.
Input-Process-Output (IPO) Diagram - A visual representation of a process or system where inputs are
represented by input arrows to a box (representing the process or system) and outputs are shown using arrows
emanating out of the box.
lshikawa Diagram - See cause and effect diagram and fishbone diagram.
Least Squares - A method of curve-fitting that defines the best fit as the one that minimizes the sum of the squared
deviations of the data points from the fitted curve.
Long-term Variation - The observed variation of an input or output characteristic which has had the opportunity to
experience the majority of the variation effects that influence it.
Lower Control Limit (LCL) - for control charts: the limit above which the subgroup statistics must remain for the
process to be in control. Typically, 3 standard deviations below the central tendency.
Lower Specification Limit (LSL) - The lowest value of a characteristic which is acceptable.
Master Black Belt - An individual who has received training beyond a Black Belt. The technical, go-to expert
regarding technical and project issues in six sigma. Master Black Belts teach and mentor other six sigma Belts,
their projects and support Champions.
Mean - See average.
Measurement - The act of obtaining knowledge about an event or characteristic through measured quantification
or assignment to categories.
Measurement Accuracy - For a repeated measurement, it is a comparison of the average of the measurements
compare to some known standard.
Measurement Precision - For a repeated measurement, it is the amount of variation that exists in the measured
values.
581
Glossary
Measurement Systems Analysis (MSA) - An assessment of the accuracy and precision of a method of obtaining
measurements. See also Gage R&R.
Median - The middle value of a data set when the values are arranged in either ascending or descending order.
Metric - A measure that is considered to be a key indicator of performance. It should be linked to goals or
objectives and carefully monitored.
Natural Tolerances of a Process - See Control Limits.
Nominal Group Technique - A structured method that a team can use to generate and rank a list of ideas or items.
Non-Value Added (NVA) - Any activity performed in producing a product or delivering a service that does not add
value, where value is defined as changing the form, fit or function of the product or service and is something for
which the customer is willing to pay.
Normal Distribution - The distribution characterized by the smooth, bell- shaped curve. Synonymous with
Gaussian Distribution.
Objective Statement - A succinct statement of the goals, timing and expectations of a six sigma improvement
project.
Opportunities - The number of characteristics, parameters or features of a product or service that can be classified
as acceptable or unacceptable.
Out of Control - A process is said to be out of control if it exhibits variations larger than its control limits or shows a
pattern of variation.
Output - A resource or item or characteristic that is the product of a process or system. See also Y, CTQ.
Pareto Chart - A bar chart for attribute (or categorical) data categories are presented in descending order of
frequency.
Pareto Principle - The general principle originally proposed by Vilfredo Pareto (1848-1923) that the majority of
influence on an outcome is exerted by a minority of input factors.
Poka-Yoke - A translation of a Japanese term meaning to mistake-proof.
Probability - The likelihood of an event or circumstance occurring.
Problem Statement - A succinct statement of a business situation which is used to bound and describe the
problem the six sigma project is attempting to solve.
Process - A set of activities and material and/or information flow which transforms a set of inputs into outputs for
the purpose of producing a product, providing a service or performing a task.
Process Characterization - The act of thoroughly understanding a process, including the specific relationship(s)
between its outputs and the inputs, and its performance and capability.
Process Certification - Establishing documented evidence that a process will consistently produce required
outcome or meet required specifications.
Process Flow Diagram - See flowchart.
582
Glossary
Process Member - A individual who performs activities within a process to deliver a process output, a product
or a service to a customer.
Process Owner - Process Owners have responsibility for process performance and resources. They provide
support, resources and functional expertise to six sigma projects. They are accountable for implementing
developed six sigma solutions into their process.
Quality Function Deployment (QFD) - A systematic process used to integrate customer requirements into
every aspect of the design and delivery of products and services.
Range - A measure of the variability in a data set. It is the difference between the largest and smallest values
in a data set.
Regression Analysis - A statistical technique for determining the mathematical relation between a measured
quantity and the variables it depends on. Includes Simple and Multiple Linear Regression.
Repeatability (of a Measurement) - The extent to which repeated measurements of a particular object with a
particular instrument produce the same value. See also Gage R&R.
Reproducibility (of a Measurement) - The extent to which repeated measurements of a particular object with
a particular individual produce the same value. See also Gage R&R.
Rework - Activity required to correct defects produced by a process.
Risk Priority Number (RPN) - In Failure Mode Effects Analysis -- the aggregate score of a failure mode
including its severity, frequency of occurrence, and ability to be detected.
Rolled Throughput Yield (RTY) - The probability of a unit going through all process steps or system
characteristics with zero defects.
R.U.M.B.A. - An acronym used to describe a method to determine the validity of customer requirements. It
stands for Reasonable, Understandable, Measurable, Believable, and Achievable.
Run Chart - A basic graphical tool that charts a characteristic’s performance over time.
Scatter Plot - A chart in which one variable is plotted against another to determine the relationship, if any,
between the two.
Screening Experiment - A type of experiment to identify the subset of significant factors from among a large
group of potential factors.
Short Term Variation - The amount of variation observed in a characteristic which has not had the opportunity
to experience all the sources of variation from the inputs acting on it.
Sigma Score (Z) - A commonly used measure of process capability that represents the number of short-term
standard deviations between the center of a process and the closest specification limit. Sometimes referred to
as sigma level, or simply Sigma.
Significant Y - An output of a process that exerts a significant influence on the success of the process or the
customer.
Six Sigma Leader - An individual that leads the implementation of Six Sigma, coordinating all of the necessary
activities, assures optimal results are obtained and keeps everyone informed of progress made.
583
Glossary
Six Sigma Project - A well defined effort that states a business problem in quantifiable terms and with known
improvement expectations.
Six Sigma (System) - A proven set of analytical tools, project management techniques, reporting methods and
management techniques combined to form a powerful problem solving and business improvement methodology.
Special Cause Variation - Those non-random causes of variation that can be detected by the use of control charts
and good process documentation.
Specification Limits - The bounds of acceptable performance for a characteristic.
Stability (of a Process) - A process is said to be stable if it shows no recognizable pattern of change and no
special causes of variation are present.
Standard Deviation - One of the most common measures of variability in a data set or in a population. It is the
square root of the variance.
Statistical Problem - A problem that is addressed with facts and data analysis methods.
Statistical Process Control (SPC) - The use of basic graphical and statistical methods for measuring, analyzing,
and controlling the variation of a process for the purpose of continuously improving the process. A process is said to
be in a state of statistical control when it exhibits only random variation.
Statistical Solution - A data driven solution with known confidence/risk levels, as opposed to a qualitative, “I think”
solution.
Supplier - An individual or entity responsible for providing an input to a process in the form of resources or
information.
Trend - A gradual, systematic change over time or some other variable.
TSSW - Thinking the six sigma way – A mental model for improvement which perceives outcomes through a cause
and effect relationship combined with six sigma concepts to solve everyday and business problems.
Two-Level Design - An experiment where all factors are set at one of two levels, denoted as low and high (-1 and +
1).
Upper Control Limit (UCL) for Control Charts - The upper limit below which a process statistic must remain to be
in control. Typically this value is 3 standard deviations above the central tendency.
Upper Specification Limit (USL) - The highest value of a characteristic which is acceptable.
Variability - A generic term that refers to the property of a characteristic, process or system to take on different
values when it is repeated.
Variables - Quantities which are subject to change or variability.
Variable Data - Data which is continuous, which can be meaningfully subdivided, i.e. can have decimal
subdivisions.
Variance - A specifically defined mathematical measure of variability in a data set or population. It is the square of
the standard deviation.
Variation - See variability.
584
Glossary
VOB - Voice of the business – Represents the needs of the business and the key stakeholders of the business.
It is usually items such as profitability, revenue, growth, market share, etc.
VOC - Voice of the customer – Represents the expressed and non-expressed needs, wants and desires of the
recipient of a process output, a product or a service. Its is usually expressed as specifications, requirements or
expectations.
VOP - Voice of the process – Represents the performance and capability of a process to achieve both
business and customer needs. It is usually expressed in some form of an efficiency and/or effectiveness
metric.
Waste - Waste represents material, effort and time that does not add value in the eyes of key stakeholders
(Customers, Employees, Investors).
X - An input characteristic to a process or system. In six sigma it is usually used in the expression of Y=f(X),
where the output (Y) is a function of the inputs (X).
Y - An output characteristic of a process. In six sigma it is usually used in the expression of Y=f(X), where the
output (Y) is a function of the inputs (X).
Yellow Belt - An individual who receives approximately one week of training in problem solving and process
optimization methods. Yellow Belts participate in Process Management activates, participate on Green and
Black Belt projects and apply concepts to their work area and their job.
Z Score – See Sigma Score.

Ebook LSS Green Belt PDF - Mai 2018 PDF

Uploaded by

Copyright:

Available Formats

Ebook LSS Green Belt PDF - Mai 2018 PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Ebook LSS Green Belt PDF - Mai 2018 PDF

Uploaded by

Copyright:

Available Formats

What are the main phases of a Six Sigma project?

What are the main phases of a Six Sigma project?

What are some common tools used in the Measure phase?

What are some common tools used in the Measure phase?

Cer1ﬁed

LEAN SIX SIGMA BELT SERIES

Fourth Edi+on -­‐ Minitab

Lean Six Sigma

Welcome to the Lean Six Sigma Green Belt Training Course.

We begin in the Define Phase with “Understanding Six Sigma”.

Understanding Six Sigma

We will examine the

Six Sigma Fundamentals

Wrap Up & Action Items

What is Six Sigma…as a Symbol?

σ, sigma, is a letter of the Greek alphabet.

The Blue Line designates

Understanding Six Sigma

What is Six Sigma…as a Value?

When measuring the

What is Six Sigma…as a Measure?

The probability of creating a defect can be estimated and translated into a

Understanding Six Sigma

Sigma Level is:

The likelihood of a defect decreases as the number of Standard Deviations

What is Six Sigma…as a Metric?

§ Defects per million opportunities (DPMO) 14

§ First Time Yield (FTY) 10

Understanding Six Sigma

What is Six Sigma…as a Benchmark?

Yield PPMO COPQ Sigma

99.9997% 3.4 <10% 6 World Class Benchmarks

99.976% 233 10-15% 5 10% GAP

99.4% 6,210 15-20% 4 Industry Average

93% 66,807 20-30% 3 10% GAP

65% 308,537 30-40% 2 Non Competitive

50% 500,000 >40% 1

What does 20 - 40% of Sales represent to your Organization?

What is Six Sigma…as a Method?

DMAIC provides the method for applying the Six Sigma

! Define - the business opportunity

! Measure - the process current state

! Analyze - determine Root Cause or Y= f (x)

! Improve - eliminate waste and variation

! Control - sustain the results

Understanding Six Sigma

What is Six Sigma…as a Tool?

Six Sigma contains a broad set of tools interwoven in a

Six Sigma has not created new tools, it has simply

Understanding Six Sigma

What is Six Sigma…as a Goal?

Low Hanging Fruit

1 - 2 Sigma Ground Fruit

What is Six Sigma…as a Philosophy?

Understanding Six Sigma

History of Six Sigma

The Phase Approach of Six Sigma

Define Measure Analyze Improve Control

GENERAL ELECTRIC MOTOROLA

Fourth Edi+on -‐ Minitab

§  Defects per million opportunities (DPMO) 14

§  First Time Yield (FTY) 10

!  Define - the business opportunity

!  Measure - the process current state

!  Analyze - determine Root Cause or Y= f (x)

!  Improve - eliminate waste and variation

!  Control - sustain the results

§  Charter Benefits Analysis

–  Widened the scope of the definition of quality

§  Own project selection, execution control, implementation and realization of

§  Provide advice and counsel to Executive Staff

§  Project team leader

•  Well versed in the definition & measurement of critical processes

Green Belts §  Involved in identifying improvement opportunities

§  Provide support to Black Belts and Green Belts as

§  Hard work (becoming a Six Sigma Belt is not

§  Complete all course work:

§  Achieve results and make a difference

§  Submit a final report which documents tool understanding and

§  Encourage and reward individual initiative

§  Align incentive systems to support desired behaviors

§  Eliminate functional barriers

§  Embrace “systems” thinking

§  Balance standardization with flexibility