Section 6 Slides PDF
Section 6 Slides PDF
Section 6 Slides PDF
Data
Description Numbers
Continuous vs Discrete
❖ Continuous Data
❖ Measurements: Length, height, time
❖ More information with less samples
❖ More sensitive
❖ Provide more information
❖ More expensive to collect
❖ Discrete Data
❖ Count: Number of students, Number of
heads
Measurement Scales
Data
Data
Example:
Color: Blue, Green, Red
Measurement Scales
Data
Data
Example:
Temperature: Celsius
Measurement Scales
Data
Example:
Height, mass, volume
Measurement Scales
Data
Capacity
Central
Variability
Tendency
Mean Range
Standard
Mode
Deviation
Interquartile
Median
Range
Percentile
Mean
❖ Also known as Average Central
Tendency
❖ Mode = 10 Quartile
Median
❖ Middle value when put in ascending or Central
Tendency
descending order.
❖ Example: 10, 11, 14, 9, 6
Mean Mode Median Percentile
❖ Median = 10
order
❖ Calculate location(i) = P.(n)/100 Mean Mode Median Percentile
Variability
Interquartile Standard
Range
Range Deviation
Range
❖ Difference between lowest and the Variability
highest value.
Interquartile Standard
Range
❖ Example: 6,9,10,11, 11,14 Range Deviation
❖ Range = 14-6 = 8
Interquartile Range
❖ Range of middle 50% data Variability
deviation
Standard Deviation
x x-x̅ (x-x̅ )2
100 0 0 ∑(x-x̅ )2
S2 =
101 1 1 n-1
99 -1 1
102 2 4 S 2 = 10/5 = 2
98 -2 4
S = √ 2 = 1.414
100 0 0
x̅ =100 ∑(x-x̅ )=0 ∑(x-x̅ )2=10
Graphical methods for depicting relationships
❖ Stem-and-leaf Plot
❖ Box-and-whisker Plots
❖ Scatter Plot
Stem-and-Leaf Plot
❖ 11, 22, 55, 13, 45, 14, 19, 10, 33, 52, 13
Stem Leaf
1 013349
2 2
3 3
4 5
5 25
Stem-and-Leaf Plot
21.0 21.0 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2
17.8 16.4 17.3 15.2 10.4 10.4 14.7 32.4 30.4 33.9
21.5 15.5 15.2 13.3 19.2 27.3 26.0 30.4 15.8 19.7
15.0 21.4
Stem-and-Leaf Plot
21.0 21.0 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2
17.8 16.4 17.3 15.2 10.4 10.4 14.7 32.4 30.4 33.9
21.5 15.5 15.2 13.3 19.2 27.3 26.0 30.4 15.8 19.7
15.0 21.4
Box and Whisker Plots
❖ Also known as Box Plot
❖ Shows the median
❖ Shows Q1, Q3 and IQR
70
60
50 Median
25th
40
75th
30 Mean
20 Outliers
10 Avg No. of
orders per
0 mo
Scatter Diagram
❖ One of seven basic quality tools
❖ To see relationship between two
variables
❖ Relationship should make practical
sense
❖ Temperature(X) vs Ice cream sale (Y)
❖ Some times relationship between two
variables is because of a third variable.
(ice cream sale vs heat stroke cases)
❖ Correlation/Regression is covered in the
Analyze Phase
Histogram
❖ Graphical representation of the
distribution of numerical data
❖ Values are assigned “bins” and
frequency for each bin is plotted.
Histograms
Graphical method for depicting distributions
❖ Probability plots
❖ Normal and non-normal
Graphical method for depicting distributions
40
Frequency
30
20
10
0
72 80 88 96 104 112 120 128
Length
Graphical method for depicting distributions
Histogram of Length
Normal
50 Mean 99.72
StDev 9.997
N 150
40
Frequency
30
20
10
0
72 80 88 96 104 112 120 128
Length
Graphical method for depicting distributions
25
Frequency
20
15
10
0
70 80 90 100 110 120 130
New Length
Q-Q Plot
❖ Quantile-Quantile Plot
❖ Earlier we talked about Quartile (Q1, Q2,
Q3). There are Q0 and Q4 as well. This
divides data in four parts.
❖ Percentile divides the data in 100 parts.
For example Q2 is 50th Percentile.
❖ Quantile can divide data in any number
of parts. Quartile and Percentile are
examples of Quantile.
Q-Q Plot
❖ Quantile-Quantile Plot
Theoretical Quantile
Data Quantile
Data Quantile
Theoretical Quantile
Errors of Statistical Tests
True State of Nature
H0 Ha
Is true Is true
Support H0 /
Reject Ha Correct Type II
Conclusion Error
Conclusion Support Ha / Correct
Reject H0 Type I Error Conclusion
(Power)
Errors of Statistical Tests
Type I error (alpha) Type II error (beta)
Name Producer’s risk/ Consumer’s risk
Significance level
1 minus error is Confidence level Power of the test
called
Example of Fire False fire alarm leading Missed fire leading to
Alarm to inconvenience disaster
Effects on Unnecessary cost Defects may be produced
process increase due to frequent
changes
Control method Usually fixed at a pre- Usually controlled to < 10%
determined level, 1%, by appropriate sample size
5% or 10%
Simple definition Innocent declared as Guilty declared as innocent
guilty
Significance Level
Level of Confidence / Confidence Interval:
C = 0.90, 0.95, 0.99 (90%, 95%, 99%)
Level of Significance:
α = 1 – C (0.10, 0.05, 0.01)
Power
❖ Power = 1 – β (or 1 - type II error)
❖ Type II Error: Failing to reject null
hypothesis when null hypothesis is false.
❖ Power: Likelihood of rejecting null
hypothesis when null hypothesis is false.
❖ Mean
❖ Standard Deviation
Normal Probability Distribution
❖ About 68% of the area under the curve
falls within 1 standard deviation of the
mean.
❖ z is the z-score,
❖ X is the value of the element,
❖ μ is the population mean,
❖ σ is the standard deviation.
Z Table
Continuous Probability Distributions
❖ Normal probability distribution
❖ Student's t distribution
❖ Chi-square distribution
❖ F distribution
Continuous vs Discrete Variable
❖ If a variable can take on any value
between two specified values, it is called
a continuous variable; otherwise, it is
called a discrete variable.
Discrete Probability Distributions
❖ Binomial Probability Distribution
❖ Bernoulli Distribution
❖ Hypergeometric Probability Distribution
n.p.(1-p)
identical trials.
❖ 2. For each trial, there are only two
possible outcomes (success/failure).
❖ 3. The probability of success, p, remains
the same for each trial.
❖ 4. The trials are independent of each
other.
❖ 5. x = the number of successes observed
for the n trials.
Bernoulli Distribution
❖ Distribution of successes on a single
trial.
❖ What is the probability of getting head in
tossing of a coin once?
Hypergeometric Distribution
❖ There is a fixed number, n , of identical
trials.
❖ For each trial, there are only two possible
outcomes (success/failure).
❖ The probability of success, p, remains the
same for each trial.
❖ The trials are independent of each other.
❖ Finite and known population without
replacement.
❖ Number of successes in population are
known
❖ x = the number of successes observed for
the n trials.
Hypergeometric Distribution
❖ N: size of population P(x) = ACx . N-ACn-x / NCn
❖ A: number of successes in population
❖ x: The number of successes that result from the
experiment.
❖ n: The number of trials without replacement.
❖ p: The probability of success on an individual
trial.
❖ q: The probability of failure on an individual
trial. (This is equal to 1 - p.)
❖ P(x) : The probability that an n-trial experiment
results in exactly x successes
❖ nCx: The number of combinations of n things,
taken x at a time.
Hypergeometric Distribution
❖ Out of 10 people (6M, 4F), 3 people are P(x) = ACx . N-ACn-x / NCn
selected without replacement. What is
the probability that two of them are
females?
❖ P(2) = 4C2 . 10-4C3-2 / 10C3
❖ = 4C2 . 6C1 / 10C3 = 6x6/120 = 0.3
❖ Interval estimate:
❖ A range of values within which, we believe,
the true parameter lies with high
probability.
Point Estimates
❖ Point estimate:
❖ Summarize the sample by a single number
that is an estimate of the population
parameter.
❖ The sample mean x̄ is a point estimate of
the population mean μ. The sample
proportion p is a point estimate of the
population proportion P.
Point vs Interval Estimates
❖ Interval estimate:
❖ A range of values within which, we believe,
the true parameter lies with high
probability.
❖ For example, a < x̄ < b is an interval
estimate of the population mean μ. It
indicates that the population mean is
greater than a but less than b.
Confidence Interval
❖ Factors affecting the width of
confidence interval
❖ sample size
❖ standard deviation
❖ confidence level
Confidence Interval
❖ When population standard deviation is
known/ Sample size is >=30
σ
CI = xത ± zαΤ2 ⋅
n
❖ Zα/2 = z table value for confidence level,
❖ σ = standard deviation
❖ n = sample size.
Confidence Interval
❖ The average income of 100 random residents of city was
found to be $42,000 per annum with standard deviation of
5,000. Find the 95% confidence interval of the town income.
σ
CI = xത ± zαΤ2 ⋅
n
❖ Zα/2 = z table value for confidence level,
❖ σ = standard deviation
❖ n = sample size.
Confidence Interval
❖ The average income of 100 random residents of city was
found to be $42,000 per annum with standard deviation of
5,000. Find the 95% confidence interval of the town income.
σ
CI = xത ± zαΤ2 ⋅
n
❖ Zα/2 = z table value for confidence level = 1.96
❖ σ = standard deviation = 5,000
❖ n = sample size = 100
𝐶𝐼 = 41,020 to 42,980
Confidence Interval
❖ When population standard deviation is
unknown and Sample size is < 30
s
CI = xത ± t αΤ2 ⋅
n
❖ tα/2 = t distribution value for the confidence
level and (n-1) degrees of freedom
❖ s = sample standard deviation
❖ n = sample size.
Confidence Interval
❖ The average income of 25 random residents of
city was found to be $42,000 per annum with
standard deviation of 5,000. Find the 95%
confidence interval of the town income.
❖ CI = x̄ +/- (tα/2 )* s/√(n).
❖ tα/2 = t distribution value for the confidence
level and (n-1) degrees of freedom
❖ s = sample standard deviation
❖ n = sample size.
❖ H 0: μ 1 – μ 2= 0.2
Two Sample z Test
❖ Example: From two machines 100
samples each were drawn.
❖ Machine 1: Mean = 151.9 / sd = 2.1
❖ Machine 2: Mean = 151.2 / sd = 2.2
❖ Is there difference of more than 0.2 cc in
these two machines. Check at 95%
confidence level.
❖ Zcal = 0.5/0.304 = 2.30
❖ Zcritical = 1.64
❖ Reject Null Hypothesis.
Two Sample t Test
❖ If two set of data are independent or
dependent.
❖ If the values in one sample reveal no
information about those of the other
sample, then the samples are independent.
❖ Example: Blood pressure of male/female
Test Information
H0: Mean Difference = 0
Ha: Mean Difference Not Equal To 0
Assume Unequal Variance
Results: A C
Count 5 5
Mean 151.80 154.60
Standard Deviation 1.483 15.027
Results: A C
Count 5 5
Mean 151.80 154.60
Standard Deviation 1.483 15.027
Two Sample t Test
tcritical = 2.776
Two Sample t Test
❖ Minitab 17 output:
Two-sample T for A vs C
F critical = 4.1203
Two Sample Variance – F Test
❖ Example: We took 8 samples from
machine A and the standard deviation
was 1.1. For machine B we took 5
samples and the variance was 11. Is
there a difference in variance at 90%
F critical = 4.1203
confidence level?
❖ n1 = 8, s1 = 1.1, s21 = 1.21, df = 7 (denominator)
❖ n2 = 5, s22 = 11, df = 4 (numerator)
❖ F calculated = 11/1.21 = 9.09 (higher value at top)
❖ Reject H0
Tests for Variance
❖ F-test
❖ for testing equality of two variances from
different populations
❖ for testing equality of several means with
technique of ANOVA.
❖ Chi-square test
❖ For testing the population variance against
a specified value
❖ testing goodness of fit of some probability
distribution
❖ X2 = 24x5 / 4 = 30
❖ X2 = 24x5 / 4 = 30
❖ Fail to reject H0
One Sample Chi Square
❖ SigmaXL Output
ANOVA
❖ F-test
❖ for testing equality of two variances from
different populations
❖ for testing equality of several means with
technique of ANOVA.
❖ Chi-square test
❖ For testing the population variance against
a specified value
❖ testing goodness of fit of some probability
distribution
4 x
3 x 3 vs 4
2 x 2 vs 3 2 vs 4
1 x 1 vs 2 1 vs 3 1 vs 4
1 2 3 4
ANOVA
❖ Why ANOVA?
❖ How many t Test we need to conduct if
have to compare 4 samples? … 6
❖ Each test is done with alpha = 0.05 or 95%
confidence.
❖ 6 tests will result in confidence level of
0.95x0.95x0.95x0.95x0.95x0.95 = 0.735
ANOVA
❖ Comparing three machines:
Machine 1 Machine 2 Machine 3
150 153 156
151 152 154
152 148 155
152 151 156
151 149 157
150 152 155
x̄1 = 151 x̄2 = 150.83 x̄3 = 155.50
ANOVA
Machine 1 Machine 2 Machine 3
❖ Comparing three machines: 150 153 156
151 152 154
152 148 155
152 151 156
158
151 149 157
156 150 152 155
x̄1 = 151.00 x̄2 = 150.83 x̄3 = 155.50
154
Median
25th
152
75th
Mean
150
148
Machine 1 Machine 2 Machine 3
146
ANOVA
Machine 1 Machine 2 Machine 3
❖ Comparing three machines: 150 153 156
151 152 154
152 148 155
152 151 156
158
151 149 157
156 150 152 155
x̄1 = 151.00 x̄2 = 150.83 x̄3 = 155.50
154
Median
25th
152
75th
Mean
150
148
Machine 1 Machine 2 Machine 3
146
ANOVA
❖ Comparing three machines:
❖ Ratio:
SS between (or treatment) / SS within(or error)
ANOVA
150 153 156
151 152 154
152 148 155
152 151 156
Machine 1 x1 - x̄1 Sqr(x1 - x̄1) Machine 2 x2 - x̄2 Sqr(x2 - x̄2) Machine 3 x3 - x̄3 Sqr(x3 - x̄3)
150.00 -1.00 1.00 153.00 2.17 4.69 156.00 0.50 0.25
151.00 0.00 0.00 152.00 1.17 1.36 154.00 -1.50 2.25
152.00 1.00 1.00 148.00 -2.83 8.03 155.00 -0.50 0.25
152.00 1.00 1.00 151.00 0.17 0.03 156.00 0.50 0.25
151.00 0.00 0.00 149.00 -1.83 3.36 157.00 1.50 2.25
150.00 -1.00 1.00 152.00 1.17 1.36 155.00 -0.50 0.25
151.00 150.83 155.50 152.44
4.00 18.83 5.50
ANOVA
150 153 156
151 152 154
152 148 155
152 151 156
Machine 1 x1 - x̄1 Sqr(x1 - x̄1) Machine 2 x2 - x̄2 Sqr(x2 - x̄2) Machine 3 x3 - x̄3 Sqr(x3 - x̄3)
150.00 -1.00 1.00 153.00 2.17 4.69 156.00 0.50 0.25
151.00 0.00 0.00 152.00 1.17 1.36 154.00 -1.50 2.25
152.00 1.00 1.00 148.00 -2.83 8.03 155.00 -0.50 0.25
152.00 1.00 1.00 151.00 0.17 0.03 156.00 0.50 0.25
151.00 0.00 0.00 149.00 -1.83 3.36 157.00 1.50 2.25
150.00 -1.00 1.00 152.00 1.17 1.36 155.00 -0.50 0.25
151.00 150.83 155.50 152.44
4.00 18.83 5.50
ANOVA
150 153 156
151 152 154
152 148 155
152 151 156
Machine 1 x1 - x̄1 Sqr(x1 - x̄1) Machine 2 x2 - x̄2 Sqr(x2 - x̄2) Machine 3 x3 - x̄3 Sqr(x3 - x̄3)
150.00 -1.00 1.00 153.00 2.17 4.69 156.00 0.50 0.25
151.00 0.00 0.00 152.00 1.17 1.36 154.00 -1.50 2.25
152.00 1.00 1.00 148.00 -2.83 8.03 155.00 -0.50 0.25
152.00 1.00 1.00 151.00 0.17 0.03 156.00 0.50 0.25
151.00 0.00 0.00 149.00 -1.83 3.36 157.00 1.50 2.25
150.00 -1.00 1.00 152.00 1.17 1.36 155.00 -0.50 0.25
151.00 150.83 155.50 152.44
4.00 18.83 5.50
ANOVA
150 153 156
151 152 154
152 148 155
152 151 156
❖ Degrees of freedom
❖ Total df = df treatment + df error
❖ (N-1) = (C-1) + (N-C)
❖ df treatment = 3-1=2, df error = 18-3=15
❖ df total = 17
Machine 1 Machine 2 Machine 3
ANOVA
150 153 156
151 152 154
152 148 155
152 151 156
❖ MSbetween = SS between(or treatment) /df treatment x̄1 = 151.00 x̄2 = 150.83 x̄3 = 155.50
❖ DEMONSTRATE MS Excel
Machine 1 Machine 2 Machine 3
ANOVA
150 153 156
151 152 154
152 148 155
ANOVA Table
Source SS DF MS F P-Value
Between 84.111 2 42.056 22.265 0.0000
Within 28.333 15 1.889
Total 112.44 17
ANOVA
❖ Practice Exercise: Fill in the values for ?A
to ?E in this ANOVA Table:
ANOVA Table
Source SS DF MS F
Between 84.111 ?C ?D ?E
Within ?A 15 1.889
Total ?B 17
X2 = 25.8
Goodness of Fit Test (Chi Square)
❖ A coin is flipped 100 times. Number of
heads are noted. Is this coin biased?
X2cal= 25.8
X2(4,0.95)= 9.49
Goodness of Fit Test (Chi Square)
❖ A coin is flipped 100 times. Number of
heads are noted. Is this coin biased?
❖ X2cal= 25.8
❖ X2(4,0.95)= 9.49
EXPECTED
Operator 1 Operator 2 Operator 3
Shift 1 122x71/347 112x71/347 115x71/347 71
Shift 2 122x116/347 112x116/347 115x116/347 116
Shift 3 122x160/347 112x160/347 115x160/347 160
122 112 115 347
Contingency Tables
❖ Calculate Chi square statistic.
EXPECTED
Operator 1 Operator 2 Operator 3
Shift 1 122x71/347 112x71/347 115x71/347 71
Shift 2 122x116/347 112x116/347 115x116/347 116
Shift 3 122x160/347 112x160/347 115x160/347 160
122 112 115 347
EXPECTED
Operator 1 Operator 2 Operator 3
Shift 1 24.96 22.91 23.53 71
Shift 2 40.78 37.44 38.44 116
Shift 3 56.25 51.64 53.02 160
122 112 115 347
Contingency Tables
OBSERVED EXPECTED
Operator 1 Operator 2 Operator 3 Operator 1 Operator 2 Operator 3
Shift 1 22 26 23 71 Shift 1 24.96 22.91 23.53 71
Shift 2 28 62 26 116 Shift 2 40.78 37.44 38.44 116
Shift 3 72 22 66 160 Shift 3 56.25 51.64 53.02 160
122 112 115 347 122 112 115 347
Operat Operat
2
(O-E) /E Operator 1
or 2 or 3
Shift 1 (22-24.96)2/24.96 = 0.35 0.42 0.01 71
Shift 2 (28-40.78)2/40.78 = 4.00 16.11 4.03 116
Shift 3 (72-56.25)2/56.25 = 4.41 17.01 3.18 160
122 112 115 347 X2 = 49.52
Contingency Tables
❖ Calculate Chi square statistic = 49.52
❖ Degrees of freedom = (r-1)(c-1) = 4
❖ Chi square critical = 9.49
❖ Reject null hypothesis
❖ There is a relationship between the shift
and the operator.
Contingency Tables
❖ Practice Exercise:
❖ Calculate the Expected value for Non
Smoker Male?
❖ What will be the degrees of freedom in
this example?
Smoker Non
Smoker
Male 60 40 100
Female 35 40 75
95 80 175
Contingency Tables
❖ Practice Exercise:
❖ Calculate the Expected value for Non
Smoker Male? = 80x100/175 = 45.71
❖ What will be the degrees of freedom in
this example? (2-1)(2-1)=1
Smoker Non
Smoker
Male 60 40 100
Female 35 40 75
95 80 175
Correlation
❖ Y = f(X),
❖ where Y is Dependent variable or the result
(output)
❖ X is Independent variable, input or the
controllable variable
Column 1 Column 2
Column 1 1
Column 2 0.879350768 1
Correlation Coefficient
❖ Correlation
❖ Measures the strength of linear
relationship between Y and X
❖ Pearson Correlation Coefficient, r (r
varies between -1 and +1)
❖ Perfect positive relationship: r = 1
❖ No relationship: r = 0
❖ Perfect negative relationship: r = -1
Correlation Coefficient
Correlation vs Causation
❖ Correlation does not imply causation
❖ a correlation between two variables does
not imply that one causes the other
Correlation – Confidence Interval
❖ Population correlation (ρ) – usually
unknown
❖ Sample correlation (r)
Correlation – Confidence Interval
❖ Since r is not normally distributed, there
are three steps to find out confidence
interval
❖ Convert r to z’ (Fisher’s Transformation)
❖ Calculate confidence interval in terms of z’
❖ Convert confidence interval back to r
❖ z’ = .5[ln(1+r) – ln(1-r)]
❖ Variance = 1/N-3
Correlation – Confidence Interval
❖ N=10, r=0.88 find confidence interval
❖ Step 1.
❖ Convert r to z’
❖ z’ = .5[ln(1+r) – ln(1-r)]
❖ z’ = .5[ln(1+0.88) – ln(1-0.88)]
❖ z’= . 5[0.63 – (-2.12)] = 1.375
Correlation – Confidence Interval
❖ N=10, r=0.88 find confidence interval
❖ Step 2. Confidence interval for z’
❖ Variance = 1/N-3 = 1/7 = 0.1428
❖ Standard error = Sqrt (0.1428) = 0.378
❖ 95% confidence Z = 1.96
❖ CI = 1.375 +/- (1.96)(0.378)
❖ Lower Limit = 0.635
❖ Upper Limit = 2.11
Correlation – Confidence Interval
❖ N=10, r=0.88 find confidence interval
❖ Step 3. Convert back to r
❖ z’ Lower Limit = 0.635
❖ z’ Upper Limit = 2.11
z’ = .5[ln(1+r) – ln(1-r)]
❖ r = 0.88, r2 = 0.77
Regression Analysis
❖ Quantifies the relationship between Y
and X (Y = a + bX)
Regression Analysis
❖ Quantifies the relationship between Y
and X (Y = a + bX)
Hours Studied (X) Test Score % (Y) XY X2 Y2
20 40 800 400 1600
24 55 1320 576 3025
46 69 3174 2116 4761
62 83 5146 3844 6889
22 27 594 484 729
37 44 1628 1369 1936
45 61 2745 2025 3721
27 33 891 729 1089
65 71 4615 4225 5041
23 37 851 529 1369
SUM 371 520 21764 16297 30160
Regression Analysis
❖ Quantifies the relationship between Y
and X (Y = 15.79 + 0.97.X)
Hours
Test Score
Studied XY X2 Y2
% (Y)
(X)
20 40 800 400 1600
24 55 1320 576 3025
46 69 3174 2116 4761
62 83 5146 3844 6889
22 27 594 484 729
37 44 1628 1369 1936
45 61 2745 2025 3721
27 33 891 729 1089
65 71 4615 4225 5041
23 37 851 529 1369
SUM 371 520 21764 16297 30160
Regression Analysis
❖ For a student studying 50 hrs what is the
expected test score %?
Residual Analysis
❖ Y = 15.79 + 0.97.X
Residual Analysis – No pattern
Residual
20
15
10
0
0 10 20 30 40 50 60 70
-5
-10
-15
Residual
Time Series
• Trend
• Seasonality
Time Series
Patterns
• Trend
• Seasonality
Time Series
Patterns
• Trend
• Seasonality
Time Series Prediction Techniques
❖ Moving Average Method
❖ Exponential Smoothening
❖ Vector Auto Regression
❖ ARIMA (Autoregressive
moving average model)
Model
Statistical Process Control (SPC)
❖ SPC helps to monitor and control a
process.
❖ Monitoring and controlling the process
ensures that it operates at its full
potential.
❖ At its full potential, the process can
make as much conforming product as
possible with a minimum waste
❖ Products conforming to specification are
acceptable products
Statistical Process Control (SPC)
❖ Two phases of SPC
❖ Understanding the process variation
❖ Monitoring and Controlling
# Pieces # Occurrences
# Defectives # Defects
n is subgroup size
Variable Measurements
Variable - Measurements
n>9 n =2 to 9 n=1
# Pieces # Occurrences
# Defectives # Defects
n is subgroup size
np and p Chart
Variable Attribute
Total Defectives and Percent Defective
❖ Binomial Distribution
❖ Subgroup size is normally big compared to
variable charts
# Pieces
# Defectives
Constant Variable
np chart p chart
np Chart
Equal Subgroup size Control Limits
❖ Control limits are straight lines
12 10.931
10
NP - Defectives
8
6 4.680
4
2
0.000
0
p Chart
Unequal Subgroup size Control Limits
❖ Control limits change with the number of items
in the subgroup (subgroup size)
❖ Larger Subgroup – narrow control limits
❖ Smaller Subgroup – wider control limits
0.100
P - Unplanned Return
0.080
0.053
0.060
0.040
0.021
0.020
0.000
0.000
Attribute Measurements
Attribute - Counts
# Pieces # Occurrences
# Defectives # Defects
# Occurrences
# Defects
Constant Variable
c chart u chart
c Chart
Equal Subgroup size Control Limits
Average Defects
20.192
20
C - Defects
15
10.480
10
5
0.768
0
u Chart
Defects per Unit Control Limits
0.150
U - Defects
0.105
0.100
0.050
0.004
0.000
Control Chart Analysis
What is the problem with this process?
110.00
109.00
107.45
108.00
107.00
106.00
X-Bar: Shot 1 - Shot 3
105.00
104.00
103.00
102.00 100.91
101.00
100.00
99.00
98.00
97.00
96.00
95.00 94.36
94.00
Control Chart Rules
❖ Nelson Rules
Rule Pattern Probable Cause
1 1 point more than 3 Stdev from CL New person, wrong setup
2 7 points in a row on same side of CL Setup change, process change
3 7 points in a row all increasing or all decreasing Trend, Tool wear
4 14 points in a row alternating up and down Over control, tempering
5 2 out of 3 points more than 2 Stdev from CL (same side) New person, wrong setup
6 4 out of 5 points more than 1 Stdev from CL (same side) Small shift similar to Rule 1, 5
7 14 points in a row within 1 Stdev from CL (either side) Process change
8 8 points in a row more than 1 Stdev from CL (either side) Process change
Rule Pattern
Rule 1 1
2
1 point more than 3 Stdev from CL
7 points in a row on same side of CL
3 7 points in a row all increasing or all decreasing
4 14 points in a row alternating up and down
5 2 out of 3 points more than 2 Stdev from CL (same side)
6 4 out of 5 points more than 1 Stdev from CL (same side)
7 14 points in a row within 1 Stdev from CL (either side)
8 8 points in a row more than 1 Stdev from CL (either side)
110.00
1
109.00
107.45
108.00
A
107.00
5
106.00
X-Bar: Shot 1 - Shot 3
5
105.00
104.00
103.00
B
102.00
101.00 C 100.91
C
2
100.00
99.00 6 2
B
98.00
97.00
96.00
95.00 A 3
94.36
94.00
Rule Pattern
Rule 2 1
2
1 point more than 3 Stdev from CL
7 points in a row on same side of CL
3 7 points in a row all increasing or all decreasing
4 14 points in a row alternating up and down
5 2 out of 3 points more than 2 Stdev from CL (same side)
6 4 out of 5 points more than 1 Stdev from CL (same side)
7 14 points in a row within 1 Stdev from CL (either side)
8 8 points in a row more than 1 Stdev from CL (either side)
110.00
1
109.00
107.45
108.00
107.00
5
106.00
X-Bar: Shot 1 - Shot 3
5
105.00
104.00
103.00
102.00 100.91
101.00
2
100.00
99.00 6 2
98.00
97.00
96.00 3
94.36
95.00
94.00
Rule Pattern
Rule 3 1
2
1 point more than 3 Stdev from CL
7 points in a row on same side of CL
3 7 points in a row all increasing or all decreasing
4 14 points in a row alternating up and down
5 2 out of 3 points more than 2 Stdev from CL (same side)
6 4 out of 5 points more than 1 Stdev from CL (same side)
7 14 points in a row within 1 Stdev from CL (either side)
8 8 points in a row more than 1 Stdev from CL (either side)
110.00
1
109.00
107.45
108.00
107.00
5
106.00
X-Bar: Shot 1 - Shot 3
5
105.00
104.00
103.00
102.00 100.91
101.00
2
100.00
99.00 6 2
98.00
97.00
96.00 3
94.36
95.00
94.00
Rule Pattern
Rule 4 1
2
1 point more than 3 Stdev from CL
7 points in a row on same side of CL
3 7 points in a row all increasing or all decreasing
4 14 points in a row alternating up and down
5 2 out of 3 points more than 2 Stdev from CL (same side)
6 4 out of 5 points more than 1 Stdev from CL (same side)
7 14 points in a row within 1 Stdev from CL (either side)
8 8 points in a row more than 1 Stdev from CL (either side)
108.00 107.32
107.00
106.00 4
4 5
105.00
X-Bar: Shot 4 - Shot 6
104.00
4
103.00
102.00 100.65
101.00
100.00
99.00
98.00
97.00
96.00
95.00 93.98
94.00
93.00
Rule Pattern
Rule 5 1
2
1 point more than 3 Stdev from CL
7 points in a row on same side of CL
3 7 points in a row all increasing or all decreasing
4 14 points in a row alternating up and down
5 2 out of 3 points more than 2 Stdev from CL (same side)
6 4 out of 5 points more than 1 Stdev from CL (same side)
7 14 points in a row within 1 Stdev from CL (either side)
8 8 points in a row more than 1 Stdev from CL (either side)
110.00
1
109.00
107.45
108.00
107.00
5
106.00
X-Bar: Shot 1 - Shot 3
5
105.00
104.00
103.00
102.00 100.91
101.00
2
100.00
99.00 6 2
98.00
97.00
96.00 3
94.36
95.00
94.00
Rule Pattern
Rule 6 1
2
1 point more than 3 Stdev from CL
7 points in a row on same side of CL
3 7 points in a row all increasing or all decreasing
4 14 points in a row alternating up and down
5 2 out of 3 points more than 2 Stdev from CL (same side)
6 4 out of 5 points more than 1 Stdev from CL (same side)
7 14 points in a row within 1 Stdev from CL (either side)
8 8 points in a row more than 1 Stdev from CL (either side)
110.00
1
109.00
107.45
108.00
107.00
5
106.00
X-Bar: Shot 1 - Shot 3
5
105.00
104.00
103.00
102.00 100.91
101.00
2
100.00
99.00 6 2
98.00
97.00
96.00 3
94.36
95.00
94.00
Rule Pattern
Rule 7 1
2
1 point more than 3 Stdev from CL
7 points in a row on same side of CL
3 7 points in a row all increasing or all decreasing
4 14 points in a row alternating up and down
5 2 out of 3 points more than 2 Stdev from CL (same side)
6 4 out of 5 points more than 1 Stdev from CL (same side)
7 14 points in a row within 1 Stdev from CL (either side)
8 8 points in a row more than 1106.80
Stdev from CL (either side)
107.00
106.00
5 5
105.00
104.00
X-Bar: Shot 7 - Shot 9
103.00
102.00 7
100.45
101.00
7
100.00
99.00
98.00
97.00
96.00
95.00 94.11
94.00
93.00
Rule Pattern
Rule 8 1
2
1 point more than 3 Stdev from CL
7 points in a row on same side of CL
3 7 points in a row all increasing or all decreasing
4 14 points in a row alternating up and down
5 2 out of 3 points more than 2 Stdev from CL (same side)
6 4 out of 5 points more than 1 Stdev from CL (same side)
7 14 points in a row within 1 Stdev from CL (either side)
8 8 points in a row more than 1 Stdev from CL (either side)
107.44
108.00
107.00
106.00
X-Bar: Shot 11 - Shot 13
105.00 8
104.00
103.00
101.71
102.00
101.00
100.00
99.00
98.00
97.00 95.98
96.00
95.00
Rule Pattern
Rule 1,2,7,8
1 1 point more than 3 Stdev from CL
2 7 points in a row on same side of CL
3 7 points in a row all increasing or all decreasing
❖ Probability of Rule 1 4 14 points in a row alternating up and down
5 2 out of 3 points more than 2 Stdev from CL (same side)
❖ (1-0.9973) = 0.0027
6 4 out of 5 points more than 1 Stdev from CL (same side)
❖ Probability of Rule 2 7 14 points in a row within 1 Stdev from CL (either side)
❖ (0.5)7 = 0.0078 8 8 points in a row more than 1 Stdev from CL (either side)
❖ Probability of Rule 7
❖ (0.68)14 = 0.0045
❖ Probability of Rule 8
❖ (1-0.68)8 = 0.0001
Pre-Control Charts
❖ Use specification limits instead of
statistically-derived control limits to
determine process capability over time.
❖ Used during the initial setup process.
❖ Easier to setup, implement and interpret
Pre-Control Charts
• Pre-Control Limits (LPCL and UPCL) are
50% of the tolerance.
• To establish process control, 5 items
should fall in the Pre-Control Limits.
• After that 2 successive units are
periodically samples.
• Continue if both fall in green or
one in green and one in Yellow.
• Stop and adjust process if both fall
in Yellow, or one fall in the red
zone.
Short-run SPC
❖ A typical Control Chart needs 20-25
samples with 4 to 5 items as the
subgroup size.
❖ You need roughly 100 measurements to
define control limits.
❖ What if there are a very few pieces
manufactured?
❖ Use Short-run Chart
Short-run SPC
❖ Short-run SPC focuses on the process
rather than the product.
❖ Example: Different diameter items
produced
❖ E.g Eight items with 300, 400 and 500 mm
each
❖ Options:
❖ 100% inspection – Expensive
❖ First-off inspection – What about process
variation?
❖ Last-off inspection – Too little too late
❖ Separate control chart – limited data
Short-run SPC
302.634 Run A 504.188 Run B 400.548 Run C
300.558 Run A 506.879 Run B 403.193 Run C
301.604 Run A 506.189 Run B 392.790 Run C
298.130 Run A 517.210 Run B 399.538 Run C
298.824 Run A 479.511 Run B 392.192 Run C
301.384 Run A 495.170 Run B 403.812 Run C
302.373 Run A 506.851 Run B 393.457 Run C
298.685 Run A 489.671 Run B 401.051 Run C
Short-run – Difference Chart
Stamp Data
302.634
Run
Run A
Nominal
300
Difference
2.6338 Assumption: Each run has similar variance
300.558 Run A 300 0.5579
301.604 Run A 300 1.6043
298.130 Run A 300 -1.8704
298.824 Run A 300 -1.1757 I-MR Chart of C4
301.384 Run A 300 1.3837 20 UCL=21.16
302.373 Run A 300 2.3729
298.685 Run A 300 -1.3154 10
Individual Value
504.188 Run B 500 4.1884 0
_
X=-0.15
506.879 Run B 500 6.8792
-10
506.189 Run B 500 6.1887
517.210 Run B 500 17.2103 -20 LCL=-21.45
479.511 Run B 500 -20.4891 1 3 5 7 9 11 13 15 17 19 21 23
495.170 Run B 500 -4.8304 Observation
Moving Range
403.193 Run C 400 3.1932 UCL=26.17
20
392.790 Run C 400 -7.2102
399.538 Run C 400 -0.4623 10 __
MR=8.01
392.192 Run C 400 -7.8076
403.812 Run C 400 3.8119 0 LCL=0
1 3 5 7 9 11 13 15 17 19 21 23
393.457 Run C 400 -6.5428
Observation
401.051 Run C 400 1.0513
Z-MR Chart
Stamp Data Run Nominal Difference
302.634 Run A 300 2.6338
300.558 Run A 300 0.5579
301.604 Run A 300 1.6043
298.130 Run A 300 -1.8704
298.824 Run A 300 -1.1757
301.384 Run A 300 1.3837
302.373 Run A 300 2.3729
298.685 Run A 300 -1.3154
504.188 Run B 500 4.1884
506.879 Run B 500 6.8792
506.189 Run B 500 6.1887
517.210 Run B 500 17.2103
479.511 Run B 500 -20.4891
495.170 Run B 500 -4.8304
506.851 Run B 500 6.8514
489.671 Run B 500 -10.3293
400.548 Run C 400 0.5483
403.193 Run C 400 3.1932
392.790 Run C 400 -7.2102
399.538 Run C 400 -0.4623
392.192 Run C 400 -7.8076
403.812 Run C 400 3.8119
393.457 Run C 400 -6.5428
401.051 Run C 400 1.0513
Process Capability Studies
❖ Select the process
❖ Data Collection Plan
❖ Measurement System Analysis
❖ Gather data
❖ Confirm normality of data
❖ Confirm that the process is in control
❖ Estimate the process capability
❖ Continually improve process
Process Performance Matrices
❖ Percent Defectives
❖ PPM
❖ DPMO
❖ DPU
❖ Rolled Through Yield
Process Performance Matrices
❖ Percent Defectives
❖ PPM
❖ DPMO
❖ DPU
❖ Rolled Through Yield
Percent Defectives
❖ Percent of parts having one or more
defects
❖ 2 percent – 2 pieces per 100 pieces
Parts per Million (PPM)
❖ Defective parts per million.
❖ 2 percent – 2 pieces per 100 pieces
❖ 0.02 x 1,000,000 = 20,000 PPM
Defect vs Defective
❖A nonconforming unit is a defective
unit
❖ Cr = 6σ / (USL-LSL)
❖ Milk : 40 cc vs 80 cc
6 9
+
Milk
-
3 - Sugar + 6
Design of Experiments
6 9
+
# Sugar Milk Rating Sequence
Milk
1 - - 3 2
2 - + 6 4
3 + - 6 1 -
4 + + 9 3 3 - Sugar + 6
Interaction Chart 10 Interaction Chart 9
10 9
8
8
6 6 6 6
6 6
4 3 4 3
Milk -
2 2 Sugar -
Milk +
Sugar +
0 0
Sugar - Sugar + Milk - Milk +
Design of Experiments
# Sugar Milk Rating Sequence
1 - - 3 2
2 - + 6 4
3 + - 6 1
4 + + 9 3
Contour Plot 6 9
+
Milk
-
3 - Sugar + 6
Design of Experiments
6 9
❖ Y = B0 + B1X1 + B2X2 +
❖ Y = B0 + BsXs + BmXm
Milk
❖ Y = 6 + 1.5 Xs + 1.5 Xm
-
3 - Sugar + 6
Design of Experiments
Y = 6 + 1.5 Xs + 1.5 Xm
7.2 8.8
+
Milk
-
3.7 - Sugar + 7.9
Design of Experiments
7.2 8.8
Standard Sugar Milk Rating Run Order
+
Order
Milk
1 - - 3.7 2
2 - + 7.2 4
3 + - 7.9 1
-
4 + + 8.8 3 3.7 - Sugar + 7.9
Interaction Chart 10 Interaction Chart 8.8
10 8.8 7.9
7.2 8
8
7.2
7.9 6
6
3.7 3.7
4 4
Milk -
2 2 Sugar -
Milk +
Sugar +
0 0
Sugar - Sugar + Milk - Milk +
Design of Experiments
Standard Sugar Milk Rating Run Order
Order
1 - - 3.7 2
2 - + 7.2 4
3 + - 7.9 1
4 + + 8.8 3
7.2 8.8
Contour Plot +
Milk
-
3.7 - Sugar + 7.9
Design of Experiments
7.2 8.8
❖ Y = B0 + B1X1 + B2X2 +
❖ Y = B0 + BsXs + BmXm
Milk
❖ Y = 6.9 + 1.45 Xs + 1.1 Xm
-
❖ For low milk, low sugar
3.7 - Sugar + 7.9
❖ Y = 6.9 + 1.45 (-1) +1.1 (-1) = 4.35 (against 3.7)
❖ Hence something else is playing here … called
interaction or Xs . Xm
❖ Interaction Xs . Xm = (8.8-7.2)-(7.9-3.7)/2
Design of Experiments
7.2 8.8
❖ Y = B0 + B1X1 + B2X2 +
❖ Y = B0 + BsXs + BmXm
Milk
❖ Y = 6.9 + 1.45 Xs + 1.1 Xm
-
❖ Interaction is half the difference in the
3.7 - Sugar + 7.9
effect of sugar when milk is high or low is
= (8.8-7.2)-(7.9-3.7)/2 = -1.3
❖ Report of half of this in the equation as
multiple of Xs . Xm or -1.3/2 = -0.65
❖ Y = 6.9 + 1.45 Xs + 1.1 Xm – 0.65 Xs . Xm
Design of Experiments
7.2 8.8
❖ Y = 6.9 + 1.45 Xs + 1.1 Xm – 0.65 Xs . Xm +
❖ For high (+) sugar high (+) milk
Milk
❖ Y = 6.9 + 1.45 (+1) + 1.1 (+1) – 0.65 (+1).(+1)
-
❖ Y = 6.9 + 1.45 + 1.1 – 0.65 = 8.8
3.7 - Sugar + 7.9
Milk
1 - - -
2 - - + +
3 - + - -
-
4 - + + - Sugar +
5 + - -
6 + - +
7 + + -
8 + + +
Design of Experiments
❖ 2 Factors 2 Levels
❖ Y = B0 + BsXs + BmXm + Bsm Xs . Xm +
❖
Milk
❖ 3 Factors 2 Levels +
❖ Y = B0 + BsXs + BmXm + BbXb -
-
+ Bsm Xs . Xm + Bsb Xs . Xb + Bmb Xm . Xb - Sugar +
+ Bsmb Xs . Xm. Xb
Design of Experiments
❖ 4 Factors 2 Levels
❖ Y = B0 + B1X1 + B2X2 + B3X3 + B4X4
+ B12 X1 . X2 + B13 X1 . X3 + B14 X1 . X4
+ B23 X2 . X3 + B24 X2 . X4 + B34 X3 . X4
Milk
23-1 = 4 experiments (half factorial) +
Standard Sugar Milk Bean Run Order
Order
-
1 - - -
-
2 - - +
- Sugar +
3 - + -
4 - + +
5 + - -
6 + - +
7 + + -
8 + + +
Half Factorial
8
4
Milk
2
+6
2 - - +
3 - + -
4 - + + - -
5 + - -
1 - Sugar + 5
6 + - +
7 + + -
8 + + +
Half Factorial
8
4
3
+ 7
Milk
2
+6
-
1 - -
Sugar + 5
Half Factorial
Standard Sugar Milk Bean Run
Standard Sugar Milk Bean Run Order Orde
Order Order r
1 - - - 1 - - (-).(-) = +
2 - - + 2 - + (-).(+) = -
3 - + - 3 + - (+).(-) = -
4 - + + 4 + + (+).(+) = +
5 + - -
6 + - +
7 + + -
8 + + +
+C = A.B
-C = A.B
Half Factorial
Sugar Milk Bean A.B B.C C.A A.B.C
(A) (B) (C)
1 - - + + - - +
2 - + - - - + +
3 + - - - + - +
4 + + + + + + +
A = B.C
B = A.C
C = A.B
Confounding/Aliased
❖ When effects that cannot be estimated
separately from each other.
A = B.C
B = A.C
C = A.B
148
Machine 1 Machine 2 Machine 3
146
One Factor Experiments
❖ How to deal with nuisance factors?
❖ Known and Controllable
❖ Blocking (shift, operator)
❖ Unknown and Uncontrollable
❖ Randomization (vibration)
Experimental
Treatment A
Group Compare
Patients Results
Control
Placebo
Group
Randomized Block Design
❖ How to deal with nuisance factors?
❖ Known and Controllable
❖ Blocking (gender)
❖ Unknown and Uncontrollable
❖ Randomization (age, medical history)
❖ Example: 30 patient (16 male, 14
female)
Randomized Block Design
16 males Experimental
Treatment A
Group
Compare Results
Males
Control
Placebo
Group
Patients
Experimental
Treatment A
Group
16 males Females
Compare Results
14 females Control
Placebo
14 females Group
D A B C
B C D A
A B C D
C D A B
Two Level Factorial Experiments
❖ 3 Factors Two Level Full Factorial
Requires 8 experiments +
Standard Sugar Milk Bean Rating Run
Milk
Order Order
1 - - - +
2 - - + -
-
3 - + - - Sugar +
4 - + +
5 + - -
6 + - +
7 + + -
8 + + +
Two Level Factorial
❖ 3 Factors Two Level Half Factorial
Requires 4 experiments
❖ Partial Factor experiments have
confounding.
Sugar Milk Bean A.B B.C C.A A.B.C
(A) (B) (C)
1 - - + + - - +
2 - + - - - + +
3 + - - - + - +
4 + + + + + + +
A = B.C
B = A.C
C = A.B
Resolution
❖ Resolution III
❖ No main effects are aliased with any other
A = B.C
main effect, B = A.C
❖ but main effects are aliased with 2-factor C = A.B
interactions.
❖ Resolution IV
❖ No main effects are aliased with any other
main effect or 2-factor interactions,
❖ but some 2-factor interactions are aliased with
other 2-factor interactions and main effects are
aliased with 3-factor interactions.
❖ Resolution V
❖ No main effects or 2-factor interactions are
aliased with any other main effect or 2-factor
interactions,
❖ but 2-factor interactions are aliased with 3-
factor interactions and main effects are aliased
with 4-factor interactions.
2 Designs
k-p