Chapter 2 Final of Final
Chapter 2 Final of Final
Chapter 2 Final of Final
Probability Distribution
• Statistics is a collection of techniques useful for
making decisions about a process or population
based on an analysis of the information
Statistics contained in a sample from that population.
• Statistical methods play a vital role in quality
control and improvement.
Two kinds of statistics
• Descriptive and inferential
• No two units of product produced by
DESCRIBING a process are identical. Some
variation is inevitable. As
VARIATION examples, the net content of a can
of soft drink varies slightly from can
to can, and the output
voltage of a power supply is not
exactly the same from one unit to
the next.
• Statistics is the science of analyzing
data and drawing conclusions, taking
variation in the data into account
• The Stem-and-Leaf Plot
DESCRIBING • The Histogram
• Numerical Summary of Data
VARIATION • The Box Plot
• Probability Distributions
• Suppose that the data are represented by x1,
x2, . . . , xn and that each number xi consists
of at least two digits.
• To construct a stem-and-leaf plot, we divide
each number xi into two
1. The Stem- parts: a stem, consisting of one or more of the
leading digits; and a leaf, consisting of the
and-Leaf Plot remaining digits.
• For example, if the data consist of percent
defective information between 0
and 100 on lots of semiconductor wafers, then
we can divide the value 76 into the stem 7 and
the leaf 6.
Example
• 15, 27, 8, 17, 13, 22, 24, 25, 13, 36, 32, 32, 32, 28, 43, 7
• Step 1 – Arrange in ascending order
0th – 7, 8
Question A. 72, 85, 89, 93, 88, 76, 108, 115, 97, 102, 113
Question B. 1.2, 2.3, 1.5, 2.4, 3.6, 1.8, 2.7, 3.2, 4.1, 2.9, 4.5, 7.6, 5.8,
9.3, 10.6, 12.4, 10.9
2. Histogram
➢ The histogram is a bar chart showing a distribution of
variables. (or) A graphical presentation of grouped
frequency distribution of variables.
➢ consisting of a series of adjacent rectangles whose bases
are the class intervals specified in terms of
❑ class boundaries (equal to the class width of the
corresponding classes) shown on the x-axis and
❑ whose heights are proportional to the corresponding class
frequenc0ies shown on the y-axis.
Histogram
➢It can help suggest both the nature, and possible improvements
for the physical mechanisms/ quality characteristics in the
process.
A histogram is
designed to show:
Step-2. Calculate Number of Class (K), if not given and using Sturges
Formula → K= 1+ 3.3 log N, Where N is the Number of Observation
= 134-100=34
Step-2. Calculate Number of Class (K), if not given and using Sturges
Formula
K= 1+ 3.3 log N =1+3.3*log 50 =6.61 , the number of class can be taken as 7
0
99.5 104.5 109.5 114.5 119.5 124.5 129.5 134.5
Values (degree Fahrenheit)
• The stem-and-leaf plot and the histogram
provide a visual display of three
properties of sample data:
3. Numerical 1 → the shape of the distribution of the data,
Summary of 2 → the central tendency in the data, and
3 → the scatter or variability in the data.
Data It is also helpful to use numerical measures
of central tendency and scatter.
Average & Variance
• Suppose that x1, x2, . . . , xn are the observations
in a sample. The most important measure of
central tendency in the sample is the sample
average. the sample average represents the
center of mass of the sample data.
Probability deals with predicting the Statistics involves the analysis of the
likelihood of future events. frequency of past events
Example: Consider there is a drawer containing 100 socks: 30 red, 20 blue and
50 black socks.
We can use probability to answer questions about the selection of a
random sample of these socks.
• PQ1. What is the probability that we draw two blue socks or two red socks from
the drawer?
Two Kinds of
Distributions
Probability Distributions
A DISCRETE
DISTRIBUTION
Random
Numbers
and Discrete
Distribution
Cumulative Distribution Function
The cumulative distribution function (CDF), F(x), of a
discrete random variable X is defined by,
2.3
3 4 5
44
Example
x -1 0 1 2
P(X=x) 0.2 0.3 a b
𝐸(𝑥) = 𝑥𝑃(𝑋 = 𝑥)
45
Cont’d
The expected value of a function of a discrete random variable X is:
𝐸(𝑥) = 𝑥𝑃(𝑋 = 𝑥)
ℎ(𝑥)
𝐸[ℎ(𝑋)] = ℎ(𝑥)𝑃(𝑥)
𝑎𝑙𝑙 𝑥
𝐸𝑥𝑎𝑚𝑝𝑙𝑒
a linear function
ℎ(𝑥) = 𝑎𝑥 + 𝑏
E(aX + b) = aE(X) + b
𝐸[ℎ(𝑥)] = 𝑎𝐸(𝑥) + 𝑏
= 𝑎 ∗ 𝑥𝑃(𝑋 = 𝑥) + 𝑏
47
Example
𝑦 = 3𝑥 + 5
X P(x) XP(x)
3 0.2 0.6
4 0.1 0.4
5 0.3 1.5
6 0.2 1.2
7 0.2 1.4
𝑦 = 3𝑥 + 5
𝑥𝑝(𝑥) = 5.1
48
Cont’d
Example
Monthly sales of a certain product are believed to follow the given probability
distribution. Suppose the company has a fixed monthly production cost of $8000
and that each item brings $2. Find the expected monthly profit h(X), from product
sales
𝟐 ∗ 𝟔𝟕𝟎𝟎 − 𝟖𝟎𝟎𝟎
= 5400 49
Variance and Standard Deviation of a Random
Variable
The variance of a random variable is the expected
squared deviation from the mean:
𝜎 2 = 𝑉(𝑋) =
𝑉(𝑥) = 𝐸(𝑋 2 ) − [𝐸(𝑋)]2
51
• Suppose 60% of Ethiopian adults
approve of the way the Prime Minister
Abiy is handing his job.
Example
• Randomly selected sample of 2
Ethiopian Adults
• Let X represent the number that
approve.
• Calculate V(x)
x 0 1 2
52
Some Properties of Means and Variances of
Random Variables
The mean or expected value of the sum of random variables
is the sum of their means or expected values:
m( X+Y) = E( X +Y) = E( X) + E(Y) = mX + mY
For example: E(X) = $350 and E(Y) = $200
E(X+Y) = $350 + $200 = $550
The variance of the sum of mutually independent random
variables is the sum of their variances:
2 ( X +Y ) = V ( X + Y) = V ( X ) +V (Y) = 2 X + 2 Y
if and only if X and Y are independent.
NOTE:
𝐸(𝑎1 𝑋1 + 𝑎2 𝑋2 +. . . +𝑎𝑘 𝑋𝑘 ) = 𝑎1 𝐸(𝑋1 ) + 𝑎2 𝐸(𝑋2 )+. . . +𝑎𝑘 𝐸(𝑋𝑘 )
and
55
Cont’d The variance of a linear function of random variable is:
𝑉(𝑎𝑋 + 𝑏) = 𝑎2 𝑉(𝑋) = 𝑎2 𝜎 2
Example = 𝐸(𝑋 2 ) − [𝐸(𝑋)]2 =
Number = 465000000 − [6700 ∗ 6700]
of items,x P(x) xP(x) x2 x2 P(x) = 420,110,000
5000 0.2 1000 25k6 5000000 SD = 20,496.58
6000 0.3 1800 36k6 10800000 𝑉(𝑎𝑋 + 𝑏) = 𝑎2 𝑉(𝑋) = 𝑎2 𝜎 2
𝑉(2𝑋 − 8000) = (22 )𝑉(𝑋)
7000 0.2 1400 49k6 9800000 = (4)(420,110,000) = 1,680,440,000
57
Cont’d
4. V(c) = 0
• The variance of a constant (c) is zero.
5. V(X + c) = V(X)
• The variance of a random variable and a constant is just the variance
of the random variable.
6. V(cX) = c2V(X)
• The variance of a random variable and a constant coefficient is the
coefficient squared times the variance of the random variable.
58
Continuous Random Variable & Probability Distributions
Random Variable
Random
Variables
Discrete Continuous
Random Variable Random Variable
62
Introduction
65
Rules for CRV
66
Probability Distributions
Continuous
Probability
Distributions
Continuous
Uniform Distributions
Continuous
Normal Distributions
Continuous Joint
Probability Distributions
Exponential Distributions
Weibull Distributions
68
Uniform Distribution
Continuous random variables that appear to have
equally likely outcomes over their range of possible
values possess a uniform probability distribution.
Suppose the random variable
x can assume values only in
an interval c ≤ x ≤ d. Then the
uniform frequency function
has a rectangular shape.
69
69
Uniform Probability Distribution of
Random Variable x
1
Probability density function: f (x) = cxd
d−c
c+d d−c
Mean: m = Standard Deviation: =
2 12
P (a x b ) = (b − a ) (d − c ), c a b d
70
Cont’d
71
f(x)
1 1
= 1.0
d − c 12.5 − 11.5
1
= = 1.0
1
x
11.5 11.8 12.5
1
for 40 x 60
60 − 40
f ( x) =
0 elsewhere
74
Uniform probability distribution of total expected
travel cost for the firm
Let x represents the total anticipated annual travel costs in
thousands of dollars, the distribution has density f(x) = 1/(60 - 40)
= 1/20 over the range from 40 to 60 and density of 0 elsewhere.
f(x)
1/20
x
40 60
75
Area of the distribution
f(x)
1/20
x
40 60
Travel costs
• Note that the total area under the density function f(x) between 40 and 60
equals 1.
Area = height x length = (1/20) x (60–40) = (1/20) x 20 = 1 and this equals the
probability that some value within the range from 40 to 60 occurs.
76
Area and probability over an interval
f(x)
1/20
x
40 50 60
Travel costs
Therefore, the probability that travel costs are between 40 and 50
Area under the curve between 40 and 50
= height x length = (1/20) x (50–40) = (1/20) x 10 = 0.5
and this is the required probability.
77
Expected value and variance for a uniform
probability distribution
For the travel cost example:
a+b
E ( x) = E(x) = (40 + 60)/2 = 50
2
Var(x) = (60–40)2/12 = 33.333
(b − a )
2
Var ( x ) = =
2
b−a
square root of 33.333 or 5.774
=
12
78
Example: Slater's Buffet
Slater customers are charged for the amount of salad they
take. Sampling suggests that the amount of salad taken is
uniformly distributed between 5 ounces and 15 ounces.
What is the probability that a customer will take between
12 and 15 ounces of salad?
79
Example: Slater's Buffet
f(x)
1/10
x
5 10 12 15
Salad Weight (oz.)
80
Example: Slater's Buffet
• Expected Value of x
E(x) = (a + b)/2
= (5 + 15)/2
= 10
• Variance of x
Var(x) = (b - a)2/12
= (15 – 5)2/12
= 8.33
81
Normal probability distribution
• The normal probability distribution is the most common and
important of the continuous probability distributions used in
statistical and econometric work.
• Other names for the normal distribution are the bell curve,
since it has a sort of bell shape, and the Gaussian distribution,
after Gauss, who is considered to be the first to have described
and used the distribution.
82
• The most often used continuous probability distribution is the normal
distribution; it is also known as Gaussian distribution.
f(x)
x
m
84
Formula and parameters for the normal distribution
85
Normal Distribution
• The mathematical equation for the probability distribution of the normal variable
depends upon the two parameters 𝜇 and 𝜎, its mean and standard deviation.
f(x)
𝜎
𝜇
x
87
Normal Distribution:Summary
1. ‘Bell-shaped’ & symmetrical f(x )
2. Mean, median, mode are equal
3. Location is characterized by the
mean, μ
4. Spread is characterized by the
standard deviation, σ
x
5. The random variable has an infinite
theoretical range: - to + Mean
Median
Mode
Normal Distribution
σ1 = σ2
σ1
σ2
µ1
µ1 µ
µ2 2 µ1 = µ2
Normal curves with µ1< µ2 and σ1 = σ2 Normal curves with µ1 = µ2 and σ1< σ2
σ1
σ2
µ1 µ2
Normal curves with µ1<µ2 and σ1<σ2
Properties of Normal Distribution
– The curve is symmetric about a vertical axis through the mean 𝜇.
– The random variable 𝑥 can take any value from −∞ 𝑡𝑜 ∞.
– The most frequently used descriptive parameter s define the curve itself.
– The mode, which is the point on the horizontal axis where the curve is a
maximum occurs at 𝑥 = 𝜇.
– The total area under the curve and above the horizontal axis is equal to 1.
∞ 1 ∞ − 1 2 (𝑥−𝜇)2
−∞ 𝑓 𝑥 𝑑𝑥 =
𝜎 2𝜋
−∞ 𝑒 2𝜎 𝑑𝑥 =1
1
∞ 1 ∞ − 2 (𝑥−𝜇)2
– 𝜇= −∞ 𝑥. 𝑓 𝑥 𝑑𝑥 = −∞ 𝑥. 𝑒 2𝜎 𝑑𝑥
𝜎 2𝜋
1
2 1 ∞ 2 −2[(𝑥−𝜇)ൗ𝜎2]
– 𝜎 = −∞(𝑥 − 𝜇) . 𝑒 𝑑𝑥
𝜎 2𝜋
1 𝑥2 − 12 (𝑥−𝜇)2
– 𝑃 𝑥1 < 𝑥 < 𝑥2 = 𝑒 𝑥2𝜎 𝑑𝑥
𝜎 2𝜋 1
denotes the probability of x in the interval (𝑥1 , 𝑥2 ). 𝜇 x1 x2
Standard Normal Distribution
• The normal distribution has computational complexity to calculate 𝑃 𝑥1 < 𝑥 < 𝑥2 for any two (𝑥1 , 𝑥2 ) and
given 𝜇 and 𝜎
• To avoid this difficulty, the concept of 𝑧-transformation is followed.
𝑥−𝜇
z= 𝜎
[Z-transformation]
= 𝑓(𝑧: 0, 𝜎)
Effect of Varying
Parameters (m & )
Normal Probability Distribution
Probability is
area under d
curve! P(c x d) =
c
f (x)dx ?
f(x)
x
c d
Standard Normal Distribution
➢The standard normal distribution is a normal distribution with µ = 0 and = 1.
A random variable with a standard normal distribution, denoted by the symbol
z, is called a standard normal random variable.
➢Probabilities associated with values of this standard normal random variable are
tabulated.
x−m
z=
The random variable Z represents the distance of X from its mean in terms of
standard deviations. It is the key step to calculate a probability for an arbitrary
normal random variable.
94
95
95
96
96
104
104
105
105
106
106
107
107
108
108
109
109
Example 2.
Students’ scores of Statistics for Engineers course are approximately distributed
normally with mean 80 and standard deviation 5.
• What is the probability that a student scores 82 or less?
P(X ≤ 82) = P(Z ≤ (82-80)/5) = P(Z ≤ .40) = .6554
• What is the probability that a student scores a 90 or more?
P(X ≥ 90) = P(Z ≥ (90-80)/5) = P(Z ≥ 2.00) = 1 - P(Z ≤ 2.00) = 1 - .9772 = .0228
• What is the probability that a student scores a 74 or less?
P(X ≤ 74) = P(Z ≤ (74-80)/5) = P(Z ≤ -1.20) = .1151
If your table does not have negatives, use P(Z ≤ -1.20) = P(Z ≥ 1.20) = 1 - .8849 =
.1151
• What is the probability that a student scores between 78 and 88?
P(78 ≤ X ≤ 88) = P((78-80)/5 ≤ Z ≤ (88-80)/5) = P(-0.40 ≤ Z ≤ 1.60) = P(Z ≤
1.60) - P(Z ≤ -0.40) = .9452 - .3446 = .6006
• What is the probability that an average of three scores is 82 or less?
P(X ≤ 82) = P(Z ≤ (82-80)/(5/√3)) = P(Z ≤ .69) = .7549
11
0
Home Study
You work in Quality Control for GE. A. between 2000 and 2400 B. less than 1470 hours?
Light bulb life has a normal hours?
distribution with m = 2000 hours and s
= 200 hours. What’s the probability
that a bulb will last
115
Solution* P(2000 x 2400)
x−m 2400 − 2000
z= = = 2.0
200
Normal Standardized Normal
Distribution Distribution
= 200 =1
.4772
= 200 =1
.5000
.0040 .4960
m=0 ?
.31 z 0.2 .0793 .0832 .0871
122
Finding the Area to the Right of a Number a for
an Exponential Distribution
Estimation and Statistical Intervals
Estimation theory
Estimation refers to any procedure where a sample information is used to estimate
or predict the value of a population parameter
Parameter is a characteristic or measure obtained from a population
Statistic is a characteristic or measure obtained from a sample
An Estimator is a sample statistic that is used in estimating a population parameter.
An Estimate is the value determined from the estimator as an estimate of the
population parameter.
There are two ways of estimation
1) Point Estimation and
2) Interval Estimation
1. Point Estimation
✓A single-valued estimate.
✓A single element chosen from a sampling distribution.
✓Conveys little information about the actual value of the population
parameter and about the accuracy of the estimate.
_
For example, a population mean (m) is estimated by a sample mean (x) and
population standard deviation (x) is estimated by sample standard deviation
(Sx)
Point Estimation
Property of Estimators
• The desirable property that a good estimator should possess is
that it be unbiased.
• An estimator is unbiased if, in repeated random samples, the
numerical values of the estimator stack up around the
population parameter that we are trying to estimate.
• If the repeated random samples are centred some where else
then the exhibit amount of bias.
2. Interval Estimation (Confidence Interval )
Point estimation produces a single value as an estimate of a
population parameter. The estimate may or may not be
close to the actual parameter value; thus, the estimate might
be incorrect.
• An interval estimate describes a range of values within
which a parameter might lie.
• An interval or range of values believed to include the
unknown population parameter.
• Associated with the interval there is a measure of
confidence that the interval does indeed contain the
parameter of interest.
133
• Because of these, interval estimation are more desirable than point
estimation.
• A confidence interval or interval estimate has two components:
✓A range or interval of values
✓An associated level of confidence
Confidence Interval Estimation of a
Population Mean m
x
−m 2
−
÷
N o rm a l D is trib u tio n : m = 0 , = 1
÷
0.4
f ( x) = 1 e 2 2 for − x
0.3
2 2
e = 2.718281 ... and = 314159265
f(x)
0.2
. ...
0.1
0.0 𝜎 𝜎
-5 0 5 𝑃 𝑥lj − 𝑍 < 𝜇 < 𝑥lj + 𝑍 ___(𝑍? ) = 0.95
𝑛 𝑛
m
𝜎 𝜎
𝑃 𝑥lj − 1.96 < 𝜇 < 𝑥lj + 1.96 = 0.95
𝑛 𝑛
Normal Probabilities (Empirical Rule)
• The probability that a normal random
variable will be within 1 standard
deviation from its mean (on either S ta n d ard N o rm al D is trib u tio n
f(z)
0.2
0.2
0.1
2.5% 2.5% mean falls within the 95% interval around
0.0 the population mean.)
x
m − 196
. m m + 196
.
n n
𝜎 x 𝜎
𝑥lj − 1.96 𝑥lj + 1.96
𝑛 𝑛
x
P −za z za = (1 − a)
0.2
0.1 a a 2 2
2 2
0.0 (1- a)100% Confidence Interval:
-5 -4 -3 -2 -1 0 1 2 3 4 5
−z a Z za x ± za
2 2
2 n
Critical Values of z and Levels of Confidence
(1 − a )
a za
Stand ard N o rm al Distrib utio n
0.4
2 2 (1 − a )
0.3
0.99 0.005 2.576
f(z)
0.2
0.4 0.4
0.3 0.3
f(z)
f(z)
0.2 0.2
0.1 0.1
0.0 0.0
-5 -4 -3 -2 -1 0 1 2 3 4 5 -5 -4 -3 -2 -1 0 1 2 3 4 5
Z Z
0 .4 0 .9
0 .8
0 .3 0 .7
0 .6
0 .5
f(x)
f(x)
0 .2
0 .4
0 .3
0 .1
0 .2
0 .1
0 .0 0 .0
x x
0.2
0.8 0.2881 0.2910 0.2939 0.2967 0.2995 0.3023 0.3051 0.3078 0.3106 0.3133
0.9 0.3159 0.3186 0.3212 0.3238 0.3264 0.3289 0.3315 0.3340 0.3365 0.3389
0.1 1.0 0.3413 0.3438 0.3461 0.3485 0.3508 0.3531 0.3554 0.3577 0.3599 0.3621
1.56{ 1.1 0.3643 0.3665 0.3686 0.3708 0.3729 0.3749 0.3770 0.3790 0.3810 0.3830
1.2 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944 0.3962 0.3980 0.3997 0.4015
0.0 1.3 0.4032 0.4049 0.4066 0.4082 0.4099 0.4115 0.4131 0.4147 0.4162 0.4177
-5 -4 -3 -2 -1 0 1 2 3 4 5 1.4 0.4192 0.4207 0.4222 0.4236 0.4251 0.4265 0.4279 0.4292 0.4306 0.4319
Z 1.5 0.4332 0.4345 0.4357 0.4370 0.4382 0.4394 0.4406 0.4418 0.4429 0.4441
1.6 0.4452 0.4463 0.4474 0.4484 0.4495 0.4505 0.4515 0.4525 0.4535 0.4545
1.7 0.4554 0.4564 0.4573 0.4582 0.4591 0.4599 0.4608 0.4616 0.4625 0.4633
1.8 0.4641 0.4649 0.4656 0.4664 0.4671 0.4678 0.4686 0.4693 0.4699 0.4706
1.9 0.4713 0.4719 0.4726 0.4732 0.4738 0.4744 0.4750 0.4756 0.4761 0.4767
2.0 0.4772 0.4778 0.4783 0.4788 0.4793 0.4798 0.4803 0.4808 0.4812 0.4817
Look in row labeled 1.5 and 2.1 0.4821 0.4826 0.4830 0.4834 0.4838 0.4842 0.4846 0.4850 0.4854 0.4857
2.2 0.4861 0.4864 0.4868 0.4871 0.4875 0.4878 0.4881 0.4884 0.4887 0.4890
column labeled .06 to find 2.3 0.4893 0.4896 0.4898 0.4901 0.4904 0.4906 0.4909 0.4911 0.4913 0.4916
increases
Cont’d
• The t distributions approach the standard normal distribution as
n increases.
• As a result, we can use the standard normal distribution (z
value table) when is not known and n > 30 in constructing
an approximate interval estimate for m
• When n < 30 and is not known t distribution table is used.
153
Cont’d
}
f(t)
9 1.383 1.833 2.262 2.821 3.250 0 .2
}
17 1.333 1.740 2.110 2.567 2.898 t
18 1.330 1.734 2.101 2.552 2.878
19 1.328 1.729 2.093 2.539 2.861 Area = 0.025 Area = 0.025
20 1.325 1.725 2.086 2.528 2.845
21 1.323 1.721 2.080 2.518 2.831
22 1.321 1.717 2.074 2.508 2.819
23
24
1.319
1.318
1.714
1.711
2.069
2.064
2.500
2.492
2.807
2.797
Whenever is not known (and the population is
25 1.316 1.708 2.060 2.485 2.787
26 1.315 1.706 2.056 2.479 2.779
assumed normal), the correct distribution to use is
27 1.314 1.703 2.052 2.473 2.771
28 1.313 1.701 2.048 2.467 2.763
the t distribution with n-1 degrees of freedom.
29 1.311 1.699 2.045 2.462 2.756
30 1.310 1.697 2.042 2.457 2.750
Note, however, that for large degrees of freedom,
40 1.303 1.684 2.021 2.423 2.704
60 1.296 1.671 2.000 2.390 2.660
the t distribution is approximated well by the Z
120 1.289 1.658 1.980 2.358 2.617
1.282 1.645 1.960 2.326 2.576
distribution.
Example 1
A stock market analyst wants to estimate the average return on a certain
stock. A random sample of 15 days yields an average (annualized) return of
𝑥lj = 10.37% and a standard deviation of s = 3.5%. Assuming a normal
population of returns, give a 95% confidence interval for the average return
on this stock.
159
Determination of Sample Size
• Collecting valid information through sampling requires
careful planning, including determination of an appropriate
sample size.
• How large should the sample size be? The answer depends
on the following three factors.
1. How precise (narrow) do we want a confidence interval estimate
to be?
2. How confident do we want to be that the interval estimate is
correct?
3. What is the standard deviation of the population in question?
• Generally the higher the desired precision or level of
confidence, the larger will be the sample size.
• And also, the larger the population variability is, the larger
will be the sample size.
160
Sample Size for Interval Estimation of m
• Consider
_
z = (x - m)/( /n)
Solving this for n we get the following
_
n =(z ) / (x - m)2
2 * 2
161
Cont’d
Example:
For the purpose of illustration, assume the desired confidence
level is 95%. If =15 and we want an estimate of m with a
maximum error in estimation of 5, the required sample size
would be computed as follows.
Solution: n =(z2 *2) / (x - m)2
c.l = 0.95, a = 0.05, z = 1.96
x = 15
| x - m| = 5
n = [(1.96)2 * (15) 2 ] / 5 2
= 34.5744 or 35
In sample size determination, no matter what the value of the
decimal places is, we round them up wards.
162
HYPOTHESIS TESTING
Hypothesis Testing
• The techniques of statistical inference can be classified into two broad categories: parameter estimation and
hypothesis testing. We have already briefly introduced the general idea of point estimation of process
parameters.
• A statistical hypothesis is a statement about the values of the parameters of a probability distribution. For
example, suppose we think that the mean inside diameter of a bearing is 1.500 in. We may express this statement
in a formal manner as
Hypothesis Testing
• An important part of any hypothesis testing problem is determining the parameter values specified in the null and
alternative hypotheses.
• Generally, this is done in one of three ways.
• First, the values may result from past evidence or knowledge. This happens frequently in statistical quality
control, where we use past information to specify values for a parameter corresponding to a state of
control, and then periodically test the hypothesis that the parameter value has not changed.
• Second, the values may result from some theory or model of the process.
• Finally, the values chosen for the parameter may be the result of contractual or design specifications, a situation
that occurs frequently.
Hypothesis Testing
• To test a hypothesis, we take a random sample from the population under study, compute an appropriate test
statistic, and then either reject or fail to reject the null hypothesis The set of values of the test statistic leading to
rejection of 𝐻0 is called the critical region or rejection region for the test.
• Two kinds of errors may be committed when testing hypotheses. If the null hypothesis is rejected when it is true,
then a type I error has occurred. If the null hypothesis is not rejected when it is false, then a type II error has
been made. The probabilities of these two types of errors are denoted as
•
Hypothesis Testing
• Thus, the power is the probability of correctly rejecting 𝐻0 . In quality control work, α is sometimes called the
producer’s risk, because it denotes the probability that a good lot will be rejected, or the probability that a
process producing acceptable values of a particular quality characteristic will be rejected as performing
unsatisfactorily.
• In addition, is sometimes β called the consumer’s risk, because it denotes the probability of accepting a lot of
poor quality, or allowing a process that is operating in an unsatisfactory manner relative to some quality
characteristic to continue in operation..
• .
•
1. Inference on the Mean of a Population, variance
Known
Confidence Interval on the Mean with Variance
Known.
• Confidence Intervals. An interval estimate of a parameter is the interval between two statistics that includes
the true value of the parameter with some probability.
END