Chapter 6
Chapter 6
ESTIMATION
6.1 Introduction.
In practice it is not always possible to work with the entire population and determine the
desirable statistical measures, like mean and standard deviation. This is because the
populations might be infinite or very large and hence very expensive to work on it.
Estimation involves sampling techniques whereby findings from those samples are used
to represent the whole population. Estimators are the formulas used to estimate the
population parameters.
A good estimator must have several properties including the following;
(a) Unbiasedness
(b) Efficiency
(c) Consistency
Unbiasedness
An estimator ˆ for a population parameter is said to be unbiased if E ˆ . The
quantity E ˆ is called the bias of .
Efficiency
This property is used to compare the efficiency of one estimator over the others in
estimating the same population parameter . The estimator with this property is also
known as MVUE (Minimum Variance Unbiased Estimator). This property is described as
follows;
Let ˆ and ˆ be two unbiased estimators for , then ˆ is said to be more efficient over
1 2 1
than ˆ2 if
Var ˆ1 Var ˆ2
Consistency
An estimator ˆ for is said to be consistent if both its bias and variance tend to zero
when the sample size approaches infinity.
1
6.2 Point estimation
This is the technique of estimating population parameters using single valued
statistics/estimators. Commonly used approaches in point estimations are Maximum
Likelihood and the Method of moments. We shall not discuss these approaches.
The value 𝑍𝛼/2 is called the critical value and is determined using inverse probability
technique. Depending on the standard normal table you use, this is simply the value of Z
at c such that 𝑃(𝑍 ≥ 𝑐) = 𝛼/2 or 𝑃(0 ≤ 𝑍 ≤ 𝑐) = 0.5 − 𝛼/2.
𝛼
For instance, if 𝛼 = 5% = 0.05 ⇒ = 0.025 ⇒ 𝑍0.025 = 𝑐
2
0.4750
0.0250
0 c
We find the table that 𝑃(0 ≤ 𝑍 ≤ 1.96) = 0.475. it follows that 𝑐 = 1.96.
2
Example 6.1
A population is known to have a variance of 81. A random sample of size 16 showed that
x 10.5 . Estimate the population mean by means of 95% confidence interval.
Solution
Given 2 81 9, n 16, x 10.5, 5% 0.05
Then, 95% confidence interval is given by
9
x Z 2 10.5 Z 0.025
n 16
10.5 (1.96) (2.25)
10.5 4.41
6.09 14.91
3
But Z 2 Z 0.025 1.96 , so we have
Example 6.3
Repeat example 5.2 with sample size 25.
Solution
Given n 25, x 520, s 91, 5%
Since the sample size is small, the distribution used is t, and hence the formula for 95%
confidence interval for is
s
x t 2, n1
n
But t 2, n1 t 0.025, 24 2.064 , then we have
4
Then, the formula for 1 100% confidence interval estimate for the difference between
means x y is given by
x2 y2
x y Z 2
nx ny
Example 6.4
Random variables X and Y are normally distributed with standard deviations x 1.2 and
y 0.9 ; random samples of observations on both variables, each of size 32, provide the
following information x 4.1 and y 3.5 . Estimate the difference between population
means by means of a 95% confidence interval.
Solution
Since the populations variances are known, the distribution used is Z, and hence the
formula for 95% confidence interval estimate for x y is given by
x2 y2
x y Z 2
nx ny
Where Z 2 Z 0.025 1.96 , then the confidence interval is
x2 y2 1.2 2 0.9 2
x y Z 2 4.1 3.5 1.96
nx ny 32 32
0.60 1.960.2652
0.60 0.52
0.08 x y 1.12
Comment: Since the entire interval consists of positive numbers only, it is 95% confident
that 𝜇𝑥 > 𝜇𝑦 .
5
Large Sample Sizes
The distribution used is still Z, and the population variances are replaced by their
respective sample variances. Hence the formula for 1 100% confidence interval
estimate for the difference between means x y is given by
2
s x2 s y
x y Z 2
nx n y
Example 6.5
A utility company used to send out monthly statements to its customers without addressed
return envelopes. From a random sample of 120 customers it was determined that, on
average, it took 9 days for a payment to be made, with a sample standard deviation of 2
days.
Wishing to speed up receipt of payment, pre-addressed return envelopes were
subsequently included with the invoices. An independent sample of 130 customers
indicated that average payment time fell to 8 days, with a sample standard deviation of 2.2
days.
Compute a 95% confidence interval estimate for the difference between population
means.
Solution
Let X represents the invoices sent without pre-addressed return envelopes.
Let Y represents the invoices sent with pre-addressed return envelopes.
The following information are given
n x 120, x 9, s x 2 , n y 130, y 8, s y 2.2
Since the sample sizes are large, the distribution used is Z, population variances are
unknown but are replaced by the corresponding sample variances and hence the formula
for 95% confidence interval is given by
2
s x2 s y
x y Z 2
nx n y
Where Z 2 Z 0.025 1.96 , so we have
6
2
s x2 s y 2 2 2.2 2
x y Z 2 9 8 1.96
nx n y 120 130
1.0 1.960.2656
1.0 0.52
0.08 x y 2.52
n x 1 s x2 n y 1s y2
s
2
nx n y 2
p
s 2p s 2p
x y t 2 , n n 2
1 2
nx ny
Example 6.6
Repeat example 5.5 with sample sizes n x 19 and n y 25
Solution
The following information are given
n x 19, x 9, s x 2 , n y 25, y 8, s y 2.2
Since the population variances are unknown and the sample sizes are small, the
distribution used is t-distribution and the population variances are replaced by sample
variances, and hence the formula for 95% confidence interval is
7
s 2p s 2p
x y t 2 , n n 2
1 2
nx ny
s 2p s 2p
x y t 2 , n n 2 9 8 2.021
4.48 4.48
1 2
nx ny 19 25
1.0 2.0210.6442
1.0 1.30
0.30 x y 2.30
Comment: This interval consists of negative, zero and positive values. So one can
conclude he/she is not 95% confident that 𝜇𝑥 > 𝜇𝑦 .
Other formulas for confidence interval estimations are summarized in the following table
8
Example 6.7
In a random sample of 500 families owning television sets in a certain city, it is found that
340 have not yet subscribed to a newly introduced digital transmission system. Find a
95% confidence interval for the actual proportion of all families in the city who have not
yet subscribed to the system.
Solution
The point estimate of p is pˆ 340 500 0.68. For 95% confidence, we have 0.05 ,
then, Z Z 0.025 1.96 . Therefore, 95% confidence interval for 𝜋 is
2
𝑝(1−𝑝) 0.68(0.32)
𝜋 = 𝑝 ± 𝑍𝛼 √ = 0.68 ± (1.96)√ = 0.68 ± 0.04
2 𝑛 500
⇒ 0.68 ≤ 𝜋 ≤ 0.72
Example 6.8
In a town A housing survey, 234 respondents out the 300 reported that they had exclusive
use of a flush toilet inside the house. In a town B housing survey, 135 out of 150 also
reported that they had exclusive use of a flush toilet inside the house. Construct a 95%
confidence interval for the difference between these two town proportions.
Solution
The following information are given
𝑥𝐴 = 234, 𝑛𝐴 = 300, 𝑦𝐵 = 135, 𝑛𝐵 = 150, 𝛼 = 5%
Now, 𝑝𝐴 = 234⁄300 = 0.78, 𝑝𝐵 = 135⁄150 = 0.90, and Z Z 0.025 1.96
2
9
Example 6.9
The following are weights, in decagrams, of 10 packages of grass seed distributed by a
certain company: 46.4, 46.1, 45.8, 47.0, 46.1, 45.9, 45.8, 46.9, 45.2 and 46.0. Find a 95%
confidence interval estimate for the variance of all such packages of grass seed distributed
by this company, assuming the normal population.
Solution
We first compute sample variance of this data as follows;
1 2
xi2 x
1 2
21,273.12 641.2 0.286
1 1
s2
n 1 n
9 10
For 95% confidence interval, we have 0.05 , then
2 n 1 02.025 9 19.023 and 12 n 1 02.975 9 2.700
2 2
10
Example 6.10
What sample size would be required to estimate the population mean for a large set of
company invoices to within $0.30 with 95% confidence, given that the estimated
population standard deviation is $5.
Solution
The following information are given. 𝐸 = 0.30, 𝛼 = 5%, 𝜎 = 5
Required to find the estimated sample size 𝑛.
Now, 𝑍𝛼 = 𝑍0.025 = 1.965
2
Then,
𝑧𝜎 2 1.96 × 5 2
𝑛≥( ) =( ) = 1067.11
𝐸 0.3
11