Chapter 4. Sampling Distributions

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 31

CHAPTER 4

SAMPLING DISTRIBUTIONS
In this chapter, you learn:
 Population

 Sample

 Statistical Inference

 Sampling distribution
Population and Sample
 a population is the entire set of observations under
study, whereas a sample is a subset of a population.
In a population, the number of observations is labeled N
a sample, the number of observations is the sample
size, denoted by n

 Eg: Let X be the mark on a statistics exam in FTU.


N =  of students who have taken the statistics exam.
A sample of 10 students  n = 10.
Population and Sample
Population and Sample
 The variance and its related measure, the standard
deviation, are arguably the most important statistics.
They are used to measure variability.
Population and Sample
Statistical Inference
 Statistical inference is the process of making an
estimate, prediction, or decision about a population
based on sample data. Because populations are
almost always very large, investigating each
member of the population would be impractical and
expensive. It is far easier and cheaper to take a
sample from the population of interest and draw
conclusions or make estimates about the population
on the basis of information provided by the sample.
Population and Sample
 X: the number of spots turning up when a balance
die is rolled.

 We throw the die two times


X1: the number of spots turning up for the first time
X2: the number of spots turning up for the second time
Let W denote: W = (X1, X2)
W is called the random sample of size n

 The sample mean X1  X 2


X 
2
Sampling Distribution of X
* Mean and variance of the sampling distribution of X?
SAMPLING DISTRIBUTION

 Let X1, X2,…,Xn be a r.s. of size n from a population

 Let f(x1,x2,…,xn) be a real function whose domain

includes the sample space of (X1, X2,…,Xn).

 Then, the r.v Y=f(X1, X2,…,Xn) is called a statistic.


The probability distribution of a statistic Y is called
the sampling distribution of Y.

13
The sampling distribution

 Take the random sample of size n


W = (X1, X2, …, Xn)
 The sample mean:
X1  X 2    X n
X 
n
 E(X) = ; V(X)=2/n; ????
X –N(Muy, sigma^2)

 The standard error of the mean:


X N(, 2)  W=(X1,X2,…,Xn)
1 n
- The Sample mean X   Xi;
n i 1
n
1
S  MS   ( X i  X )  X  X
2
- The Sample Variance ˆ 2 2 2

n i 1 ?

1 n n
S 
2

n  1 i 1
( X i  X )  S  MS
2 2

n 1
1 n 1 N
S   ( X i  m) 2
*2
m x
n i 1 N i 1
i

- The Standard Deviation


S  S2

X A(p)  W=(X1,X2,…,Xn). A Sample Proportion:


1 n Xi 0 1
F   Xi
n i 1 P(x) q p
xi x1 x2 … x k 1 k k
x   ni xi ; n i n
ni n1 n2 … n k n i 1 i 1

1 k 1 k
ms   ni ( xi  x ) 2   ni xi2  x 2
n i 1 n i 1

1 k
n
s2  i i
n  1 i 1
n ( x  x ) 2
s  ms
2

n 1

1 n
s   ( xi  m) 2
*2

n i 1

1 n m
f   xi 
s s 2
n i 1 n
E.g. The price of a stock sold on the stock market in
100 trading sessions:
Price 13-15 15-17 17-19 19-21 21-23
(1000 đ)
ni 5 18 42 27 8
Estimate sample characteristics?

Giá xi ni ni xi n i x i2
13-15 14 5 70 980 x
 nx i i

1830
 18,3
15-17 16 18 288 4608
n 100
17-19 18 42 756 1360 x2 
 ii
n x 2


33868
 338, 68
8 n 100
19-21 20 27 540 1080
0 ms  x 2  ( x )2  3, 79
21-23 22 8 176 3872 n
s ms  1,9566
 100 1830 3386 n 1
8
Properties of the Sample Mean and
Sample Variance
• Let X1, X2,…,Xn be a r.s. of size n from a
N(,2) distribution. Then,
a ) X and S 2 are independent rvs.
b) X ~ N   ,  / n 
2

c)
 n  1 S 2

~ 2

 2 n 1

18
SAMPLING FROM THE NORMAL
DISTRIBUTION
• Let X1, X2,…,Xn be a r.s. of size n from a
N(,2) distribution. Then,
X 
Z ~ N  0,1
/ n
•Most of the time  is unknown, so we use:
X 
T
S/ n
19
SAMPLING FROM THE NORMAL
DISTRIBUTION

In statistical inference, Student’s t distribution


is very important.

20
SAMPLING FROM THE NORMAL
DISTRIBUTION
• Let X1, X2,…,Xn be a r.s. of size n from a
N(X,X2) distribution and let Y1,Y2,…,Ym be a
r.s. of size m from an independent
N(Y,Y2).

• If we are interested in comparing the


variability of the populations, one quantity
of interest would be the ratio
 X /  Y  S X / SY
2 2 2 2
21
SAMPLING FROM THE NORMAL
DISTRIBUTION
• The F distribution allows us to compare
these quantities by giving the distribution
of
S X2 / SY2 S X2 /  X2
F 2  2 2 ~ Fn1,m1
 X /  Y SY /  Y
2

• If X~Fp,q, then 1/X~Fq,p.


• If X~tq, then X2~F1,q.
22
CENTRAL LIMIT THEOREM
If a random sample is drawn from any population, the
sampling distribution of the sample mean is
approximately normal for a sufficiently large sample
size. The larger the sample size, the more closely the
sampling distribution of X will resemble a normal
distribution.
Random Sample
(X1, X2, X3, …,Xn)

X X
as n  
Random Variable Sample Mean
(Population) Distribution Distribution 23
EXAMPLE 1
• The amount of soda pop in each bottle is
normally distributed with a mean of 32.2
ounces and a standard deviation of 0.3
ounces.
– Find the probability that a bottle bought by a
customer will contain more than 32 ounces.
– Solution
• The random variable X is the 0.7486

amount of soda in a bottle.


x   32  32.2
P( x  32)  P(  )
x .3 x = 32 m = 32.2
25
 P( z  .67)  0.7486
EXAMPLE 1 (contd.)

• Find the probability that a carton of four bottles will


have a mean of more than 32 ounces of soda per
bottle.
• Solution
– Define the random variable as the mean amount of soda per
bottle.
x   32  32.2
P( x  32)  P(  ) 0.9082
x .3 4
 P( z  1.33)  0.9082
0.7486
x = 32
x  32 m = 32.2 26
 x  32.2
Sampling Distribution of
a Proportion

• The parameter of interest for nominal data


is the proportion of times a particular
outcome (success) occurs.
• To estimate the population proportion ‘p’
we use the sample proportion. The number
of successes

^ =
p
X
The estimate of p = n
27
Sampling Distribution of
a Proportion

^ can
• Since X is binomial, probabilities about p
be calculated from the binomial distribution.
^ we prefer to use
• Yet, for inference about p
normal approximation to the binomial
whenever it approximation is appropriate.

28
Approximate Sampling Distribution
of a Sample Proportion

• From the laws of expected value and variance, it


can be shown that E( p̂ ) = p and V( p̂ )=p(1-p)/n
• If both np ≥ 5 and n(1-p) ≥ 5, then

pˆ  p
z
p (1  p )
n
• Z is approximately standard normally distributed.

29
EXAMPLE
– A state representative received 52% of the
votes in the last election.
– One year later the representative wanted
to study his popularity.
– If his popularity has not changed, what is
the probability that more than half of a
sample of 300 voters would vote for him?

30
EXAMPLE (contd.)
Solution
• The number of respondents who prefer the representative is
binomial with n = 300 and p = .52. Thus, np = 300(.52) = 156
and
n(1-p) = 300(1-.52) = 144 (both greater than 5)

 pˆ  p .50  .52 
P ( pˆ  .50)  P    .7549
 p(1  p) n (.52)(1  .52) 300 

31