Lecture 3: Sampling and Sample Distribution
Lecture 3: Sampling and Sample Distribution
Lecture 3: Sampling and Sample Distribution
Sample Distribution
I. Introduction to sampling distribution
II. Sampling distribution of the mean
III. Sampling distribution of proportion
SAMPLING DISTRIBUTION
There are three distinct types of distribution of data which are –
1.Population Distribution, characterizes the distribution of elements of
a population
2.Sample Distribution, characterizes the distribution of elements of a
sample drawn from a population
3.Sampling Distribution, describes the expected behavior of a large
number of simple random samples drawn from the same population.
Sampling distributions constitute the theoretical basis of statistical
inference and are of considerable importance in business decision-making.
Sampling distributions are important in statistics because they provide
a major simplification on the route to statistical inference.
DEFINITION
A sampling distribution is a theoretical probability distribution of a statistic
obtained through a large number of samples drawn from a specific population
SELECTION OF
PROPERTIES OF HYPOTHESIS
DISTRIBUTIO TYPE
STATISTICS TO MODEL TESTING
SCORE
i)Properties of Statistic :
Statistic have different properties as estimators of a population
parameters. The sampling distribution of a statistic provides a
window into some of the important properties. For example if the
expected value of a statistic is equal to the expected value of the
corresponding population parameter, the statistic is said to be
unbiased
Consistency is another valuable property to have in
estimation of a population parameter, as the statistic with the smallest
standard error is preferred as an estimator estimator A statistic used to
estimate a model parameter.of the corresponding population
parameter, everything else being equal.
ii) Selection of distribution type to model scores :
The sampling distribution provides the theoretical foundation to select a
distribution for many useful measures. For example, the central limit
theorem describes why a measure, such as intelligence, that may be considered a
summation of a number of independent quantities would necessarily be
distributed as a normal (Gaussian) curve.
X = 𝑋1 + 𝑋2 +𝑋3 + …𝑋𝑛
𝜇 = 𝜇1 + 𝜇2 + 𝜇3 + …. 𝜇𝑛
= 𝑛𝜇1
𝜎
where µ 𝑎
2= 𝟐
𝑛 𝜎 2 + 𝜎2 + 𝜎2 + …𝜎2
𝑑𝝈 are the mean2and variance of 𝑿𝟏
=
𝟏 𝟏 𝑛𝜎
1 2 3
𝑛 1
UTILITY :
The utility of this theory is that it requires virtually no conditions on
distribution patterns of the individual random variable being summed. As a
result, it furnishes a practical method of computing approximate probability
values associated with sums of arbitrarily distributed independent random
variables.
This theorem helps to explain why a vast number of phenomena show
approximately a normal distribution. Because of its theoretical and practical
significance, this theorem is considered as most remarkable theoretical
formulation of all probability laws.
However, most of hypothesis testing and sampling theory is based on this
theorem. So the central limit theorem is perhaps the most fundamental result
in all of statistics.
2) SAMPLING DISTRIBUTION OF THE
PROPORTION :
Properties :
Application :
95 % CI = Mean ± ( 1.96 × SEM )
99 % CI = Mean ± ( 2.58 × SEM )
STANDARD ERROR TABLE
SAMPLING STANDARD ERROR SAMPLING STANDARD ERROR
DISTRIBUTION DISTRIBUTION
𝜎 1.3626 𝜎
MEANS 𝜎𝑥 = √𝑁
FIRST & THIRD 𝜎𝑄1= 𝜎3𝑄 = 𝑁
QUARTILES
𝜎
STANDARD DEVIATIONS 1. 𝜎𝑠= VARIANCES 2
1. 𝜎 2 = 𝜎
2𝑁 2
𝜇4− 𝜇2 𝑠 𝑁
2. 𝜎 = 2
𝜇4− 𝜇2 2
𝑠 4𝑁𝜇2 2. 𝜎𝑠2 =
𝑁
𝑣
MEDIANS COEFFICIENTS OF
𝜋 1.2533 𝜎 𝜎𝑣= 1+
𝜎 =σ = VARIATION 2𝑣22𝑁
𝑚𝑒
𝑑 2𝑁 √𝑁
Point & Interval Estimates
POINT INTERVAL
ESTIMATE ESTIMATE
S S
For example,
the sample mean ¯x is a point estimate of the population mean μ.
the sample proportion p is a point estimate of the population proportion
Similarly,
P.
Interval Estimation :
parameter
An intervalisestimate
said to lie.
is defined by two numbers, between which a
population
For example
a < x < b is an interval estimate of the population
mean μ. It indicates that the population mean is
greater than a but less than b.
5.5
6.5
Illustrating the fact that 𝜇𝑥 = µ
7.0
8.5
d) Here, standard deviation of the sampling distribution of mean is,
2−6 2+(2.5−6)2+ ………+ (11−6)2
𝜎2x = ( substracting the mean 6 from each numbers, squaring the
25
result, adding all 25 numbers thus obtained and dividing by 25 )
13
=5 =
2
5.40
σx = 55.40 =
2.32 𝜎2
2
This illustrates the fact that for finite populations involving sampling with 𝜎 x= -
𝑁
since the right, hand side is 10.8/2 = 5.40 ; agreeing with the above
replacement
value .
Without Replacement:
c) There are 10 samples of size 2 that can be drawn without replacement from the
population :
(2,3) (2,6) (2,8) (2,11) (3,6) (3,8) (3,11) (6,8) (6,11) (8,11)
The corresponding sample means are :
2.5, 4.0 , 5 , 0 , 6.5 , 4.5 , 5.5 , 7.0 , 7.0 , 8.5 ,
9.5 .
The mean of sampling distribution of means is ,
2.5+4.0+ …….…+9.5
𝜇𝑥 = =
6.0
10 ∴ 𝜇𝑥 = µ
(d) The variance of sampling distribution of mean is ,
(2.5−
6)2+ 4.0−6 2+ ……….+
= 4.05
𝜎2x = 1
(9.5−6) 2
0
And, 𝜎𝑥 = 2.01
𝜎2 𝑁𝑝− 𝑁
this illustrates, 𝜎2x = )
𝑁 𝑁𝑝−
( 1
10.8
= 5−2(
2
5−1)
= 4.05
As obtained above .
PROPORTIONS
Prob. 2 :
Find the probability that in 120 tosses of a fair
coin , a)Between 40 % and 60 % will be
heads and
b)5/8 or more will be heads .
Answer:
0.0456
= 2.65
Required probability = ( area under normal curve to right of z=2.65 )
=(area to right of z = 0) – (area between z=0 and z= 2.65 )
= 0.5 – 0.4960
=0.0040 .
REFERENCES
:
1.Statistics For The Social Sciences with Computer Applications –
Anthony Walsh
2.Schaum’s Outline of Theory and Problems of STATISTICS – Murray R. Spiegel
3.Business Statistics – SP Gupta & MP Gupta
4.Descriptive and Inferential Statistics – An introduction - Herman J
Loether & Donald G McTavish