Central Tendency
Central Tendency
Central Tendency
UNIT-1
FREQUENCY DISTRIBUTION
Structure:
1.0 Introduction
1.1 Objectives
1.2 Measures of Central Tendency
1.2.1 Arithmetic mean
1.2.2 Median
1.2.3 Mode
1.2.4 Empirical relation among mode, median and mode
1.2.5 Geometric mean
1.2.6 Harmonic mean
1.3 Partition values
1.3.1 Quartiles
1.3.2 Deciles
1.3.3 Percentiles
1.4 Measures of dispersion
1.4.1 Range
1.4.2 Semi-interquartile range
1.4.3 Mean deviation
1.4.4 Standard deviation
1.2.5 Geometric mean
1.5 Absolute and relative measure of dispersion
1.6 Moments
1.7 Karl Pearson’s β and γ coefficients
1.8 Skewness
1.9 Kurtosis
1.10 Let us sum up
1.11 Check your progress : The key.
2
1.0 INTRODUCTION
According to Simpson and Kafka a measure of central tendency is typical
value around which other figures aggregate‘.
According to Croxton and Cowden ‗An average is a single value within the
range of the data that is used to represent all the values in the series. Since an
average is somewhere within the range of data, it is sometimes called a measure of
central value‘.
1.1 OBJECTIVES
The main aim of this unit is to study the frequency distribution. After going through this unit you
should be able to :
know about measures of dispersion like range, semi-inter-quartile range, mean deviation,
standard deviation;
To find the arithmetic mean, add the values of all terms and them divide sum by the
number of terms, the quotient is the arithmetic mean. There are three methods to find
the mean :
(i) Direct method: In individual series of observations x1, x2,… xn the arithmetic mean is
obtained by following formula.
x x x x .............xn1 xn
A.M . 1 2 3 4
n
(ii) Short-cut method: This method is used to make the calculations simpler.
Let A be any assumed mean (or any assumed number), d the deviation of the
arithmetic mean, then we have
M. A
fd ( d=(x-A))
N
(iii)Step deviation method: If in a frequency table the class intervals have equal width,
say i than it is convenient to use the following formula.
M A
fu i
n
where u=(x-A)/ i ,and i is length of the interval, A is the assumed mean.
Example 1. Compute the arithmetic mean of the following by direct and short -cut methods
both:
Freqyebcy 8 26 30 20 16
Solution.
Example 2 Compute the mean of the following frequency distribution using step deviation
method. :
Frequency 9 17 28 26 15 8
Solution.
Property 1 The algebraic sum of the deviations of all the variates from their arithmetic
mean is zero.
Proof . Let X1, X2,… Xn be the values of the variates and their corresponding frequencies be
f1, f2, …, fn respectively.
Let xi be the deviation of the variate Xi from the mean M, where i = 1,2, …, n. Then
Xi = Xi –M, i = 1,2,…, n.
n n
fixi f ( X M )
i 1 i 1
i i
n n
=M
i 1
fi M f i
i 1
5
=0
Exercise 1(a)
52 75 40 70 43 65 40 35 48
Variate : 6 7 8 9 10 11 12
Frequency: 20 43 57 61 72 45 39
Frequency: 31 44 39 58 12
1.2.2 MEDIAN
The median is defined as the measure of the central term, when the given terms (i.e.,
values of the variate) are arranged in the ascending or descending order of magnitudes. In
other words the median is value of the variate for which total of the frequencies above this
value is equal to the total of the frequencies below this value.
Due to Corner, ―The median is the value of the variable which divides the group into two
equal parts one part comprising all values greater, and the other all values less then the
median‖.
For example. The marks obtained, by seven students in a paper of Statistics are 15, 20, 23,
32, 34, 39, 48 the maximum marks being 50, then the median is 32 since it is the value of the
4th term, which is situated such that the marks of 1st, 2nd and 3rd students are less than this
value and those of 5th, 6th and 7th students are greater then this value.
COMPUTATION OF MEDIAN
Let n be the number of values of a variate (i.e. total of all frequencies). First of all
we write the values of the variate (i.e., the terms) in ascending or descending order of
magnitudes
n 1
th
th
Case2. If n is even then there are two central terms i.e., n/2 and The mean of
2
these two values gives the median.
(b) Median in continuous series (or grouped series). In this case, the median (Md) is
computed by the following formula
n
cf
Md l 2 i
f
Where Md = median
Example 1 – According to the census of 1991, following are the population figure, in
thousands, of 10 cities :
1400, 1250, 1670, 1800, 700, 650, 570, 488, 2100, 1700.
Here n=10, therefore the median is the mean of the measure of the 5th and 6th terms.
= 1325 Thousands
No. of workers 22 38 46 35 20
Here N = 161. Therefore median is the measure of (N + 1)/2th term i.e 81st term. Clearly 81st
term is situated in the class 20-30. Thus 20-30 is the median class. Consequently.
n
cf
Median M d l 2 i
f
= 20 + (½ 161 – 60) / 46 10
125 1
th
= 63rd term.
n
cf
Median M d l 2 i
f
= 30 + 25/24
= 30+1.04 = 31.04
1.2.3 MODE
The word ‗mode is formed from the French word ‗La mode‘ which means ‗in
fashion‘. According to Dr. A. L. Bowle ‗the value of the graded quantity in a statistical
group at which the numbers registered are most numerous, is called the mode or the
position of greatest density or the predominant value.‘
Mode
According to other statisticians, ‗The value of the variable which occurs most
frequently in the distribution is called the mode.‘
―The mode of a distribution is the value around the items tends to be most heavily
concentrated. It may be regarded at the most typical value of the series‖.
Definition. The mode is that value (or size) of the variate for which the frequency is
maximum or the point of maximum frequency or the point of maximum density. In other
words, the mode is the maximum ordinate of the ideal curve which gives the closest fit to
the actual distribution.
9
Size of shoes 1 2 3 4 5 6 7 8 9
Frequency 1 1 1 1 2 3 2 1 1
Here maximum frequency is 3 whose term value is 6. Hence the mode is modal size number
6.
(b) In continuous frequency distribution the computation of mode is done by the following
formula
f1 f 0
Mode M 0 l i … (i)
2 f1 f 0 f 2
i =class interval
f1 f 0
Mode M 0 l i
2 f1 f 0 f 2
72 36
= 21 10
(2 72 36 51)
= 21 + 357 / 87
= 21 + 4.103
= 25.103.
10
(c) Method of determining mode by the method of grouping frequencies. This method is
usually applied in the cases when there are two maximum frequencies against two different
size of items. This method is also applied in the cases when it is possible that the effect of
neighboring frequencies on the size of item (of maximum frequency) may be greater. The
method is as follows :
Firstly the items are arranged in ascending or descending order and corresponding
frequencies are written against them.The frequencies are then grouped in two and then in
threes and then is fours (if necessary). In the first stage of grouping, they are grouped (i.e.,
frequencies are added) by taking, first and second, third and fourth, …, . After it, the
frequencies are added in threes. The frequencies are added in the following two ways :
1. (i) First and second, third and fourth, fifth and sixth, seventh and eighth, …
(ii) Second and third, fourth and fifth, …
2. (i) First, second and third; fourth, fifth and sixth, …
(ii) Second, third and fourth; fifth, sixth and seventh, …
(iii) Third, fourth and fifth; sixth seventh and eighth, …
Now the items with maximum frequencies are selected and the item which
contains the maximum is called the mode. For illustration see following example 1.
Size of I II III IV V VI
Items
4 2
7
5 5
13
6 8
17 15
7 9
21 22
8 12
26 35 29
9 14
11
28 40
10 14
29 40 43
11 15
26 39
12 11
24
13 13
We have used brackets against the frequencies which have been grouped. Now we
shall find the size of the item containing maximum frequency :
Column Size of item having maximum frequency
I 11
II 10,11
III 9,10
IV 10,11,12
V 8,9,10
VI 9,10,11
Here size 8 occurs 1 time, 9 occurs 3 times, 10 occurs 5 times, 11 occurs 4 times, 12
occurs 1 time.
Since 10 occurs maximum number of times (5 times).
Hence the required mode is size 10.
For moderately asymmetrical distribution (or for asymmetrical curve), the relation
Mean – Mode = 3 (Mean - Median),
approximately holds. In such a case, first evaluate mean and median and then mode
is determined by
Mode = 3 Median – 2 Mean.
If in the asymmetrical curve the area on the left of mode is greater than area on the
right then
Mean < median < mode, i. e., (M < Md < M0)
12
Mode
Median Mode
Median
Mean
Mean
If in the asymmetrical curve, the area on the left of mode is less than the area on the
right then in this case
Exercise 1(c)
Q.1) Find the Mode of the following model size number of shoes.
Model size no. of shoes : 3,4,2,1,7,6,6,7,5,6,8,9,5.
If x1,x2, … ,xn. are n values of the variate x, none of which is zero . Then their
geometric mean G is defined by
G = (x1, x2, … xn)1/n (1)
If f1, f2, … , fn are the frequencies of x1,x2,…, xn respectively, then geometric mean G
is given by
The Harmonic mean of a series of values is the reciprocal of the arithmetic means of
their reciprocals. Thus if x1,x2,…, xn (none of them being zero) is a series and H is its
harmonic mean then
1 1 1 1 1
[ .... ]
H N x1 x 2 xn
If f1, f2, …, fn be the frequencies of x1,x2, … , xn (none of them being zero) then harmonic
mean H is given by
H .M .
f
1
fx
Example 1. Find the harmonic mean of the marks obtained in a class test, given below
Marks : 11 12 13 14 15
No. of students: 3 7 8 5 2
Solution.
Marks Frequency 1/x f 1/x
X f
11 3 0.0909 0.2727
12 7 0.0833 0.5831
13 8 0.0769 0.6152
14 5 0.0714 0.3570
15 2 0.0667 0.1334
14
N = ∑f = 25 ∑f/x = 1.9614
H .M .
f
1
fx
= 25 / 1.9614
= 25/1.9614
= 250000/19614
= 12.746 marks.
Property . For two observations x1 and x2, we have
AH = G2
Where A = arithmetic mean, H = harmonic mean and G = geometric mean.
1.3.1 QUARTILES :
Definition. The values of the variate which divide the total frequency into four equal
parts, are called quartiles. That value of the variate which divides the total frequency into
two equal parts is called median. The lower quartile or first quartile denoted by Q1 divides
the frequency between the lowest value and the median into two equal parts and similarly
the upper quartile (or third quartile) denoted by Q3 divides the frequency between the
median and the greatest value into two equal parts. The formulas for computation of
quartiles are given by
n 3n
cf cf
Q1 l 4 i , Q3 l 4 i
f f
1.3.2 DECILES :
15
Definition,. The values of the variate which divide the total frequency into ten equal
parts are called deciles. The formulas for computation are given by
n 2n
cf cf
D1 l 10 i , D2 l 10 i etc…
f f
1.3.3 PERCENTILES :
Definition. The values of the variate which divide the total frequency into hundred
equal parts, arte called percentiles. The formulas for computation are :
n 70n
cf cf
P1 l 100 i , P70 l 100 i etc.…..
f f
Example 1. Compute the lower and upper quartiles, fourth decile and 70th percentile for
the following distribution:
(i) To compute Q1. Here N = 49, ¼ N = ¼ 9 = 12.25 which clearly lies in 15-20
Thus 15-20 is lower quartile class.
l = 15, cf = 11, f = 15, i = 20-15 = 5
n
cf
Q1 l 4 i
f
= 15 + (12.25 – 11) / 15 5 = 15 + 0.417 = 15.417.
(ii) To Compute Q3 . Here ¾ N = ¾ 49 = 36.75 which clearly lies in the class 25-30.
3n
cf
Q3 l 4 i
f
(iii)To compute D4 Here 4/10 N = 4/10 x 49 = 19.6, which clearly lies in the class 15-20.
Thus l = 15, cf = 11, f = 15, i = 5
4n
cf
D4 l 10 i
f
= 15 + 19.6 – 11 / 15 5 = 15 + 2.87 = 17.87
(iv) To compute P70. Here 70N/100 = 7/10 x 49 = 34.3 which clearly lies in the class 20-
25. Thus l = 20, cf = 26, f = 10, i = 5
70n
cf
P70 l 100 i
f
= 20 + (34.3 – 26) / 10 5 = 20 + 4.15 = 24.15
Exercise 1(b)
Q.1) Find the meadian of the following.
20 18 22 27 25 12 15
Q.2) Calculate the meadian, lower and upper quartiles, third decile and 60th percentile for the
following distribution.
Class : 0-10 10-20 20-30 30-40 40-50 50-60 60-70 70-80
Frequency :5 8 7 12 28 20 10 10
DISPERSION OR VARIATION
1 typist : 15 20 25 25 30 35 150
17
2 typist : 10 20 25 25 30 40 150
We see that each of the typist 1 and 2 typed 150 pages in 6 working days and so the
average in both the cases is 25. Thus there is no difference in the average, but we know that
in the first case the number of pages varies from 15 to 35 while in the second case the
number of pages varies from 10 to 40. This denotes that the greatest deviation from the
mean in the first case is 10 and in the second case it is 15 i.e., there is a difference between
the two series., The variation of this type is termed scatter or dispersion or spread.
Definition. The degree to which numerical data tend to spread about an average
value is called variation or dispersion or spread of the data.
Various measures of dispersion or variation are available, the most common are the
following.
15 20 25 25 30 35
i.e., 35-15=20
where Q1 and Q3 are respectively the first and third quartiles for the data.
Q =(Q3 – Q1)/2
The semi-inter-quartile range is a better measure of dispersion than the range and is easily
computed. Its drawback is that it does not take into account all the items.
18
Definition. The average (or mean) deviation about any point M, of a set of N
numbers x1, x2, …, xN is defined by
n
1
Mean Deviation (M. D.) = δm =
N
x M
i 1
i
where M is the mean or median or mode according as the mean deviation from the
mean or median or mode is to be computed, l xi – M l represents the absolute (or
numerical) value. Thus l-5l = 5.
1 k 1
δm =
N j 1
fj ( xj M ) f ( x M )
N
Mean deviation depends on all the values of the variables and therefore it is a better
measure of dispersion than the range or the quartile deviation. Since signs of the deviations
are ignored (because all deviations are taken positive), some artificiality is created.
Example 1. Find the mean deviation from the arithmetic mean of the following
distribution :
No. of students : 5 8 15 16 6
Arithmetic mean M= u
fu i = 25 + 10/50 10 = 27.
N
δm =
f (x M ) = 472 / 50 = 9.44
N
Solution.
To calculate lower Quartile Q1. Here N = 60. So ¼ (N+1)th i.e., 16th students lies in the
marks group 20-30. Thus lower quartile class is 20-30.
1
N F
15.75 12
Q1 = l 4 i = 20 10 = 23.75.
f 10
20
47.25 38
Similarly Q3 = 40 10 = 48.4.
11
1
Semi-inter quartile range = (Q3 Q1)
2
Definition. It is defined as the positive square root of the mean of the squares of
the deviations from an origin A and denoted by s
1
s {
N
f ( x A) 2 }
Mean square deviation. It is defined as the mean of the squares of the deviations from an
origin A. Thus
f x A
2
s2 =
n
Standard Deviation :
Definition. Standard deviation (or S.D.) is the positive square root of the
arithmetic mean of the square deviations of various values from their arithmetic mean M.
It is usually denoted by σ. Thus
1
σ= s {
N
f ( x M )2 }
Remarks:
1. When the deviations is calculated from the arithmetic mean M, then root mean
square deviation becomes standard deviation.
21
x: x1 x2 … xn
f: f1 f2 … fn
Let A be the assumed mean and M the arithmetic mean. Also suppose M-A = d. Then
1 1
σ2
N
f ( x M )2 f ( x A M A)2
N
1
N
f ( x d )2 whereX x A, d M A
1 1 1
N
fx 2 2d . fX d 2 . f
N N
1 1
N
f ( x A)2 2d . f ( x A) d 2
N
1 1 1
N
f ( x A)2 2d ( fx fA) d 2
N N
1
N
f ( x A)2 2d (M A) d 2
1
N
f ( x A)2 d 2 s 2 d 2
Hence s2 = σ2 + d2 … (1)
Relation (1) shows that s is least when d = 0 i.e., A = M and the least value of s is equal to σ.
In other words the standard deviation is the least possible root mean square deviation.
S2> σ2
i.e., mean square deviation about any point A is greater than variance.
22
We know that
f )}
2
1
σ { f (
2
N N
fu ) }
2
1
σ h { fu 2 (
2 2
N N
fu ) }]
2
1
σ = h [{
N
fu 2
(
N
They are useful to compare two series in different units and also to compare
variations of two series having different magnitudes.
Some of the relative measures of dispersion (or coefficient of dispersion) which are
in common use are given below :
Q3 Q1
Q.D.=
Q3 Q1
(ii) Coefficient of mean dispersion = Mean deviation about any point ‘a‘ / a .
Here any point ‗a‘ can be replaced by mean, median, mode etc.
C.V. or V = σ / M
Sometimes, we define
C.V. or V = σ / M 100
Example1. Calculate the S.D. and coefficient of variation (C.V.) for the following table :
Frequency : 5 10 20 40 30 20 10 5
= 35 + 4.64 = 39.64
S.D., σ = h [
fu 2
(
fu ) ]
2
N N
24
385
= 10 [ (.464)2 ]
140
1.6 MOMENTS
For any frequency distribution, the rth moment about any point A is defined as the
arithmetic mean of rth powers of deviations from the point A.
( xi x) r
, for r = 0,1,2,3,….
µr = i 1
n
(ii) For a frequency distribution. Let
x: x1 x2 … xn
f: f1 f2 … fn
be a discrete frequency distribution. Then the rth moment µ r about the mean
x is defined by
n
( xi x ) r
n
µr = i 1
N
, for r = 0,1,2,3,…. Where fi N
i 1
( xi x ) r
µr =
N
, for r = 0,1,2,3,…. Where fi N
or
Particular Cases.
For r = 0,
n n
fi ( xi x )0 fi N
µ0 = i 1
i 1
1
N N N
Hence for all distributions,
25
µ0 = 1
f (x x )
i i
1
1 x xN
For r = 1, µ1 =
N
N
f x N fi x
i i
N
0.
1
For r=2, µ2 =
N
fi ( xi x )2 = σ2 = variance.
fi x i x 3
For r=3, µ3 = , and so on.
N
For any frequency distribution the rth moment about any point x = A, is defined as
the arithmetic mean of the rth powers of the deviations from the point x=A and is denoted
by µr ‗.
If
X: x1 x2 … xn
F: f1 f2 … fn
1 n n
µr= ׳
N i 1
fi ( xi A) r, r = 0,1,2,3,…, and f N.
i 1
i
( x A)
i
r
µr= ׳ i 1
, r 0,1, 2,3,....
n
26
Particular cases.
For r = 0,
1 n 1 n N
µ0 = ׳
N i 1
fi ( xi A) 0
N i 1
fi 1
N
For r=1
1 n 1 n 1 n
µ1 = ׳
N i 0
fi ( xi A)1
N i 0
fi xi A fi
N i 1
= x – A/N x N = x – A = d (say)
For r=2
2
fi{( xi x ) ( x A)}
1 n 1 n
µ2 = ׳
N i 1
fi ( xi A) 2
N i 1
( x A)2
fi( xi x )
1 n n
N i 1
2
N
f
i 1
i
= σ2 + (x-A)2 = σ2 + d2.
For r=3
RELATION BETWEEN CENTRAL MOMENTS (µr ) AND MOMENTS ABOUT ANY POINT (µr )׳:
We have:
If
x: x1 x2 … xn
f: f1 f2 … fn
27
be a discrete frequency distribution, then the rth moment about the origin is denoted by
Vr, (say) and is defined by
n n
1
Vr =
N
i 1
fi xir, r=0,1,2,3,… and
i1
fi = N
n
1
V1
N
i 1
fixir
n
1
N
i 1
fi {(xi – x) + x}r
n
1
N
i 1
fi {(xi-x)r + rC1 (xi-x)r-1 x + rC2 (xi - x)r-2 (x)2 + … + (x)r}
Vr = µr r C1µr 1x r C2 µr 2 x x
2 r
Karl Pearson gave the following four coefficients. Calculated from the central
moments, which are defined as
β2 = µ4/µ22 4 3 2 2
γ2 = β2 – 3 =
22
Example 1. Calculate the first four central moments from the following data
Frequency : 1 3 5 7 4
28
x A x 25
u=
h 10
h r fu r
µ r‘ =
N
h1 fu1
µ1‘ = = 10x10/20 = 5
N
h 2 fu 2
µ2‘ = = 100x30/20 = 150
N
h 3 fu 3
µ3‘ = = 1000x28/20 = 1400
N
h 4 fu 4
µ4‘ = = 10000x90/20 = 45000.
N
µ1 = 0 (always)
1.8. SKEWNESS
By skewness in some frequency distribution, we mean the lack in symmetry. [If the
frequencies are symmetrically distributed about the mean, then the distribution is called
symmetrical, in other words, a distribution is called symmetrical when the values
equidistant from the mean have equal frequencies.] Skewness is also termed as asymmetry.
M=M0=Md
We know that for a symmetrical distribution the mean, median and mode coincide.
Therefore, skewness in a distribution is shown when these three averages do not coincide.
Skewness indicates that the frequency curve has a longer tail on one side for the average.
When the frequency curve has a longer tail on right side, the skewness is called positive.
When the frequency curve has a longer tail on left side, the skewness is called negative. In
other words, the skewness is positive if M0<Md<M and negative if M<Md<M0, where M,
Md and M0 are mean, median and mode respectively.
MEASURE OF SKEWNESS :
(i) First coefficient of skewness. It is also known as Bowley‘s coefficient of skewness and
is defined as
Q Q1 2M d (Q3 M d ) ( M d Q1 )
Coefficient of skewness = 3 JQ .
Q3 Q1 (Q3 M d ) ( M d Q1 )
Where Q1 and Q3 are lower and upper quartiles respectively and Md is median.
Clearly this measure is based on the fact that in a skew curve, the median does
not lie half way between Q1 and Q3. This formula for coefficient of skewness is
used when mode is well defined
30
Note that both of the above coefficients are pure numbers since both the
numerator and denominator have the same dimensions.
If skewness in the series is very small then second coefficient of skewness should
be used.
Coefficients of skewness based on moments are also called Moment Coefficient
of skewness.
30-40 5 24
40-50 3 27
1.9 KURTOSIS
In Greek language kurtosis means ‗bulgines.‘. kurtosis indicates the nature of the
vertex of the curve. Several statisticians defined kurtosis. Some of these definitions are :
―In statistics, kurtosis reefers to the degree of flatness of peaked ness in the region
about the mode of frequency curve. The degree of kurtosis of a distribution is measured
relative to the peaked ness of normal curve.‖
1. Normal Curve or Measokurtic Curve. A curve which is neither flat nor peaked is
called a normal curve or meso-kurtic curve. For such type of curve we have β2 = 3
and γ2=0.
32
2. Leptokurtic Curve. A curve which is more peaked than the normal curve is called
leptokurtic curve. For such type of curve, we have β2>3 and γ2>0.
3. Platykurtic Curve. A curve which is more flatter than the normal curve is called
platykurtic curve. For such type of curve, we have β2<3 and γ2<0.
Measure of Kurtosis
Second and fourth moments are used to measure kurtosis. Karl Pearson gave the
following formula to measure kurtosis :
Kurtosis or 2 µ4 / µ2 2 .
4 322
γ2 = β2 – 3 =
22
Deductions.
Example 2. The fourth moment about mean of frequency distribution is 768. What must
be value of its standard deviation in order that the distribution be
(i) Leptokurtic
(ii) Mesokurtic
(iii)Platykurtic.
Given µ4 = 768.
After going through this unit , you would achieved the objectives stated earlier in
the unit. Let us recall what we have discussed so far –
* An average is a single value within the range of the data that is used to represent
all the values in the series.
* To find the arithmetic mean, add the values of all terms and them divide sum by
the number of terms, the quotient is the arithmetic mean.
* The median is the value of the variable which divides the group into two equal
parts one part comprising all values greater, and the other all values less then the
median
* The mode is that value (or size) of the variate for which the frequency is maximum
or the point of maximum frequency or the point of maximum density. In other
words, the mode is the maximum ordinate of the ideal curve which gives the closest
fit to the actual distribution.
* If f1, f2, … , fn are the frequencies of x1,x2,…, xn respectively, then geometric mean
G is given by
* If f1, f2, …, fn be the frequencies of x1,x2, … , xn (none of them being zero) then harmonic
mean H is given by
H .M .
f
1
fx
* The values of the variate which divide the total frequency into four equal parts, are called
quartiles
* The values of the variate which divide the total frequency into ten equal parts are called
deciles
* The values of the variate which divide the total frequency into hundred equal parts, arte
called percentiles.
* The range of a set of numbers (data) is the difference between the largest and the least
numbers in the set
Q =(Q3 – Q1)/2
1 k 1
δm =
N j 1
fj ( xj M ) f ( x M )
N
* Standard deviation (or S.D.) is the positive square root of the arithmetic mean of the
square deviations of various values from their arithmetic mean M. It is usually denoted by
σ.Thus
1
σ= s {
N
f ( x M )2 }
* For any frequency distribution, the rth moment about any point A is defined as the
arithmetic mean of rth powers of deviations from the point A.
* measure of kurtosis indicates the degree to which a curve of the frequency distribution is
peaked or flat-topped.
Exercise
Q.1)Calcutate the measure of Kurtosis for the following distribution.
No. of candidate :1 3 5 7 4
Q.2)The first four moments of the distribution about the value 5 of a variable are 2,20,40 and
Unit-2
Probability
Structure:
2.0 Introduction
2.1 Objectives
2.4 Event
2.14 Mean, Median, Mode and Moments for a continuous distribution function
2.16 Covariance
2.0 INTRODUCTION
2.1 OBJECTIVES
The main aim of this unit is to study the probability. After going through this unit you should
be able to :
describe random experiments, sample space, additive law and multiplicative law of
probability, dependent and independent events etc ;
calculate mean, mode, median, and moments for a continuous distribution function,
mathematical expectation;
Consider a bag containing 4 white and 5 black balls. Suppose 2 balls are
drawn at random. Here the natural phenomenon is that ‗both balls may be
white‘ or ‗one white and one black‘ or ‗both black‘. Thus there is a
probabilistic situation.
A sample space of a random experiment is the set of all possible outcomes of that
experiment and is denoted by S
2.4 EVENT.
Equally likely events. Two events are considered equally likely if one of
them cannot be expected in preference to the other.
For example, if an unbiased coin is tossed then we may get any of head (H) or
tail (T), thus the two different events are equally likely.
40
For example, if an unbiased die is rolled, then we may obtain any one of the six
numbers1,2,3,4,5 and 6. Hence there are six exhaustive events in this trial.
For example, If a pair of fair dice is tossed then the favorable events to get the
sum 7 are six :
Example 1. In a single toss of a fair die, find (a) sample space (b) event
of getting an even number (c) event of getting an odd number (d) event of getting
numbers greater than 3, (e) event of getting numbers less than 4.
Solution. (a). When we toss a die, then we may get any of the six
numbers, 1,2,3,4,5and 6. hence the set of these six numbers is the sample space
S for this experiment, i.e.,
S={1,2,3,4,5,6};
(b) E1 ={2,4,6}
(c) E2 ={1,3,5}
(d) E3 ={4,5,6}
(e) E4 ={1,2,3}
(a)Heads on the upper faces of coins, (b)head on one and tail on other, (c)Tails on
both,(d) at least one head.
Solution. If H denotes ‗head‘ and T denotes ‗tail‘ then the toss of two coins can
lead to four cases (H,H),(T,T),(H,T),(T,H) all equally likely. Hence the sample
space S is the set of all these four ordered pairs, thus
S={(H,H),(T,T),(H,T),(T,H)};
42
In this experiment let E1 , E2 , E3 and E4 be the events of getting both heads, one
head and one tail, both tails and at least on head respectively, then
(a) E1 ={(H,H)}
(b) E2 ={(H,T),(T,H)}
(c) E3 ={(T,T)}
(d) E4 ={(H,H),(H,T),(T,H)}
Simple event. If E contains only one element of the sample space S, then E, is
called simple event. Thus
E ei
Composition of events
43
A
B
A B
A B.
44
A A
n(S ) n, n( E1 ) l , n( E2 ) m.
Clearly n( E1 E2 ) l m r .
n( E1 E2 ) l m r l m r
P( E1 E2 )
n( S ) n n n n
n( E1 ) n( E1 ) n( E1 E2 )
n( S ) n( S ) n( S )
or P( E1 E2 ) P( E1 ) P( E2 ) P( E1 E2 ) (1)
E1 E2 and n( E1 E2 ) 0
P( E1 E2 ) P( E1 ) P( E2 )
1 1
Example 1. If is the probability of winning a race by the horse A and be
4 3
the probability of winning the same race by the horse B. Find the probability that
one of these horse will win.
46
Solution . Let E1 and E2 be the events that the horse A and B wins the race
respectively. Then
1 1
P( E1 ) , P( E2 )
4 3
We know that if the horse A wins the race then the horse B cannot
win the race and if B wins the race then A can not win. Hence the events E1
and E2 are mutually exclusive events. Therefore, the probability that any one
of A or B ins the race is given by
1 1 7
P( E1 E2 ) P( E1 ) P( E2 ) = + =
4 3 12
which is impossible
47
Theorem . If E1 and E2 are two events, the respective probabilities of which are
known, then the probability that both will happen simultaneously is the prduct of
probability of E1 and the conditional probability of E2 when E1 has already
occurred i.e.,
E2
P( E1 E2 ) P( E1 ) P( ).
E1
Proof. Let S be the sample space for an experiment and E1 and E2 be its two
events.
E2 n( E1 E2 )
P( )=
E1 n( E1 )
n( E1 E2 )
n( S )
n( E1 )
n( S )
P( E1 E2 )
P( E1 )
E2
P( E1 E2 ) P( E1 ) P( ).
E1
a. P( A B)
b. P( B A)
49
c. P( A B)
Solution.
a. P( A B) P( B).P( A / B)
1
P( A )
P ( A B )
= 43
B P( B) 1 4
3
1
P ( B A) P ( A B ) 2 1
b. P( B ) 4
A P( A) P( A) 1 4 2
2
1 1 1 7
c. P( A B) P( A) P( B) P( A B)
2 3 4 12
P(B/A1),P(B/A2),…,P(B/An)
P(A j ).P( B / Aj )
P(A j / B) n
P(A ) P( B / A )
i 1
i i
P(A ) P( B / A )
i 1
i i
Proof: (a) since the event B can occur when either A1 occurs, or A2 occurs,
or,…,An occurs i.e , B can occur in composition with either A1 or A2
consequently
n n
P( BAi ) P(Ai )P( B / Ai )
i 1 i 1
P(Ai ) P( B / Ai )
P(A j / B) n
…(3)
P(A )P( B / A )
i 1
i i
(b) the further event C can occur in n mutually exclusive cases namely A1
C/B,…,AnC/B. hence the conditional probability of C is given by
n n
= P(AiC / B) P(Ai / B) P(C / Ai B)
i 1 i 1
52
P(A ) P( B / A ) P(C / A B)
i i i
i 1
n
P(A )P( B / A )
i 1
i i
Example1. A bag contains 3 white and 2 black balls, an another bag contain 5
white and 3 black balls. If a bag is selected at random and a ball is drawn
from it , find the probability that it is white.
Solution. Let B be the event of getting one white ball, and A1,A2 be the events
of choosing first bag and second bag respectively
1
P(A1)= the probability of selecting the first bag out of two bags =
2
1
similarly , P(A2) = the probability of selecting second bag =
2
P(B/Ai)=the conditional probability of drawing one white ball while first bag
has been selected
3
C1 3
=5
C1 5
Similarly P( B / A2 ) 5 / 8
2
= P(Ai )P( B / Ai )
i 1
= P(A1 ) P( B / A1 ) P(A2 ) P( B / A2 )
1 3 1 5 49
= . . Ans.
2 5 2 8 80
A random variable can assume only a set of real values and the values
which the variable takes depends on the chance. Random variable is also
called stochastic variable or simply a variate. For example.
Suppose a perfect die is thrown then x, the number of points on the die
is random variable since x has the following two properties:
The set of values 1,2,3,4,5,6 with their probabilities 1/6 is called the
Probability Distribution of variate x.
The set of values xi (for i=1,2…n) with their probabilities pi (for i=1,2…n) is
called the Probability Distribution of the variable of that trial. It is to be noted
that most of the properties of frequency distribution will be equally applicable
to probability distribution.
Continuous Variate
So far we have discussed with discrete variate which takes a finite set of
values.
When we deal with variates like weights and temperature then we know
that these variates can take an infinite number of values in a given interval.
Such type of vriates are known as continuous variates.
Definition. A variate which is not discrete, i.e., which can take infinte number
of values in a given interval a x b , is called a continuous variate.
1 1
P( x dx X x dx) f ( x)dx ,
2 2
(i) f ( x) 0
55
b
(ii)
a
f ( x)dx 1, if a X b
f ( x)dx 1, if X
then the function f(x) is called the probability density function(or in brief p.d.f.)
of the continuous random variable X.
Remarks:
(1) If the range of X be finite, then also it can be expressed as infinite range.
For example,
.f(x)=ф(x), for a X b
(2) The probability that a value of continuous variable X lies within the
interval(c,d) is given by
d
P(c X d ) f ( x)dx
c
(3) The continuous variable always takes values within a given interval
howsoever small the interval may be.
(4) If X be a continuous random variable, then
P(X=k)=0
F ( x) P( X x) p( xi )
x xi
(1) Mean M X E( X )
= if x
M
Also mean= 6 x 2 x3 M d if a x b
d
1 1 1
2 3 0 2
b
= a log xf ( x)dx E (log x) if a x b
1 b1 1
f ( x)dx E if a x b
H a x
x
Q3 3 3
f ( x)dx =
4
and Q1
f ( x)dx
4
(6) Mode. The mode is the value of the variant for which
d d 2 f ( x)
f ( x) 0 and 0
dx dx 2
[In other words, the mode is the value of the variant for which
probability f(x) is maximum]. The condition for which is that the
values obtained from (d/dx)f(x)=0 lies within the given range of x.
(7) Moment. The rth moment about any given arbitrary value A is given
by:
( x m) f ( x)dx, if x
2
b
X ( x A)r f ( x)dx, if a x b
a
b
= a ( x m)2 f ( x)dx, if a x b
f(x)= 6( x x2 ) is a p.d.f.
1
M x. f ( x)dx
0
1 1
M x.6( x x 2 )dx 6 ( x 2 x3 )dx
0 0
= 6 x 2 x3 6
1 1 1 1 1
3 4 0 3 4 2
1 11
.6( x x 2 )dx
H 0 x
1
1 x2
6 x 3
H 2 0
H=1/3
Md 1
0
f ( x)dx
2
60
Md 1
0
6( x x 2 )dx
2
M
1 1 d 1
6 x 2 x3
2 3 0 2
4M d 3 6M d 2 1 0
(2M d 1)(2M d 2 2M d 1) 0
1 1
Md or M d (1 3)
2 2
1
Md
2
y 6( x x 2 ) .
d2y
dy / dx 6(1 2 x) and 12
dx 2
1
dy / dx 0 6(1 2 x) 0 x
2
Definition. Let x be the discrete random variable and let its frequency
distribution be as follos :
Variate : x1 x2 x3 ….. xn
61
Probability : p1 p2 p3 ….. pn
where ∑p=1.
Variance
E[ x E ( x)] 0 .
62
Expectation of a Sum
Theorem. The expectation of the sum of two variates is equal to the sum of their
expectations, i.e., if x and y are two variates then
E( x y) E( x) E( y)
Product of Expectations
E( xy) E( x).E( y)
2.16 COVARIANCE
Example 1. what is the expected value of the number dof points that will be
obtained in a single throw with an ordinary die? Find variance also.
Solution. The variate i.e., number showing on a die assumes the values
1,2,3,4,5,6 and probability in each case is 1/6 .
63
x: 1 2 3 4 5 6
6
E ( x) pi xi
i 1
= p1 x1 p2 x2 p3 x3 ........... p6 x6
1 1 1 1 1 1
= .1 .2 .3 .4 .5 .6
6 6 6 6 6 6
1 1
[1 2 3 4 5 6] 21
6 6
1
= [12 22 32 42 52 62 ] (7 / 2)2 35 / 82.
6
P(A )( P( B / A )P(C / A B)
i i i
Example 2. Find i 1
n
for the following probability
P(A )( P( B / A )
i 1
i i
distribution:
.x : 8 12 16 20 24
E(x)=∑x.p(x)
1 1 3 1 1
= 8. 12. 16. 20. 24. 16
8 6 8 4 12
64
E ( x 2 ) x 2 p( x)
1 1 3 1 1
82. 122. 162. 202. 242. 276
8 6 8 4 12
E{( x x 2 )} E ( x x 2 ) p( x)
1 1 3 1 1
(8 16)2 . (12 16)2 . (16 16)2 . (20 16) 2 . (24 16) 2 . 20
8 6 8 4 12
After going through this unit , you would achieved the objectives
stated earlier in the unit. Let us recall what we have discussed so far –
P(A j ).P( B / Aj )
P(A j / B) n
P(A ) P( B / A )
i 1
i i
P(A ) P( B / A )
i 1
i i
(i) f ( x) 0
b
(ii) a
f ( x)dx 1, if a X b
f ( x)dx 1, if X
Let x be the discrete random variable and let its frequency distribution
be as follows :
Variate : x1 x2 x3 ….. xn
Probability : p1 p2 p3 ….. pn
The expectation of the sum of two variates is equal to the sum of their
expectations, i.e., if x and y are two variates then
E( x y) E( x) E( y)
Exercise
Q.1) A coin is tossed twice. What is the probability that at least one head occurs.
Q.2 )A card is drawn from an ordinary deck. Find the probability that it is a heart.
Q.3) What is the probability of getting a total of 7 or 11 when a pair of dice is tossed?
Q.4) A coin is tossed 6 time sin succession. When is the probability that at least one head
occurs?
Q.5) A card is drawn from an ordinary deck and we are told that it is red. What is the
Q.6) Find a formula for the probability distribution of the random variable X
Structure
3.1. Introduction
3.2. Objectives
3.10. Assignment
distributions are divided into two parts – Discrete Theoretical Distributions and
Continuous Theoretical Distributions. Binomial and Poisson distributions are Discrete
Theoretical Distributions while Normal, Rectangular and Exponential are Continuous
Theoretical Distributions. If certain hypothesis is assumed, it is sometimes possible to
derive mathematically what the frequency distributions of certain universes should be.
Such distributions are called ‘Theoretical Distributions’. The Binomial Distribution was
discovered by James Bernoulli in 1700 and therefore it is also called the Bernoulli
Distribution. The Poisson distribution is a particular limiting form of the Binomial
distribution; it was first discovered by a French Mathematician S.D. Poisson in 1837. In
1733 Demoivre made the discovery of Normal (or Gaussian) distribution as a limiting form
of Binomial distribution, it is defined by the probability density function. Rectangular (or
Uniform) and Exponentials distributions are also defined by different probability density
functions. These distributions serve as the guiding instrument in researchers in the
physical, social sciences and in medicine, agriculture and engineering. These are
indispensable tool for the analysis and the interpretation of the basic data obtained by
observation Experiment.
3.2: Objectives: After the end of the unit the student will be able to understand/know the
applications
3.3. Theoretical Distributions: When the frequency distributions are made by collecting the
data in direct form. Then such distributions are called observed frequency distributions.
When the frequency distributions are made by obtaining probable (or expected)
frequency by using mathematical methods on the basis of definite hypothesis (or
assumptions) then such distributions are said to be theoretical frequency distributions.
Theoretical distributions are obtained by probability distributions. If the probabilities are
72
assumed to the relative frequencies then the probability distributions are said to be
theoretical frequency distributions. Theoretical distributions are divided into two following
parts:
Example: If we throw four coins 80 times then by the theory of probability the theoretical
distribution is given as :
0 1 1
80 =5
16 16
1 4 4
80 = 20
16 16
2 6 6
80 = 30
16 16
3 4 4
80 = 20
16 16
4 1 1
80 =5
16 16
Total 1 80
73
3.4. Binomial Distribution: Let there be an event, the probability of its being success is p
and the probability of its failure is q in one trial, so that p + q = 1. Let the event be tried n
times and suppose that the trials are
(i) Independent
The probability that the first r trials are successes and the remaining n – r are
failures is pr qn-r . But we are to consider all the cases where any r trials are successes,
since out of n , r can be chose nCr ways, then the probability p(r) [or b(n, p, r) of r
successes out of n independent trials is given by p(r) = nCr pr qn-r.
The binomial distribution contains two independent constants viz, n and p (or q), these are
called the parameter of the binomial distribution. If p = q = ½, the binomial distribution is
called symmetric and when p q it is called skew symmetric distribution.
Let the n independent trials constitute one experiment and let this experiment repeated N
times where N is very large. In these N sets there will be few sets in which there is no
success, a few sets of one success, a few sets of two successes, and so on. Hence in all the N
sets, the number of sets with r successes is N nCr pr qn-r. Therefore the number of sets
corresponding to the number of successes 0,1, 2, … , r , … , n , are respectively N qn , N
n
C1 p qn-1 , N nC2 p2 qn-2 , … , N nCr pr qn-r , … , N pn .Hence for N sets of n trials the
Theoretical frequency distribution or Expected frequencies distribution of 0, 1, 2, … , r,
… , n successes are given by the successive terms in the expression
74
N[ qn + nC1 p qn-1 + n
C2 p2 qn-2 + … + n
Cr pr qn-r + … + pn ]
which is the binomial expansion of N(q + p)n . This is called the Binomial Theoretical
Frequency Distribution or simply Binomial Distribution.
(k = 1, 2, 3, 4, …).
First moment
n
1 = E(X) = r.n C r p r q n r , q + p = 1.
r 0
n n
= r.n1 Cr 1 p r q nr = np r.n1 Cr 1 p r 1q (n1)( r 1)
r 1 r 1
= np[qn-1 + n-1
C1 qn-2p + n-1
C2 qn-3p2 + n-1
C3 qn-4p3 + . . . + pn-1 ]
Second moment
n
2 = E(X ) = r 2 .n C r p r q n r , q + p = 1.
2
r 0
n
= r r 1 r . C
r 0
n
r p r q nr
n n
= r r 1 . C
r 0
n
r p r q nr + r. C
r 0
n
r p r q nr
n
= n (n-1)p2 .n2 C r 2 p r 2 q ( n 2)( r 2) + np
r 2
n
= r (r 1)(r 2)(r 3) 6r (r 1)(r 2) 7r (r 1) r}. C
r 0
n
r p r q nr ,
1 2p (1 6 pq)
1 = 1 = , 2 = 2 -3 = .
npq npq
(Recurrence relation for the moments of Binomial Distribution). For Binomial distribution (q
+ p)n ,
d k
k 1 pq nk k 1 , where k is the kth moment about mean.
dp
n
n
Proof : Since k = (r 1 ) k . p(r ) =
r 0
(r np)
r 0
. C r p r q nr , q + p = 1,
k n
d k n n
(1)kn(r np) k 1 C r p r (1 p) nr [(r np) k C r {rp r 1 (1 p) nr (n r ) p r (1 p) nr 1
n n
dp r 0 r 0
n
1
nk k 1
pq r 1
n
C r p r q nr .(r np) k (rq np rp),
d n n
Therefore pq nk k 1 k C r p r q n r (r np) k 1 k 1 .
dp r 0
If we put k = 1, 2, 3, … we have
77
d1
2 pq n 0 pq(0 n) npq, sin ce 0 1and1 0,
dp
d
3 pq n 2 1 2 pq(nq np 2n0) npq(q p),
dp
d3
4 pq n 3 2 pq[3n.np(1 p) n(6 p 2 6 p 1)],
dp
p(r)= nCr prqn-r ; r = 0,1,2, ..., n, where r = number of successes in n trials and p =
probability of success in a single trial, q = 1 - p.
n n
Then m.g.f. about origin: M0(t) = E(e tr ) = e tr .n Cr p r q nr =
r 0
r 0
n
C r ( pe t ) r q n r
= (q+ pet)n
m.g.f. about mean np : Mnp(t) = E[e t(r - np) ] = E[e t(r - np) ] = e- npt E(e tr ) = e- npt M0(t )
= (qe- pt + pe qt.)n .
pt 2 pt 3
= 1 pt ... n
2! 3!
pt 2 pt 3 pt 2 pt 3
= 1 + C1 pt ... + C2 pt ... n
n n n
+ … … (i)
2! 3! 2! 3!
n(n 1)
2' = coefficient of t2 /2! = np + n C2 2 !p2 = np + .2 p 2 ,
2
78
Also, 2 = 2' - (1')2 = npq + (np)2– (np)2 = variance and S.D. = (npq).
d dr
r
Alternative Method: r r M 0 (t ) = r (q pe t ) n
dt t 0 dt t 0
On putting r = 1, 2, 3, … , we get
d
1 (q pe t ) n = npet (q pet ) n1 t 0 = np(q + p)n – 1 = np.
dt t 0
d2 d
2 2 (q pe t ) n = npe t (q pe t ) n1
dt t 0 dt t 0
= npe t (n 1) pe t (q pe t ) n2 npe t (q pe t ) n1
t 0
3.4.4. Mode of Binomial Distribution: (The most probable number of success in a series of n
independent trials; the probability of success in each trials being p).
Here we are to find the number of successes which has a greater probability than any
other. Let the probability of r successes is greater than or equal to that of r - 1 or r + 1
successes, i. e., p( r - 1) ≤ p(r) p(r + 1).
If n
Cr-1 pr-1 qn-r+1 ≤ n
Cr pr qn-r n
Cr+1 pr+1 qn-r-1 .
Simplifying we get
r q nr p
1 rq ≤ np – rp + p and rq + q np – rp.
n r 1 p r 1 q
np – q ≤ r ≤ np + p (n + 1) p – 1 ≤ r ≤ np + p.
Case l. If (n + 1)p = k (an integer), then probability will increase till r = k and it will be the
same for r = k - 1 and after that it will begin to decrease.
(i) If np is a whole number, the mean of the Binomial distribution coincides with the
n
greatest term. Since frequency of r successes is greater than that of r – 1 successes if Cr-
r-1 n-r+1 n r n-r
1p q < Cr p q
p q
i.e., or r np p . Similarly the frequency of r successes is greater than that
r n r 1
n
of r + 1 successes if Cr+1 pr+1 qn-r-1 > nCr pr qn-r implies that r > np – q. Thus if np is
a whole number r = np gives the greatest term and also the mean of the Binomial.
(ii) The difference of mean and mode of the Binomial distribution is not greater than unity.
Since the mean of Binomial distribution = np, also there are three cases for ode are:
(a) If np is positive integer then mode and mean are equal, therefore
Example 1. Criticize the statement: For any Binomial distribution mean is 5 and standard
deviation is 3.
Solution. For Binomial distribution, we have given that, mean = np = 5 and standard
deviation = (npq) = 3,
npq 9
q = 1.8 > 1. Hence the given statement is not correct.
np 5
Example 2. The probability of a head in a single tossing of a biased coin is 3/5. Find the
most probable number of heads and the mean of number of heads in 99 tossing of a coin.
Solution: Let on tossing 99 times the number of getting heads are = 0, 1, 2, 3, … , 99. We
have given that and here p = 0.6 , q = 0.4. Therefore the probability, distribution is
80
Solution : Given:
x 0 1 2 … n Total
n n n
f 1 C1 C2 … Cn (1 + 1)n = 2n
Example 5. Assuming that half the population are consumers of chocolate, so that the
chance of an individual being a consumer is ½, and assuming that 100 investigators each
take 10 individuals to see whether they are consumers, how many investigators would you
expect to report that three people or less were consumers?
=
100
10
1 10 45 120 17600 17appr.
2 1024
Example 6. Determine the Binomial distribution for which the mean is 4 and variance is 3
and find its mode.
81
Solution: Let P(X =r)= nCr pr qn-r , r = 0, 1, 2 , ...., n. It is given that mean = np = 4 and
variance = npq = 3 implies that q = ¾ , p = 1/4 and n = 16.
Thus the Binomial distribution is, P(X =r)= 16Cr (1/4)r (3/4)n-r , r = 0, 1, 2, 3 , …, 16.
Example 7. The following results are obtained when 100 batches of seeds were allowed to
1 89
germinate on damp filter paper in a laboratory: 1 , 2 . Determine the Binomial
5 30
distribution. Calculate the expected frequency for x = 8 assuming p > q.
Solution: For Binomial distribution we have given that
1
1 2 p 2
1
and 2 3
1 6 pq 89
npq 15 npq 30
1 2 p 2
1
and
1 6 pq 89
3
1
npq 15 npq 30 30
1 2 p 2
1 / 15
2
1 6 pq 1 / 30
16p2 + 16p + 3 = 0,
(4p – 1)( 4p – 3) = 0,
p = ¼ or p = ¾ and so q = 3/4 or q = ¼.
Example 9. The following data are the number of seeds germinating out of 10 on damp
filter for 80 sets of seeds. Fit a Binomial distribution to these data :
x 0 1 2 3 4 5 6 7 8 9 10 Total
f 6 20 28 12 8 6 0 0 0 0 0 80
A.M. =
fx 1 20 2 28 3 12 4 8 5 6 174 .Now, mean = np implies that p =
f 80 80
0.2175 and q = .7825. Hence the required Binomial distribution of the given data is 80
(.7825 + .2175)10. From this expansion the successive frequencies of 0, 1, 2, ..., 10 successes
are 6.9, 19.1, 24.0, 17.8, 8.6, 2.9, 0.7, 0.1, 0, 0, 0 respectively.
Solution: We have given that P(X = 1)= 5C1 p1 q4 = 0.4096 and P(X = 2)= 5C2 p2 q3 implies
P( X 1) 5 pq 4 (1 p) 0.4096
that 2 p 0.2
P( X 2) 10( pq) 2
2p 0.2048
Example 11. 6 dices are thrown 729 times. Find the probability of obtaining five or six at
least three dices.
Solution: We know that the probability of getting five or six on a throw of a die = 2/6 = 1/3 =
p. Then q = 1 – p =2/3. Therefore the probability of getting five or six on least at least three
dices out of six = P(3) + P(4) + P(5) + P(6),
= 6C3 (1/3)3 (2/3)3 + 6C4 (1/3)4 (2/3)2 + 6C5 (1/3)5 (2/3) + (1/3)6
We know that under the conditions, p(r) the probability of r successes in the Bino-
mial distribution is given by,
1 2 r 1
(1 )(1 )...(1 )
n n n (np) r np
= (1 ) n .
r! np n
(1 ) r
n
1 2 r 1
(1 )(1 )...(1 ) r
p(r) = b(n, p; r ) = lim n n n m m
(1 ) n
p 0 n r! m n
n (1 ) r
np m n
m r m
= e .
r!
(ii) This distribution is useful in solving the problems of following types (some examples) (a)
The number of cars passing through a certain street in lime t.
mr em
Moments about origin: k = E(X ) = r . p(r ) = r . k
, k k
r 0 r 0 r!
(k = 1, 2, 3, 4, …).
First moment
mr em
mr em m 2 m3
1 = E(X) = r . = = em m ...
r 0 r! r 0 ( r 1)! 1! 2!
m m2
= mem 1 ... = mem e m = m.
1! 2!
Second moment
mr em
mr em
2 = E(X ) = r .
2
= 2
r r 1 r .
r 0 r! r 0 r!
mr em
mr em
= r 0 (r 2)!
+
r 0 (r 1)!
= m2 e-mem + m = m2 + m.
= m3 + m2 + m and
mr em
4 = E(X4) = r 2.
r 0 r!
85
mr em
= r (r 1)(r 2)(r 3) 6r (r 1)(r 2) 7r (r 1) r}.
r 0 r!
,
= m4 + 6m3 + 7m2 + m
1 = 0 always,
2 = 2 - (1)2 = m2 + m - m2 = m
= 3m2 + m.
32 m2 1 3m 2 m 1 1 1
1 = 3 = 3 , 2 = 4 2 = 3 , 1 = 1 = , 2 = 2 -3 = .
2 2
2
m m m m m m
Note: (i) For Poisson‘s distribution mean = variance m > 0 always, so the distribution is
always positive skew. (ii) Both 1, 2 tends to 0 as m tends to infinity.
3.5.2. Recurrence Relation for the Moments of Binomial Distribution. For Poisson‘s
distribution with mean m ,
d r
k 1 mr r 1 m , where k is the kth moment about mean.
dm
mr e m
Proof : Since k = (r m) k . p(r ) =
r 0
(r m)k .
r 0 r!
,
d k
mr em rmr 1e m m r e m
r (r m) k 1 (r m) k
dm r 0 r! r 0 r!
r k 1
r m k e m m r 1 r m
r 1 r!
d k
(r m) k 1 e m m r
Therefore m mr k 1 k 1.
dm r 0 r!
If we put k = 1, 2, 3, … we have
d1 d d
2 m 0 m m, 3 2m1 m 2 m, 4 3m 2 m 3 3m 2 m,
dm dm dm
etc.
m r m
Let p(X = r) = e ; r = 0, 1, 2, ... , be a Poisson distribution. Then m.g.f. about
r!
mr
met
r
e e m
=e
m
= e m e me e m(e 1)
t t
tr tr
origin : M0(t) = E(e ) =
r 0 r! r 0 r!
Mm(t) = E(e t(r-m) ) = e-tm E(e tr ) = e-tm M0(t) = e tm e m(1e ) e me mt m .
t t
m2 t m3 t m4 t
M0(t) = 1 + m(e t 1) (e 1) 2 (e 1) 3 (e 1) 4 ...
2! 3! 4!
t2 t3 m2 t2 m3 t2 m4 t2
= 1+ m(t ...) (t ...) 2 (t ...) 3 (t ...) 4 ...
2! 3! 2 2! 6 2! 24 2!
Similarly 3 , 4 , … etc may be obtained. Also the moment about mean can be obtained by
expanding Mm(t).
87
d dr
r
Alternative Method: r r M 0 (t ) = r e m ( e 1)
t
dt t 0 dt t 0
On putting r = 1, 2, 3, … , we get
d
1 e m ( e 1) = [e m e me met ]t 0 = m.
t t
dt t 0
2 d 2 m ( et 1)
2e m met t 2 m met
= e e (me ) e e me
t
t 0
dt t 0
= [e m e me et (m 2 et m) 2 ]t 0 = m2 + m, etc
t
The value of r which has a greater probability than any other value is the mode of the
Poisson‘s distribution. Let the probability of r successes is greater than or equal to that of
r - 1 or r + 1 successes, i. e., p( r - 1) ≤ p(r) p(r + 1).
m r 1 m m r m m r 1 m
i.e., e ≤ e e .
(r 1)! r! (r 1)!
Case 2. If m is not positive integer, then there is one mode is the integral value between (m
– 1) and m.
Example 3. Find the probability that almost 5 defective fuses will be found in a box of 200
fuses if experience shows that 2 percent of such fuses are defective.
1 1 1
e-4 = 1 4 4 2 43 4 4 ... 0.0183. Hence
2 6 24
5
4r 4 2 43 4 4 45
P(r ≤5) = e
r 0
4
r!
4
e (1 4
2 6 24 120
0.7845.
Example 4. 6 coins are tossed 6400 tunes. Using Poisson distribution, find the approximate
probability of getting six heads x times and 2 times.
Solution: Let the coins be unbiased so the probability of getting a head = the probability of
getting a tail for each coin = (1/2)6 = 1/64 = p (say). Then m = np = 6400(1/64) = 100.
m r m 100 r 100
Now the Poisson‘s distribution p(X = r) = e = e .
r! r!
100 2 100
Therefore the required probability is p(X = 2) = e 5000e 100. .
2!
m r m
Solution: The Poisson‘s distribution is, p(X = r) = e , r = 0, 1, 2, … , . Then
r!
Example 6. In a Poisson‘s distribution with unity mean, show that the mean deviation from
mean is 2/e times the standard deviation. i.e., E|X -1| = 2/e.
m r m
Solution: The Poisson‘s distribution is, p(X = r) = e , r = 0, 1, 2, … , . Here mean =
r!
e 1
m = 1 , so SD = 1. Therefore p(X = r) = . Now mean deviation from mean
r!
e 1 1 | r 1 | 1 1 2 3 4
= | r m | p ( r ) | r 1 | [1 ...
r 0 r 0 r! e r 0 r! e 2! 3! 4! 5!
1 1 1 1 1 1 1 1
= [1 ( ) ( ) ( ) ... (1 1) 2 / e.
e 1! 2! 2! 3! 3! 4! e
m r m
Solution: The Poisson‘s distribution is, e , r = 0, 1, 2, … , . Here two
p(X = r) =
r!
modes are given then the mean should be an integer and modes are m – 1 and m.
Therefore, m – 1= 3, i.e., m = 4. Now
4 3 4 4 4 4 4 3 4
p(X = r = 3) = e and p(X = r = 4) = e e . Hence the required probability is
3! 4! 3!
4 3 4 4 3 4 64 4
p(X = 3 or 4) = p(X = 3) + p(X = 4) = e + e = e .
3! 3! 3
with parameter m1 + m2. Hence x1 + x2 has a Poisson‘s distribution with mean m1 + m2.
Example 9. In a certain factory turning razor blades, there is a small chance 1/500 for any
blade to be defective. The blades are in packets of 10. Use Poisson‘s distribution to
calculate the approximate number of packets containing no defective, one defective and
two defective blades in a consignment of 10,000 packets.
m r m
Solution: Since the probability distribution is p(X = r) = e , r = 0, 1, 2, … , . Then
r!
m r m
expected frequency = Np(X = r) = N e , r = 0, 1, 2, 3, … , . Here N = 1000, m = np =
r!
10(1/500) = 0.02. So that e-0.02 = 0.9802. Then the respective expected frequencies for
number of packets containing no defective, one defective and two defective blades are (i)
Ne-m = 100000.9802 = 9802,
Example 10. Fit a Poisson‘s distribution to the following and calculate theoretical
frequencies:
90
Deaths 0 1 2 3 4
Frequencies 22 60 15 2 1
Solution :
x 0 1 2 3 4 Total
f 22 60 15 2 1 200
fx 0 60 30 6 4 100
fx2 0 60 60 18 16 154
Therefore e-m = e-0.5 = 0.61 and the theoretical frequency of r deaths is given by N
m r m (0.5) r
e = 200 0.61 . Therefore the Poisson‘s distribution will be
r! r!
Example 11. In a book of 300 pages, a proof reader finds no error in 200 pages, in 75 pages
one error on each page, in 20 pages two errors on each page and in 5 pages 3 errors on each
page. Use Poisson distribution to these data calculate theoretical frequency
[e-0.43 = 0.6505]
Then Mean m = 130/300 = 0.43 and e-0.43 = 0.6505, so the theoretical frequency of r deaths is
m r m (0.43) r
given by N e = 300 0.6505 . Therefore the Poisson‘s distribution will be
r! r!
according to the table shown below:
(General Case). Let N(q + p)n be the binomial distribution where p q but p q or |p
– q| is small. Now the ratio of the frequencies f(r) and f(r + 1) of r and r + 1 successes
f (r 1) n Cr 1 p r 1q nr 1 n r p
respectively is n . Now The frequency of r successes is
f (r ) Cr p r q nr r 1 q
greater than the frequency of (r + 1) successes, i.e. f(r) > f(r +1).
f (r 1) n r p
If 1 or if np – rp < rq + q or if rq + rp > nq – q. Similarly
f (r ) r 1 q
the frequency of r successes is also greater than that of (r - 1) successes if r>
np — q or if (p + q)r > nq – q or if r < np + q.
n!
y0 = f(np) = N n Cnp p npq nnp N p np q nq . Then the frequency of np + x successes is
(np)!(nq)!
n!
yx = f(np + x) = N p np x q nq x . Therefore
(np x)!(nq x)!
1
yx (np)!(nq)! n
p x q x . Now for large n , by James Stirling formula n! e n n 2 2 ,
y0 (np x)!(nq x)!
we have
1 1
np nq
yx e np (np) 2
2 e nq (nq) 2
2 1
1 1
1 1
y0 np x nq x x np x x nq x 2
e ( np x ) (np x) 2
2 e ( nq x ) (nq x)) 2
2 (1 ) 2
(1 )
np nq
yx 1 x 1 x
log (np x ) log(1 ) (nq x ) log(1 )
y0 2 np 2 nq
1 x x2 x3 1 x x2 x3
(np x )( 2 2 3 3 ...) (nq x )( 2 2 3 3 ...)
2 np 2n p 3n p 2 nq 2n q 3n q
1 1 1 1 1 1 1 1
x(1 1 ) x2 ( 2 2 2 2)
2np 2nq 2np np 4n p 2nq nq 4n q
1 1 1 1 1
x3 ( 2 2
3 3 2 2 2 2 2 2 ) ...
3n q 6n q 2n p 3n p 6n p
pq p2 q2 x2
x x2 2 2 2 ... terms of higher order.
2npq 4n p q 2npq
yx q p x2
Neglecting terms containing 1/n2, we get log x .
y0 2npq 2npq
Since p <1, q < 1, q — p is very small in comparison to n and therefore the first term can be
y x2
neglected. Hence log x y x y0 e x / 2npq y0 e x / 2 .
2 2 2
. Therefore
y0 2npq
(Particular Case). Let N(q + p)n be the binomial distribution where p = q. If p = q then p = q
=1/2 and consequently the binomial distribution is symmetric. Now the frequency of r
successes is f(r) =N nCr pr qn-r = N nCr (1/2)n . Without loss of generality, we assume that n is
an even integer, say n = 2k. Since n , the frequency f(r) = N 2kCr (1/2)2k Now the
ratio of the frequencies f(r) and f(r + 1) of r and r + 1 successes respectively is
93
f (r 1) 2k
Cr 1 2k r
. Now The frequency of r successes is greater than the frequency
f (r ) 2k
Cr r 1
of (r + 1) successes, i.e. f(r) > f(r +1).
f (r 1) 2k r
If 1 or if 2k – r < r + 1 or if r > k – 1/2. Similarly the
f (r ) r 1
frequency of r successes is also greater than that of (r - 1) successes if r<k+
½. Thus we observed that if k – ½ < r < k + ½ , the frequency corresponding to r
successes will be greatest. Clearly r = k is the value of success corresponding to which the
frequency is maximum. Suppose y0 be the maximum frequency, then
(2k )!
y0 = N 2 k Ck (1/ 2) 2 k N (1/ 2) 2 k . Then the frequency of k + x successes is yx =
(k )!(k )!
(2k )!
N (1/ 2) 2k . Therefore
(k x)!(k x)!
1 2 x 1
(1 )(1 )...(1 )
yx (k )!(k )! k (k 1)(k 2)...(k x 1) k k k
. Then
y0 (k x)!(k x)! (k x)(k x 1)...(k 1) 1 2 x
(1 )(1 )...(1 )
k k k
x 1
log(1 ) log(1 ) ... log(1 ) log(1 ) log(1 ) ... log(1 )
yx 1 2 1 2 x
log
y0 k k k k k k
On expanding each logarithmic terms and neglecting the higher powers of x/k, we get (as
k )
yx 1 1
log {1 2 3 ... ( x 1)} {1 2 3 ... ( x 1) x}.
y0 k k
2 x 2 ( x 1)( x 1 1) x x x x2
{1 2 3 ... ( x 1)} = ( x 1) .
k k k k k k k k
(c) Determination of y0 . Suppose that y0 be such that the total probability may be 1,
dx 1 2 y0 e x
/ 2 2 / 2 2
y0 e x dx 1. x / 2 t , dx 2dt ,
2 2
then Put we get
0
1 1 x 2 / 2 2
2 2y0 e t dt 1 2 2 y0 (1/ 2) 1 y0 Hence y
2
. e is the
0 2 2
94
standard form of normal distribution. If the total frequency is N, the corresponding normal
N x2 / 2 2
distribution is y e .
2
If the origin is changed to the point (m = np = mean, 0), where x – m is the excess of the
mean over the value chosen as origin, then the corresponding normal distribution is
N
y e ( xm) / 2 .
2 2
2
2
distribution is called Normal (or Gaussian) Distribution, here m and are called the
/ 2 2
parameter of the distribution. A curve is given by y x y0 e ( xm)
2
is said to be a normal
/ 2
curve, where origin is taken at mean then y x y0 e x
2 2
.
/ 2 2
Since if we replace x by –x in the equation of normal curve y x y0 e x
2
, then the
equation remains unchanged. Hence the normal curve is symmetric about x-axis.
/ 2 2 dy y d2y y0 x 2 x2 / 2 2
y y0 e x 02 xe x / 2 and e
2 2 2
1 .
dx dx 2 2 2
dy d2y
Then 0 x 0 and 0. Hence at x = 0 y is maximum, i.e., x = 0 is the mode of
dx dx 2
normal distribution.
1 M
e t dt d 0 M d 0. Hence the mean, median and mode
t 2
e dt
2
Md / 2
2 0 2
coincide at the origin.
95
(3) The points of inflexion of the normal curve are given by x = .
/ 2 2 dy y d2y y0 x 2 x 2 / 2 2
Since if y y0 e x 02 xe x / 2 , e
2 2 2
1
dx dx 2 2 2
d2y x2 d3y xy 0 x 2 x 2 / 2 2
Then 0 1 0 x . Now 3 e 0 at x = .
dx 2 2 dx 3 4 2
1 / 2 2
Let y e ( x m)
2
. Then
2
1
Mean = m = E(x) = 1 xf ( x)dx = xe
( x m ) 2 / 2 2
dx
2
1
(m 2 )e
z2 / 2
= dz , where z = (x - m)/.
2
m 2m
e dz e z / 2 dz = e dz ,
z 2
/2 2
z / 2 2
=
2 2 0
(since the integrand is even in first term and odd in second term)
m
e
t
= t (1/ 2) dt , , where t = z2/2, dt = z dz.
0
1/ 2
m
t
= e t (1 / 2)1dt m m.
0
1
( x m)
/ 2 2
e ( x m )
2
Variance: Var(X) = E(X – m)2 = 2
dx
2
2
z e
2 z2 / 2
= dz , where z = (x - m)/.
2
2 2
z e dz ,
2 z / 2 2
= (since the integrand is even)
0
2 2
t
1 / 2 t
= e dt , where t = z2/2, dt = z dz.
0
96
2 2
2 2 3 / 2
e 2.
t ( 3 / 2 ) 1
= t dt
0
0
1 1 1
e z / 2 dz
z2 / 2 2
e dz = . On comparing, we get (Md – m)/ = 0, implies that
2 2 0
2
Md = m.
Mode: Now if
1 dy 1 x m ( xm)2 / 2 2
y f ( x) e ( xm) / 2 0 x m. Also
2 2
2
e
2 dx 2
xm
2
d2y 1 1 2
2 e ( xm) / 2 e ( xm ) / 2 2 0 at x = m.
2 2 2
2 2
2
dx
2 4
Integral Table, we have Q1 = - 0.6745. The third quartile Q3 is given by
Q3
1 3
e ( xm ) / 2 dx . On solving the equation with the help of 'Normal Probability
2 2
2 4
Integral Table' we have Q3 = + 0.6745.
Hence the odd moments about origin are zero. i.e., 1 = 3 = 5 = …= 2n+1 = 0.
97
1
x
2 n x 2 / 2 2
Even moments. 2n = E(X)2n = e dx
2
2
x
2 n x 2 / 2 2
= e dx , [Since the integrand is an even function].
2 0
x2 dt
Put t , dx , we get
2 2
2 2
2 n n t dt 2 n 2 n n 2 1 t 2n 2n
1
2 1
2 t e n .
n
2n = = t e dt =
2 0 2t 0 2
2 n1 2 n2 1
Similarly, 2n-2 = n 1 . Then
2
1
n
2 n 2 1
2 2 2 2 (n ) 2 (2n 1) 2 n 2 (2n 1) 2 n2 .
2 n2 1 2
n
2
This is the recurrence formula for even moments. On putting n = n–1, n–2, n-3, …, 3,2,1 ,
we have
2 n1 2 n1 z / 2
z e dz [where z = (x - m)/] = 0.[Since the integrand is an odd function].
2
=
2
Hence the odd moments about mean are zero. i.e., 1 = 3 = 5 =…= 2n+1 = 0.
1
( x m)
/ 2 2
Even moments. 2n = E(X – m)2n = e ( x m )
2
2n
dx
2
98
2n
z
2n z2 / 2
= e dz [where z = (x - m)/]
2
2 2 n
z
2n z2 / 2
= e dz [Since the integrand is an even function]
2 0
2 2 n 2 2 n
1 1
n 2
n t dt
2 2 t 2 e t dt
2
= ( 2t ) e , where t = z /2 =
2 0 2t 2 0
2n 2n 2n 2n
1
n 1 1
t
t
= 2
e dt = n .
0 2
2 n1 2 n2 1
Similarly, 2n-2 = n 1 . Then
2
1
n
2n 2 1
2 2 2 2 (n ) 2 (2n 1) 2 n 2 (2n 1) 2 n2 .
2 n2 1 2
n
2
This is the recurrence formula for even moments. On putting n = n–1, n–2, n-3, …, 3,2,1 ,
we have
Note: Since the mean of the normal distribution is zero so 2n = 2n and 2n+1 = 2n+1.
32 4 3 4
1 = = 0, 2 = = 3 , 1 = 1 = 0 , 2 = 2 -3 = 0 .
23 22 ( 2 ) 2
1
| x m | e | z | e
( x m ) 2
/ 2 2 z2 / 2
E(|X – m|) = dx = dz , where z = (x - m)/.
2 2
2
0 1
( z)e z / 2 dz + ze z / 2 dz =
z2
ze
2 2
2
= dz ,
2 0 2 0
2 2 4
e dt (where t = z2/2, dt = z dz) = 0.7979
t
= appr.
2 0
5
Q3 Q1 ( 0.6745 ) ( 0.6745 ) 2
Quartile Deviation: Q.D. = 0.6745 ,
2 2 3
1 / 2 2
Let y e ( x m)
2
.
2
1
e
t ( m z ) z 2 / 2
= e dz , where z = (x - m)/.
2
1 1
e tm e tm ( z 2 2tz t 2 2 ) t 2 2
e
tz z2 / 2
= e dz e 2 2
dz
2 2
1
mt t 2 2 1
e 2 ( z t ) 2
= 2 e 2
dz , (since the integrand is even)
2 0
1 1
mt t 2 2 1 1
=e 2
0
e y y 2 dy , where (1/2)(z - t)2 = y, (z - t)dz = dy.
1 1
mt t 2 2 1 1 mt t 2 2
2
=e =e 2 .
2
1
e
zt / 2 2
e t ( x m ) e ( x m )
2
Mm(t) = E(et(x-m)) = dx
2
1
e
zt
e z / 2 dz , where z = (x - m)/.
2
=
2
1 1
1 ( z 2 2tz t 2 2 ) t 2 2
=
2
e 2 2
dz
1 2 2
t 1 1 2 2
e 2 u2 t
=
2
e 2
du , where u = z - t = e 2
.
t 2 n1
2n+1 = coefficients of in Mm(t) = 0 and
(2n 1)!
Example 1. Two normal universe have the same total frequency but the standard deviation
of one is k-time that of the other, show that the maximum frequency of the first is 1/k that
of the other.
Solution: Let the total number of frequencies be N and and k be the standard deviations
of the two universes. Hence the equations to the two normal curves are
/ 2 2 / 2 k 2 2
y y0 e x and y y0 e x
2 2
. According the question if the total frequencies for both
x 2 / 2 2 / 2 k 2 2
dx y0 ex
2
are the same, then y0 e dx
Or, y0 e x / 2 2
dx y0 e x / 2 k 2 2
2 2
dx
0 0
101
0 0
But y0 or y0' are the frequencies corresponding to the mean or median or mode (as all the
three averages coincide). Hence the maximum frequency of first is 1/k that of the other.
d 2 n
Example 2. In case of normal distribution, prove that 2 n2 2 2 n 3 .
d
1
( x m) e
2 n ( x m ) / 2 2
Proof: Since 2n = E(X – m)2n =
2
dx
2
1 1 2 2
( x m)
/ 2 2
e ( x m ) ( x m) 3 dx.
2
2n
+
2 2
d 2 n 1 1
( x m) e ( x m)
2 n ( x m ) / 2 2 2 n 2 / 2 2
2 e ( x m )
2 2
dx + dx.
d 2 4
2
d 2 n 1 1 d 2 n
2 n 3 2 n2 3 2 2 n 2 n2 .
d d
d 2 n
2 n 2 2 2 n 3 .
d
Example 3. Prove that for the normal distribution, the quartile deviation, the mean
deviation from mean and the standard deviation are approximately in the ratio 10: 12: 15.
2 4
S.D. = , Q.D. = , M.D.=
3 5
2 4
Q.D.: M.D. : S.D. = : : = 10 : 12: 15 = 10 : 12: 15.
3 5
102
Example 4. For a certain normal distribution the first moment about 10 is 40 and the fourth
moment about 50 is 48. What is the arithmetic mean and variance of the normal
distribution?
Solution: Since the mean and standard deviation of normal distribution are m and
respectively. Given that first moment about 10 is given by
Again, since mean = 50, then we have the fourth moment about 50 = fourth moment about
mean = 4(50) = 4 = 48 (given) = 34 2 = 4 = Variance.
Example 5. For a normal distribution with mean 2 and standard deviation 3, find the value
of a variate such that the probability of the intervals from the mean to that value is 0.4114.
Given that t = 1.35.
1
y f ( x) e ( xm) / 2 . Let the required value be x, then
2 2
2
x
1
/ 2 2
e ( x m ) dx (Since given that mean m = 2 and =3).
2
P(2 < X < x) =
2 2
xm x2
On putting t , we get
3
t
1
e t dt = 0.4114 (given). Hence x = 3t + 2 = 3 1.35 + 2 = 6.05.
2
P(2 < X < x) =
2 0
Example 6. In an intelligence test administered to 1000 children, the average score is 42 and
standard deviation is 24. Find (i) the number of children whose score exceeds 60 (ii) the
number of children with score lying between 20 and 40. It is given that if
t
1
f (t ) e t
2
/2
dt , then f(0.75) = 0.2734, f(0.91) = 0.3184, f(0.08) = 0.0319.
2 0
xm 60 42
Solution: Let, t 0.75 , then P(t >0.75) = 0.5 – 0.2734 = 0.2266,
24
xm 20 42 x m 40 42
for x = 20, t 0.91 and for x = 40, t 0.08 . Then
24 24
P[20 < X < 40] = P(- 0.91 < t < - 0.08) = P(0.08 < t < 0.91)
= P(0 < t < 0.91) - P(0 < t < 0.08) = 0.3184 – 0.0319 = 0.2865. Therefore
required no. of children = 0.2865 x 1000 = 286.5.
Example 7. For a normal variate X, mean = 12 and standard deviation = 2. find P(9.6 <
x < 13.8). Given that for x/ = 0.9, A = 0.3159 and for x/ = 1.2, A = 0.3849.
xm 9.6 12
Solution: It is given that m = 12 and = 2, therefore when x = 9.6 t 1.2
2
xm 13.8 12
and when x = 13.8, t 0.9 . Therefore the required probability
2
13.8 0.9
1 1
e (1 / 2)( x12) / 4 dx = e (1 / 2)t dt , where t = (x – 12)/2
2 2
P(9.6 < X < 13.8) =
2 2 9.6 2 1.2
0 0.9
1 1
e (1/ 2)t dt + e (1/ 2)t dt ,
2 2
=
2 1.2 2 0
1.2 0.9
1 1
e (1 / 2)t dt + e (1/ 2)t dt ,
2 2
=
2 0 2 0
Example 8. Assume the mean height of soldiers to be 68.22 inches with a variance 108(in)2 .
How many soldiers in a regiment of 1000 would you expect to be over 6feet tall? Given that
the area under the standard normal curve between t = 0 and t = 0.35 is 0.1368 and
between t = 0 and t = 1.15 is 0.3746.
Solution: It is given that m = 68.22 inches and 2 = 10.8 (in)2, i.e., = 3.28 inches. Therefore
x m 72 68.22
t 1.15 (where x = 6 = 72). Since it is given that the area under the
3.28
standard normal curve between t = 0 and t = 1.15 is 0.3746. So the area under the standard
normal curve between t = 1.5 and t = is 0.5 - 0.3746 = 0.1254.
Hence number of soldiers who are over 6feet tall = 10000.1254 = 125.4.
Example 9. In a normal distribution 31% of the items are under 45 and 8& items are over
t
1
64. Find the mean and standard deviation. Given that if f (t ) e x / 2 dx , then f(0.5) =
2
2 0
0.19, f(1.4) = 0.42.
104
Solution: Let m and 2 are the mean and variance of normal distribution. It is given that
P(X < 45) = 31% = 0.31 and P(X < 64) = 8% = 0.08.
X m 45 m X m 64 m
Therefore P 0.31 and P 0.08 or
45 m 64 m
Pt 0.31 and P t 0.08 , where t X m .
0.5 0
1 1
e t
t 2 / 2 2
/2
Since it is given that 0.19 = f(0.5) = e dt = dt ,
2 0 2 0.5
0 0.5
1 1
e (1 / 2)t dt + e (1 / 2)t dt . Therefore
2 2
=
2 2
0.5
1
e (1 / 2)t dt = 0.5 – 0.19 = 0.31, i.e., P(t ≤ -0.5) = 0.31. Again given that
2
2
1.4
1 1 1
t 2 / 2 t 2 / 2
e (1 / 2)t dt . Therefore
2
0.42 = f(1.4) = e dt = e dt -
2 0 2 0 2 1.4
1
e (1/ 2)t dt = 0.5 – 0.42 = 0.08, i.e. P(t 1.4) = 0.08. On comparing we have
2
2 1.4
45 m 64 m
0.5 and 1.4 , which implies that m = 50 and = 10.
Length(cm) 8.60 8.59 8.58 8.57 8.56 8.55 8.54 8.53 8.52
Frequency 2 3 4 9 10 8 4 1 1
Solution:
x f = x – A, A = 8.56 2 f f2
Then mean m = A
f 8.56 0.11 8.5626 and standard deviation =
f 42
f f
2 2 2
0.0133 0.11
0.0175 cm(appr.)
f f
42 42
N
Hence the required normal curve is: y e ( xm ) / 2 = 9.8e 0.163( x8.563) .
2 2 2
2
This distribution is so called since the curve y = f(x) describes a rectangle over the
X-axis between the ordinates at x = a and x = b. This implies that X is a continuous
variable. Let X be a random variable in the range a to b with p.d.f. f(x) which is constant
in a ≤ x ≤ b and zero elsewhere. Then
b b
1
f ( x)dx 1 f ( x)dx 1 f ( x) dx 1 f ( x)
a a
ba
1
f ( x) ,a≤x≤b
ba
=0, elsewhere
0, x a
xa
F (x) = ,a x b .
ba
1, x b
b
b
1
b
x3 b 3 a 3 a 2 ab b 2
2 E ( X ) x f ( x)dx x
2 2
dx 2
, and so o on
a a
ba 3(b a) a 3(b a) 3
b
b
1
b
x r 1 b r 1 a r 1
r E ( X ) x f ( x)dx x
r r
dx r
. Moments about a :
a a
ba (r 1)(b a) a (r 1)(b a)
b
( x a) r
b
( x a) r 1 (b a) r 1 (b a) r
r (a) E ( X a) r
dx .
a
ba (r 1)(b a) a (r 1)(b a) r 1
ba (b a) 2 (b a) 3 (b a) 4
In particular, 1 (a) , 2 (a) , 3 (a) , 4 (a) .
2 3 4 5
2 = variance
a 2 ab b2 b a 4b 2 4ab 2 4a 2 3b 2 3a 2 6ab (b a) 2
2
= 2 - (1) =
2
.
3 2 12 12
ba
Standard deviation, = .
12
(b a) 2 b a (b a) 2
2
(b a) 3 (b a) 2 ba ba
3
(b a) 4
4 = 4(a) - 43(a)1(a) + 62(a)[1(a)]2 – 3[1(a)]4 = .
80
32 4 (b a) 4 / 80
1 = = 0, 2 = = 1.8 , 1 = 1 = 0 , 2 = 2 -3 = -1.2 .
23 22 [(b a) 2 / 12]2
1 a , a x b
Let f ( x) 0b,otherwise . Then m.g.f. About origin:
b
1 e tx e bt e at
b
e tx
M0(t) = E(e ) =
tx
dx
a
ba b a t a t (b a)
b 2t 2 brt r a 2t 2 art r
1 bt ... ... 1 at ... ...
= ,
2! r! 2! r!
t (b a)
(b 2 a 2 )t 2 (b r a r )t r
(b a)t ... ...
= 2! r! ,
t (b a)
t r (b r a r )
Hence r = coefficient of , for r = 0, 1, 2, 3, … .
r! t (b a)
Example 1. Find interquartile range, quartile deviation and its coefficient for the
1
rectangular distribution f ( x) , a ≤ x ≤ b.
ba
1 3
Solution. Since, we have Q1 a (b a) , Q3 a (b a) , therefore
4 4
108
Q3 Q1 1
Interquartile Range = Q3 – Q1 = (1/2)(b – a), Q.D. = (b a) and
2 4
Q3 Q1 (b a) / 2 (b a)
coefficient of Q.D. = .
Q3 Q1 2a (b a) 2(b a)
1
Example 2. For the rectangular distribution, where f ( x) , a ≤ x ≤ b.
ba
Solution: We know that for given rectangular distribution mean = (a + b)/2, therefore
ab ab
b b
1
Mean deviation from mean = | x | f ( x)dx |x | dx , a ≤ x ≤ b
a
2 ba a 2
( b a ) / 2
1 ba ab
b a ( b a ) / 2
| y | dy
4
, where y = x
2
.
Example 3. For the rectangular distribution f(x) = 1/2a, -a ≤ x ≤ a. Find the m.g.f., even
and odd moments.
a
1 e tx e at e at 1
a
e tx
Solution. M0(t) = E(e ) = dx
tx
sinh at
a
2a 2a t a 2a at
1 a 3t 3 a 5 t 5 a 7 t 7 a 2t 2 a 4t 4 a 6t 6
= at ... 1 ... .
at 3! 5! 7! 3! 5! 7!
Clearly all odd order moments 2n+1 = 0 and even order moments are given by
a 2n
2n = . Since here mean = 1 = 0, therefore n = n .
(2n 1)
The probability distribution having the probability density function f(x), defined by
f ( x) 0 , x 0
ex , x 0, 0
0
Here
f ( x)dx f ( x)dx f ( x)dx 0 ex dx e x
0 0
0 1 and the distribution function
0, x 0
x x
f ( x)dx e dx 1 e x , x 0 .
x
F(x) = P(X ≤ x) =
a 1, x
2 1
Mean = 1 E ( X ) xf ( x)dx xe x dx 2 .
0 0
3
2 E ( X 2 ) x 2 f ( x)dx x 2e x dx
2
, and so on
0 0
3
2
r 1 r 1
r E ( X ) x f ( x)dx x re x dx
r r
.
0 0
r 1 r
2
2 1 1
Moments about mean: 1 = 0 always, 2 = variance = 2 - (1)2 = 2 .
2
3
1 1 2 6 2 1
Standard deviation, = . 3 = 3 - 321 + 2[1] = 3 3 2 2 3 3
9
4 = 4 - 431 + 62[1]2 – 3[1]4 = .
4
3 2 2 / 3 4 9 / 4
2
1 = 4 2 9. , 1 = 1 = 2 , 2 = 2 -3 = 6 .
2 3 1 /
= , = =
2
22 1/ 4
Q1
1 1 1
e
x
e x
Q1
dx Q1 (log e 3 log e 4) . The second quartile (median) Q2
0
0
4 4
Q2
1 1 1
e
x
e x
Q2
(Md) is given by dx Q2 log e 2 . The third quartile Q3 is given
0
0
2 2
Q3
3 3 1
by e x dx e x
Q3
Q3 log e 4 .
0
0
4 4
Interquartile Range = Q3 – Q1 = (1/)loge 4 + (1/)( loge 3 – loge 4) = (1/) loge 3 and Q.D.
Q Q1
= 3 = (1/2) loge 3. Also coefficient of quartile deviation is
2
1 1
Mean deviation from mean = | x | e x dx | x 1 | e x dx | y 1| e
y
dy , (y = x)
0
0
0
e
1
1 1 1 2
y 1 | e dy
y y 1
= | y 1| e dy e 1 .
0 1
e
1
M0(t) = E(e ) = e e dx e
tx tx x ( t ) x
dx
t
e
( t ) x
0
t
1
t
0 0
t t2 t3
=1 ... .
2 3
tr r!
Hence r = coefficient of r , for r = 0, 1, 2, 3, … .
r!
Solution: We have to find out that P(X 4) = 3e 3 x dx e 3 x
4 e 12 .
4
2
1 2 1 2 3 x 1 3 1 1
Mean = E(X) = 3xe 3 x
dx 3 2 , Variance = 4 3( x ) e dx 3 and
4
3 3 3 3 33 9 9
Standard Deviation = 1/3. Hence C.V. = 1.
Example2. The income tax of a man is exponentially distributed with the Probability
1 13 x
density function given by f ( x) 3 e , x 0 What is the probability that his income will
0, x 0.
exceed Rs. 17,000 assuming that the income tax is lived at the rate of 15% on the income
above Rs. 15,000?
Solution. If the income exceeds Rs. 17,000 then income tax will exceed by 15% of 17000-
15000), i.e. exceeds by Rs. (15x2000)/100 = 300. Hence the required probability is P(X >
1
1 x 13x
300) = e 3 dx e 3 e 100 .
300
3 300
1 = Mean = np, 2 = npq + n2 p2, 3 = n(n - 1)(n - 2 ) p3 + 3n(n - 1)p2 + np and
Moments about mean : 1 = 0 always, 2 = npq = variance 2 = npq and S.D. = (npq), 3 =
npq(2q – 1) = npq(q – p) , 4 = npq[1 + 3(n – 2)pq].
(1 2 p) 2 (1 6 pq) 1 2p (1 6 pq)
1 = , 2 = 3 + , 1 = , 2 = .
npq npq npq npq
d k
k 1 pq nk k 1 , where k is the kth moment about mean.
dp
Moments about mean: 1 = 0 always, 2 = m = variance 2, S.D.= m, mean = (S.D.)2 .
3 = m, 4 = 3m2 + m.
1 1
Karl Pearson’s Coefficients: 1= 1/m, 2 = 3 +1/m, 1 = , 2 = .
m m
d r
k 1 mr r 1 m , where k is the kth moment about mean.
dm
Then m.g.f. about origin : M0(t) = e m ( e 1) and m.g.f. about mean m : Mm(t) = e me mt m .
t t
2
and its distribution is called Normal (or Gaussian) Distribution, here m and are called
the parameter of the distribution.
(3) The points of inflexion of the normal curve are given by x = .
Moments about the mean : 2n+1 = 0, 2n (2n 1)(2n 3)...3.1 2n . Also n = n . Karl
Pearson’s Coefficients: 1 = 0, 2 = 3 , 1 = 0 , 2 = 0 .
4 2
Mean Deviation about Mean = appr. Quartile Deviation = ,
5 3
=0, elsewhere
Moments about origin : Mean = 1 = (a + b)/ 2, 2 = (a2 +ab + b2)/ 3 etc.
ba
Moments about mean: 1 = 0 always, 2 = variance = (b – a)2 /12, S.D. = .
12
(b a) 4
3 = 0, 4 . Karl Pearson’s Coefficients: 1 = 0, 2 = 1.8, 1 = 0 , 2 = -1.2 .
80
1 1 3
Quartiles : Q1 a (b a) , Q2 a (b a) , Q3 a (b a) .
4 2 4
e bt e at
M0(t) =
t (b a)
114
(iv) Exponential Distribution : The probability distribution having the probability density
function f(x), defined by f ( x) 0 , x 0
ex , x 0, 0
is called exponential distribution with parameter
.
Mean = 1 1/ , 2 2 / 2 . Moments about mean: 1 = 0 always, 2 = variance = 1/2 and
1 9
S. D. = , 3 = 2/3 , 4 = .
4
1 1 1
Quartiles : Q1 (log e 3 log e 4) , Q2 log e 2 , Q3 log e 4 .
Q3 Q1
Interquartile Range = Q3 – Q1 = (1/) loge 3 and Q.D. = = (1/2) loge 3, Q.D.
2
log e 3
= . Mean deviation from mean =2/e.
log e 16 log e 3
3.10. Assignment
Q.1. In case of binomial Distribution, write an expression for (i) the probability of at most
r successes (ii) the probability of at least r successes.
Q 2. The mean and variance of a Binomial distribution are 4 and 4/3 respectively. Find (i)
the probability of 2 successes, (ii) the probability of more than two successes, (iii) the
probability of 3 or more than three successes.
Q 3. A perfect cubical die is thrown a large number of times in sets of 8. The occurrence of
5 or 6 is called a success. In what proportion of the sets you expect 3 successes.
Q 4. An irregular six-faced die is thrown and the expectation that in 10 throws it will give
five even numbers is twice the expectation that it will give four even numbers. How many
times in 10,000 sets of 10 throws would you expect it to give no even numbers ?
115
Q 5. In a precision bombing attack there is a 50% chance that any one bomb will strike the
target. Two direct hits are required to destroy the target completely. How many bombs
must be dropped to give a 99% chance or better of completely destroying the target ?
Q 6. Show that if two symmetrical Binomial distributions of degree n (the same number of
observations) are so superposed that the first term of the one coincides with the (r + l)th
term of the other, the distribution formed by adding superposed terms is a symmetrical
Binomial distribution of degree (n + 1).
Q 7. In a Poisson distribution probability for x = 0 is 10%. Find the mean, given that loge10
= 2.3O26.
Q 8. A car-hire-firm has two cars, which it hires, out day by day. The number of demands
for a car on each day is distributed as Poisson distribution with mean 1.5. Calculate the
proportion of days on which neither car is used and the proportion of days on which some
demand is refused. [e-1.5 = 0.2231|.
Q 9. A telephone switch board handles 600 calls on the average during a rush hour. The
board can make a maximum of 20 connections per minute. Use Poisson distribution to
estimate the probability that the board will be over taxed during any given minute. [e -1 =
.00004539]
Q 10. If p(X = 2) = 9p(X = 4) + 90p(X = 6) in the Poisson distribution, then find E(X).
Q 11. If X is a normal variate with mean 8 and standard deviation 4, find (i)P(X ≤ 5) and
(ii) P(5 ≤ X ≤ 10).
X 100 95 90 85 80 75 70 65 60 55 50 45
Frequency 0 1 3 2 7 12 10 9 5 3 2 0
Frequency 5 18 42 27 8
Q 14. For the rectangular distribution f(x) = 1, 1 ≤ x ≤ 2. Find arithmetic mean, geometric
mean, harmonic mean and standard deviation and verify that AM > GM > HM.
Q 15. If families are selected at random in a certain thickly populated area and their
annual income in excess of Rs. 4000 is treated as a random variable having an exponential
116
1
1 2000x
distribution f ( x) e , x 0 . What is the probability that 3 out of 4 families
2000
selected in the area have income in excess of Rs. 5000 ?
Q 2. The number of males in each 106 eight pig litters was found and they are given by the
following frequency distribution :
Number 0 1 2 3 4 5 6 7 8 Total
of Male
per litter
Frequency 0 5 9 22 25 26 14 4 1 106
Assuming that the probability of an animal being male or female is even i.e., p = q = ½ and
frequency distribution follows the Binomial law, calculate the expected frequencies.
Number 0 1 2 3 4 Total
of female
mice
Number 8 32 34 24 5 103
of litters
If the chance of obtaining a female in a single trial is assumed constant, estimate this
constant of unknown probability. Find also expected frequencies.
1 35
Q 4. Find Binomial distribution if 1 , 2 .
36 12
q p
Q 6. Show that a measure of skewness of the Binomial distribution is given by
(npq)1/ 2
1 6 pq
and its kurtosis is 3 .
npq
117
Q 9. If x and y are two independent Poisson‘s variates where P(X =1) = P (X =2) and P(Y =
2) = P(Y = 3), find the variance of (X - 2Y).
Q 10. Find the mean and standard deviation for the table of deaths of women over 85 year
old recorded in a three year period.
Q 11. In 1,000 extensive sets of trials for an event of small probability the frequencies f of
the number x of successes are found to be:
x 0 1 2 3 4 5 6 7
Q 12. If X is a Poisson variate and P(X = 1) = (PX = 2), find P(X = 4) and P(X ≤ 4).
Q 13. If a Poisson distribution has a double mode x = 1 and x = 2, find P(X = 1).
Q 14. If a Poisson distribution has a double mode x = 4 and x = 5, find probability that x
will have either of these values.
Q 15. In a distribution exactly normal. 7% of the items are under 35 and 89% are under
63. What are the mean and s.d. of the distribution.
Q 16. For a normal distribution with mean 1 and standard deviation 3, find the probability
that 3.43 ≤ x ≤ 6.19.
Q 17. The quartiles of a normal distribution are 8 and 14 respectively. Show that the mean
and standard deviations are respectively 11 and 4.4.
Q 18. In a sample of 1000 cases, the mean of certain test is 14 and standard deviation is 2.5.
Assuming the normality of the distribution find (i) how many candidates score between 12
118
and 15 (ii) how many score below 8 and (iii) the probability that a candidate selected at
random will score above 15 ?
Q 19. Show that for the rectangular distribution dF(x) = dx, 0 ≤ x ≤ 1 the m.g.f is (et – 1)/t.
Hence or otherwise show that k = 1/(k + 1), 2 = 1/12. Also prove that mean deviation
about mean is ¼.
Q 20. The sales tax of a shopkeeper has an exponential distribution with p.d.f.
1 14 x
f ( x) 4 e , x 0 If sales tax is levied at the rate of 5%, what is the probability that his
0, x 0.
sales exceed Rs. 10,000 ?
At the end of the unit student discuss or seek clarification on some points, if so mention the
points:
A: -------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------
B: -------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------
C: -------------------------------------------------------------------------------------------------------
119
--------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------
==========
Structure
4.1. Introduction
4.2. Objectives
4.4.6. Covariance
4.7. Assignment
121
4.1. Introduction: After the study of frequency distribution, measure of central tendencies,
mean deviation, standard deviation, moment, skewness, kurtosis, theory of probability,
mathematical expectation and moment generating functions ad theoretical distribution. In
this unit we shall confine over selves to the study of method of least square, curve fitting,
concepts of correlation and regression, coefficient of correlation, rank correlation, multiple
and partial correlation. The fitting of curve to a given data is very important both from the
point of view of theoretical and practical statistics. In theoretical statistics, the study of
correlation and regression can be regarded as fitting of linear curves to be given bivariate
or multi-variate frequency distributions. In practical statistics it enable us to get a close
functional relation between x and y. These relations expressed by a polynomial, exponential
or logarithmic and be fitted by using the principle of least square. We know that if (i) if,
number of equations are equal to number of unknowns then such equations has unique
solution (ii) if number of equations are less than number of unknowns then such equations
has infinitely many solutions, but (iii) if number of equations are more than number of
unknowns then the solution of such equations is not possible. To solve such equations
perhaps Gauss used the method of least square in 1975 but it was named and first
published in 1805 by Legendre and so this method is also known by principle of Legendre.
4.2: Objectives: After the end of the unit the student will be able to understand/know the
4.3: Curve Fitting. Curve Fitting means an expression of the relationship between two
variables by algebraic equations on the basis of observed data. It is considered very
important both from the point of view of theoretical and practical Statistics. In theoretical
Statistics the lines of regression can be regarded as fitting of linear curves to the given
bivariate values. In practical Statistics we are required to find a functional relation
between x and y where the dependent variable y is expressed as a function of the
independent variable x, which may involve integral powers.
The general problem of finding equations of approximating curves which fit given
sets of data is called curve fitting. The simplest curve that can be fitted to a number of
points is the straight line. But no straight line passes exactly through all the points although
a great many lines may be drawn which nearly do so. Similarly we can find curves of
second degree, third degree etc., which may give the best representation of the points.
4.3.1. Most Plausible Value. Suppose we have a number of independent linear equations in n
unknowns say x, y, z, ...
where ar , br , cr , … and Ar are constants. For m = n , we can find a unique set of values of
x, y, z, ... to satisfy all these equations. However if m > n, i.e., the number of equations is
greater than the number of unknowns, there may exist no such solution. In such cases we
try to find out those values of x, y, z, ... which will satisfy the given system of equations as
nearly as possible. The principle of least square asserts that these values are those which
m
make U a minimum where U= (a x b y c z ... A )
r 1
r r r r
2
. Applying the conditions of
U U U
minimum i.e., ... 0 , we will get n equations called as normal equations.
x y z
When these equations are solved simultaneously, they give the values of x, y, z, ... .These are
called the best or most plausible values. On calculating the second order partial derivatives
and substituting the values of x, y, z, ... thus obtained, we will see that the expression will
become positive. Hence U is minimum.
4.3.2. Method of Least Square: Suppose that we have m observations are (x1, y1), (x2, y2),
(x3, y3), … , (xm, ym) of two variables x and y and we are required to fit a curve of the type
y = a + bx + bx2 + cx3 + ...+ kxn … … … (1)
123
from these values. Now we have to determine the constants a, b, c,...k, such that it
represents The curve of best fit of that degree. If m = n, we can in general find a unique set
of values satisfying the given system of equations. But when m > n, we get m equations by
substituting the different values of x and y in equation (1) and we are required to find only
n constants. Therefore no such solution may exist to satisfy all m equations. We therefore
try to find those values of a, b, ... , k which may give the best fit i.e., which may satisfy all
the equations as nearly as possible. The principle of least squares asserts suitable method in
such cases.
… … … …
Here the quantities Y1, Y2, … , Ym and y1, y2, … , ym are called the expected values and
observed values of y corresponding to the values of x1, x2, … , xm of x. The difference Rr =
yr – Yr for different values of r are called residuals. A measure of the "goodness of fit" of
the curve to the given data is provided by the quantity R12 + R22 + … + Rm2, if this is
small, the fit is good, if it is large the fit is bad.
―Of all curves approximating a given set of points, the curve having the property that R12 +
R22 + … + Rm2 is minimum is called best fitting curve‖ and a curve having this property is said
to fit the data in the least square sense and is called a least square curve.
Here U is said to be sum of squares of residues. The principle of least square asserts that
the constant a, b, c, …, k are chosen in such a way so that sum of squares of residues is
minimum. The conditions that U is maximum or minimum of is
U U U U
... 0. Which implies that
a b c k
y ma b x ... k x n
xy a x b x 2
... k x n1
x 2
y a x 2 b x 3 ... k x n2
124
… … … …
x n
y a x n b x n1 ... k x 2n ,
here we removed the subscripts. These equations are called the normal equations and are (n
+ 1) in number and on solving these equation we have the values of (n + 1) unknowns a, b,
c, ... , k. On calculating the second order partial derivatives and on putting these values,
they give a positive value of the function. Hence U is minimum.
Particular cases. When n = 1 and n = 2 in these cases the curve to be fitted is a straight line y
= a + bx and second degree parabola respectively the corresponding normal equations are
y ma b x , xy a x b x 2
and y ma b x c x 2 ,
xy a x b x c x , x y a x
2 3 2 2
b x 3 c x 4 respectively.
4.3.2. Change of Origin. When the values of x are of equal interval i.e., x, x + h, x + 2h, …
and m is odd say 2n + 1, the normal equations can be simplified by taking the origin of x
at the middle of the values and the interval h may be taken as unit of measurement. Thus
if a be the middle value, then u = (x – a)/h takes the values -n, ...,-1, 0, 1, ..., n and we get
u u 3 u 5 0 and if m is even, say 2n we take the origin of x at the mean of the
middle pair of values and h/2 as the new unit. The values of m then become -(2n – 1), -
(2n – 3) , …,-3, -1, 1, 3, ... , (2n – 3), (2n – 1) and we get u u 3 u 5 0 .
4.3.4. Curve of type y = abx and y = axb. To fit the curves of these types we follow the
following method. Let the curve to be fitted be y = abx or log y = log a + x log b. The
normal equations are log y m log a log b x , x log y log a x log b x 2 .
These equations are solved for log a and log b and then a and b are found by taking
antilog. Similarly if the curve to be fitted be y = axb or log y = log a + b log x. The
normal equations are
These equations are solved for log a and b and then a is found by taking antilog.
125
Example 1. Find the most plausible values of x, y and z from the following equations :
x – y + 2z = 3, 3x + 2y – 5z = 5, 4x + y + 4z = 21, -x + 3y + 3z = 14.
U
= 2(x – y + 2z – 3) + 6(3x + 2y – 5z – 5) + 8(4x + y + 4z – 21) - (-x + 3y + 3z – 14) = 0
x
27x + 6y = 88.
6x + 15y + z = 70 and y + 54z = 107 respectively. On solving these equations the most
plausible values of x, y and z ar x = 2.47, y = 3.55 and z = 1.92.
Example 2. Fit a straight line to the following data regarding x as the independent variable
:
X: 0 1 2 3 4
Solution. Let the straight line to be fitted to the data be y = a + bx ,then the normal
equations are
y ma b x , xy a x b x 2
. Now
x y xy x2
0 1.0 0 0
1 1.8 1.8 1
2 3.3 6.6 4
3 4.5 13.5 9
4 6.3 25.2 16
126
10 16.9 47.1 30
x: 0 1 2 3 4
y: 1 5 10 22 38
Solution. Let the second degree parabola to be fitted to the data be y = a + bx +cx2 , then
the normal equations are y ma b x c x 2 ,
xy a x b x 2
c x 3 and x 2
y a x 2 b x 3 c x 4 . Now
X Y x2 x3 x4 xy x2y
0 1 0 0 0 0 0
1 5 1 1 1 5 5
2 10 4 8 16 20 40
3 22 9 27 81 66 198
Here
m = 5, y 76, x 10, x 2
30, x3 100, x 4 354, xy 243, x 2 y 851 . Then
by normal equations, we have
76 = 5a + 10b + 30c, 243 = 10a + 30b + 100c, 851 = 30a + 100b + 354c. On solving these
equations, we get a = 1.43, b = 0.24, c = 2.21. Hence the required parabola is
Example 4. Fit a straight line to the following data regarding x as the independent variable
:
x: 0 5 10 15 20 25
127
y : 12 15 17 22 24 30
Solution. Here m = 6 even and the values of x are equally spaced h = 5.Therefore we take
u = (x – 12.5)/2.5 and v = y – 20. Let the straight line to be fitted to the data be v=a+
bu, then the normal equations are
v ma b u , uv a u b u 2
. Now
X Y u v uv u2
0 12 -5 -8 40 25
5 15 -3 -5 15 9
10 17 -1 -3 3 1
15 22 1 2 2 1
20 24 3 4 12 9
25 30 5 10 50 25
Total 0 0 122 70
Now by normal equations, we have 0 = 6a, 122 = 70b implies that a = 0, b = 1.74. Hence
the required line is v = 1.74u or y – 20 = 1.74[(x – 12.5)/2.5] or y = 0.7x + 11.28.
x: 0 1 2 3 4
Solution. Since m is odd and the values of x are equi-distant we take the origin for the x
series at the middle value 2. Now let us put X = x — 2 and Y = y, so that the curve of fit is
Y = a + bX + cX2 .
x Y X Y XY X2 X2Y X3 X4
0 1 -2 1 -2 4 4 -8 16
1 1.8 -1 1.8 -1.8 1 1.8 -1 1
2 1.3 0 1.3 0 0 0 0 0
XY a X b X 2
c X , X Y a X b X
3 2 2 3
c X 4 . On substituting the
values in these equations, we get
12.9 = 5a + 10c, 11.3 = 10a + 10b, 33.5 = 10a + 34c. On solving these equations, we get a =
1.48, b = 1.13, c = 0.55. Hence the required parabola is Y = 1.48 + 1.13X + 0.55X2 or, y =
1.48 + 1.13(x - 2) + 0.55(x – 2)2 , or y = 1.42 – 1.07x + 0.55x2.
x: 2 3 4 5 6
Solution. We have to fit the curve y = abx , taking log both sides we have
Then
X y x2 Logy x log y
20 90 11.5835 47.1254
Solving these equations and taking anti-logarithms, we get a = 100, 6 = 1.2 approximately.
Hence the equation of the required curve is y = 100(l.2)x.
4.4. Correlation and Regression: In earlier units we have been mainly concerned with uni-
variate universes only. In this section we shall discuss bivariate universes and in particular
study the simultaneous variation of two variables for examples: height and weight of
students in a class, ages of husband and wives, rainfall and crops etc. Let us consider the
heights of the husbands and the wives at the time of marriage. If the height of the
bridegroom is represented by x in general and that of bride by y, then to each marriage
there corresponds a pair of values (x1, y1) of the variables x and y. Now our object is to
discover whether there is any connection between stature of husband (x) and stature of wife
(y). Do tall men tend on the average to wed tall women, or do we find tall men choosing
short women for wives just about as often as they choose tall women ? Then we try to find
out a relation between x and y. Whenever two variables x and y are so related that a change
in one is accompanied by change in the other in such a way that an increase in the one is
accompanied by an increase or decrease in the other, then variables are said to be correlated.
When the relation relationship is of quantitative nature, the approximate statistical tool for
discovering and measuring the relationship and expressing it in brief formula is know as
correlation.
Let the points (xr, yr), r = 1, 2, 3, … are the values of two variables x and y be plotted
on a xy-plane and be represented by points P1, P2, P3, … .The values in the brackets are
being taken the corresponding values of two variables. One of the variable is taken along
the x-axis and the other along y-axis. Such a graphical representation is called a Scatter or
Dot-diagram.
A universe every number of which bears one of the values of each of two variates is
said to be bivariate. If the pair (x1, y1) occurs f1 times then f1 is called the frequency of that
pair. If the values are grouped according to class-intervals, we have a bivariate frequency
distribution.
25-30 1
30-35 2 1
35-40 1 2 4
40-45 3 1
45-50 3
xy
r
xy n
p
. Where x and y are the deviations measured
x y
2 2
x y x y
2 2
n2
from their respective means and x , y being the standard deviations of these series. This
is also called the product moment correlation coefficient. In many cases it is easier to work
out after taking assumed means and in that case the formula of correlation coefficient
becomes a different one. Let x and y be the deviations measured from the true means Mx
and My of the two series and and be the deviations measured from the assumed means
Ax and Ay respectively. Then = X – Ax = X - Mx + Mx - Ax = x + dx , where dx = Mx - Ax =
, similarly = Y – Ay = Y – My + My – Ay = y + dy , where dy = My – Ay =
. Then
n n
( x d x )( y d y ) ( xy xd y yd x d x d y )
= xy d x d y d d
y y x y
= xy d d x y , since x = 0 = y.
Hence xy nd d
.
x y
n
131
2 2 2 2
Also, x and y
. Then the formula of coefficient is
n n n n
n
reduced to r
xy . This is known the short cut
n x y
2 2
2 2
n n
method of correlation coefficient. If we suppose that u = /h and v = /h, where h and h
are the scales of x and y series. Then
n uv n
u v
r is the step-
2 2
u v
u v
2 2
2 2 2 2
n n n n
fuv f
fu fv
r
, where f is the frequency in the frequency
fu 2
fu fv fv
2
2
2
f f
table.
1 Perfect +1 -1
5 No correlation 0 0
132
1 r2 1 r2
S.D. = and P.E. = 0.6745 . The P.E. error is used for testing the reliability of
n n
a particular value of r. The formula r P.E. gives two limits within which the coefficient of
correlation always lies. If r < P.E., there is a correlation and if r < 6P.E. the correlation is
significant.
Correlation coefficient lies between –1 to +1: If x and y denotes the deviations of the
variates X and Y from their respective means, then by Schwarz‘s Inequality
4.4.3. Correlation of Ranks: An easier method than that of Karl Pearson‘s method of
calculation of correlation coefficient is given by Charls Spearman, which known as
Spearman’s Rank Difference Method. In this method a group of n individuals are arranged
in order of merit in the possession of a certain characteristic. The same group would in
general give different orders for different characteristics. In this method the only ranks are
considered and so we call it the rank correlation coefficient in the characteristics for those
group individuals. Assuming that no two individuals X and Y are equal in either
classification, each of the individual takes the values 1, 2, 3, ... , n and hence their
1 2 3 ... n n 1
arithmetic means are equal and each being X Y . Let d = X – Y
n 2
= ( X X ) (Y Y ) x y , where x and y are deviation from the mean. Therefore
x ( X X ) X X 2 X X
2
2 2 2
n (n 1)
6 2 2
n(n 2 1)
= y 2 . Now
12
x y d 12 n(n 6 1) d
2
1
d 2 ( x y) 2 x 2 y 2 2 xy xy 2 2 2 2
2
1 1
n(n 2 1) d 2
Hence r
xy
12 2 1
6 d 2
.
2 2
x 2
y 2 n ( n 1) n ( n 1)
12
Rank Correlation Coefficient lies between -1 and + 1 including both the values. Since we know
that r is a positive number and r is maximum if d2 is minimum. Now d2 is minimum if
each d is zero i.e., so that d2 = 0. Hence maximum value of r is 1. Now r is minimum if d2
is maximum i.e., each of d is maximum, d will be maximum if the ranks of the n
individuals are in opposite party as shown below :
x: 1 2 3 ... n-1 n
(2r + l - l), (2r + l – l - 2), ... , 4, 2, 0, -2, -4, ... , -(2r - 2), -2r
d2 = 2[(2r)2 +(2r - 2)2 + 42 + 22] = 8[r2 +(r - 1)2 + 22 + 12] = (8/6)[r(r + 1)(2r + 1).
6 d 2 8r (r 1)(2r 1)
r 1 1 1 2 1.
n(n 1)
2
(2r 1)(4r 2 4r 1 1)
Case II. When n is even say 2r. Then the values of d are
(2r- 1), (2r - 3), ... , 3, 1, -1, -3, … , -(2r - 3), -(2r – 1). Therefore
= 2[(2r)2 + (2r - 1)2 + (2r - 2)2 + 32 + 22 + 12 – {(2r)2 + (2r - 2)2 +…+ 42 + 22}]
6 d 2 4r (4r 2 1)
r 1 1 1 2 1 .
n(n 2 1) 2r (4r 2 1)
Example.1. The students got the following percentage of marks in Economics and Statistics
:
134
Roll Nos. : 1 2 3 4 5 6 7 8 9 10
Marks in Economics : 78 36 98 24 75 82 90 62 65
40
Marks in Statistics : 84 51 91 60 68 62 86 58 53
47
Solution. Let the marks of two subjects be denoted by X and Y respectively. Then the mean
for X-series = Mx = 650/10 = 65 and the mean for Y-series = My = 660/10 = 66. If x and y
are the deviations of X's and Y's from their respective means, then the data may be
arranged in the following form:
X Y x y x2 y2 xy
75 68 10 2 100 4 20
82 62 17 -4 289 16 -68
62 58 -3 -8 9 64 24
65 53 0 -13 0 169 0
Therefore r
xy
2704
0.78appr.
x y 2 2
5398 2224
Example 2. A computer while calculating the correlation coefficient between two variates x
and y from 25 pairs of observations obtained the following constants :
135
n = 25, X = 125, X2 = 650, y = 100, y2 = 960, xy =508. It was however, later
discovered at the time of checking that he had copied down two pairs as
8 6 6 8
Solution. On account of the mistake, there will be no change in x, y and x2. But there
will be change in y2 and xy . Instead of old value in y2, 142 + 62 = 232 the new value
l22 + 82 = 208 is to be substituted. Hence correct value of y2 = 960 - 232 + 208 = 936. Also
instead of old value in xy, 614 + 86 = 132, the new value 8 x 12 + 6 x 8 = 144 is to be
substituted. Hence the correct value of xy = 508 – 132 + 144 = 520.
Hence r
xy
520
0.666.
x y
2 2
650 936
Example 3. Find the coefficient of correlation between the values of X and Y: (short cut
method)
X: 1 3 5 7 8 10
Y: 8 12 15 17 18 20
Solution. Let the assumed mean for X be 7 and for Y be 15. Then
X Y 2 2
1 8 -6 -7 42 36 49
3 12 -4 -3 12 16 9
5 15 -2 0 0 4 0
7 17 0 2 0 0 4
8 18 1 3 3 1 9
10 20 3 5 15 9 25
Total -8 0 72 66 96
136
n
92 0
Hence r 0.6 .
170 0150 10
2
2
2 2
n n
Example 4. Find the coefficient of correlation for the following table: (step deviation
method)
x 10 14 81 22 62 30
y 18 12 24 6 30 36
x y u v uv u2 v2
10 18 -3 -1 3 9 1
14 12 -2 -2 4 4 4
18 24 -1 0 0 1 0
22 6 0 -3 0 0 9
26 30 1 1 1 1 1
30 36 2 2 4 4 4
Total -3 -3 12 19 19
uv n
u v (3)(3)
12
Therefore r 6 0.6 .
u
2
v
u v
2
2
2
19
(3)
2
19
(3) 2
6 6
n n
0-4 1 2
4–8 4 5 8
8 – 12 3 4
12 – 16 2 1
= y -- Ay class 0-5 5 - 10 10-15 15-20 20-25 Arrows shows the cell value
= -10 -5 0 5 10
y-12.5
u= /4
0–4 2 -8 -2 4 2 -6 12
1 2 3
4 4 8
4–8 6 -4 -1 1 0 -1 -17 17
4 5 8 17
4 -8 -4
0
8–12 10 0 0 0 0 0 0
3 4 7
0 0
0
12-16 14 4 1 1 2 3 3
2 1 3
2 4
2
138
Total(f) 1 6 8 14 1 30 -20 32 8
fv -2 -6 0 14 2 8
fv2 4 6 0 14 4 28
fuv 4 8 0 -6 2 8
fuv f
fu fv (20) 8
8
r
30
fu 2
fu fv fv
2
2
(20) 2
82
32 28
2
f f 30 30
8 5.333 13.333
= 0.6065 .
32 13.3328 2.13 21.977
Example 6. Find the coefficient of correlation for the following table:
X 67 72 77 82 87 92 97
y
92 1 2 3 1
87 1 3 8 1 5
82 4 4 6 4 9 1
77 3 3 7 6 4
72 2 3 5 6 1 1
67 3 2
62 1
139
67 72 77 82 87 92 97
(f)
92 3 0 3 6 9 21 63
1 2 3 1 7
0 6 18 9 33
87 2 -2 0 2 4 6 36 72
1 3 8 1 5 18
-2 0 16 4 30 48
82 1 -3 -2 -1 0 1 2 28 28
4 4 6 4 9 1 28
-12 -8 -6 0 9 2 -15
77 0 0 0 0 0 0 0 0
3 3 7 6 4 23
0 0 0 0 0 0
72 -1 3 2 1 0 -1 -2 -18 18
2 3 5 6 1 1 18
6 6 5 0 -1 -2 15
67 -2 6 4 -10 20
3 2 5
18 8 25
62 -3 9 -3 9
1 1
140
9 9
fuv 21 6 -3 0 30 22 39 115
fuv f
fu fv (28) 54
115
r
100
fu 2
fu fv fv
2
2
(28)
2
54 2
286 210
2
f f 100 100
Example 7.(a) If di stands for the difference in ranks of the ith individual and and if di
= 0 for all values of i, then prove that r = l.
(c) Show that in a ranked bivariate distribution in which no ties occurs and in which the
variable are independent (i) di2 is always even
6 d 2
r 1 1 0 1.
n(n 2 1)
141
6 d 2
(b) Since the formula of rank correlation is r 1 . In case of perfect positive
n(n 2 1)
6 d 2
correlation r = 1, so that 1 = 1 di2 = 0, which is possible only when di2 = 0
n(n 1)
2
for each i or if the ranks of both the variables are same. Similarly, in case of perfect
6 d 2 6 d 2
negative correlation r = - 1, so that -1 = 1 =2 di2 =
n(n 2 1) n(n 2 1)
6 d 2 6 d 2
n(n – 1)/3 = (n - )/3. But since r -1 so that 1
2 3
-1 which gives 2.
n(n 2 1) n(n 2 1)
Hence mindi2 = 0 and maxdi2 = (1/3)(n3 – n).
(c) Let the ranks of two characteristics A and B of n individuals are x 1, x2, x3, … , xn and
y1, y2, y3, … , yn respectively. Then
n 1
xi = 1 + 2 + 3 + … + n = n(n + 1)/2 = yi x y and
2
xi2 = 12+22+32 + … + n2 = n(n +1)(2n +1)/6 = yi2 xi2/n = (n + 1)(2n + 1)/6 = yi2/n.
Therefore,
1 1 (n 1)(2n 1) (n 1) 2 n 2 1
i i
2 2 2
Var(x) = ( x x ) x x = Var(y)
n n 6 4 12
Now, di xi yi ( xi x ) ( yi y )
di ( xi x ) 2 ( yi y ) 2 2 ( xi x )( yi y )
2
n2 1
= nVar(x) + nVar(y) – 2n Cov(x, y) = 2n 2nCov( x, y)
12
n2 1
= 2n 2nCov( x, y ) , yields that di2 is always even. Further since
12
Example 8. Calculate the coefficient of correlation from the data given below by the
method of differences :
X: 78 89 97 69 59 79 68 57
Solution.
=X—Y=d
78 125 4 4 0 0
89 137 2 2 0 0
97 156 1 1 0 0
69 112 5 6 -1 1
59 107 7 8 -1 1
79 136 3 3 0 0
68 123 6 5 1 1
57 108 8 7 1 1
Total d = 0 d2 = 4
6 d 2 6 4 3 20
r 1 1 1 0.95 .
n(n 1)
2
8(64 1) 63 21
Example 9. The ranking of ten students in two subjects A and B are as follows:
143
A: 3 5 8 4 7 10 2 1 6 9
B: 6 4 9 8 1 2 3 10 7 5
Solution.
=A—B=d
3 6 -3 9
5 4 1 1
8 9 -1 1
4 8 -4 16 d = 0
d2 =214
7 1 6 36
10 2 8 64
2 3 -1 1
1 10 -9 81
6 5 1 1
9 7 2 4
6 d 2 6 214 147
r 1 1 0.3 .
n(n 1)
2
10(100 1) 495
Example 10. Ten competitors in a beauty contest got marks by three judges in the
following orders :
First Judge : 1 6 5 10 3 2 4 9 7 8
Second Judge : 3 5 8 4 7 10 2 1 6 9
Third Judge : 6 4 9 8 1 2 3 10 5 7
144
Use the rank correlation coefficient to discuss which pair of judges have the nearest
approach to common tastes in beauty.
Solution.
First Judge Second Third judge Rank Rank Rank d122 d132 d232
Judge diff. diff. diff.
1 10 3 8 6 5 2 5 3 4 25 9
6 5 5 6 4 7 -1 -2 -1 1 4 1
5 6 8 3 9 2 3 4 1 9 16 1
10 1 4 7 8 3 -6 -2 4 36 4 16
3 8 7 4 1 10 4 -2 -6 16 4 36
2 9 10 1 2 9 8 0 -8 64 0 64
4 7 2 9 3 8 -2 -1 1 4 1 1
9 2 1 10 10 1 -8 1 9 64 1 81
7 4 6 5 5 6 -1 -2 -1 1 4 1
8 3 9 2 2 4 1 -1 -2 1 1 4
6 d12 6 d13
2 2
6 200 7 6 60 7
r12 1 1 , r13 1 1 ,
n(n 1)
2
10(100 1) 33 n(n 1)
2
10(100 1) 11
6 d 23
2
6 214 49
r23 1 1 . Here we conclude that firs and third judges
n(n 1) 2
10(100 1) 165
approach are nearest.
regression is the straight line which gives the best fit in the least square sense to the given
frequency. The method of least squares can be used to fit a straight line to the set of
points given on the scatter diagram Transfer the origin to the point
(Mx , My), where Mx and My are the means of x-series y-series
respectively. Let x, y be the deviations from the respective means Mx
and My i.e., x = X - Mx and y = Y – My . Let Y = aX + b be the
equation of the line of best fit of x. Changing the origin to (Mx , My )
it will have the form y = ax + b, where y = Y — Mv and x = X – Mx .
Consider a dot (xr , yr ), then the difference between this point and the line is yr – axr – b.
If U, the sum of the squares of such distances i.e., U = ( y – ax – b)2 for all r. Now the
principle of least squares, we choose a and b so that U is minimum by
U U
= -2x( y – ax – b) = 0 = = -2( y – ax – b), implies that xy - ax2 - bx = 0 and y
a x
r y r x
This is called regression line of X on Y. Here and are called the regression
x y
coefficients of y on x and x on y, denote by byx and bxy respectively and byxbxy
146
r y r x
= = r2 or the coefficient of correlation is the G. M. of the coefficients of
x y
regressions.
Note : (i) If r = +1 or -1, the two regression lines will coincide. The variables are perfectly
correlated. If r = -1, the variables are perfectly negatively correlated, low values of one
corresponding to high values of the other. If r = +1, variables are perfectly positively
correlated, high values of one corresponding to high values of the other.
(ii) If r = 0, the two lines of regression become X = Mx and Y = My , which are two lines
parallel to Y and X axes respectively, passing through their means Mx and Mv. They are
perpendicular to each other. It means that mean values of X and Y do not change with y
and X respectively i.e., X and Y are independent.
4.4.6. Covariance : If x y be the expected values (or means) of two variates x and y, then
the covariance between x and y is defined by the relation
Note: The covariance of two independent variates is equal to zero. Since if x and y are
independent, then Cov( x, y) E( x x )( y y ) E( x x ) E( y y ) 0 , since here
E( x x ) E( x) x 0 E( y y ) .
Correlation coefficient: With the above notation the correlation coefficient r is defined by
E ( x x )( y y ) Cov( x, y) Cov( x, y)
the formula: r . If x and y are
E( x x ) E( y y)
2 2
Var ( x).Var ( y) x y
independent variates then we have cov (x, y) = 0, then r = 0 i.e., they are uncorrelated.
[ E ( xy )]2 E ( x 2 ) E ( y 2 ) .
Proof. For any real constant a, (ax – y)2 0, therefore E(ax – y)2 0,
E ( xy )
i.e., a2E(x2) + E(y2) – aE(xy) 0. Since a being arbitrary so put a = , then
E(x2 )
2
E ( xy ) E ( xy )
E ( x 2 ) E(x ) + E(y ) – E ( x 2 ) E(xy) 0 [ E( xy )] E( x ) E( y ) 2[ E( xy )] 0
2 2 2 2 2 2
147
i.e., [ E ( xy )]2 E ( x 2 ) E ( y 2 ) .
Theorem 3: If a regression coefficient is greater than 1 then the other is less than 1.
Proof. Since if byx 1 then 1/byx ≤ 1. Now since byx bxy = r2 ≤ 1 implies that bxy ≤ 1.
Example 1. Prove that the Pearson's coefficient of correlation r lies between -1 and +1.
U U
= -2x( y – ax – b) = 0 = = -2( y – ax – b), implies that xy - ax2 - bx = 0 and y
a x
xy = y 2
xy
2
= y 2
2
1 = y 2
(1 r 2 ) 0, because U is the sum of the
x 2
x y
2 2
Example 2. Prove that A. M. of the coefficient of regression is greater than the coefficient of
correlation.
r y r x
x y
Solution. Here we are to prove that r
2
or x y 2 x y or ( x y ) 2 0 , which is true.
2 2
148
Example 3. If is the acute angle between the two regression lines, in the case of two
variables x and y, show that
1 r2
tan 2 x y 2 , where r, x , y have their usual meanings. Moreover Explain the
r x y
significance of the formula when r = 0 and r = ± 1.
Solution. If 1, 2 are the angles which the two regression lines make with the x-axis, then
y y
tan 1 r , tan 2 . So
x r x
y r y
r x x 1 r2
tan tan(1 2 ) = 2x y 2 .
r y r x y
1 y
r x x
When r = 0 then = /2 i.e., the two lines of regression are perpendicular to each other.
The estimated value of y is the same for all values of x or vice-versa. Also if r = ± 1 then =
0, hence the lines of regression coincide and there is a perfect correlation between the two
variates x and y.
Example 4. The following marks have been obtained by a class of students in Statistics .
Paper I : 45 55 56 58 60 65 68 70 75 80 85
Paper II : 56 50 48 60 62 64 65 70 74 82 90
Compute the coefficient of correlation for the above data. Find also the equation of the
lines of regression.
Solution. Taking assume means 65 and 70 respectively of first paper(x) and second paper
(y) respectively, we get the following table:
Paper I Paper II
x = x – 65 2 y = y – 70 2
58 -7 49 60 -10 100 70
149
60 -5 25 62 -8 64 40
65 0 0 64 -6 36 0
68 3 9 65 -5 25 -15
70 5 25 70 0 0 0
75 10 100 74 4 16 40
n
2(49)
1393
Then r 11 0.918
2
2
4
1414 1865
49 2
2 2
11 11
n n
2 2 2 2
Here n = 11, now x
, y
n n
n n
1414 2 2 1865 49 2
x 11.336, y 12.235
11 11 11 11
r y 0.919 12.235
Therefore regression coefficient of y on x, byx = 0.992 and regression
x 11.336
r x 0.919 11.336
coefficient of x on y, bxy = 0.851 . Also means of x-series, Mx =
y 12.235
Assumed mean
65 2 65.2 and
n 11
Height of father : 65 66 67 67 68 69 71 73
Height of Son : 67 68 64 68 72 70 69 70
Form the two lines of regression and calculate the expected average height of the son when
the height of the father is 67.5 inches.
x y = x - 69 = y - 69 2 2
65 67 -4 -2 16 4 8
66 68 -3 -1 9 1 3
67 64 -2 -5 4 25 10
67 68 -2 -1 4 1 2
68 72 -1 3 1 9 -3
69 70 0 1 0 1 0
71 69 2 0 4 0 0
73 70 4 1 16 1 4
-6 -4 54 42 24
2 2 2 2
x
, y
n n
n n
54 6 2 42 4 2
x 2.49, y 2.23 and
8 8 8 8
151
n
24 3
r 0.47 .
2
2
9
54 2 42 2
2 2
n n
r y 0.47 2.23
Therefore regression coefficient of y on x, byx = 0.421 and regression
x 2.49
r x 0.47 2.49
coefficient of x on y, bxy = 0.52 . Hence the lines of regression are :
y 2.23
Now for the height of son x = 67.5, the corresponding height of father is given by y–
68.5 = 0.421(x – 68.25) = 0.421(67.5 – 68.25) y = 68.19.
Solution. On solving the given regression equation, we get the means of x and y are: x=
13, y = 17. Also the given regression equations can be put as:
y on x is byx = 0.8 and x on y is bxy = 0.25 and the correlation coefficient is given by
r2 = byx bxy = 0.80.45 = 0.360 gives r = 0.6. Further variance of x, x2 = 9, hence x = 3.
Therefore b yx = ry /x = 0.6 y/3 = 0.8 gives y = 4.
Cov(u, v)
r . Here u = x + y and v = x – y so that u x y, v x y . Then
Var (u ).Var (v)
= E ( x x ) 2 E ( y y ) 2 x y . Also
2 2
152
Therefore
cov(u, v) x y 2 2
r= 2 .
var(u ) var(v) x y
2
Example 8. Two independent variables x and y have means 5 and 10, variances 4 and 9
respectively. Obtain in the coefficient of correlation between u and v where
u = 3x + 4y and v = 3x - y.
Hence u - E(u)=3[x - E{x)] + 4[y - E(y)} and v - E(v) = 3[x - E(x)] – [y - E(y)]. Therefore
Cov(u, v) E(u u)(v v) 9E[( x x) 2 ] 4E[( y y) 2 ] 9E[( x x)( y y)]
cov(u, v)
Cov(u, v) 9 var( x) 4 var( y) 9 cov( x, y) 9 4 4 9 0 r 0.
u v
Example 9. Show that the coefficient of correlation between two variables x and y is given
x 2 y 2 x y 2
by r = .
2 x y
cov( x, y) x y x y
2 2 2
1
cov( x, y) ( x y x y ) r
2 2 2
.
2 x y 2 x y
u u
Example 11. If x1, x2, x3 are three uncorrelated variates having standard deviations 1, 2,
3 respectively. Obtain the coefficients of correlation between (x1:+ x2) and (x2 + x3).
153
E{( x1 x1 )( x2 x2 ) ( x2 x2 )( x3 x3 ) ( x2 x3 ) 2 }
cov(u, v) 2 2
r= .
{( 1 2 )( 2 3 )}
2 2 2 2
var(u ) var(v)
Example 12. Correlation coefficient between two variables x and y is 0.32, covariance is 7.86
and variance of x is 10. Find the variance of y.
Solution. We have given that r = 0.32 , Cov(x, y) = 7.86 and x = 10. Now
Cov( x, y) Cov( x, y) 7.86
r y 2.45625 . Therefore Var(y) = 6 appr.
x y r x 0.32 10
4.5. Partial and Multiple Correlation: In our previous sections we have discussed the
correlation between two variates only. When the values of one variable are influenced by
those of another, the coefficient of correlation provides a useful measure of the degree of
association between them. But it often happens that the values of a variable are influenced
by those of several others. Therefore it becomes necessary to find correlation between three
or more variates. For examples
(i) Number of children (x1) per family depending on income (x2) of the family and age (x3)
at marriage.
(iii) Crimes (x1) depending upon illiteracy (x2) and increased population (x3).
154
Here we shall consider the case of three mutually correlated variables only.
When we consider the combined influence of two or more variates (x1, x2, x3, …,)
upon a variate not included in the above variates our study is of multiple correlation. The
degree of relationship existing between three or more variables is called multiple
correlation. The correlation between two variates (x1, x2) when the linear effect of the third
variate (x3) in them has been eliminated from both is called partial correlation. For example
if we study the effects of both rainfall (x2) and fertilizers (x3) on the production of whets
then it leads an example of multiple correlation, but if we eliminate the effect of any one of
(x2) or (x3) then the correlation between (x3, x1) or (x2, x1) is an example of partial
correlation.
Two variables. Let 1 and 2 be the standard deviations of two variables x1 and x2
measured from their means (i.e., x1 = X1 – X 1 and x2 = X2 - X 2 ), where E(x1)= 0 = E(x2). Let
the lines of regression of x1 on x2 and of x2 on x1 be denoted by the symbols:
x1 = b12x2 and x2 = b21x1 , here b12 and b21 are the regression coefficient of the line of
regressions x1 on x2 and x2 on x1 respectively.
Here x1.2 and x2.1 are called residuals which are the deviations of the representatives
points from the corresponding line of regression. The values of b12 and b21 are obtained
by the principle of least squares which leads to normal equations:
1
r122 = b12b21 and b12 r12 , b21 r21 2 .
2 1
If we denote the mean squares of deviations of (i) by 1.22 and 2.12 , then
155
1 1
1.22 = x = 12(1 – r122) and 2.12 = x = 22(1 – r212).
2 2
1.2 2.1
N N
Three variables. Let 1, 2 and 3 be the standard deviations of three variables x1, x2 and x3
measured from their means (i.e., x1 = X1 – X 1 , x2 = X2 - X 2 and x3 = X3 - X 3 ), where E(x1)=
0 = E(x2). Let the plane of regression of x1 depending on x2 and x3 is:
x1 = a + b12.3x2 + b13.2x3 , here b12.3 and b13.2 are the slopes of straight lines in the graph of x1,
x2 keeping x3 as constant and x1, x3 keeping x2 as constant respectively. These are known
as the partial regression coefficients. Here the first subscript attached to the b‘s is the
subscript of the dependent variable for which estimate is being found, the second subscript
is that of variable x which the coefficient multiplies. These are called the primary subscripts.
After the primary subscripts, and separated from them by a dot, are placed the subscripts
of the other variables that enter into that equation. These are called the secondary
subscripts and their number determines the order of the regression coefficients. The values
of a‘s and b‘s are obtained by the principle of least squares.
Let U = (x1 - a - b12.3x2 - b13.2x3)2 = x1.232 . Here x1.23 called residuals which are the
deviations of the representatives points from the corresponding plane regression.
U U U
Now 0 , gives the normal equations are:
a b12.3 b13.2
Now since x1 = x2 = x3 = 0, then by (i) a = 0 and (ii), (iii) reduced to
Since x22 = N22, x32 = N32 , x2x3 = Nr2323 etc., where rij is coefficient of correlation
between variables xi and xj , therefore from equations (iv) and (v), we get
b12.3 22 + b13.2 (r23 23) – r1212 = 0 or b12.3 2 + b13.2 r23 3 = r121 … (vi)
b12.3 (r23 23) + b13.2 32 – r13 13 = 0 or b12.3 r23 2 + b13.2 3 = r131 … (vii).
On solving equations (vi) and (vii) for b12.3 and b13.2 , we have
156
Property 1. The sum of the products of corresponding values of a variate and a residual is
zero provided the subscript of the variate occurs among the secondary subscripts of the
residual.
Proof. Let the regression equation of x1 on x2 and x3 is x1 = b12.3x2 + b13.2x3 then the
normal equations are x2 x12.3 = 0 = x3x1.23 etc.
Property 2. The sum of the products of two residuals is unaltered by omitting from the
residual any or all of the secondary subscripts which are common to both.
Proof. Let x1.2 = x1 - b12x2 , then with the help of the normal equations
x1.23x1.2 = x1.23 (x1 - b12x2) = x1.23x1 , x1.23x1.23 = x1.23 (x1 - b12.3x2 – b13.2x3) = x1.23x1
Property 3. The sum of the products of two residuals is zero provided all the subscripts of
one residual occur among the secondary subscripts of the second.
Standard Deviation of the Residuals. The standard deviation of x1 and x2, keeping x3
constant is denoted by 1.23 , since a = 0, we have
1 1
1.232 x (x b12.3 x2 b13.2 x3 ) 2
2
1.23 1
N N
157
1 1
=
N
(x 1 b12.3 x2 b13.2 x3 )( x1 b12.3 x2 b13.2 x3 )
N
x (x 1 1 b12.3 x2 b13.2 x3 )
r12 r13r23 2 r r r
= 1 b12.3 1 2 r12 b13.2 1 3r13 = 1 1 r12 12 12223 12 r13
2 2
1 r23 1 r23
2
2
Multiple Correlation coefficient. Let R1(23) denote the multiple correlation of the variable x1
on x2 and x3. Then it is defined as the correlation between x1 and its estimate
R1( 23)
x X 1 1
x (b x b x ) 1 12.3 2 13.2 3
x (x x ) 1 1 1.23
x X x (b x b x ) x (x x
2 2 2 2 2
1 1 1 12.3 2 13.2 3 1 1 1.23 )2
x x x x
2 2 2 2
= 1 1.23
1 1.23
x ( x 2 x x 1.23 x1.23 ) x ( x x
2 2 2 2 2 2
1 1 1 1 1 1.23 )
1/ 2
N 1 N 1.23
2 2
1.232 12 1.232
= 1 .
2
N 1 ( N 1 N 1.23 )
2 2 2
1 ( 1 1.23 )
2 2 2
1
This is the required coefficient of multiple correlation. This result may be expressed as
1 R1( 23) 1.23 / 1 . Another notations of R1(23) are R1.23 and note that R1(23) > 0 since
2 2 2
Partial Correlation Coefficient. The partial correlation coefficient of x1 and x2, keeping x3
constant is denoted by r12.3 is the square root of the product of b12.3 and b21.3 i.e.,
1 r23 1 r13
2 2
158
x2 : 57 59 49 26 51 50 55 48 52 42 61 57
x3 : 8 10 6 11 8 7 10 9 6 6 12 -9
x1 = b1.23 + b12.3x2 + b13.2x3 and normal equations are x1 = n b1.23 + b12.3 x2 + b13.2 x3 ,
x1x2 = b1.23 x2 + b12.3 x22 + b13.2 x2 x3 and x1x3 = b1.23 x3 + b12.3 x2x3 + b13.2 x32 .
12 b1.23 + 643 b12.3 + 106 b13.2 = 753, 643 b1.23 + 34843 b12.3 + 5779 b13.2 = 40830 and 106
b1.23 + 5779 b12.3 + 976 b13.2 = 6796. Solving these equations, we get
b1.23 = 3.6512, b12.3 = 0.8546, b13.2 = 1.5063, hence x1 = 3.6512 + 0.8546 x2 + 1.5063 x3.
2
Solution. We know that R1( 23) 1 1.232 , where
2
1
2
2
(1 r23 r12 r13 2r12r13r23 ) , therefore
2 2 2
1
1 r
1.23 2
23
2
x2 x1.23 = x2.3 x1.23 = x2.3(x1 - b12.3x2 - b13.2x3) = x1x2.3 - b12.3x2 x2.3 (since x2.3 x3 = 0)
Example 4. For a trivariate distribution, show that 1 R1( 23) (1 r12 )(1 r13.2 ) .
2 2 2
Solution. We know that 1 R1( 23) 1.23 / 1 , where 1.23 1 (1 r12 )(1 r13.2 ) . Hence
2 2 2 2 2 2 2
Example 5. If r12 = 0.86, r13 = 0.65, r23 = 0.72, then find r12.3 .
Solution. Since
r12 r13r23 r21 r23r13 r12 r13r23 0.86 (0.65)(0.72)
r12.3 0.744 .
1 r23 1 r13
2 2
(1 r23 )(1 r13 )
2 2
(1 ( 0.65) 2
(1 ( 0.72) 2
1 r23
2
Example 6. If r12.3 = 0 then show that r13.2 r13 .
1 r12
2
r12 r13r23
Solution. Since r12.3 0 r12 r13r23 , therefore
(1 r23 )(1 r13 )
2 2
(1) Curve Fitting means an expression of the relationship between two variables by
algebraic equations on the basis of observed data.
(2) Whenever two variables x and y are so related that a change in one is accompanied by
change in the other in such a way that an increase in the one is accompanied by an increase
or decrease in the other, then variables are said to be correlated. When an increase (or
decrease) in one variate corresponds to an increase (or decrease) in the other, the
correlation is said to be positive. It is negative when increase in one corresponds to decrease
in the other or vice versa.
(3) Karl Pearson’s Coefficients of Correlation : The coefficients of correlation r between two
variables X and Y is defined by the relation
xy
r
xy n
p
. Where x and y are the deviations measured
x y2 2
x y x y
2 2
n2
from their respective means and x , y being the standard deviations of these series.
(4) Correlation coefficient and rank correlation coefficients lies between –1 to +1.
(5) By regression we mean that an association or relation between two variates x, and y. If r
= +1 or -1, the two regression lines will coincide. The variables are perfectly correlated. If r
= -1, the variables are perfectly negatively correlated, low values of one corresponding to
161
high values of the other. If r = +1, variables are perfectly positively correlated, high values
of one corresponding to high values of the other. If r = 0, the two lines of regression
become X = Mx and Y = My , which are two lines parallel to Y and X axes respectively,
passing through their means Mx and Mv. They are perpendicular to each other. It means
that mean values of X and Y do not change with y and X respectively i.e., X and Y are
independent.
[ E ( xy )]2 E ( x 2 ) E ( y 2 ) .
(9) The sum of the products of corresponding values of a variate and a residual is zero
provided the subscript of the variate occurs among the secondary subscripts of the
residual.
(10) The sum of the products of two residuals is unaltered by omitting from the residual
any or all of the secondary subscripts which are common to both.
(11) The sum of the products of two residuals is zero provided all the subscripts of one
residual occur among the secondary subscripts of the second.
4.7. Assignments
1. Form normal equations and hence find the most plausible values of x and y from the
following x+ y = 3.01, 2x – y = 0.03, x + 3y = 7.03, 3x + y = 4.97. Ans x = 0.999, y =
2.004.
2. Fit a second degree parabola to the following data, x is the independent variable.
X 1 2 3 4 5 6 7 8 9
Y 2 6 7 8 10 11 11 10 9
x 1 2 3 4 5
5. Derive the least square equations for fitting a curve of the type y = ax2 + (b/x) to a set of
n points. Hence fit a curve of this type to the data:
X 1 2 3 4
6. Calculate the value of Pearson's coefficient of correlation for the following series A and
B:
Ans. 0.6
7.Calculate the value of r between X and Y for the values given below :
A: 1 5 7 9 19 17
B : 25 27 26 29 34 35
Ans. 0.967
163
0–5 7
5 – 10 6 8
10 – 15 5 3
15 – 20 7 2
20 – 25 9
Ans. 0.8666
y
29.5 4 3 4 1 1
59.5 1 3 6 18 6 9 2 3 1
89.5 7 3 16 16 4 4 1 1
119.5 5 9 10 9 2 1 2
149.5 3 5 8 1
179.5 4 2 3 1
209.5 4 4
239.5 1 1
Ans. -0.49
11. Calculate the coefficient of correlation from the data given below by the method of
differences :
164
X: 45 56 39 54 45 40 56 60 30 36
Y: 50 36 30 44 36 32 45 42 20 36
Ans.0.7836
12. Two judges in beauty contest rank the ten competitors as follows:
A: 6 4 3 1 2 7 9 8 10 5
B: 4 1 6 7 5 8 10 9 3 2
13. Illustrate the methodology of computing the correlation coefficient and the equation of
the line of regressions by using the following data:
x: 2 6 4 7 5
y: 8 8 5 6 2
Age of Husband : 18 19 20 21 22 23 24 25 26 27
Age of wife : 17 17 18 18 18 19 19 20 21 22
15. The following data are given for marks in English and Mathematics of a certain
examination: Mean marks in English = 39.5, Mean marks in Maths.= 47.6, r = 0.42,
Standard deviation of marks in English = 10.8, Standard deviation of marks in
Maths.= 16.9. Form the two lines of regression and calculate the expected average marks
in Maths of candidates who received 50 marks in English.
Ans. y = 0.657x + 21.64, y – 47.6 = 0.657(x – 39.5),y = 0.268x + 26.73, y -39.5 = 0.268(y –
47.60; y = 54.5
165
16. Two lines of regression are given by x + 2y - 5 = 0, 2x +3y - 8 = 0 and x2 = 12. Calculate
the mean values of x and y, variance of y and the coefficient of correlation between x and y.
Ans. 1, 2, 0.86, 4
17. The following regression equations have been obtained from a correlation table:
y = 0.516x + 33.73, x = 0.512y + 32.52. Find the mean of x's and also of y's as well as the
correlation coefficient between x and y. What is the ratio of the standard deviation of x to
that of y.
18. For bivariate data: n = 18, x2 = 60, y2 =96, x = 12, y = 18, xy = 48. Find the
equation of regression lines. Ans. y = -0.18x – 1.12, x = -0.12y + 0.89.
19. If x and y are two correlated variances with the same standard deviation and the
coefficient of correlation r, show that the correlation coefficient between x and x + y is
1
(1 r ) .
2
20. The variates x and y have variances 12 and 22 respectively and are correlated with
coefficient of correlation , and defined by = x cos + y sin and = y cos - x
2
sin. Show that and will be uncorrelated if tan 2 2 1 22 .
1 2
22. For any two variates x and y u = 2x + 7y and v = 3x + ky. Then find for what value of k,
u ad v will be un-correlated.
23. Show that correlation coefficient between the residuals x1.23 and x2.l3 is equal and
opposite to that between xl.3 and x2.3.
24. In a distribution 1 = 2, 2 = 3 = 3, r12 = 0.7, r23 = 0.5 = r31 , evaluate r23.1 , R1.23 , b12..3
,b13.2 ,1.23.
b12 b13b23
24. Show that b12..3 .
1 b23b32
166
25. Find partial correlation coefficients, where r12 = 0.70, r13 = 0.61, r23 = 0.4.
1. Find the most plausible values of x and y from the four equations :
X 1 2 3 4 5
Y 4 6 3 5 7
X 1 2 3 4 5
Y 5 7 9 10 11
X 1 2 3 4 5
Y 25 28 33 39 46
5. Taking 1913 as origin for x-series fit a straight line to the following data showing the
production of a commodity in different years in Punjab:
Production y : 10 12 8 10 14.
6. The profits £100y, of certain company in the xtk year of its life are given by
167
x: 1 2 3 4 5
y: 25 28 33 39 46
7. The profit of a certain company in the xth year of its life are given by
x: 1 2 3 4 5
8. Fit a straight line using the method of least squares, and calculate the average rate of
growth per week
Age : 1 2 3 4 5 6 7 8 9 10
Weight : 52.5 58.7 65.0 70.2 75.4 81.1 87.2 95.5 102.2 108.4
Fit a Parabola of the second degree to represent the data. Ans. y = 1547.9 + 378.4x – 40x2
10. Fit the curve y = aebx to the following data, e being Napierian base, 2.71828 :
x: 0 2 4
y: 5.012 10 31.62
Ans. y = 4.642e0.46x
x: 1 2 3 4 5 6
168
Ans. y = 2.978x-5.444
12. Calculate the coefficient of correlation between the values of X and Y given below :
X: 78 89 97 69 59 79 68 61
Ans. 0.957
x: -10 -5 0 5 10
y: 5 9 7 11 13
Ans. 0.9
14. Find Karl Pearson's coefficient of correlation from the following index numbers :
Cost of Living : 98 99 99 97 95 92 95 94 90 91
Ans.0.85
x/y 16 - 18 18 - 20 20 -22 22 – 24
10 – 20 2 1 1
20 – 30 3 2 3 2
30 – 40 3 4 5 6
40 – 50 2 2 3 4
50 – 60 1 2 2
60 – 70 1 2 1
Ans. 0.28
16. Calculate the coefficient of correlation from the data given below:
169
X: 76 90 98 69 54 82 67 52
Y: 25 37 56 12 7 36 23 11
Ans. 0.952
17. Find the correlation coefficient and the equations of regression lines for the following
values of x and y.
x: 1 2 3 4 5
y: 2 5 3 8 7
17. Find the correlation coefficient and the equations of regression lines for the following
values of x and y.
x : 23 27 28 28 29 30 31 33 35 36
y : 18 20 22 27 21 29 27 29 28 29
18. You are given the following results for the heights (x) and weights (y) of 100 policemen.
Mx = 68 inches, My= 150 lbs., x = 2.5 inches, y = 20 lbs., r = 0.6. From these data estimate
(a) he height of a particular policeman whose weight it 200 lbs. (b) the weight of particular
policeman who is 5 feet tall.
19. Two random variables have the least square regression lines with equations 3x + 2y =
26 and 6x + y = 31. Find the mean values and the correlation coefficient between x
and y. Ans, 4, 7, -0.5
21. Two variates x and y have zero means, the same variance 2 and zero correlation. Show
that u = x cos + y sin and v = x cos - y sin have the same variance 2 and
zero correlation.
170
23. If is the angle between the regression lines of two variates with correlation coefficient
r, prove that sin ≤ 1 – r2.
25. Find partial correlation coefficients, where r12 = 0.80, r13 = -0.40, r23 = -0.56.
26. Show that the values r12 = 0.60, r13 = -0.40, r23 = 0.7 are inconsistent.
At the end of the unit student discuss or seek clarification on some points, if so mention the
points:
A: -------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------
B: -------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------
C: -------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------
171
--------------------------------------------------------------------------------------------------------
==========
Structure
5.1 Introduction
5.2 Objectives
5.9 Assignment
5.1 Introduction
In practical problems the statistician is often confronted with the necessity of discussing
universe (or population) of which he can not examine every member. We are thus led
naturally to the question. What can be said about a universe of which we can examine only
a limited number of sample members? This question is the origin of the theory of sampling.
Any collection of individuals under study is said to be population (or universe). The
individuals often called the members or the units of the population may be physical objects
or measurements expressed numerically or otherwise. A part or small section selected from
the population is called sample and the process of such selection is called sampling.
5.2 Objectives
173
The fundamental object of sampling is to gel as much information as possible of the whole
universe by examining only a part of it. An attempt is thus made through sampling to give
the maximum information about the parent universe with the minimum effort.
Another object of sampling is to determine the reliability of the estimates when they are
obtained. This can be done by drawing successive samples from the same parent universe
and comparing the results obtained from different samples.
Simple Sampling: By simple sampling we mean random sampling in which each event has
the same chance of success and in which the chances of success of each event are
independent of the success or failure of events in the preceding trials.
If we do not replace it, the chance of drawing the king second time is 3/54 and so on
However if the card drawn at the first trial was put back in the pack before the next trial,
the random sampling would become simple sampling.
Definition5.3.1: The drawing of samples from a universe whose members possess the
attribute A or a (not-.A).
npq pq
The standard deviation of the proportion of successes-
n n
Tests of Significance for Large Samples: Suppose a large number of sample is classified
according to the frequencies of an attribute,
If the number of successes in a large sample of size n differs from the expected value np, by
more than 3 npq we call the difference highly significant and the truth of the hypothesis is
very improbable.
Generally we accept the hypothesis as correct and then we calculate np, npq , and apply
the above test.
Example5.4.1: A coin is tossed 400 times and it turns up head 216 times. Discuss whether
the coin may be unbiased one.
The deviation of the actual number of head from expected = 216 – 200 = 16.
1 1
The standard deviation = npq = 400 = 10.
2 2
175
The deviation is only 1.6 times the standard deviation and hence it is likely to appear as a
result of fluctuations of simple sampling. We conclude that the coin may be taken as
unbiased one.
Standard Error: The standard deviation of a sampling distribution of a statistics also called
the Standard error.
Therefore frequencies differing from the expected frequency by more than 3 times the
standard error are almost certainly not due to fluctuation of samples. It is some times
written as S. E.
Probable Error: Instead of standard error some authors have used a quantity called
probable error which is .67449 times the standard error. It is more easily understood than
the standard error by the layman and the business men and universally used in past.
Following are the standard errors of some important parameters when the Parent universe
is assumed normal-
Standard deviation µ3 6 6
2n n
Median µ4 96 8
2n n
varience 2
2
n
176
Example5.5.1: Out of a simple sample of 1000 individuals from the inhabitants of a country
we find that 36% of them have blue eyes and the remainder has eyes of some other colors.
What can we infer about the proportion of blue eyed individuals in the whole population?
q = 1 - .36 = .64
Assume that the conditions of this problem will give a simple sample.
0.36 0.64
= = 0.015
1000
Hence taking .36 to be estimate of proportion of families having blue eyes the limit are .36
( 3 x 0.015) = .405 and .319 i.e. 40.5% and 31.9%
Example5.5.3: A random sample of 500 pineapples was taken from a large consignment and
65 were found to be bad. Show that the S. E. of the proportion of bad ones in a sample of
this size is 0.015 and deduce that the percentage of bad pineapples in the consignment
almost certainly lies between 8.5 and 17.5.
x np
2- Define a test statistic Z = ~ N (0,1) and calculate value of Z.
npq
Example5.6.1: A coin is, tossed 1000 times and the head comes out 516 times. Discuss
whether the coin is unbiased one?
The deviation of the actual number of head from expected = 550 – 500 = 50
1 1
The standard deviation = npq = 1000 = 25.
2 2
The deviation (50) is only 2 times the standard deviation (25) and hence it is likely to
appear as a result of fluctuations of simple sampling. We conclude that the coin may be
taken as unbiased one.
Example 5.6.2: In some dice throwing experiment, Weldon throw a die 49152 times, and of
these 25145, yielded 4, 5 or 6. Is this consistent with the hypothesis that the die was-
unbiased?
So q = 1 – ½ = ½.
n = 49152, x = 25145.
1 1
Standard deviation = npq = 49152 = 110.9
2 2
Since |Z| > 3, hence hypothesis H0 at .27% rejected. The difference is not due to sampling
fluctuations. Hence the data is not consistent with the hypothesis that the die was unbiased.
x np pP
Z= ~ N (0,1) = ~ N (0,1)
npq PQ / n
Now, |Z| 3
pP
3
pq / n
pq pq
p – 3 P p3
n n
Let two large simple samples of n1 and n2 members be taken from two universes. Let these
samples give proportion of the attribute A‘s as p1 and p2 respectively.
We have to find:
Is the difference p1 ~ p2 due to fluctuation of simple sampling, the two populations being
similar as regards the given attribute A is concerned?
On the hypothesis that the populations are similar us regards the giver, attribute, we can
combine that two samples to give an estimate.
p1n1 p2 n2
If p0 be this estimate then it is given by p0
n1 n2
If e1 and e2 be standard errors in the proportion of successes in the two samples then
p0 q0 2 p0 q0
e12 , e2
n1 n2
If e be the standard error in the proportion of successes in the parent universe, it is given
by, e2 e12 e22
179
If p1 ~ p2 < 3e it may be due to fluctuations of sampling but if p1 ~ p2 > 3e, it may be taken a
real difference in the population proportions.
Illustrative Examples
Example5.7.1: In a simple sample of 600 men from a certain large city400 are found to be
smokers. In one of' 900 from another large city, 450 are smoker. Do of data indicate that
cities are significantly different with respect to prevalence of smoking among men?
x1 400 2 x2 450 1
Solution Here n1 = 600, n2 = 900 , p1 p2
n1 600 3 n2 900 2
2 1
600 900
p n p2 n2 17 17 13
p0 1 1 p0 3 2 so, q0 1
n1 n2 600 900 30 30 30
e2 e12 e22
1 1 17 13 1 1
= p0 q0 0.000682 e = .026
1
n n2 30 30 600 900
Now p1 ~ p2 < 3e
Therefore it is highly significant. Our hypothesis that the populations are similar is almost
certainly wrong.
Example5.7.2:.A machine puts out 16 imperfect articles in a sample of 500. After machine
overhauled, puts out 3 imperfect articles in a batch of 100. Has the machine been
improved?
Solution- Do as above.
Suppose that all possible samples of size n are drawn without replacement from a finite
population of size N. If x and be the mean and s.d. of the sampling distribution and M
and σ be the mean and s.d. of the population respectively then,
N n
x = M, =
n N 1
180
x = M, =
n
Example5.7.3: population consists of the five numbers 2, 3, 6, 8, 11. We are to take all
possible samples of size two which can be drawn from this population with replacement.
Find
Solution:
2 3 6 8 11 30
(a) Mean of population x = M = 6
5 5
(2 6) 2 (3 6) 2 (6 6) 2 (8 6) 2 (11 6) 2 54
2 10.8
5 5
σ = 3.29.
The number of samples k: = 52 = 25. The following are the samples of size two
(d) The variance of the sampling distribution of means is obtained by subtracting the mean
6 from each number in (1), squaring the result, adding all 25 numbers thus obtained and
dividing by 25. :.
(2, 3). (2, 6), (2, 8), (2, 11), (3. 6), (3. 8), (3, 11), (6, 8), (6, 11), (8, 11).
2.5 4.0 5.0 6.5 4.5 5.5 7.0 7.0 8.5 9.5
Definition5.8.1: Let x1, x2,… xn, be a random sample size n from a normal distribution with
mean µ and variance 2 Then
xi ~ N (µ, 2 ) I = 1,2,3…,n
xi
zi = ~ N (0,1), i 1,2,..., n
n
and z
i 1
2
i denoted by χ2 (read as ki square) χ2 Statistic with (n - 1) degrees of freedom.
Application of χ2 distribution
1- χ2 Test for testing the significance of sample variance (sample size n 30)
Solution: H0 = 2 = 7.5
Test statistic: 2
(x x)
i
2
~ 2 with (n - 1) d.f.
2
ns 2
Computation: 2 , under H0
2
25 102
2 44
(7.5)2
Conclusion: Since the value of χ2 calculated is greater than tabular value at 5% d.f. which is
36.415 So H0 is rejected. This means there is justification for believing that the variability
has increased.
183
2- Test independence of the attributes in contingency table: Procedure and terminology for
this test is as given below.
(a) Contingency table – A table consisting t rows and s columns i.e. s x t frequency cells is
called contingency table. The grand total N gives the number of individuals.
(b) Expected frequencies – Assuming that a and B are independent then expected
frequencies corresponding to cell (I, j) to the observed frequency is given by
( Ai )( B j )
eij =
N
Oij2
(c) Test-Statistic: 2
N with (t – 1)(s – 1) degree of freedom
i i eij
(d) Degrees of Freedom: To obtain the number of degrees of freed for t x s contingency
table we proceed as follows —
Suppose there are p columns and q rows. The sum of the cell frequencies in each row is
determined as being the border frequency in that row, and similarly for the columns.
Hence each of the p columns and q rows imposes a constraint. Thus there are (p + q)
constraints. From (p + q) subtract 1, because they are not algebraically independent, as the
sum of the border column equals the sum of the border row i.e., the total frequency. Hence
there are p + q — 1 independent linear constraints. The total number of cells is p x q.
Linear Constraints: Constraints which involve linear equations m cell frequencies are
called linear constraints.
(e) Conditions for the application of X2: Following are the condition which should be
satisfied before χ2 -test can be applied—
(1) In first place N, the total number of frequencies must be Otherwise χ's are not normally
distributed. N should reasonably be at least 50, however few the number of cells.
Illustrative Examples
184
a b
c d
χ2 = (a + b + c + d)(ad - bc)2
Attribute B ß Total
A a b a+b
α c d c+d
Here observed frequencies are a, b, c, d and the corresponding expected frequencies under
the assumption that the two attributes A and B are independent are as follows
respectively. Now χ2
(Oij eij ) 2
eij
(ad bc) 2 1
a b c d (a b)(a c)
After solving
(ad bc) 2 (a b c d )
(a b)(c d )(b d )(a c)
185
Example5.8.3: From the following table, test the hypothesis that the flower colour is
independent of flatness of leaf.
Red Flowers 20 5 25
Use the following table giving the values of χ2 for one degree of freedom for different values
of P—
P 5 1 05
Solution: On the hypothesis that the flower colour is independent of flatness of leaves, the
theoretical frequencies are:
1 1 1 1
2 (1.4) 2[
100.41 34.59 18.59 6.41
= (2 – 1) (2 – 1) = 1
The 5% value of χ2 for one degree of freedom is 3 841. The calculated value is much less
than this. This comparison leads us to the conclusion that there is no cause to suspect the
hypothesis that the flower color is independent of the flatness of the leaf.
Example5.8.4: The following table shows the result of inoculation against cholera.
Is there any significant association between inoculation and attack? Given that,
The theoretical frequencies have been shown in brackets. The first theoretical frequency
has been obtained
187
431
722 427.7 Similarly other can be obtain.
736
1 1 1 1
2 (3.3)2[ 3.28
427.7 8.3 294.3 5.7
Thus if the hypothesis is true, our data gives results which would be obtained about 7 times
in hundred trials. We may be unjustified in rejection the hypothesis. So we have to believe
that inoculation and attack are associated.
Fruits prickly 47 21 68
Fruits smooth 12 3 15
Using chi-square test, find the association between colour of flowers and character of fruits,
given that,
Solution
Total 59 24 83
47
59 48.34 Similarly other can be obtaining.
83
1 1 1 1
2 (1.34)2[ .7105
48.34 19.66 10.66 4.84
Hence
Since P > . 05, the value is not significant Le,, the hypothesis is correct.
Hence colour of flowers and character of fruits are not associated and the divergence might
have happened on account of fluctuation of sampling.
Example5.8.6: Five dice were thrown 192 times and the number of tunes 4,5 or 6
were as follows—
f 6 46 70 48 20 2
Calculate χ2.
1 1
192( )5
2 2
Since for the application of χ2-test number of frequency should be less than 5, hence
No. of dice 5 4 3 2 1 or 0
throwing 4, 5,6
f 6 46 70 48 22
fi 6 30 60 60 36
( f fi )2
2
f
Example5.8.7: Five dice were thrown 96 times and the number of tunes 4,5 or 6
were as follows—
f 7 19 35 24 8 3
1 1
96( )5
2 2
Since for the application of χ2-test number of frequency should be less than 5, hence on
regrouping
No. of dice 5 or 4 3 2 1 or 0
throwing 4, 5,6
190
f 7+19=26 35 24 8+3=11
fi 3+15=18 30 30 15+3=18
( f fi )2
2
f
Also difference in χ2 from calculated and table value is 8.31 – 7.815 = 0.495
0.03 0.495
So for χ2=0.495 difference in P is = 0.007
2.022
43 4.3 1
Required probability = .043 about 1 in 25
1000 100 25
t-statistic: The static t was introduced by W S.Gosset in 1908 who wrote under the name
“Student”. That is why it is called student's t Later on its distribution was rigorously
established by Prof. R.A. Fisher in 1926.
Definition: Let x1, x2,… xn, be a random sample size n from a normal distribution with
mean µ and variance 2
1 n
Let s2 = ( xi X )2
n 1 i 1
191
1 n
where X xi
n i 1
Then
x (x ) n
t= or
s s
n
When n < 30. For large n, t-statistic tends to standard normal variant.
x (x ) n
Test static: t = or
s s
n
Conclusion: If p < .05, we regard that the value of t is significant. If p < .01, we regard it as
highly significant. A significant value of t throws doubt on the truth of hypothesis.
Illustrative Examples
Example5.8.8: A machine which produces mica insulating washers of use in electric devices is
set to turn out washers having a thickness of 10 mils (l mil = 0 001 inch). A sample of 10
washers has an average thickness of 9.52 mils with a standard deviation of 0 .60 mil. Find out
t.
x 9.52 10 10 48 4
Test static: t = n 10 3.16 2.528
s .60 .60 3
192
eight: -4, -2, - 2, 0, 2, 2, 3.3 taking the mean of the universe to be zero.
Solution :
Serial no x xx ( x x )2
1 -4 -4.25 18.0625
2 -2 -2.25 5.0625
3 -2 -2.25 5.0625
4 0 -0.25 0.0625
5 2 1.75 3.0625
6 2 1.75 3.0625
7 3 2.75 7.5625
8 3 2.75 7.5625
Total 2 49.5000
x Mean
x 2 .25
n 8
s
(x x) 2
49.5
7.071428 2.659
n 7
xM (0.25 0) 8
t= n .27
s 2.659
Example 5.8.10: Ten individuals are chosen at random from a population and their heights
are found to be in inches 63,63,64,65,66,69,69,70,70,71 discuss the suggestion that the mean
height in the universe is 65 inches given that for 9 degrees of freedom the value of Student's t
and 5 percent level of significance is 2.262.
x (x ) n
Solution : Test static: t = or
s s
n
193
Serial no x xx ( x x )2
1 63 -4 16
2 63 -4 16
3 64 -3 9
4 65 -2 4
5 66 -1 1
6 69 2 4
7 69 2 4
8 70 3 9
9 70 3 9
10 71 4 16
n = 10 x = 670 - ( x x ) 2 = 88
Sample mean, x
x 670 67
n 10
s
(x x) 2
88
3.13inches
n 9
Since calculated value of t is less than tabulated value for 9 d.f. (2.02 < 2.262). This error
could have arisen due to fluctuations and we may conclude that the data are consistent with
the assumption of mean height in the universe of 65 inches.
Does the mean of the nine items differ significantly from the assumed population mean of
47.5? Given that
s (x x) 2
(45 49.11) 2 ... (49.11 51) 2
2.62
n 9 1
Thus the change of getting value of t greater than observed is 1 - .95 = .05. The probability
of getting t greater in absolute value is 2 x .05 .10 which is greater than .05. This shows that
the value of t is not significant. Hence the same may be a ransom sample from a normal
population of mean 47.5
Procedure, to test the difference between two means or to compare two samples.
Example5.8.12: The heights of six randomly chosen red roses are in cm 63, 65, 68, 69, 71 and
72 Those at 10 randomly chosen yellow roses are 61, 62, 65, 66, 69. 69, 70. 71, 72, 73. Discuss
the light that these data throw on the suggestion that yellow roses are on the average taller
than red roses; given that
x xx ( x x )2 y y y ( y y )2
63 -5 25 61 -6.8 46.24
65 -3 9 62 -5.8 33.64
68 0 0 65 -2.8 7.84
69 1 1 66 -1.8 3.24
71 3 9 69 1.2 1.44
72 4 16 69 1.2 1.44
70 1.2 4.84
71 3.2 10.24
72 4.2 17.64
73 5.2 27.04
408
Mean height of red roses is x 68
6
678
Mean height of yellow roses is y 67.8
10
{ ( x x )2 ( y y )2}
s2
n1 n2 2
= 15.257
s = 3.906.
Now
Therefore Fisher's P = 2(1 - 538)= .924 which is much greater than .05 I Hence the value of
t is not significant. 'Thus there is nothing to suggest that the universes are unlike as regards
height i.e., the suggestion that ―red rose", are on the average taller than yellow roses is
wrong.
Example 5.8.13: In a rat feeding experiment, the following results were obtained:
High 13 14 10 11 12 16 10 8 11 12 9 12
protein
Low Protein 7 11 10 8 10 13 9
Investigate if there is any evidence of superiority of one diet over the other The value of t for
17 degrees of freedom at 5% level of significance =2.11.
x xx ( x x )2 y y y ( y y )2
11 -.5 .25 2 4
7 49
12 .5 .25 10 144
1 5 49
7
16 4.5 20.25
2 4
10 -1.5 2.25 8 7 49
9 -2.5 6.25
12 .5 .25 9
138 0 53 00 68 0 1148
49
Mean gain in weight on High Protein = 11.5 Mean gain in weight of Low Protein = 9 5
7
25 68
xy n1n2 7 12 7
t 2
s n1 n2 535 11 7
119
Paired Samples: When the size of two samples is the same, ‗t‘ can be obtained using
d d 50 5S 2
1
(d d ) 2 t
d 0
n 1
d
n 10 sd n
Illustrative Examples
Example5.8.14: Ten school boys were given a test in Mathematics. They were given a month's further
tuition and a second test of equal difficulty was held at the end of it. Do the marks give evidence, that
the students have from benefited by the extra coaching?
Boys : 1 2 3 4 5 6 7 8 9 10
Marks in Test 68 25 58 56 64 55 57 69 34 44
I
Marks in Test 71 39 59 59 57 68 69 76 43 39
II
198
Solution:
x y
1 68 71 3 -2 4
2 25 39 14 9 81
3 58 59 1 -4 16
4 56 59 3 -2 4
5 64 57 -7 -12 144
6 55 68 13 8 64
7 57 69 -12 7 49
8 69 76 7 2 4
9 34 43 9 4 16
10 44 39 -5 -10 100
— — ∑d = 50 — ∑ (d d )2 = 482
Here n = 10.
d d 50 5
n 10
1
Sd (d d ) 2
2
n 1
= 53.56
Sd = (53 56) ½
= 7. 3
Ho: Students have not been benefited by the extra coaching i.e. µd = 0.
H1: µd > 0 (one sided, upper tail of the t- distribution is the critical region)
199
d 0 (5 0) 10
Test static t = 2.17
sd n 7.3
Degrees of freedom = 10 - 1 = 9.
Conclusion: Since the calculated value of t is greater than tabulated value of t, the
hypothesis H0 is rejected. Consequently the coaching may be beneficial to the school boys.
Remark: If we take H0 : µd = 0.and H1: µd 0, i.e., there is difference in the marks of two
tests. Then two sided critical region.
Example5.8.15: The yield of two 'Type 17'and 'Type 51' of grains in pounds per acre in 6
replications are given below. What comments woold you make on the difference in the mean
yields? You may assume that if there be 5 degrees of freedom and P = 0.2, it is approximately
1.476.
1 20.50 24.86
2 24.60 26.39
3 23.06 28.19
4 29.98 30.75
5 30.37 29.97
6 23.83 22.04
9.86 36.27
let us lake the null hypothesis that the difference in types has no effect on yields i.e., the
population mean of the difference is zero, then this
201
d * 0 (1.634 0) 6
t = 1.489
sd n 2.69
The value of t is less than t0.05 for 5 d.f. and therefore the difference is not significant at 5%
level but is significant at 20 % level.
Example5.8.16. The sleep of 10 patients was measured for the effect of the soporific drugs
referred to in the following table as Drug A and Drug B. From the data given below snow that
there is significant difference between the effects of two drugs, on the assumption that
different random samples of patients were used to test the two drugs A and B. You may
assume that if there be 9 degrees of freedom and P =.05, t= 2.26.
Let the joint distribution of X and Y two variables be bivariate normal distribution with
mean µ1, µ2, standard deviations σ1, σ2 and correlation coefficient ρ. Let a random sample
(x1, y1), (x2, y2), ...,, (xn, yn) be drawn from this bivariate normal population. The t-test H0: ρ
= 0, the population correlation coefficient is zero, we have
9
r n2
Test statistic, t = with (n - 2) degree of freedom
1 r2
Illustrative Examples
r n2 0.5 15 2
1 r 2
1 (0.5) 2
0.5 13 1.803
1 25 .866
2.082
203
The tabulated value of t0.05 for 13 degrees of freedom is 2.16 which are more than calculated
value of t at 5% level of significant. So sample correlation does not warrant the existence of
correlation in the population.
Example5.8.18: Find the least value of r the sample of 18 pairs from a bivariate normal
population, significant at 5 % level, r being ihe coefficient of correlation of the sample.
Snedecore’s F- Distribution
S12
F= , where S12 > S22
S 22
And
2 (x x) 2
2 (x x) 2
S ,S
n1 1 n2 1
1 2
The numerator and denominator of the second member arc independent of χ2 variates with
v1 and v2 degrees of freedom respectively.
Illustrative Examples
Example5.8.19: Two samples of size 9 and 8 give the sum of squares of deviations from their
respective means equal to 160 square and 91 squares respectively. Can they be regarded as
drawn from the two normal populations with same variance? Given that F0.05 for 8 and 7 d.f.
is 3.73
Solution: Here
S 2
(x x) 2
160
20
n1 1 9 1
1
S22
(x x) 2
91
13
n2 1 7
S12 20
F= = = 1.54
S 22 13
This calculated value of F is less than F0.05 for 8 and 7 degrees of freedom i.e., 3 .73.
Therefore the calculated value of F is not at all significant. Hence the two samples can be
regarded as drawn from two normal populations with the same variance.
Example5.8.20: Two independent samples of 8 and 7 items respectively had the following
values of the variable:
Sample I 9 11 13 11 15 9 12 14
Sample II 10 12 10 14 9 8 10
Does the estimate of Population variance differ significantly? Given that for 7 degrees of
freedom the value of F at 5 % level of significance is 4.20 nearly
Solution:
205
Sample I Sample II
x x2 y y2
9 81 10 100
11 121 12 144
13 169 10 100
11 121 14 196
15 225 9 81
9 81 8 64
12 144 10 100
14 196 — —
94 1138 73 785
94 73
x = 11.75, y = 10.43
8 7
(x x) 2
2 x 2 x x x
2 2
2
94 94
1138 2 94 8
8 8
942
1138
8
33.5
Similarly, ( y y) 2
23.7
S12 (x x) 2
33.5
n1 1 7
(x x) 2
23.7
S 2
n2 1
2
6
206
S12 33.5 6
F= = 1.21
S22
7 23.7
Hence differences are not significant. Therefore the samples may well be drawn from the
population with same variance.
Example 5.8.21: Two random samples drawn from two normal populations are:
Sample I 20 16 26 27 23 22 18 24 25 19
Sample II 27 33 42 35 32 34 38 28 41 43 30 37
Obtain the estimates of the variances of the populations and test whether the two populations
having the same variance.
Solution: Do as above
Hint: This calculated value of F (=2.14) is less than F.05 (=3.112) at 11 and 9 degrees of
freedom. Hence, the hypothesis H0 : σ12 = σ22 may be accepted. Therefore the samples may
be regarded as drawn from the populations which have same variance.
Example5.8.22: For a random sample of 10 pigs, fed on diet A, the increases in weight in
pounds in a certain period were
For another random sample of 12 pigs, fed on diet B, the increases in the same period were
7, 13, 22, 15, 12, 14, 18, 8, 21, 23, 10, 17 lbs.
Show that the estimate of population variance in the two samples was not significantly
different (for v1 = 11, v2 = 9, the 5% value of F = 3.112)
Hint: Since FCal =2.14 < F.05 = 3.112 at (11, 9) H0 : σ12 = σ22 accepted.
207
The estimates of population variance in the two samples are not significantly different.
Example5.8.23: Show how you would use Student's t-test and Snedecor's F-test to decide
whether the following two samples have been drawn from the same normal population. Which
of the two tests would you apply first and why?
Sample I 9 68 36
Sample II 10 69 42
Hint-1- Since FCal =1.04 < F9,7, (.05) = 3.4. Hence H0 : σ12 = σ22 is accepted.
Thus from (1) and (2) we may conclude that the two samples have been drawn from the
same normal populations.
Fisher's z-Distribution
Let x1, x2,… xn, and Let y1, y2,… yn, be the values of two independent random samples with
esitmated variances S12 and S22 . Suppose we are required to test significance for the
difference between two sample variance To do so Fisher has defined a statistic z by the
relation
1 1 S2 1 S
z log e F log e 12 log 1
2 2 S2 2 S2
If we write v1 = n1 - 1 and v2 = n2 - 1 then v1 and v2 are the degrees of the estimates S12 and
S22
A. Fisher has shown that if the samples came from the same universe, and that the universe
is normal, z is distributed according to the law
ev1 z
y y0 1 ( v1 v 2 )
(v1e2 z v2 ) 2
1
Significance Test: The hypothesis to be tested in that S12 and S22 are the estimates of the
same population variance. The divergence of the value of z from 0 is the basis of this
test.
For Pz = 0.05 a value of z > z0 refuses the hypothesis and we conclude that the sample have
been taken from populations with different variances.
Illustrative Examples
Example5.8.24: Two gauge operators are tested for precision in making measurements. One
operator completes a set of 26 readings with a standard deviation of 1.34 and the other does 34
readings with a standard deviation of 0.98 what is the level of significance of this difference?
You are given that for v1 = 25 and v2= 33, z0.05 = 0.306, z0.01=0.432
n1SD12 26 (1.34)2
S12 1.8674
n1 1 26 1
n2 SD22 34 (.98) 2
S22 0.9895
n2 1 34 1
S12 1.8674
F= 1.8770
S22 0.9895
1
z log e F
2
1 S2
log e 12
2 S2
1
log e (1.877)
2
1
log10 (1.877) log e 10
2
1
0.2735 2.3026
2
0.3149
209
Since zcal > z0.05 (0.305) and zcal > z0.01 (0.432)
Example5.8.25: Show how you would use Student's t-test and Fisher's z-test to use whether the
two sets of observations:
17 27 18 25 27 29 27 23 17
and 16 16 20 16 20 17 15 21
[The value of z at 5% points for 8 and 7 degrees of freedom is .6575 and the value of z at 1 %
points for 8 and 7 degrees of freedom is 0.9614]
17 -6 36 16 0 0
27 4 16 16 0 0
18 -5 25 20 4 16
25 2 4 16 0 0
27 4 16 20 4 16
29 6 36 17 1 1
27 4 16 15 -1 1
23 0 0 21 5 25
17 -6 36
210
Total 3 185 33 59
Since the calculated value of Z is less than table value of Z at 1% level of significance for 8
and 7 degrees of freedom. H0 : σ12 = σ22 is accepted at 1 % level. The two population
variances are same at 1% level of significance.
5.9 Assignment
1. The following is the distribution of 106 nine pig-litters according to the numbers of
males in the litters-
Fit a Binomial distribution under the hypothesis that the sex ratio is 1:1.Test the goodness
of fit. Given that χ2 for 4 degrees of freedom at 5% level of significance=9.488.
2. The following table gives the number of aircraft accident that occurred during the
various days of the week. Find whether the accident is uniformly distributed over the week.
No. of 14 16 12 19 9 14 84
Accident
3. Records taken of the number of male and female birth in 800 families having four
children are as follows-
0 4 32
1 3 178
2 2 290
211
3 1 236
4 0 64
Test whether the data are consistent with the hypothesis that the binomial law holds and
that the chance of a male birth is equal to that of a female birth, that is q=p=1/2.You may
use the table given below-
Degrees of 1 2 3 4 5
Freedom
4. The following data shows the suicides of women in eight German states during fourteen
years-
ObservedFrequen 9 19 7 20 15 11 8 2 3 5 3 112
cy
Fit a Poisson distribution and test its goodness of fit. The value of χ2 for 6d.f. at 5% level of
significance=12.592
5. Genetic theory states that children having one parent of blood type M and the other of
blood type N will always be one of the three types M,MN,N and that the proportions of
three type s will be on average as 1:2:1.A report states that oout of 300 childern having one
M parent and N parent 30% were found to be type M,45%type MN and remainder type N
Test the hypothesis by χ2 for 2 degrees of freedom at 5% level 5.991.
Not Light
light
Test whether the colour of the son‘s eyes is associated with that of the father‘s (χ2=3.84,v=1)
212
7.In an experiment on the immunization of goats from anthrax the following results were
obtained. Derive your inference on the efficiency of the vaccine-
Inoculated withVaccine 2 10 12
Not Inoculated 6 6 12
Total 8 16 24
8. The following table gives the series of controlled experiment. Discuss whether the
treatment may be considered to have any positive effect-
Treatment 9 2 1 12
Control 3 6 3 12
Total 12 8 4 24
Affected Unaffected
Inoculated 12 26
Not inoculated 16 6
11. Five‘s computed for four fold tables is independent replications of an experiment are
0.50,4.10,1.20,2.79 and 5.41. Does the aggregate of these tests yield a significant? Given-
213
Degrees of freedom 4 5 6 7
18. Write down an expression for testing the independence of two attributes.
ANSWERS 5.9
4. Show that in the random sample of size 25 from an uncorrelated normal population the
chance is 1 in 100 that r is greater than about 0.43.
214
5. Find the least value of r in a sample of 27 paired observations from a bivariate normal
population that is significant at 5% level of significance.
7. Calculate the value of t in the case of two characters A and B whose corresponding value:
are given below:
A 16 10 8 9 9 8
B 8 4 5 9 12 4
Ans. t = 1.66.
8. The figures below are for protein tests of the same variety of wheat grown in two
districts. The average in District I is 12.74 and in District II is 13.03. Calculate r for testing
the significance between the means of the two districts:
Protein results
Ans. t = 0.85
9. In a Test Examination given to two groups of students the marks obtained were as
follows:
First 18 20 36 50 49 36 34 49 41
group
Second 29 28 26 35 30 44 46
group
Examine the significance of difference between the arithmetic averages of the marks
10. For a random sample of 12 boys fed on diet A, the increases in weight in pounds in a
25, 32, 30, 34, 24, 25,14, 32,24, 30, 31, 35.
For another random sample of 15 boys fed on diet B, the increase weight in pounds in
44, 34, 22,10,47, 31,40, 30, 32, 35,18, 21, 35, 29, 22.
Given that the value of t for 25 degrees of freedom at 5% level of significance is 2 06.
11. The means of two random samples of sizes 9 and 7 respectively, are 196.4 and 198.82
respectively. The sum of the squares of the deviations from the means are 26.94 and
18.73 respectively. Can the samples be considered to have been drawn from the same
normal population?
It being given that the value of t for 14 d.f. at 5% level of significance is 2.145 and at 1%
level of significance is 2.977.
Ans. t = 2.65
12. Two types of batteries-A and B are tested for their length of life and following results
are obtained :
A: -------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------
B: -------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------
C: -------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------
******