17 Regression Analysis
17 Regression Analysis
17 Regression Analysis
5 analysis
INTRODUCTION TO REGRESSION:
1. If the income of a family increase the expenditure will increase but if the
expenditure increases, there is no guarantee that the income will also increase.
2. If the rainfall is timely and good the crop will be good but if the crop is good there
is no guarantee that the rainfall was timely and good.
REGRESSION EQUATIONS:
DEEPAK SHARMA
5.2 REGRESSION ANALYSIS [C.P.T.]
Regression Equation of Y on X:
Regression Equation of X on Y:
Graphic method :
Under this method the two variables are plotted on a graph. It is a normal practice to
plot the dependent variable on the y-axis and independent variable on the x-axis. The
points plotted in the space from a scatter diagram which is shown below.
Now we have to draw a line called the line of regression which passes through these
points, bearing equal number of points in both sides the line should be drawn in such a
manner that all the points should be as close as possible to the line. This is achieved by
using method of least squares. In this method the sum of the squares of the differences
between observed value of Y and estimated value of Y from the line for every point is
minimised.
Y Y = a + bX
•
• •
• • •
• • • •
• • •
• •
•
O X
(1) Least Square Method
Yc = a + bX
To determine the values of a and b, the following normal equations are to be solved
simultaneously :
∑ Y = Na + b∑ X
∑ XY = a∑ X + b∑ X 2
To determine the values of a and b, the following normal equations are to be solved
simultaneously :
∑ X = Na + b∑ Y
∑ XY = a∑ Y + b∑ Y 2
σ σ
y
(3) bxy = r. x ; byx = r.
σy σx
nΣ dx dy − Σdx. Σdy
(4) bxy = (Short cut Method)
nΣdy 2 − (Σ dy )2
n Σ x y − Σ x. Σ y
(6) bxy = (Direct Method)
nΣ y 2 − ( Σ y) 2
nΣ x y − Σ x. Σ y
(7) byx = (Direct Method)
nΣ x 2 − ( Σ x) 2
EXERCISE :
Q.1 From the following data obtain the two regression equations :
DEEPAK SHARMA
5.4 REGRESSION ANALYSIS [C.P.T.]
X 6 2 10 4 8
Y 9 11 5 8 7
(Ans: Y = 11.9 – 0.65 X ; X = 16.4 – 1.3 Y)
Q.2 The following table shows the ages (X) and blood pressure (Y) of 8 persons :
X: 52 63 45 36 72 65 47 25
Y: 62 53 51 25 79 43 60 33
Obtain the regression equation of Y on X and find the expected blood pressure of a
person who is 49 years old.
[Ans. Y = 11.87 + 0.768x, Y = 49.502]
Q.3 (i) Compute the two regression equations on the basis of the following
information:
X Y
Mean 40 45
Standard Deviation 10 9
Karl Pearson’s correlation coefficient between X and Y = 0.50
(ii) Also estimate the value of Y for X = 48 using the appropriate regression
equation.
[Ans: Y = 0.45 x + 27 X = 48 is 48.6.
4. If one regression coefficient is positive and the other is negative, we can say that
(a) correlation coefficient is positive
(b) correlation coefficient is negative
(c) correlation coefficient is imaginary
(d) given information is defective.
5. Generally, there are two regression lines for two regression equations; it can be one only
when correlation coefficient is equal to
(a) +1 (b) —I
(c) +1 or—I (d) zero.
6. If both regression lines completely falls on each other the correlation between X and Y
variables shall be
(a) perfect (b) no correlation
(c) significant (d) insignificant.
10. If both regression coefficients are 2 and 0.45 respectively, the value of r will be
(a) 0.90 (b) 0.30
(c) 0.95 (d) 0.03.
12. If the correlation coefficient between X and Y is 0.50, what percentage of the total
variations remains unexplained by the regression equation?
(a) .25 (b) .50
(c) .75 (d) 1
15. Following are the two normal equations obtained for deriving the regression line of y and
x:
5a + 10b = 40
10a + 25b = 95
The regression line of y on x is given by
(a) 2x + 3y = 5 (b) 2y + 3x = 5 (c) y = 2 + 3x (d) y = 3 + 5x
17. Given the regression equations as 3x + y = 13 and 2x + 5y = 20, which one is the
regression equation of y on x?
(a) 1st equation (b) 2nd equation
(c) both (a) and (b) (d) none of these
18. If regression coefficients b xy and b yx are —0.4 and +1.6 respectively, the coefficient of
correlation should be
(a) 1 (b) —0.8
(c) +0.4 (d) None of the above.
DEEPAK SHARMA
5.8 REGRESSION ANALYSIS [C.P.T.]
20. For a bivariate data the mean value of X is 20, and that of Y is 45.The regression coefficient
of Y on X is 4 and that of X on Y is 1/9. From these data find out the coefficient of correlation,
(a) 0.35 (b) 0.47
(c) 0.67 (d) 0.87
23. Given the following equations: 2x – 3y = 10 and 3x + 4y = 15, which one is the regression
equation of x on y ?
(a) 1st equation (b) 2nd equation
(c) both the equations (d) none of these
26. If the regression line of y on x and that of x on y are given by y = –2x + 3 and 8x = –y + 3
respectively, what is the coefficient of correlation between x and y?
(a) 0.5 (b) –1/ 2 (c) –0.5 (d) none of these
28. The correlation co-efficient between x and y is 0.6 hence the correlation co-efficient
between x + 0.2 and y + 0.2 is
(a) 0.4 (b) 0.6 (c) —0.6 (d) 0.9.
31. If the regression coefficient of y on x, the coefficient of correlation between x and y and
variance of y are –3/4, –√3/2 and 4 respectively, what is the variance of x?
(a) 2/(√3/2) (b) 16/3 (c) 4/3 (d) 4
32. If y = 3x + 4 is the regression line of y on x and the arithmetic mean of x is –1, what is the
arithmetic mean of y?
(a) 1 (b) –1 (c) 7 (d) none of these
33. The correlation co-efficient between x and y is 0.87, hence the correlation co-efficient
between —x and —y is
(a)—0.87 (b) 0.87 (c) either a or b (d) none of them.
34. Estimate the loss of production in a week when the number of workers on strike is 1800
from the following information:
Average number of workers on strike = 800; Average loss of daily production in ‘000 Rs. = 35;
Standard deviation of numbers of workers on strike = 100; Standard deviation of loss of daily
production in ‘000 Rs. = 2; Coefficient of correlation between the number of workers and the
daily production loss = 0.8 assuming 6 working days in a week.
(a) 3,00,000 (b) 3,05,000
(c) 3,06,000 (d) 3,07,000
36. The two lines of regression are given by 8x + 10y = 25 and 16x + 5y = 12 respectively. If
the variance of x is 25, what is the standard deviation of y?
(a) 16 (b) 8 (c) 64 (d) 4
37. Given below the information about the capital employed and profit earned by a company
over the last twenty five years:
Mean SD
Capital employed ( 0000 Rs) 62 5
Profit earned ( 000 Rs) 25 6
Correlation Coefficient between capital and profit = 0.92. The sum of the Regression
coefficients for the above data would be:
(a) 1.871 (b) 2.358 (c) 1.968 (d) 2.346
DEEPAK SHARMA
5.10 REGRESSION ANALYSIS [C.P.T.]
38. If the two lines of regression in a bivariate distribution are X + 9Y = 7 and Y + 4X = 16,
then Sx : Sy is
(a) 3:2 (b) 2:3 (c) 9:4 (d) 4:9.
39. If the relationship between two variables x and u is u + 3x = 10 and between two other
variables y and v is 2y + 5v = 25, and the regression coefficient of y on x is known as 0.80,
what would be the regression coefficient of v on u?
(a) 0.1067 (b) 2.3 (c) 0.945 (d) 0.498.
ANSWERS: