1. Introduction
2. Definition
3. Use of Regression
4. Difference between correlation and regression
5. Method of studying Regression
6. Conclusion
7. Reference

Introduction :

So far we have studies correlation analysis which measures the direction

and strength of the relationship between two variables. Here we can estimate or
predict the value of one variable from the given value of the other variable. For
instance, price and supply are correlated. We can find out the expected amount
of supply for a given price or the required price level for attaining the given
amount of supply.

Regression helps us ton estimate one variable or the dependent variable

from the other variable or the independent variable. In other worlds, we can
estimate the value of one variable, provided the value of the other variable is
given. The statistical method which helps us to estimate the unknown value of
one variable from the known value of the related variable is called Regression.
The dictionary meaning of the word regression is ‘return’ or going back. In
1877, Sir Francis Galtion, first introduced the word regression, while studying
the relationship between the heights of fathers and sons. He studies about the
about the heights of 100 fathers and sons and gave his opinion that tall fathers
were having tall sons and short fathers were having short sons. He found out
that the average height of the sons of tall fathers was less than the average
height of the tall fathers. And the average height of the sons of short father was
more than the average height of the short father. The tendency to regression or
going back was called by Galton as the Line of Regression. The line describing
the average relationship between two variables is known as the line of
regression Now, the modern writers use the terms estimating line instead of
regression line.


According to Blair, “Regression is the measure of the average

relationship between two or more variables in terms of the original units of the

According to Taro Yamane, “the most frequently used techniques in

economics and business research, to find a relation between two or more
variables that are related casually is regression analysis.”

According to Walli’s and Robert, “It is often more important to find out
what the relation actually is in order to estimate or predict one variable (the
dependent variable), and the statistical technique appropriate in such a case is
called Regression Analysis”.

Accounting to Ya-Lun- Chou, Regression analysis attempts to establish

the nature of relationship between the variables, that is to study the functional
relationship between the variables X and Y, and thereby, provides a mechanism
for prediction or forecasting.

There are two types of variables in regression – analysis (a) Independent

variable and (b) Dependent variable. The variable whose value is influenced or
is to be predicted is called Dependent variable, where as the variable which
influences the value or is used for prediction is called independent variable.

According to Ya-Lun-Chow, “Regression analysis is a statistical device.

With the help of the regression analysis, we can estimate or predict the
unknown values of one variable from the known values of another variable. In
regression analysis, the independent variable is also known as the “Regressor”
or “Predictor” or “Explanator” and the dependent variable is known as
“regressed” or “explained” variable.

Use Of Regression Analysis

 Regression analysis is used in statistics and in all those fields where two
or more relative variables have a tendency to go back to the average. It is
used more than the correlation analysis in many scientific studies. It is
widely used in social sciences like economics, natural and physical
sciences. It is used to estimate the relationship between two economic
variables like income and expenditure. If we know the income, we can
find out the probable expenditure. Thus it is a highly valuable tool in
economics and business. Most of the economic issues are based on cause
and effect relationship. It is very useful for prediction purposes. In
business also, it is very helpful to study the business predictions. Cost of
production is affected by the sale of the product. Economists have arrived
at many predictions and theories on the basis of regression.
 Regression analysis predicts the value of dependent variables from the
values of independent variables.
 The regression analysis is highly useful and the regression line equation
helps to estimate the value of dependent variable, when the values of
independent variables are used in the equation.
 We can calculate the coefficient of correlation (r) and the coefficient of
determination (r2) with the help of regression coefficient.
 Using regression analysis, the statistical estimation of demand curves,
production function, consumption function, etc. can be predicted

Significance of Regression Study:

The coefficient of correlation between the two variables gives us an

abstract form a pure number. of the amount of relationship between the two
variables. It is an abstract number which measures the degree of the relationship
between the two variables. While dealing with economics and commercial data,

we are required to make prediction or estimation. For instance, with a rise in
price. the demand for the commodity goes down; with better monsoon. output
of agricultural produces increases etc. The first objective of regression analysis
is or provide estimates of the values of the dependent variable from values of
independent variable. Prediction or estimation is one of the major problems in
almost all spheres of human activity. The prediction or estimation of future
activities are important or businessmen. Thus the statistical deck. with the help
of which, we estimate or predict the unknown values of one variable from the
known values of another variable, is known as regression. Regression analysis
is one of the scientific method for making such predictions. According to Blair.
‘Regression analysis is a mathematical measure of the average relationship
between two or more variables in tennis of the original units of the data”. This is
done with the help of the regression line. ‘[he regression line describes the
average relationship existing between X and Y variables, i.e.. it is a line which
displays mean values of Y for given values of X.

The regression analysis confined to the study of only two variables n a

time is termed as simple regression. The regression analysis for studying more
than two variable sat at a lime is known as multiple regression

Correlation and Regression:

The Correlation coefficient is a measure of degree co variability between

two variables which the regression establishes a functional relationship
between dependent and independent variables so that the former can be
predicted for a given value of the later. Correlation should recede regression ,
for if the relationship is not sufficiently strong there would appear to be no
sound lasis for prediction.

The difference between correlation and regression are :

Correlation Regression
1. Correlation is the relationship Regression means going back and it is
between two or more variable. a mathematical measure showing the
Which vary in sympathy with average relation ship between two
the other in the same or the variables.
opposite direction
2. Both the variables X and Y are Here X is a random variable and Y is a
random variables. fixed variable. Sometimes both the
variables may be random variables.
3. It finds out the degree of It indicates the cause and effect
relationship between two relationship between the variables and
variables and not the cause and establishes a functional relationship.
effect of the variable.
4. It is used for testing and Beside verification it is used for the
verifying the relation between prediction of one value, in relationship
two variables and gives limited to the other given value
5. The co efficient of correlation is Regression coefficient is an absolute
a relative measure. The range of figure. If we know the value of the
relationship lies between + 1. independent variable, we can find the
variable. We can find the value of the
dependent variable.
6. There may be nonsense In regression there is nonsense
correlation between two regression.
7. It has limited application, It has wider application, as it studies
because it is confined only to linear and non- linear relationship

linear relationships between the between the variables.
8. It is not vary useful for further It is widely used for further
mathematical treatment mathematical treatment
9. If the coefficient of correlation is The regression coefficient explains
positive, then the two variables that the decrease in one variable is
are positively correlated and associated with the increase in the
vice versa. other variable.
10. It is immaterial whether X There is a functional relationship
depends upon Y or Y depends between the two variables so that we
upon X. may identify between the independent
and dependent variables.

Method of Studying Regression:

There are two methods :

1. Graphic Method
2. Algebraic Method
1. Graphic Method: The points arc plotted on a graph paper representing
pairs of values of the concerned variables. In this diagram the
independent variable is taken on horizontal axis and dependent variable
on the vertical axis. These points give a picture of a scatter diagram. A
regression line may the drawn in between these points by free hand or by
a scale rule. It should be drawn carefully as the line of best fit leaving
equal number of points on both sides in such a manner that the sum of the
square of the distances is the least.

The following illustration shows the regression line fitted on the scatter

65 68
63 66
67 68
64 65
68 69
62 66
70 68
60 65
68 71
67 67
69 68
71 70

2. Algebraic Method:

Regression n Line: In the graphical jargon, a regression line is a straight

line filled to the data by the method of least Squares. It indicates the best
probable mean value of one variable corresponding lo the mean value of the
other. Since a regression line is the line of best cannot he used conversely :
therefore. there are always two regression lines constructed for the relationship
between two variables, say, X and Y. Thus. One regression line shows
regression of X upon Y. and the other shows the regression of Y upon X .When
two variables have relationship, then we can draw a regression line. In the case
of two variables X and Y there are two regression lines. (a as the regression line
of Y on X arid (b) the regression line of X on Y. A line of regression gives the

best average value of one variable for any given value of the other variable. The
regression line of X on Y gives the most probable values of X for any given
value of Y. In the same manner, the regression.

Line of Y on X gives the most probable values of Y for any given value
of X. Thus there will be two regression lines in the case of two variables.

When there is perfectly positive correlation (+1) or perfectly negative

correlation (-1) the two regression lines will coincide with each other, i.e., there
will be only one line. If the regression lines are nearer to each other, then there
is a higher degree of correlation. If the two line are farther away from each
other, then there is lessee degree of correlation. If r = 0, both variables are
independent. Both will cut each other at right angle.

The regression lines cut each other at the point of average of X and Y. If
we draw a perpendicular from the point to the X axis we will get the average
value of X and if we draw a perpendicular from the point to the Y axis we will
get the average vale of Y.

Regression Equations: Regression equation is an algebraic method. It is an
algebraic expression of the regression line. It can be classified into regression
equation, regression coefficient, individual observation and group distribution.

As there are two regression lines, there are two regression equations. For
the two variables X and Y, there are two regression equations. They are
regression equation of X on Y and the regression equation of Y on X. the
former states the change in the value of X for a given change in the value of X
for a given change in Y and the latter states the change in the value of Y for a
given change in X

𝑅𝑒𝑔𝑟𝑒𝑠𝑠𝑖𝑜𝑛 𝐸𝑞𝑢𝑎𝑡𝑖𝑜𝑛 𝑜𝑓 𝑋 𝑜𝑛 𝑌
𝑥(𝑒) = 𝑎 + 𝑏𝑦

Here ‘a’ and ‘b’ the two unknown constants or the parameters of the line,
determine the position of the line, determine the position of the line. The
constant ‘a’ shows the level of the fitted line. It is the distance between the point
of origin and the point where the regression line touches the Y axis. The
constant ‘b’ shown the slope of the line Xc is the value of X computed from the
relationship for a given Y. by the least square method, we can find out the value
of ‘a’ and ‘b’ and determined the regression line, which is known as the line of
best fit. The formulae are.

∑ 𝑋 = 𝑁𝑎 + 𝑏 ∑ 𝑌

∑ 𝑋𝑌 = 𝑎 ∑ 𝑌 + 𝑏 ∑ 𝑌 2

N is the number of observed pairs of value. ∑ 𝑋, ∑ 𝑌, ∑ 𝑋𝑌, ∑ 𝑦 are the totals
and they are computed from the value of two variables X and Y. Thus we can fit
a least square line.

Conclusion :

Reference :

1. Mathematics and Statistics for Economics, G S Monga (2008), Vikas

publishing house Pvt.Ltd
2. Statisticks, T.S.N Pillai and bagavathi (1994). S.Chand and company
Ltd Ram nagar, New Delhi.

