Properties of correlation coefficient.
1).correlation coefficient has a well defined formula
2).It lies between -1 and +1
isa pure number and is independent of the units of measurement.
4).correlation co .efficient does not change with reference to change of origin or change of scale.
5).coefficient of correlation between X and Y is same as that of Y and X.
Probable error
Probable error is used to measure the reliability and dependability of the value of correlation
coefficient . If probable error is added or subtracted from the value of correlation coefficient , we get 2
limits within which the value of correlation coefficient may expected to lie.CORRELATION
Correlation is defined as the relationship between two or more variables. Two varlables are said to be
correlated if the change in one variable results in a corresponding change in the other variable.
For example: when price of a commodity rises supply for that commodity also rises.
Different kinds of correlation
1. Positive and negative correlation
Positive correlation:
Two variables are moves in the same direction then the correlation is said to be positive correlation.
That is an increase in the value of one variable causes an increase in the value of other varlable or a
decrease In the value of one variable causes a decrease in the value of the other variable,
for example: helght, weight
‘Negative correlation:
‘Two variables moves in the opposite direction then the correlation is sald to be negative correlation.
That Is an increase in the value of one variable causes a decrease In the value of the other variable
cor a decrease in the value of one varlable causes an Increase in the value of other variable.
For example: price, demand
Linear and non-linear correlation
Linear correlation:
‘When the amount of change in one varlable leads to a constant ratio to the change in the other
variable, then the correlation is said to be linear3. Simple, partial and multiple correlation
‘Simple correlation
Inthe study of relationship between variables , if there are only two variables, the correlation is said
tobe simple.
Example: price, demand
Partial correlation
In partial correlation we study the relationship between any two variables , and the third variable
remains constant, if there are three variables.
Example: pressure, volume and temperature
Multiple correlation
In multiple correlation we study the relationship between one variable on one side and the
remaining variables on the other side.
Example: yield, rainfall and temperatureDistinction between correlation and Regression
1Lin correlation analysis, we study the degree of relatienship between variables. In regression
we study the nature of reatonshi,
2m corelaton analysis, the choice of dependent and independent varlables is purely depends
con personal choice anda of no practical significance,
Incogrssion analysis, one has to decide which variable shal be taken as dependent and which as
Independent.
3, Correlation analssis not othe purpose of prediction whereas the rearesslon analysis is
‘easialiy used for prediction purposes.
Properties of rogresion lines
2], The twaltnes intersect at, 9),
2), When “=, the twolines coincide.
3). When ‘7-0, the we lines are mutualy perpendiclr.
Properties of regression co.eficents
bya ie the repression coefficient ofyon x ond bis the regression co. efficient ofx ony.
1). Thesign of both regression coaficents willbe the some. That is, both willbe positive or both willbe
negative.
2) Product ofthe twa rgresion coeffcensis the square ofthe correlation coefficient.
baby
by a by wil have the same sign as
4), When there i perfect correlation, by, and bry ae reciprocals ofeach other.
2 ang
6), Both the regression coefelents wil nat be greater than I, ne of them cannot be greater thani,or
both con be less thon2). Method of least squares (curve fitting)
The principle of least squares is that principle which states that the line of best fit should be drawn in
such a manner that the sum of the squares of difference between the known values of the dependent
variable and the corresponding values of it obtained from the line of best fit should be the least.
ie (yo — ye)" should be the least where yostands for known values of the dependent variable and ye
stands for the corresponding value of the dependent line.
Regression lines
There are two types of regression lines.
1) Regression line of X on ¥.
2). Regression equation of ¥ on X.
Regression Equations
1), Regression equation of X on ¥ is.x—
2). Regression equation of ¥ on X is Y
are known as regression
coefficients.REGRESSION
Regression means the estimation or prediction of unknown value of one variable from that of a known.
variable Jt isa statistical device used to study the relationship between two or more variables which are
related.
‘Simple regression
Inthe study of regression analysis, ifthere are twa variables tis nown as simple regresslen.
‘Multiple regression
{In multiple regression analysis, there are more than two variables and we try to find out the effect of
two or more independent variables on one dependent variable,
Regression curve
{the elven bivariate data are plotted on a graph, the points so obtained on the scatter diagram will be
less or more concentrate around a curve ,is called regression curve.
Linear regression
Ifthe regression curve isa straight line wwe say that there fs linear regression between the variables
under study,
Non -linear regression
Ifthe curve of regression isnot a straight line then the regression is termed as nor-linear or curved
linear regression.
Une of best fit
‘The points of the scatter diagram concentrate around a straight line, that line is called Line of Best fit.
This ine is also known as regression line.
‘Method of drawing regression lines (graphic method)
1). Free hand curve method
Toke independent varlable on X-axis and dependent variable on the Y-axis. We draw a smooth free
hand line in such a way that it clearly indicates the tendency ofthe original data. The line is fitted nearly
by inspection the free hand curve be drawn in such a way that the area of the curve below and above
the line are approximately equal. Such a line will describe the general tendency of the original data This
is the regression line.K3
‘The: sisica soe Teor senses / ieee a
Ties are usually affected by a multiplicity of causes.
‘Thechange in the vlues ofa variable related te ti
: (0 time can be the result of
a large variety of factors like changes in the tastes and habits of people,
éhanges in population, reduction in cost of prodtion, increase In i
‘come of | Lar gta ‘The value of a variable changes due to the interac-
tion of such forces. These forces are interconnected and cannot be
distinguished easily. The effects of these forces on a time series are
called the components of a time series
‘These components are
(1) Secular trend
(2) Seasonal variations
) Cyclic variations
(@) Inregular variations.
1. Secular trend
Trend refers to long period changes. It shows the definite and
basic tendency of the statistical data with the passage of time, It is,
smooth, regular and long term movement. It refers to general tendency
ofasat ‘al data to rise or to fall or to remain the same. For example, in
1 series concerning population or national income an upward tendency
can be noticed while in data on birth or death or illiteracy a downward
tendency.
1. Seasonal variation
‘Seasonal variations are those variations which occur with some
degree of regularity within a specific period of one year or shorter. CTi-
matic conditions,‘sccial customs, religious functions etc) are the factors
r&ponsbile for seasonal variations. The prices of rice will go up in the
sowing season and will come down during the harvest season. This is
the seasonal variation in the case of price of rice.
3. Cyclic variation |
‘Cyclic variations re periodie movements. These variations oc-
curat intervals (or periods) of more than one year. Business cycles and
Trae cycles work on Business and Economic series. These cyclic move-
mnens pass through different stages like prosperity, recession, depres-
sion and recovery. During these different stages, the time series show
changes, These changes are called cyclic fluctuations. They are visible
in the case of most of the business and economic activities. Series
‘eduction, demand efc undergo such cyclic changes.
ions (Random variations)
are those caused by unusual, unexpected and
"eating to prices, pr
4. Irregular fluctuat
Irregular fluctuations
Yewvey
NaasMerits of Moving Average Method.
1, _ Itis quite simple and easy to calculate as compared with the method
of least squares.
2. There is no scope for the bias or the personal prejudice.
3, Itcan be extended both ways to give figures for the past and for the
future years.
4, It is most commonly used and in a very long series it is the only
practical method.
5. This method is associated with a high degree of accuracy and it can
be made the basis for further analysis of time series.
6. When the period of moving average is equivalent to the preiod of
the cyclic fluctuations, such fluctuations are completely eliminated.
Demerits of Moving Average Method
1. Trend values cannot be computed for all the years. The moving
averages for the first few years and last few years cannot be ob-
tained. It is often these extreme years in which we may be inter-
ested.
2. Selection of proper period is a great difficulty. Ifa wrong period is
selected, there is every likelihood that conclusions may be mislead-
ing.
3. Since the moving average is not represented by a mathematical
function, this method cannot be used for forecasting.
4. Itcan be applied only to those series which show periodicity.(1) Scatter diagram
~~ This is a graphical method of studying correlation between two vari-
ables. One of the variables is shown on the X —axis and the other on the
Y - axis. Each pairof valuesis plotted on the graph by means ofa dot mark.
After, all the items are plotted we get as many dots on the graph paper as
the number of points. If these points show some trend either upward or
downward, the two variables are said to be correlated. If the plotted
points do not show any trend, the two variables are not correlated. Ifon
the other hand the tendency is reverse so that the points show a down-
ward trend from the left top to the right bottom, correlation.is negative.
The scatter diagram is a visual aid to show the presence or absence
of correlation between two variables. A line of best fit can be drawn
using the method of least square. This line will be as close to the points
as possible. If the points are falling very close to this line, there is very
high degree of correlation . If they lie very much away from this line it
shows that the correlation is not much.
For the two sets of data given below draw scatter diagrams and com-
ment on the relationship between variables x and y.Merits and Demerits of Rank correlation
Merits
1, It is easy to calculate.
2. It is simple to understand.
3. It can be applied to both quantitative and qualitative data.
Demerits
1. Rank correlation coefficient is only approximate measure as the
actual values are not used.
2. It is not convenient when ‘n’ is large.
3. Further algebraic treatment is not possible.
CONCURRENT DEVIATION METHOD
This is a very simple method of measuring correlation. Under this
method the directions of deviations are only taken. The magnitudes of
the values are ignored. So this method is useful when we are interested
in studying correlation between two variables in a casual manner and areMerits of Scatter diagram are
“1. It is easy to plot the points.
~Z. It is simple to understand.
3. Abnormal values in the data can be easily detected,
4, The value of dependent variable for a given value of independent
_-Nariable can be detected.
5°” The extreme values do not affect it.
Demerits of Scatter diagram are
LAC The degree of correlation cannot be easily estimated.
Algebraic treatment is not possible.
3. When the number of pairs of observations is either very big or very
small, the method is not e
. Correlation Graph (Graphic method)
Under this method, separate curves are drawn for the X variable
and Y variable on the same graph paper. The values of the variable are
taken as ordinates of the poion's plotted. From the direction and close-
ness of the two curves we can infer whether the variables are related. If
both the curves move in the same direction (upward or downward), cor-
relation is said to be positive. If the curves are moving in the opposite
direction correlation is said to be negative.
\ 3) \Coefficient of correlation
Coefficient of correlation is an algebraic method of measuring cor-
relation, Under this method, we measure correlation by finding a value
known as the coefficient of correlation using an appropriate formula.
Correlation coefficient is a numerical value. It shows the degree or the
extent of correlation between two variables.nt of determination
Coefficient of determination gives the percentage variation in the
dependent variable in relation with the independent variable. In other
words coefficient of determination gives the ratio of the explained vari-
ance to the total variance. The coefficient of determination is the square
of the correlation coefficient.
r ~. _ 5 _ Explained Variance
Thus, Coefficient of determination = r: Total Variance
The coefficient of determination is a much useful and better mea-
sure of interpreting the value of r. Coefficient of determination states
what percentage of variations in the dependent variable is explained by
the indenendent variable.