Multiple Correlation & Regression: Correlation Is A Measure of How Well A Given

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 28

Multiple Correlation & Regression

• In statistics, the coefficient of multiple


correlation is a measure of how well a given
variable can be predicted using a linear function of
a set of other variables.
• It is the correlation between the variable's values
and the best predictions that can be computed
linearly from the predictive variables.
• Regression analysis, predicts the value of the
dependent variable based on the known value of
the independent variable, assuming that average
mathematical relationship between two or more1
variables
Multiple Correlation & Regression
• Correlation & Regression are generally performed
together.
• Correlation – Degree of association between two
sets of quantitative data.
• Regression analysis – Explains the variation in one
variable (called the dependent variable), based on
the variation in one or more other variables (called
the independent variable).
• One Dependent Variable (D.V.) & One
Independent Variable (I.V.) – Simple Regression
• Multiple Independent Variable & one Dependent2
Variable – Multiple Regression
Multiple Correlation & Regression
• Worked Out Example –
• A Manufacturer & marketer of electric motors
would like to build a regression model consisting
of 5 or 6 independent variables to predict sales.
Past data has been collected for 15 sales territories
on sales & 6 different Independent Variables. Build
a Regression Model & recommend whether or not
it should be used by company.
• Data –
• Dependent Variable –
• Y = Sales in Rs Laks in the territory 3
Multiple Correlation & Regression
• Independent Variables –
• X1 = Market Potential in the territory ( in Rs Lakh)
• X2 = No. of Dealers of the company in the
territory
• X3 = No. of Sales Persons in territory
• X4 = Index of competitor activity in territory on 5
Point scale (1 – Low, 5 – High)
• X5 = No. of service People in the territory
• X6 = No. of existing customers in the territory

4
Multiple Correlation & Regression

5
Multiple Correlation & Regression
• SPSS Procedure –
• Correlation:
– Click on ANALYZE ( or STATISTICS depending
upon SPSS version)
– Click on CORRELATE, followed by BIVARIATE
– Select all the variables from list with a ‘Right Arrow’
– Select PERSON under heading of Correlation
Coefficient
– Select ‘2 – tailed’ under heading Test of Significance
– Click OK to get matrix of pairwise ‘Person
Correlations’ among all the variables selected along
with two tailed significance of each pairwise 6

correlation.
Multiple Correlation & Regression
• SPSS Procedure –
• Regression:
• Click on ANALYZE (or STATISTICS)
• Click on REGRESSION followed by LINEAR
• Select Dependent variable & transfer them to
Dependent Variable box using arrow keys
• Select Independent variable & transfer them to
Independent Variable box using arrow keys
• Select either ENTER or STEPWISE or
BACKWRAD
• Click OK 7
Multiple Correlation & Regression – SPSS Output

8
Multiple Correlation & Regression – SPSS Output –
‘Enter’

9
Multiple Correlation & Regression – SPSS Output

The Standard Error of the Estimate for Regression measures the amount of
variability in the points around the Regression line.
It is the Standard Deviation of the data points as they are distributed around
the Regression line
R Square is a basic matrix which tells you about that how much variance is
been explained by the model. What happens in a multivariate linear regression
is that if you keep on adding new variables, the R square value will always
increase irrespective of the variable significance.
What Adjusted R Square do is calculate R square from only those variables
whose addition in the model which are significant. So always while doing a
multivariate linear regression we should look at adjusted R square instead of R
square. 10
Multiple Correlation & Regression – SPSS Output

Residuals. The difference between the observed value of the


dependent variable (y) and the predicted value (ŷ) is called the
residual (e). Each data point has one residual. Both the sum and
the mean of the residuals are equal to zero.
df – Degrees of Freedom –
For Regression – p ( no of DVs)
For Residual: n-p-1 ( n – sample size)
MS = SS / df
F = MS (Regression) / MS ( Residual) 11
Multiple Correlation & Regression – SPSS Output

The t statistic is the coefficient divided by its standard error.


The standard error is an estimate of the standard deviation of the
coefficient, the amount it varies across cases. It can be thought of as a
measure of the precision with which the regression coefficient is
measured.
12
Multiple Correlation & Regression – SPSS Output
Standardization of the coefficient is usually done to answer the question of which
of the independent variables have a greater effect on the dependent variable in
a multiple regression analysis, when the variables are measured in different units
of measurement (for example, income measured in dollars and family
size measured in number of individuals).

A regression carried out on original (unstandardized) variables produces


unstandardized coefficients. A regression carried out on standardized variables
produces standardized coefficients. Values for standardized and unstandardized
coefficients can also be derived subsequent to either type of analysis.

Before solving a multiple regression problem, all variables (independent and


dependent) can be standardized. Each variable can be standardized by subtracting
its mean from each of its values and then dividing these new values by
the standard deviation of the variable. Standardizing all variables in a multiple
regression yields standardized regression coefficients that show the change in the
dependent variable measured in standard deviations

13
Multiple Correlation & Regression – SPSS Output
‘Forward’

14
Multiple Correlation & Regression – SPSS Output

15
Multiple Correlation & Regression – SPSS Output

16
Multiple Correlation & Regression – SPSS Output
‘Backward’

17
Multiple Correlation & Regression – SPSS Output

18
Multiple Correlation & Regression – SPSS Output

19
Multiple Correlation & Regression – SPSS Output

20
Multiple Correlation & Regression – SPSS Output

21
Multiple Correlation & Regression
• Ex 2 – An organization would like to build
Regression Model consisting of four independent
variables to predict the Compensation (Dependent
Variable) of it’s employees. Past data has been
collected for 15 different employees & four
independent variables.. Build a Regression Model
& recommend it’s proper usage.
• The data is as follows –
• Dependent Variable – (DV)
• Y = Compensation in Rs.
22
Multiple Correlation & Regression
• Independent Variables ( I.V.) –
• 1. Experience in Years
• 2. Education in Years (After 10th)
• 3. Number of Employees Supervised
• 4. Number of Projects Handled
The dataset consisting of observations is given on the
next slide -

23
Multiple Correlation & Regression

24
Multiple Correlation & Regression

25
Multiple Correlation & Regression

26
Multiple Correlation & Regression

27
Multiple Correlation & Regression

28

You might also like