Linear Regression
Linear Regression
Linear Regression
• Linear Regression
• Multiple Regression
• Logistic Regression
Multiple Linear Regression
• A new variable will be added in the data showing Cook’s distance for
each respondent.
Dependent variable is normally distributed
• Analyze > Descriptive Statistics >
Explore
• In plots
• Check histogram
• Check normality plots with tests
• Shapiro Wilk test should have p value
greater than 0.05
Linear relationship
• Visually inspect scatter plot
• Graphs > Chart Builder >
Scatter/Dot
• Select the dependent variable in
y axis
• Select independent variable
individually in x axis
Multicollinearity
• Multicollinearity refers to a situation in which more than two
explanatory variables in a multiple regression model are highly
linearly related.
• Check Collinearity diagnostics in Statistics option while
conducting Linear Regression.
• VIF value should be less than 10.
No autocorrelation
• Data that is correlated with itself, as opposed to being correlated with
some other data
• May happen in time series data.
• Stock prices, where the price is not independent from the previous
price.
• Check Durbin Watson in Statistics option while conducting Linear
Regression.
• Durbin Watson should be between 1.5 and 2.5.
Homoskedasticity
• ZRESID in Y axis
• ZPRED in X axis
Regression Interpretation
• In Model summary table, check the value of Adjusted R square
• It should be greater than 0.3
• In ANOVA table, check the value of Significance level
• It should be less than 0.05. It means that the model is significant.
• In coefficients table, first check the value of significance level
• If the value of significance of a variable is greater than 0.05, that variable is
insignificant. Remove the variable and run the regression again. Repeat till
there are no insignificant variables.
• Finally, note down the B (Beta) value of all the variables.
• Develop the regression equation.