Simple Linear Correlation-1

Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

Simple Linear Correlation and regression

Paper II: Statistics and Research Methodology


Under Ms. Aarti Shukla
Assistant Professor, AICP, AUH

Presented by: Pratosh Kamal Das


(A50710923001)
M. Phil. Clinical Psychology
Simple Linear Correlation
Correlation and its Types

In statistics, Correlation studies and measures the direction and extent of relationship
among variables, so the correlation measures co-variation, not causation. For example,
there exists a correlation between two variables X and Y, which means the value of one
variable is found to change in one direction, the value of the other variable is found to
change either in the same direction (i.e. positive change) or in the opposite direction (i.e.
negative change).

The correlation coefficient, (r), is a summary measure that describes the extent of the
statistical relationship between two interval or ratio level variables. The correlation
coefficient is scaled so that it is always between -1 and +1. When r is close to 0 this
means that there is little relationship between the variables and the farther away from 0 r
is, in either the positive or negative direction, the greater the relationship between the two
variables.
Types of Correlation

There can be three such situations to see the relation between the two variables –
Positive Correlation – when the values of the two variables move in the same
direction so that an increase/decrease in the value of one variable is followed by an
increase/decrease in the value of the other variable.
Negative Correlation – when the values of the two variables move in the opposite
direction so that an increase/decrease in the value of one variable is followed by
decrease/increase in the value of the other variable.
No Correlation – when there is no linear dependence or no relation between the
two variables.
Scatter Diagrams
A scatter diagram is a diagram that shows the values of two variables X
and Y, along with the way in which these two variables relate to each
other. The values of variable X are given along the horizontal axis, with
the values of the variable Y given on the vertical axis. A scatter diagram
is given in the following example, showing various types of correlation.
In case of bivariate population: Correlation can be studied through (a) Charles Spearman’s
coefficient of correlation; (b) Karl Pearson’s coefficient of correlation; whereas cause and effect
relationship can be studied through simple regression equations.

Charles Spearman’s coefficient of correlation (or


rank correlation): It is the technique of determining the degree of correlation

between two variables in case of ordinal data where ranks are given to the different values of the
variables. The main objective of this coefficient is to determine the extent to which the two sets
of ranking are similar or dissimilar.This coefficient is determined as under:
Pearson’s Product Moment Correlation

Karl Pearson’s coefficient of correlation (or simple correlation) is the most widely used
method of measuring the degree of relationship between two variables. This coefficient
assumes the following: (i) that there is linear relationship between the two variables; (ii)
that the two variables are casually related which means that one of the variables is
independent and the other one is dependent; and (iii) a large number of independent
causes are operating in both variables so as to produce a normal distribution. Karl
Pearson’s coefficient of correlation can be worked out thus. Karl Pearson’s coefficient of
correlation
SIMPLE LINEAR REGRESSION ANALYSIS

Regression is the determination of a statistical relationship between two or more


variables. In simple regression, we have only two variables, one variable (defined as
independent) is the cause of the behavior of another one (defined as dependent variable).
Regression can only interpret what exists physically i.e., there must be a physical way in
which independent variable X can affect dependent variable Y. The basic relationship
between X andY is given by,

Y’ = a + bX

Where, the symbol Y’ denotes the estimated value of Y for a given value of X. This
equation is known as the regression equation of Y on X (also represents the regression
line of Y on X when drawn on a graph) which means that each unit change in X produces
a change of b inY, which is positive for direct and negative for inverse relationships.
Scattergram
Other types of Regression
1. MULTIPLE REGRESSION extends linear regression by incorporating two
or more independent variables to predict the dependent variable. It allows
for examining the simultaneous effects of multiple predictors on the outcome
variable. Multiple regression equation assumes the form:

Y = β0 + β1X1 + β2X2 + … + βnXn + ε

In this formula:
 Y represents the dependent variable (response variable).
 X represents the independent variable(s) (predictor variable(s)).
 β0, β1, β2, …, βn are the regression coefficients or parameters that need to be
estimated.
 ε represents the error term or residual (the difference between the observed
and predicted values).
Other types of Regression (contd….)
Polynomial Regression

Polynomial regression models non-linear relationships between variables by adding

polynomial terms (e.g., squared or cubic terms) to the regression equation. It can capture

curved or nonlinear patterns in the data.

Logistic Regression

Logistic regression is used when the dependent variable is binary or categorical. It models

the probability of the occurrence of a certain event or outcome based on the independent

variables. Logistic regression estimates the coefficients using the logistic function, which

transforms the linear combination of predictors into a probability.


Other types of Regression (contd….)
Ridge Regression and Lasso Regression

Ridge regression and Lasso regression are techniques used for addressing multicollinearity (high
correlation between independent variables) and variable selection. Both methods introduce a penalty
term to the regression equation to shrink or eliminate less important variables. Ridge regression uses
L2 regularization, while Lasso regression uses L1 regularization.

Time Series Regression

Time series regression analyzes the relationship between a dependent variable and independent
variables when the data is collected over time. It accounts for autocorrelation and trends in the data
and is used in forecasting and studying temporal relationships.
Other types of Regression (contd….)
Nonlinear Regression

Nonlinear regression models are used when the relationship between the
dependent variable and independent variables is not linear. These models can
take various functional forms and require estimation techniques different from
those used in linear regression.

Poisson Regression

Poisson regression is employed when the dependent variable represents count


data. It models the relationship between the independent variables and the
expected count, assuming a Poisson distribution for the dependent variable.
Scattergrams
References

Kothari, C. R. (2004). Research methodology: Methods and


techniques. New Age International.
McCleod, S. (2010). Correlation In Psychology: Meaning, Types,
Examples & Coefficient. Retrieved from:
https://www.simplypsychology.org/correlation.html.
Hassan, M. (2020). Regression Analysis – Methods, Types and
Examples. Retrieved from:
https://researchmethod.net/regression-analysis/.

You might also like