Linear Regression

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 10

R programming

ATHIRA B

LINEAR REGRESSION
Linear Regression

Linear regression is one of the easiest and most popular Machine


Learning algorithms.
Linear regression makes predictions for continuous/real or numeric
variables such as sales, salary, age, product price, etc.
Regression shows a line or curve that passes through all the data
points on a target-predictor graph in such a way that the vertical
distance between the data points and the regression line is
minimum.
Linear regression algorithm shows a linear relationship
between a dependent (y) and one or more independent
(x) variables, hence called as linear regression.
Since linear regression shows the linear relationship,
which means it finds how the value of the dependent
variable is changing according to the value of the
independent variable.
 The linear regression model provides a sloped straight line representing the relationship
between the variables

Mathematically, y= a0+a1x

Y= Dependent Variable (Target Variable)


X= Independent Variable (predictor Variable)
a0= intercept of the line (Gives an additional degree of freedom)
a1 = Linear regression coefficient (scale factor to each input value).
Linear regression can be further divided into two types of the algorithm:
• Simple Linear Regression:
If a single independent variable is used to predict the value of a numerical dependent
variable, then such a Linear Regression algorithm is called Simple Linear
Regression.
• Multiple Linear regression:
If more than one independent variable is used to predict the value of a numerical
dependent variable, then such a Linear Regression algorithm is called Multiple
Linear Regression.
Finding the best fit line:
 When working with linear regression, our main goal is to find the best fit line that means
the error between predicted values and actual values should be minimized. The best fit line
will have the least error.
 The different values for weights or the coefficient of lines (a0, a1) gives a different line of
regression, so we need to calculate the best values for a0 and a1 to find the best fit line, so to
calculate this we use cost function and Gradient descent algorithm.
 Cost function-
 The different values for weights or coefficient of lines (a0, a1) gives the different line of
regression, and the cost function is used to estimate the values of the coefficient for the best
fit line.
For Linear Regression, we use the Mean Squared Error (MSE) cost function, which is the
average of squared error occurred between the predicted values and actual values. It can be
written as:
MSE

Where N=Total number of observation


Yi = Actual value
Gradient descent 
oGradient descent is a method of updating a0 and a1 to minimize the cost function (MSE).
oA regression model uses gradient descent to update the coefficients of the line (a0, a1 =>
xi, b) by reducing the cost function by a random selection of coefficient values and then
iteratively update the values to reach the minimum cost function. To find these gradients,
we take partial derivatives for a0 and a1.
The partial derivates are the gradients, and they are used to update the values of a0 and
a1. Alpha is the learning rate.
Gradient descent: Impact of learning rate

You might also like