This document provides an overview of linear regression. It defines linear regression as an algorithm that finds a linear relationship between a dependent (y) and independent (x) variable(s) to make predictions. Simple linear regression uses one independent variable, while multiple linear regression uses more than one. The goal is to find the "best fit" line by minimizing the error between predicted and actual values. This is done using a cost function (mean squared error) and gradient descent algorithm, which iteratively updates the regression coefficients to reduce the cost function and reach the minimum.
This document provides an overview of linear regression. It defines linear regression as an algorithm that finds a linear relationship between a dependent (y) and independent (x) variable(s) to make predictions. Simple linear regression uses one independent variable, while multiple linear regression uses more than one. The goal is to find the "best fit" line by minimizing the error between predicted and actual values. This is done using a cost function (mean squared error) and gradient descent algorithm, which iteratively updates the regression coefficients to reduce the cost function and reach the minimum.
This document provides an overview of linear regression. It defines linear regression as an algorithm that finds a linear relationship between a dependent (y) and independent (x) variable(s) to make predictions. Simple linear regression uses one independent variable, while multiple linear regression uses more than one. The goal is to find the "best fit" line by minimizing the error between predicted and actual values. This is done using a cost function (mean squared error) and gradient descent algorithm, which iteratively updates the regression coefficients to reduce the cost function and reach the minimum.
This document provides an overview of linear regression. It defines linear regression as an algorithm that finds a linear relationship between a dependent (y) and independent (x) variable(s) to make predictions. Simple linear regression uses one independent variable, while multiple linear regression uses more than one. The goal is to find the "best fit" line by minimizing the error between predicted and actual values. This is done using a cost function (mean squared error) and gradient descent algorithm, which iteratively updates the regression coefficients to reduce the cost function and reach the minimum.
Download as PPTX, PDF, TXT or read online from Scribd
Download as pptx, pdf, or txt
You are on page 1of 10
R programming
ATHIRA B
LINEAR REGRESSION Linear Regression
Linear regression is one of the easiest and most popular Machine
Learning algorithms. Linear regression makes predictions for continuous/real or numeric variables such as sales, salary, age, product price, etc. Regression shows a line or curve that passes through all the data points on a target-predictor graph in such a way that the vertical distance between the data points and the regression line is minimum. Linear regression algorithm shows a linear relationship between a dependent (y) and one or more independent (x) variables, hence called as linear regression. Since linear regression shows the linear relationship, which means it finds how the value of the dependent variable is changing according to the value of the independent variable. The linear regression model provides a sloped straight line representing the relationship between the variables
Mathematically, y= a0+a1x
Y= Dependent Variable (Target Variable)
X= Independent Variable (predictor Variable) a0= intercept of the line (Gives an additional degree of freedom) a1 = Linear regression coefficient (scale factor to each input value). Linear regression can be further divided into two types of the algorithm: • Simple Linear Regression: If a single independent variable is used to predict the value of a numerical dependent variable, then such a Linear Regression algorithm is called Simple Linear Regression. • Multiple Linear regression: If more than one independent variable is used to predict the value of a numerical dependent variable, then such a Linear Regression algorithm is called Multiple Linear Regression. Finding the best fit line: When working with linear regression, our main goal is to find the best fit line that means the error between predicted values and actual values should be minimized. The best fit line will have the least error. The different values for weights or the coefficient of lines (a0, a1) gives a different line of regression, so we need to calculate the best values for a0 and a1 to find the best fit line, so to calculate this we use cost function and Gradient descent algorithm. Cost function- The different values for weights or coefficient of lines (a0, a1) gives the different line of regression, and the cost function is used to estimate the values of the coefficient for the best fit line. For Linear Regression, we use the Mean Squared Error (MSE) cost function, which is the average of squared error occurred between the predicted values and actual values. It can be written as: MSE
Where N=Total number of observation
Yi = Actual value Gradient descent oGradient descent is a method of updating a0 and a1 to minimize the cost function (MSE). oA regression model uses gradient descent to update the coefficients of the line (a0, a1 => xi, b) by reducing the cost function by a random selection of coefficient values and then iteratively update the values to reach the minimum cost function. To find these gradients, we take partial derivatives for a0 and a1. The partial derivates are the gradients, and they are used to update the values of a0 and a1. Alpha is the learning rate. Gradient descent: Impact of learning rate