Module 4
Module 4
Module 4
Simple Linear
Regression
Prepared by,
Ashritha K P
Asst. Professor
Sahyadri College of Engineering and Management
Mangaluru
Linear Model
any choice of alpha and beta gives us a predicted output for each input x_i.
Since we know the actual output y_i, we can compute the error for each pair:
The least squares solution is to choose the alpha and beta that make
sum_of_sqerrors as small as possible.
The choice of beta means that when the input value increases by
standard_deviation(x), the prediction then increases by correlation(x, y) *
standard_deviation(y).
The choice of alpha simply says that when we see the average value of the
independent variable x, we predict the average value of the dependent
variable y.
In the case where x and y are perfectly correlated, a one-standard-deviation
increase in x results in a one-standard-deviation-of-y increase in the
prediction.
When they’re perfectly anticorrelated, the increase in x results in a decrease
in the prediction.
When the correlation is 0, beta is 0, which means that changes in x don’t
affect the prediction at all.
This gives values of alpha = 22.95 and beta = 0.903.
So our model says that we expect a user with n friends to spend
22.95 + n * 0.903 minutes on the site each day.
That is, we predict that a user with no friends on DataSciencester would still
spend about 23 minutes a day on the site. And for each additional friend, we
expect a user to spend almost a minute more on the site each day
R-Squared
2. Variable Selection: Choose which variables to include in the model based on theoretical
considerations or prior knowledge. If both "friends" and "time on site" are highly
correlated with "work hours," it may be necessary to select only one of them to avoid
multicollinearity issues.
In practice, you’d often like to apply linear regression to datasets with large
numbers of variables. 2 problems here are:
First, the more variables you use, the more likely you are to overfit your
model to the training set.
Second, the more nonzero coefficients you have, the harder it is to make
sense of them(or to compute)
Regularization is an approach in which we add to the error term a penalty
that gets larger as beta gets larger. We then minimize the combined error and
penalty. The more importance we place on the penalty term, the more we
discourage large coefficients.
Ridge Regression:in ridge regression, we add a penalty proportional to the sum of the
squares of the beta_i (except that typically we don’t penalize beta_0, the constant
term)
.
Lasso Regression: L1 Regularization : it adds a penalty term proportional to the
absolute value of the coefficients of the model
Logistic Regression
Seminar
Refer Text Book
Natural Language Processing
Seminar
Refer Text Book