Machine Leraning Unit 2
Machine Leraning Unit 2
Machine Leraning Unit 2
Unit-2
Regression
Regression is a supervised learning technique
for investigating the relationship between
independent variables or features and a dependent
variable or outcome.
Outcomes can then be predicted once the
relationship between independent and dependent
variables has been estimated
E.g : House Price Prediction, Future Stock Prediction
Example
Suppose there is a marketing company A, who does
various advertisement every year and get sales on that.
Terminologies Related to the
Regression Analysis:
Dependent Variable: The main factor in Regression
analysis which we want to predict or understand is called
the dependent variable. It is also called target variable
(Response Variable)
Independent Variable: The factors which affect the
dependent variables or which are used to predict the values
of the dependent variables are called independent variable,
also called as a predictor.
Outliers: Outlier is an observation which contains either
very low value or very high value in comparison to other
observed values. An outlier may hamper the result, so it
should be avoided.
Types of Regression
Regression
Linear Non-Linear
Regression Regression
Simple Multiple
Logistic
Linear Linear
Regression
Regression Regression
Types of Regression
Simple Linear Regression : Linear regression shows
the linear relationship between the independent
variable (X-axis) and the dependent variable (Y-axis),
hence called linear regression.
If there is only one input variable (x), then such linear
regression is called simple linear regression.
If there is more than one input variable, then such
linear regression is called multiple linear regression.
Example of Linear Regression
Sig(x)1 as x∞
Sig(x)0 as x(-∞)
Logistic Regression Assumptions
Dependent variable must be categorical in nature
Independent variable should not have Multi-
collinearity
Multicollinearity occurs when independent variables
in a regression model are correlated.
This correlation is a problem because independent
variables should be independent. If the degree of
correlation between variables is high enough, it can
cause problems when you fit the model and interpret
the results.
Difference between Linear &
Logistic Regression
Types of Logistic Regression
Binary Logistic Regression is a type of logistic regression in
which the response variable can only belong to two categories. E.g
: Spam , Not spam
Multinomial Logistic Regression is a type of logistic regression
in which the response variable can belong to one of three or more
categories and there is no natural ordering among the categories.
E.g : color : Red, Blue, Green
Ordinal Logistic Regression : a type of logistic regression in
which the response variable can belong to one of three or more
categories and there is a natural ordering among the categories.
E.g : Movie Rating(1-5), Grading (Excellent,Fair,Poor)
Summary
Bayes Theorem
Bayes theorem determines the conditional probability
of an event A given that event B has already occurred.
Bayes theorem is also known as the Bayes Rule or Bayes
Law.
It is a method to determine the probability of an event
based on the occurrences of prior events.
It is used to calculate conditional probability.
Bayes Theorem
Bayes Theorem is stated as :
The value of gamma varies from 0 to 1. You have to manually provide the
value of gamma in the code. The most preferred value for gamma is 0.1.
Here when gamma= ½(σ)^2, this kernal is called as Gaussian Kernal.
Sigma= Variance
Gamma : It decides that how much curvature we want in a decision
boundary.
SVM Kernel Functions
3) Sigmoid Kernel : It is mostly preferred for neural
networks. This kernel function is similar to a two-
layer perceptron model of the neural network, which
works as an activation function for neurons.
Sigmoid Kenel Function