Lec 05
Lec 05
Lec 05
Jia-Bin Huang
ECE-5424G / CS-5824 Virginia Tech Spring 2019
Administrative
• Please start HW 1 early!
• : Need parameters
•: Need 1 parameter
• Test example:
•:
•:
Naïve Bayes algorithm – discrete
• For each value
Estimate
For each value of each attribute
Estimate
• Classify
• Additional assumption on :
• Is independent of ()
• Is independent of ()
• Is independent of and ()
• Classify
• Naive Bayes
Logistic Regression
• Hypothesis representation
• Cost function
• Regularization
• Multi-class classification
Logistic Regression
• Hypothesis representation
• Cost function
• Regularization
• Multi-class classification
1 (Yes)
Malignant?
0 (No)
Tumor Size
h 𝜃 ( 𝑥 ) =𝜃 ⊤ 𝑥
Logistic regression:
•
1+ 𝑒
where
𝑔(𝑧)
• Sigmoid function
• Logistic function
𝑧 Slide credit: Andrew Ng
Interpretation of hypothesis output
• estimated probability that on input
• Example: If
• 0.7
⊤
𝑧 =𝜃 𝑥
Suppose predict “y = 1” if
predict “y = 0” if
Age
E.g.,
Tumor Size
• Predict “” if
E.g.,
• Predict “” if
Apply
Plug in
2 2
𝜇𝑖 0 − 𝜇𝑖1 𝜇 −𝜇
2
( 𝑥 −𝜇 𝑖𝑘 )
𝑃 ( 𝑥∨𝑦 𝑘) =
1
𝑒
−
2𝜎
2
𝑖
∑( 𝜎
2 ¿ 𝑋 𝑖+
𝑖1
2𝜎
2
𝑖0
)¿
√2 𝜋 𝜎 𝑖 𝑖 𝑖 𝑖
• Cost function
• Regularization
• Multi-class classification
Training set with examples
if 𝑦=1 if 𝑦=0
•:
•:
• Training data
• Data likelihood
• Data conditional likelihood
Cost (h 𝜃 ( 𝑥 ) , 𝑦 )=
{ − log ( h𝜃 ( 𝑥 ) ) if 𝑦 =1
− log ( 1 −h 𝜃 ( 𝑥 ) ) if 𝑦 =0
Logistic Regression
• Hypothesis representation
• Cost function
• Regularization
• Multi-class classification
Gradient descent
Goal:
Good news: Convex function!
Repeat { Bad news: No analytical solution
Goal:
Repeat {
(Simultaneously update all )
}
} 1+ 𝑒
Slide credit: Andrew Ng
Logistic Regression
• Hypothesis representation
• Cost function
• Regularization
• Multi-class classification
How about MAP?
• Maximum conditional likelihood estimate (MCLE)
• Cost function
• Regularization
• Multi-class classification
Multi-class classification
• Email foldering/taggning: Work, Friends, Family, Hobby
𝑥2 𝑥2
𝑥1 𝑥1
One-vs-all (one-vs-rest) 𝑥2
(1 )
h ( 𝑥)
𝜃
𝑥1
𝑥2
(2 ) 𝑥2
h (𝑥)𝜃
𝑥1 𝑥1
Class 1:
Class 2: (3 ) 𝑥2
Class 3: h (𝑥)
𝜃
Prediction Prediction
Further readings
• Tom M. Mitchell
Generative and discriminative classifiers: Naïve Bayes and Logistic
Regression
http://www.cs.cmu.edu/~tom/mlbook/NBayesLogReg.pdf
• Regularization 𝑚 𝑖=1 𝑚
1
𝜃 𝑗 ≔ 𝜃 𝑗 − 𝛼𝜆 𝜃 𝑗 − 𝛼 ∑ ( h 𝜃 𝑥 − 𝑦 ) 𝑥 𝑗
( )
( 𝑖) (𝑖) (𝑖)
𝑚 𝑖=1
• Multi-class classification
Coming up…
• Regularization