0% found this document useful (0 votes)
17 views

Notes 15

The document discusses logistic regression and generalized linear models (GLMs). It introduces binary responses and the need for a link function to map probabilities to the real line. The logit link and logistic regression model are described. Parameter interpretation and maximum likelihood estimation are also covered.

Uploaded by

Yash Sirowa
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views

Notes 15

The document discusses logistic regression and generalized linear models (GLMs). It introduces binary responses and the need for a link function to map probabilities to the real line. The logit link and logistic regression model are described. Parameter interpretation and maximum likelihood estimation are also covered.

Uploaded by

Yash Sirowa
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

STATISTICAL METHODS - 2

Regression Analysis - PSTAT 126


NOTES 15

LOGISTIC REGRESSION AND GLMs

Department of Statistics and Applied Probability


University of California, Santa Barbara
Today’s Lecture

 Logistic regression / GLMs


I Model framework
I Interpretation
I Estimation

1 of 37
Linear regression

Course started with the model

yi = β0 + β1 xi + i

where
i ∼ (0, σ2 )
In particular, yi has been continuous throughout the course

2 of 37
Binary responses

Binary outcomes are common in practice; usually indicate


some event
 Yes vs no
 Transplant vs no transplant
 Death vs no death

3 of 37
Binary responses

How should we deal with binary (0/1) y’s?


 Regression focuses on E(y|x)
 For binary outcomes, we want E(y|x) = p(y = 1|x)
 Does pi = p(y = 1|x) = β0 + β1 xi work?

4 of 37
Linear regression for binary outcome

1.0

0.5
y

0.0

−10 −5 0 5
x

5 of 37
What we need for binary outcomes

 Fitted probabilities should be between 0 and 1


 Use a invertible function g : (0, 1) → (−∞, ∞) to link
probabilities to the real line
 Build a model for g(pi ) = β0 + β1 xi

6 of 37
Link functions

 Lots of possible link functions: logit, probit,


complimentary log-log
 By far, most common is the logit link:
pi
g(pi ) = logit(pi ) = log
1 − pi

 The inverse link function is also useful:


exp(z)
g−1 (z) =
1 + exp(z)

7 of 37
Logistic regression

Model is now

E(yi |xi ) = pi
pi
g(pi ) = log = β0 + β1 xi
1 − pi

Using the logit link, we have

exp(β0 + β1 xi )
pi = g−1 (β0 + β1 xi ) =
1 + exp(β0 + β1 xi )

8 of 37
Parameter interpretation

Suppose we can estimate β0 , β1 ; what do they mean?


For a binary predictor ...

9 of 37
Parameter interpretation

For a continuous predictor ...

10 of 37
Parameter estimation

 For linear regression, we used least squares and found that


this corresponded to ML
 Try using maximum likelihood for logistic regression; need
a likelihood ...

11 of 37
ML for logistic regression

 Assume that [yi |xi ] ∼ Bern(pi )


y
 Density function is p(yi ) = pi i (1 − pi )1−yi
 As before, use that logit(pi ) = β0 + β1 xi
 Likelihood is
n
y
Y
L(β0 , β1 ; y) = pi i (1 − pi )1−yi
i=1

12 of 37
ML for logistic regression

 Log likelihood is easier to work with, but it is typically not


possible to find a closed-form solution
 Iterative algorithms are used instead (Newton-Raphson,
Iteratively Reweighted Least Squares)
 These are implemented for a variety of link functions in R

13 of 37
♠ Likelihood Functions

• Data: {(yi , xi ), i = 1, 2, . . . , n}, where xi = (1, Xi1 , Xi2 , . . . , Xik )′ .

• Parameters: β = (β0 , β1 , . . . , βk )′ .

• Binary Logistic Regression Models


Let yi be a binary response taking values 0 or 1. Then the likelihood
function is given by
n
Y Yn
exp(yi x′i β)
L(β) = pyi i (1 − pi ) 1−yi
=
i=1 i=1
1 + exp(x ′
i β)

and the log-likelihood function is


n n
X o
′ ′
l(β) = log L(β) = yi xi β − log[1 + exp(xi β)] .
i=1

12-22
• Binomial Logistic Regression Models
Let yi be a binary response taking values 0, 1, . . . , ni . Then the
likelihood function is given by
n
! n
!
Y ni y n −y
Y ni exp(y i x′
i β)
L(β) = pi i (1 − pi ) i i = ni
i=1
yi
i=1
yi [1 + exp(x ′
i β)]

and the log-likelihood function is


n n
!
X ni o
l(β) = log L(β) = yi x′i β − ni log[1 + exp(x′i β)] + log .
i=1
yi

• Poisson Regression Models


Let yi be a count variable taking values 0, 1, . . . . Then the likelihood
function is given by
Yn
exp(yi xi β)
L(β) = exp{− exp(x′i β)}.
i=1
yi !

12-23
and the log-likelihood function is
n n
X o
l(β) = log L(β) = yi x′i β − exp(x′i β) − log(yi !) .
i=1

♠ Maximum Likelihood Estimation (MLE)


For the GLM, there is no analytical form of the MLE β̂ available. The
Newton-Raphson (NR) algorithm is commonly used for computing the
MLE β̂ of β.

12-24

You might also like