0% found this document useful (0 votes)

17 views

Notes 15

The document discusses logistic regression and generalized linear models (GLMs). It introduces binary responses and the need for a link function to map probabilities to the real line. The logit link and logistic regression model are described. Parameter interpretation and maximum likelihood estimation are also covered.

Uploaded by

Yash Sirowa

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views

Notes 15

Uploaded by

Yash Sirowa

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 20

STATISTICAL METHODS - 2

Regression Analysis - PSTAT 126

NOTES 15

LOGISTIC REGRESSION AND GLMs

Department of Statistics and Applied Probability

University of California, Santa Barbara
Today’s Lecture

Logistic regression / GLMs

I Model framework
I Interpretation
I Estimation

1 of 37
Linear regression

Course started with the model

yi = β0 + β1 xi + i

where
i ∼ (0, σ2 )
In particular, yi has been continuous throughout the course

2 of 37
Binary responses

Binary outcomes are common in practice; usually indicate

some event
Yes vs no
Transplant vs no transplant
Death vs no death

3 of 37
Binary responses

How should we deal with binary (0/1) y’s?

Regression focuses on E(y|x)
For binary outcomes, we want E(y|x) = p(y = 1|x)
Does pi = p(y = 1|x) = β0 + β1 xi work?

4 of 37
Linear regression for binary outcome

1.0

0.5
y

0.0

−10 −5 0 5
x

5 of 37
What we need for binary outcomes

Fitted probabilities should be between 0 and 1

Use a invertible function g : (0, 1) → (−∞, ∞) to link
probabilities to the real line
Build a model for g(pi ) = β0 + β1 xi

6 of 37
Link functions

Lots of possible link functions: logit, probit,

complimentary log-log
By far, most common is the logit link:
pi
g(pi ) = logit(pi ) = log
1 − pi

The inverse link function is also useful:

exp(z)
g−1 (z) =
1 + exp(z)

7 of 37
Logistic regression

Model is now

E(yi |xi ) = pi
pi
g(pi ) = log = β0 + β1 xi
1 − pi

Using the logit link, we have

exp(β0 + β1 xi )
pi = g−1 (β0 + β1 xi ) =
1 + exp(β0 + β1 xi )

8 of 37
Parameter interpretation

Suppose we can estimate β0 , β1 ; what do they mean?

For a binary predictor ...

9 of 37
Parameter interpretation

For a continuous predictor ...

10 of 37
Parameter estimation

For linear regression, we used least squares and found that

this corresponded to ML
Try using maximum likelihood for logistic regression; need
a likelihood ...

11 of 37
ML for logistic regression

Assume that [yi |xi ] ∼ Bern(pi )

y
Density function is p(yi ) = pi i (1 − pi )1−yi
As before, use that logit(pi ) = β0 + β1 xi
Likelihood is
n
y
Y
L(β0 , β1 ; y) = pi i (1 − pi )1−yi
i=1

12 of 37
ML for logistic regression

Log likelihood is easier to work with, but it is typically not

possible to find a closed-form solution
Iterative algorithms are used instead (Newton-Raphson,
Iteratively Reweighted Least Squares)
These are implemented for a variety of link functions in R

13 of 37
♠ Likelihood Functions

• Data: {(yi , xi ), i = 1, 2, . . . , n}, where xi = (1, Xi1 , Xi2 , . . . , Xik )′ .

• Parameters: β = (β0 , β1 , . . . , βk )′ .

• Binary Logistic Regression Models

Let yi be a binary response taking values 0 or 1. Then the likelihood
function is given by
n
Y Yn
exp(yi x′i β)
L(β) = pyi i (1 − pi ) 1−yi
=
i=1 i=1
1 + exp(x ′
i β)

and the log-likelihood function is

n n
X o
′ ′
l(β) = log L(β) = yi xi β − log[1 + exp(xi β)] .
i=1

12-22
• Binomial Logistic Regression Models
Let yi be a binary response taking values 0, 1, . . . , ni . Then the
likelihood function is given by
n
! n
!
Y ni y n −y
Y ni exp(y i x′
i β)
L(β) = pi i (1 − pi ) i i = ni
i=1
yi
i=1
yi [1 + exp(x ′
i β)]