CUHK STAT5102 Ch7
CUHK STAT5102 Ch7
CUHK STAT5102 Ch7
•Logistic regression
Binary response variable
Simple logistic regressionn
Multiple logistic regression
Example
1
Binary response variable
𝑌" = 0,1
Special problems when response variable is binary
Y= 𝛽% + 𝛽' 𝑋 + 𝜀
6
Logistic regression
'
A simple example: 𝑌 𝑡 = '45 67
Logistic function (logistic curve)
Logistic function
1.2
1
0.8
E(y)
0.6
0.4
0.2
0
0 2 4 6 8 10
x
The simple logistic regression model
and
exp(𝛽% + 𝛽'𝑋" )
𝐸(𝑌" ) = 𝜋" =
1 + exp(𝛽% + 𝛽'𝑋" )
Note: the response Y has a Bernoulli distribution
11
Multiple logistic regression model
𝐸 𝑌 = 𝜋
<=>(? 4?A BA 4?C BC 4⋯4?E BE )
@
= '4<=>(?
@ 4? B
A A 4? B
C C 4⋯4? B
E E )
'
= '4<=>[G ?@ 4?A BA 4?C BC 4⋯4?E BE ]
12
Multiple logistic regression model
I J(KL')
Odds= 'GI = J(KL%)
Therefore,
13
Interpretations of parameters in the logistic model
𝛽R <0 𝛽R >0
𝛽" = Change in log-odds for every 1-unit increase in Xi, holding all
other X’s fixed.
100(e𝛽𝑖 -1) = Percentage change in odds for every 1-unit increase in
Xi, holding all other X’s fixed.
15
Logit response function
16
Problems of Least squares estimation
19
Hypothesis testing
20
Hypothesis testing
21
Example 1
A psychologist conducted a study to examine the nature of the
relation, if any, between an employee’s emotional stability (X) and the
employee’s ability to perform in a task group (Y). Emotional stability
was measured by a written test for which the higher the score, the
greater is the emotional stability. Ability to perform in a task group
(Y=1 if able, Y=0 if unable) was evaluated by the supervisor. The
results of 27 employees were used for analysis with a logistic
regression model.
Example 1 (SAS output)
1. The hypotheses
H%:𝛽' = 0 VS H':𝛽' ≠ 0
By likelihood ratio test, the test statistics 𝜒 P =8.1512 and p-value = 0.0043.
Therefore, for 𝛼 = 0.05, there is sufficient evidence to reject the null
hypothesis. Therefore, we conclude that the model is useful.
Example 1 (solution)
3. For X = 550
𝜋W = [1 + exp(10.3089 – 0.0189(550))]-1 = 0.5215
Example 1 (solution)
3. For 𝜋W = 0.7
Therefore,
A
'%.d%ef Ggh[ G']
X= .i
= 590.275
%.%'ef
Example 2
A local health clinic sent fliers to its clients to encourage everyone,
but especially older persons at high risk of complications, to get a flu
shot in time for protection against an expected flu epidemic. In a
pilot follow-up study, 159 clients were randomly selected and asked
whether they actually received a flu shot. A client who received a flu
shot was coded Y = 1, and client who did not receive a flu shot was
coded Y = 0. In addition, data were collected on their age (X1), their
health awareness (X2, awareness index), for which higher values
indicate greater awareness. Also included in the data was client
gender, where males were coded X3 = 1 and females were coded
X3 = 0.
Example 2 (SAS output)