CUHK STAT5102 Ch7

2016S_STAT5102_L7 |Department of Statistics, The Chinese University of Hong Kong
Logistic regression and piecewise

linear regression
In this chapter, we shall cover
•Logistic regression
Binary response variable
Simple logistic regressionn
Multiple logistic regression
Example
1
The response variable has only two possible qualitative

outcomes, and therefore can be represented by a binary
indicator variable taking on values 0 and 1 (binary responses
or dichotomous responses).
Meaning of response function when outcome variable is binary
Assume binary response, and if the simple linear regression model is

used:
𝑌" = 𝛽% + 𝛽' 𝑋" + 𝜀"
𝑌" = 0,1
Special problems when response variable is binary
(when linear model is used)
Let 𝜋" be the probability that 𝑌" = 1.
• E(𝑌" ) = (0)P(𝑌" =0) + (1)P(𝑌" =1) = P(𝑌" =1) = 𝜋"
• But from the linear model, we have

o E(𝑌" ) = 𝛽% +𝛽' 𝑋"
o Therefore, 𝛽%+𝛽' 𝑋" = 𝜋" ????
o Constraint 0≤ 𝜋" ≤ 1 （Not feasible for a large range of X)
Non-normal error terms

The standard least squares assumption of normal errors is violated.
Consider the simple linear regression model
Y= 𝛽% + 𝛽' 𝑋 + 𝜀
Therefore the error is 𝜀 = Y− (𝛽% + 𝛽' 𝑋)

Hence, if Y=1, 𝜀 = 1− (𝛽% + 𝛽'𝑋) with probability 𝜋"
And if Y=0, 𝜀 = − (𝛽% + 𝛽'𝑋) with probability 1- 𝜋"
Unequal error variances
The variance of the random error is
𝜋" (1− 𝜋" )
6
Logistic regression
The logistic regression model was originally developed for use

in survival analysis, where the response is typically measured
as 0 or 1, depending on whether the experimental unit (for
example, a patient) “survives”.
Logistic regression
• The relationship between the binary response y and a single

predictor variable x is curvilinear. This particular curvilinear
pattern frequently encountered in practice is the S-shaped curve.
The model that accounts for this type of curvature is the logistic
regression model.
• Probability of occurrence to fit a logistic function.
Logistic function (logistic curve)
'
A simple example: 𝑌 𝑡 = '45 67
Logistic function (logistic curve)
Logistic function
1.2
1
0.8
E(y)
0.6
0.4
0.2
0
0 2 4 6 8 10
x
The simple logistic regression model
The simple logistic regression model is:
𝑌" = 𝐸(𝑌" ) + 𝜀"
and
exp(𝛽% + 𝛽'𝑋" )
𝐸(𝑌" ) = 𝜋" =
1 + exp(𝛽% + 𝛽'𝑋" )
Note: the response Y has a Bernoulli distribution
11
Multiple logistic regression model
The Multiple logistic regression model is:
𝐸 𝑌 = 𝜋
<=>(? 4?A BA 4?C BC 4⋯4?E BE )
@
= '4<=>(?
@ 4? B
A A 4? B
C C 4⋯4? B
E E )
'
= '4<=>[G ?@ 4?A BA 4?C BC 4⋯4?E BE ]
12
Multiple logistic regression model
Define the odds of the event (Y=1) occurring as follows:
I J(KL')
Odds= 'GI = J(KL%)
Therefore,
𝑜𝑑𝑑𝑠 = exp 𝛽% + 𝛽'𝑋' + 𝛽P𝑋P + ⋯ + 𝛽Q 𝑋Q
13
Interpretations of parameters in the logistic model
• Interpretation of 𝛽R keeping all 𝑥" fixed except 𝑥R

• 𝜋 is increasing or decreasing in 𝑥R
o horizontal asymptotes at 0 and 1
o 𝜋 falls in [0,1] over an unbounded range of 𝑥R
𝛽R <0 𝛽R >0
• |𝛽R |à 0, the curve flattens to a horizontal straight line

o 𝛽R =0, Y is independent of 𝑥R
• 𝜋 approaches 0 and 1 at the same rate
o symmetric function
Interpretations of parameters in the logistic model
logit(𝜋) = 𝜋 ’ = 𝛽% + 𝛽'𝑋' + 𝛽P𝑋P + ⋯ + 𝛽Q 𝑋Q

where
I
𝜋’ = ln('GI) = log(odds)
𝜋 = P(𝑌 = 1)
𝛽" = Change in log-odds for every 1-unit increase in Xi, holding all
other X’s fixed.
100(e𝛽𝑖 -1) = Percentage change in odds for every 1-unit increase in
Xi, holding all other X’s fixed.
15
Logit response function
𝜋W ’= 𝑏% + 𝑏' 𝑋' + 𝑏P𝑋P + ⋯ + 𝑏Q 𝑋Q
Is called the fitted logit response function.
16
Problems of Least squares estimation
• The true probability 𝜋" is unknown. In order to provide the

estimate, we must have replicated observations of the response at
each combination of the levels of the independent variables. Thus,
the least squares transformation approach is limited to replicated
experiments, which occur infrequently in a practical business
setting.
• Unequal error variances.
Maximum likelihood estimation
Have several desirable properties, and the data need not be
replicated to apply maximum likelihood estimation.
Hypothesis testing
Effect of individual regressor 𝑥R (H%:𝛽R =0 Vs H':𝛽R ≠ 0)
Wald Test (z)

Test statistics = z = 𝛽[R ]𝑆𝐸(𝛽[R )
Under H0, z ~ N(0,1)
If |z|>𝑧_⁄P, H0 is rejected
19
Hypothesis testing
Likelihood ratio test: G2 = -2(L0 – L1)
On the overall adequacy of the model
H0: 𝛽' = … = 𝛽Q = 0 VS H1: at least one 𝛽 ≠ 0

L1: for full model
L0: for intercept model
Under H0, G2 ~ 𝜒QP
P
If G2>𝜒Q,_ , H0 is rejected
20
Hypothesis testing
Effect of individual regressor 𝑥R (H% :𝛽R=0 VS H' :𝛽R ≠ 0)
Likelihood ratio test (G2)

Test statistic = G2 = -2(L0 – L1)
L1: log likelihood for model with regressor 𝑥R
L0: log likelihood for model without regressor 𝑥R
Under H0, G2 ~ 𝜒'P
P
If G2>𝜒',_ , H0 is rejected
21
Example 1
A psychologist conducted a study to examine the nature of the
relation, if any, between an employee’s emotional stability (X) and the
employee’s ability to perform in a task group (Y). Emotional stability
was measured by a written test for which the higher the score, the
greater is the emotional stability. Ability to perform in a task group
(Y=1 if able, Y=0 if unable) was evaluated by the supervisor. The
results of 27 employees were used for analysis with a logistic
regression model.
Example 1 (SAS output)
Testing Global Null Hypothesis: BETA=0

Test Chi-Square DF Pr > ChiSq
Likelihood Ratio 8.1512 1 0.0043
Score 7.3223 1 0.0068
Wald 5.7692 1 0.0163
Analysis of Maximum Likelihood Estimates
Parameter DF Estimate Standard Wald Pr > ChiSq

Error Chi-Square
Intercept 1 -10.3089 4.3770 5.5472 0.0185
X 1 0.0189 0.00788 5.7692 0.0163
Example 1
1. Conduct a test of model adequacy and state the hypotheses

clearly. Use 𝛼 = 0.05.
2. State the fitted response function.
3. What is the estimated probability that employees with an
emotional stability test score of 550 will be able to perform in a
task group.
4. Estimate the emotional stability test score for which 70 percent of
the employees with this test score are expected to be able to
perform in a task group.
Example 1 (solution)
1. The hypotheses
H%:𝛽' = 0 VS H':𝛽' ≠ 0
By likelihood ratio test, the test statistics 𝜒 P =8.1512 and p-value = 0.0043.
Therefore, for 𝛼 = 0.05, there is sufficient evidence to reject the null
hypothesis. Therefore, we conclude that the model is useful.
2. The estimated logistic function is
𝜋W = [1 + exp(10.3089 – 0.0189 X)]-1
3. For X = 550
𝜋W = [1 + exp(10.3089 – 0.0189(550))]-1 = 0.5215
3. For 𝜋W = 0.7
0.7= [1 + exp(10.3089 – 0.0189X)]-1
Therefore,
A
'%.d%ef Ggh[ G']
X= .i
= 590.275
%.%'ef
Example 2
A local health clinic sent fliers to its clients to encourage everyone,
but especially older persons at high risk of complications, to get a flu
shot in time for protection against an expected flu epidemic. In a
pilot follow-up study, 159 clients were randomly selected and asked
whether they actually received a flu shot. A client who received a flu
shot was coded Y = 1, and client who did not receive a flu shot was
coded Y = 0. In addition, data were collected on their age (X1), their
health awareness (X2, awareness index), for which higher values
indicate greater awareness. Also included in the data was client
gender, where males were coded X3 = 1 and females were coded
X3 = 0.
Example 2 (SAS output)
Testing Global Null Hypothesis: BETA=0 Odds Ratio Estimates

Test Chi-Square DF Pr > ChiSq Effect Point Estimate 95% Wald
Confidence Limits
Likelihood Ratio 29.8476 3 <.0001
Score 27.0173 3 <.0001
X1 1.076 1.013 1.141
Wald 19.9803 3 0.0002
X2 0.906 0.848 0.967
X3 1.543 0.555 4.291
Analysis of Maximum Likelihood Estimates

Parameter DF Estimate Standard Wald Pr > ChiSq
Error Chi-Square
Intercept 1 -1.1772 2.9824 0.1558 0.6930
X1 1 0.0728 0.0304 5.7401 0.0166
X2 1 -0.0990 0.0335 8.7419 0.0031
X3 1 0.4339 0.5218 0.6917 0.4056
Example 2
1. State the fitted response function.

2. Obtain Exp(b1), Exp(b2), Exp(b3). Interpret these numbers.
3. What is the estimated probability that male clients aged 55
with a health awareness index of 60 will receive a flu shot?
1. Let the probability of a client who received a flu shot be E(Y) = 𝜋.

The fitted response function is
𝜋W = [1 + exp(1.1772 – 0.0728X1 + 0.0090X2 – 0.4339X3)]-1
2. Exp(b1) = Exp(0.0728) = 1.076

The odds increase multiplicative by 1.076 or the odds increase
7.6% for every 1-unit increase in age (X1) holding X2 and X3 fixed.
Exp(b2) = Exp(-0.0990) = 0.906

The odds increase multiplicative by 0.906 or the odds decrease
9.4% for every 1-unit increase in age (X2) holding X1 and X3 fixed.
Exp(b3) = Exp(0.4339) = 1.543

The odds of a male client are 1.543 times the odds of a female
client or the odds of a male client are 54.3% higher than the odds
of a female client holding X1 and X2 fixed.
3. For X1 = 55, X2 = 60, X3 = 1, the estimated probability is
𝜋W = [1 + exp(1.1772 – 0.0728(55) + 0.0990(60) – 0.4339(1)]-1 = 0.0642

CUHK STAT5102 Ch7

Uploaded by

Copyright:

Available Formats

CUHK STAT5102 Ch7

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

CUHK STAT5102 Ch7

Uploaded by

Copyright:

Available Formats

2016S_STAT5102_L7 |Department of Statistics, The Chinese University of Hong Kong

Logistic regression and piecewise

In this chapter, we shall cover

The response variable has only two possible qualitative

Meaning of response function when outcome variable is binary

Assume binary response, and if the simple linear regression model is

𝑌" = 𝛽% + 𝛽' 𝑋" + 𝜀"

(when linear model is used)

Let 𝜋" be the probability that 𝑌" = 1.

• E(𝑌" ) = (0)P(𝑌" =0) + (1)P(𝑌" =1) = P(𝑌" =1) = 𝜋"

• But from the linear model, we have

(when linear model is used)

Non-normal error terms

Therefore the error is 𝜀 = Y− (𝛽% + 𝛽' 𝑋)

(when linear model is used)

Unequal error variances

The variance of the random error is

𝜋" (1− 𝜋" )

The logistic regression model was originally developed for use

• The relationship between the binary response y and a single

The simple logistic regression model is:

𝑌" = 𝐸(𝑌" ) + 𝜀"

The Multiple logistic regression model is:

Define the odds of the event (Y=1) occurring as follows:

𝑜𝑑𝑑𝑠 = exp 𝛽% + 𝛽'𝑋' + 𝛽P𝑋P + ⋯ + 𝛽Q 𝑋Q

• Interpretation of 𝛽R keeping all 𝑥" fixed except 𝑥R

• |𝛽R |à 0, the curve flattens to a horizontal straight line

logit(𝜋) = 𝜋 ’ = 𝛽% + 𝛽'𝑋' + 𝛽P𝑋P + ⋯ + 𝛽Q 𝑋Q

𝜋W ’= 𝑏% + 𝑏' 𝑋' + 𝑏P𝑋P + ⋯ + 𝑏Q 𝑋Q

Is called the fitted logit response function.

• The true probability 𝜋" is unknown. In order to provide the

Wald Test (z)

Likelihood ratio test: G2 = -2(L0 – L1)

On the overall adequacy of the model

H0: 𝛽' = … = 𝛽Q = 0 VS H1: at least one 𝛽 ≠ 0

Effect of individual regressor 𝑥R (H% :𝛽R=0 VS H' :𝛽R ≠ 0)

Likelihood ratio test (G2)

Testing Global Null Hypothesis: BETA=0

Analysis of Maximum Likelihood Estimates

Parameter DF Estimate Standard Wald Pr > ChiSq

1. Conduct a test of model adequacy and state the hypotheses

2. The estimated logistic function is

𝜋W = [1 + exp(10.3089 – 0.0189 X)]-1

0.7= [1 + exp(10.3089 – 0.0189X)]-1

Testing Global Null Hypothesis: BETA=0 Odds Ratio Estimates

Analysis of Maximum Likelihood Estimates

1. State the fitted response function.

1. Let the probability of a client who received a flu shot be E(Y) = 𝜋.

𝜋W = [1 + exp(1.1772 – 0.0728X1 + 0.0090X2 – 0.4339X3)]-1

2. Exp(b1) = Exp(0.0728) = 1.076

Exp(b2) = Exp(-0.0990) = 0.906

Exp(b3) = Exp(0.4339) = 1.543

3. For X1 = 55, X2 = 60, X3 = 1, the estimated probability is

𝜋W = [1 + exp(1.1772 – 0.0728(55) + 0.0990(60) – 0.4339(1)]-1 = 0.0642

You might also like