CUHK STAT5102 Ch7

Download as pdf or txt
Download as pdf or txt
You are on page 1of 33

2016S_STAT5102_L7 |Department of Statistics, The Chinese University of Hong Kong

Logistic regression and piecewise


linear regression

In this chapter, we shall cover

•Logistic regression
Binary response variable
Simple logistic regressionn
Multiple logistic regression
Example

1
Binary response variable

The response variable has only two possible qualitative


outcomes, and therefore can be represented by a binary
indicator variable taking on values 0 and 1 (binary responses
or dichotomous responses).
Binary response variable

Meaning of response function when outcome variable is binary

Assume binary response, and if the simple linear regression model is


used:

𝑌" = 𝛽% + 𝛽' 𝑋" + 𝜀"

𝑌" = 0,1
Special problems when response variable is binary

(when linear model is used)

Let 𝜋" be the probability that 𝑌" = 1.

• E(𝑌" ) = (0)P(𝑌" =0) + (1)P(𝑌" =1) = P(𝑌" =1) = 𝜋"

• But from the linear model, we have


o E(𝑌" ) = 𝛽% +𝛽' 𝑋"
o Therefore, 𝛽%+𝛽' 𝑋" = 𝜋" ????
o Constraint 0≤ 𝜋" ≤ 1 (Not feasible for a large range of X)
Special problems when response variable is binary

(when linear model is used)

Non-normal error terms


The standard least squares assumption of normal errors is violated.
Consider the simple linear regression model

Y= 𝛽% + 𝛽' 𝑋 + 𝜀

Therefore the error is 𝜀 = Y− (𝛽% + 𝛽' 𝑋)


Hence, if Y=1, 𝜀 = 1− (𝛽% + 𝛽'𝑋) with probability 𝜋"
And if Y=0, 𝜀 = − (𝛽% + 𝛽'𝑋) with probability 1- 𝜋"
Special problems when response variable is binary

(when linear model is used)

Unequal error variances

The variance of the random error is

𝜋" (1− 𝜋" )

6
Logistic regression

The logistic regression model was originally developed for use


in survival analysis, where the response is typically measured
as 0 or 1, depending on whether the experimental unit (for
example, a patient) “survives”.
Logistic regression

• The relationship between the binary response y and a single


predictor variable x is curvilinear. This particular curvilinear
pattern frequently encountered in practice is the S-shaped curve.
The model that accounts for this type of curvature is the logistic
regression model.
• Probability of occurrence to fit a logistic function.
Logistic function (logistic curve)

'
A simple example: 𝑌 𝑡 = '45 67
Logistic function (logistic curve)

Logistic function
1.2
1
0.8
E(y)

0.6
0.4
0.2
0
0 2 4 6 8 10
x
The simple logistic regression model

The simple logistic regression model is:

𝑌" = 𝐸(𝑌" ) + 𝜀"

and

exp(𝛽% + 𝛽'𝑋" )
𝐸(𝑌" ) = 𝜋" =
1 + exp(𝛽% + 𝛽'𝑋" )
Note: the response Y has a Bernoulli distribution

11
Multiple logistic regression model

The Multiple logistic regression model is:

𝐸 𝑌 = 𝜋
<=>(? 4?A BA 4?C BC 4⋯4?E BE )
@
= '4<=>(?
@ 4? B
A A 4? B
C C 4⋯4? B
E E )
'
= '4<=>[G ?@ 4?A BA 4?C BC 4⋯4?E BE ]

12
Multiple logistic regression model

Define the odds of the event (Y=1) occurring as follows:

I J(KL')
Odds= 'GI = J(KL%)

Therefore,

𝑜𝑑𝑑𝑠 = exp 𝛽% + 𝛽'𝑋' + 𝛽P𝑋P + ⋯ + 𝛽Q 𝑋Q

13
Interpretations of parameters in the logistic model

• Interpretation of 𝛽R keeping all 𝑥" fixed except 𝑥R


• 𝜋 is increasing or decreasing in 𝑥R
o horizontal asymptotes at 0 and 1
o 𝜋 falls in [0,1] over an unbounded range of 𝑥R

𝛽R <0 𝛽R >0

• |𝛽R |à 0, the curve flattens to a horizontal straight line


o 𝛽R =0, Y is independent of 𝑥R
• 𝜋 approaches 0 and 1 at the same rate
o symmetric function
Interpretations of parameters in the logistic model

logit(𝜋) = 𝜋 ’ = 𝛽% + 𝛽'𝑋' + 𝛽P𝑋P + ⋯ + 𝛽Q 𝑋Q


where
I
𝜋’ = ln('GI) = log(odds)
𝜋 = P(𝑌 = 1)

𝛽" = Change in log-odds for every 1-unit increase in Xi, holding all
other X’s fixed.
100(e𝛽𝑖 -1) = Percentage change in odds for every 1-unit increase in
Xi, holding all other X’s fixed.

15
Logit response function

𝜋W ’= 𝑏% + 𝑏' 𝑋' + 𝑏P𝑋P + ⋯ + 𝑏Q 𝑋Q

Is called the fitted logit response function.

16
Problems of Least squares estimation

• The true probability 𝜋" is unknown. In order to provide the


estimate, we must have replicated observations of the response at
each combination of the levels of the independent variables. Thus,
the least squares transformation approach is limited to replicated
experiments, which occur infrequently in a practical business
setting.
• Unequal error variances.
Maximum likelihood estimation
Have several desirable properties, and the data need not be
replicated to apply maximum likelihood estimation.
Hypothesis testing
Effect of individual regressor 𝑥R (H%:𝛽R =0 Vs H':𝛽R ≠ 0)

Wald Test (z)


Test statistics = z = 𝛽[R ]𝑆𝐸(𝛽[R )
Under H0, z ~ N(0,1)
If |z|>𝑧_⁄P, H0 is rejected

19
Hypothesis testing

Likelihood ratio test: G2 = -2(L0 – L1)

On the overall adequacy of the model

H0: 𝛽' = … = 𝛽Q = 0 VS H1: at least one 𝛽 ≠ 0


L1: for full model
L0: for intercept model
Under H0, G2 ~ 𝜒QP
P
If G2>𝜒Q,_ , H0 is rejected

20
Hypothesis testing

Effect of individual regressor 𝑥R (H% :𝛽R=0 VS H' :𝛽R ≠ 0)

Likelihood ratio test (G2)


Test statistic = G2 = -2(L0 – L1)
L1: log likelihood for model with regressor 𝑥R
L0: log likelihood for model without regressor 𝑥R
Under H0, G2 ~ 𝜒'P
P
If G2>𝜒',_ , H0 is rejected

21
Example 1
A psychologist conducted a study to examine the nature of the
relation, if any, between an employee’s emotional stability (X) and the
employee’s ability to perform in a task group (Y). Emotional stability
was measured by a written test for which the higher the score, the
greater is the emotional stability. Ability to perform in a task group
(Y=1 if able, Y=0 if unable) was evaluated by the supervisor. The
results of 27 employees were used for analysis with a logistic
regression model.
Example 1 (SAS output)

Testing Global Null Hypothesis: BETA=0


Test Chi-Square DF Pr > ChiSq
Likelihood Ratio 8.1512 1 0.0043
Score 7.3223 1 0.0068
Wald 5.7692 1 0.0163

Analysis of Maximum Likelihood Estimates

Parameter DF Estimate Standard Wald Pr > ChiSq


Error Chi-Square
Intercept 1 -10.3089 4.3770 5.5472 0.0185
X 1 0.0189 0.00788 5.7692 0.0163
Example 1

1. Conduct a test of model adequacy and state the hypotheses


clearly. Use 𝛼 = 0.05.
2. State the fitted response function.
3. What is the estimated probability that employees with an
emotional stability test score of 550 will be able to perform in a
task group.
4. Estimate the emotional stability test score for which 70 percent of
the employees with this test score are expected to be able to
perform in a task group.
Example 1 (solution)

1. The hypotheses

H%:𝛽' = 0 VS H':𝛽' ≠ 0

By likelihood ratio test, the test statistics 𝜒 P =8.1512 and p-value = 0.0043.
Therefore, for 𝛼 = 0.05, there is sufficient evidence to reject the null
hypothesis. Therefore, we conclude that the model is useful.
Example 1 (solution)

2. The estimated logistic function is

𝜋W = [1 + exp(10.3089 – 0.0189 X)]-1

3. For X = 550
𝜋W = [1 + exp(10.3089 – 0.0189(550))]-1 = 0.5215
Example 1 (solution)

3. For 𝜋W = 0.7

0.7= [1 + exp(10.3089 – 0.0189X)]-1

Therefore,
A
'%.d%ef Ggh[ G']
X= .i
= 590.275
%.%'ef
Example 2
A local health clinic sent fliers to its clients to encourage everyone,
but especially older persons at high risk of complications, to get a flu
shot in time for protection against an expected flu epidemic. In a
pilot follow-up study, 159 clients were randomly selected and asked
whether they actually received a flu shot. A client who received a flu
shot was coded Y = 1, and client who did not receive a flu shot was
coded Y = 0. In addition, data were collected on their age (X1), their
health awareness (X2, awareness index), for which higher values
indicate greater awareness. Also included in the data was client
gender, where males were coded X3 = 1 and females were coded
X3 = 0.
Example 2 (SAS output)

Testing Global Null Hypothesis: BETA=0 Odds Ratio Estimates


Test Chi-Square DF Pr > ChiSq Effect Point Estimate 95% Wald
Confidence Limits
Likelihood Ratio 29.8476 3 <.0001
Score 27.0173 3 <.0001
X1 1.076 1.013 1.141
Wald 19.9803 3 0.0002
X2 0.906 0.848 0.967
X3 1.543 0.555 4.291

Analysis of Maximum Likelihood Estimates


Parameter DF Estimate Standard Wald Pr > ChiSq
Error Chi-Square
Intercept 1 -1.1772 2.9824 0.1558 0.6930
X1 1 0.0728 0.0304 5.7401 0.0166
X2 1 -0.0990 0.0335 8.7419 0.0031
X3 1 0.4339 0.5218 0.6917 0.4056
Example 2

1. State the fitted response function.


2. Obtain Exp(b1), Exp(b2), Exp(b3). Interpret these numbers.
3. What is the estimated probability that male clients aged 55
with a health awareness index of 60 will receive a flu shot?
Example 2 (solution)

1. Let the probability of a client who received a flu shot be E(Y) = 𝜋.


The fitted response function is

𝜋W = [1 + exp(1.1772 – 0.0728X1 + 0.0090X2 – 0.4339X3)]-1

2. Exp(b1) = Exp(0.0728) = 1.076


The odds increase multiplicative by 1.076 or the odds increase
7.6% for every 1-unit increase in age (X1) holding X2 and X3 fixed.
Example 2 (solution)

Exp(b2) = Exp(-0.0990) = 0.906


The odds increase multiplicative by 0.906 or the odds decrease
9.4% for every 1-unit increase in age (X2) holding X1 and X3 fixed.

Exp(b3) = Exp(0.4339) = 1.543


The odds of a male client are 1.543 times the odds of a female
client or the odds of a male client are 54.3% higher than the odds
of a female client holding X1 and X2 fixed.
Example 2 (solution)

3. For X1 = 55, X2 = 60, X3 = 1, the estimated probability is

𝜋W = [1 + exp(1.1772 – 0.0728(55) + 0.0990(60) – 0.4339(1)]-1 = 0.0642

You might also like