Lecture 4

Introduction to Mantel-Haenszel
estimate
• The two examples in lecture3 have illustrated the
reason why it is not appropriate to use marginal
(crude) odds ratio to examine the association of
the exposure variable (x) with the response
variable (y), and the need to use conditional
(adjusted) odds ratios. Therefore, the population
parameters of interest are those conditional
(adjusted) odds ratios rather than marginal (crude)
odds ratio.
estimate
• Q: Note that the data in both examples in
lecture 3 are population data which allows
one to know exactly what those conditional
(adjusted) odds ratios are. In reality, one
only has sample data. As before the natural
question to ask is how to estimate those
conditional (adjusted) odds ratios based on
sample data?
estimate
• The answer to the question depends on
whether or not those conditional (adjusted)
odds ratios are different across the levels of
z:
– Case 1: those conditional (adjusted) odds ratios
are different. This case is the case where we
call z as effect modifier.
– Case 2: conditional (adjusted) odds ratios are
the same
estimate
• In case 1, the conditional (adjusted) odds
ratio that corresponds to a specific level of z
can be estimated using the 2 by 2 table of y
and x that corresponds to that level of z.
estimate
• In case 2, as in both examples in lecture 3,
the common conditional (adjusted) odds
ratio can be estimated by so-called Mantel-
Haenszel estimate of common odds ratio.
estimate
• Let  denote the common conditional (adjusted)
odds ratio. Obviously, an Ad-hoc approach to
estimating  is to use one of the z-level-specific 2
by 2 tables of y and x.
• The disadvantage of this Ad-hoc approach is the

fact that it fails to use all the data of y and x,
which leads to lose of efficiency, i.e., wider
confidence interval.
estimate
• Q: how to combine 2 by 2 tables together to
estimate  ?
• A: Three types of estimates of  :

– maximum likelihood estimate
– Mantel-Haenszel estimate
– Logit Estimate
Likelihood estimate of common
odds ratio
• The likelihood estimate of  won’t be covered
here since it will be discussed in the logistic
regression context.
Mantel-Haenszel estimate of
common odds ratio
• Mantel and Haenszel(1959) proposed a
computationally simpler estimate for the
common conditional (adjusted) odds ratio,
which is called Mantel-Haenszel estimate of
common odds ratio
• To express this estimate, we need some
notations
common odds ratio
• The individual data on (y,x,z) can be represented
by r two by two tables of y and x with the k-th
table corresponding to the k-th level of z. The k-th
table is denoted as follows:
x y 1 0
1 nk11 nk10 nk1
0 nk01 nk00 nk0
mk1 mk0 nk
common odds ratio
• Mantel-Haenszel estimate of common odds
ratio takes the form:
n n
k 11 k 00 / nk
ˆMH  k 1
r
n
k 1
n
k 10 k 01 / nk
common odds ratio
• Mantel-Haenszel estimate of common odds ratio can be
viewed as a weighted average of the individual odds
ratios: r
ˆ  w ˆ
MH 
k 1
k k
nk10 nk 00 nk
where wk  r
n
j 1
j10 n j 00 n j
is the weight associated with the k-th odds ratio

estimate, ˆk  nk11nk 00 nk10 nk 01 . The weight wk approximate
the inverse variance of ˆk when  is near 1.
common odds ratio
data cmh;
input center smoke cancer count @@;
cards;
1 1 1 126 1 1 0 100 1 0 1 35 1 0 0 61
2 1 1 908 2 1 0 688 2 0 1 497 2 0 0 807
3 1 1 913 3 1 0 747 3 0 1 336 3 0 0 598
4 1 1 235 4 1 0 172 4 0 1 58 4 0 0 121
5 1 1 402 5 1 0 308 5 0 1 121 5 0 0 215
6 1 1 182 6 1 0 156 6 0 1 72 6 0 0 98
7 1 1 60 7 1 0 99 7 0 1 11 7 0 0 43
8 1 1 104 8 1 0 89 8 0 1 21 8 0 0 36
;
run;
proc freq data=cmh order=data;
weight count;
table center*smoke*cancer/ cmh ;
run;
Note
• The CMH option requests the Mantel-
Haenszel and logit estimates of the odds
ratios and the corresponding confidence
intervals, as well as the p-values for both
Breslow-Day and Cochran-Mantel-
Haenszel tests .
common odds ratio
Estimates of the Common Relative Risk (Row1/Row2)
Type of Study Method Value 95% Confidence Limits

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
Case-Control Mantel-Haenszel 2.1745 1.9840 2.3832
(Odds Ratio) Logit 2.1734 1.9829 2.3823
Cohort Mantel-Haenszel 1.5192 1.4417 1.6008

(Col1 Risk) Logit 1.5132 1.4362 1.5942

(Col2 Risk) Logit 0.7011 0.6734 0.7300
Asymptotic Confidence Interval
for common odds ratio
• Fact: When nk are sufficiently large,
log(ˆMH ) is approximately normally
distributed with mean and standard

E (log(ˆ ))  log( )
error: MH
 (log(ˆMH ))
(too complicated)
• The asymptotic confidence interval for the
log of  is
log(ˆMH )  z1 / 2 (log(ˆMH ))
• The asymptotic confidence interval for  can

be obtained by exponentiating endpoints of
the above confidence interval.
data cmh;
cards;
1 1 1 126 1 1 0 100 1 0 1 35 1 0 0 61
2 1 1 908 2 1 0 688 2 0 1 497 2 0 0 807
3 1 1 913 3 1 0 747 3 0 1 336 3 0 0 598
4 1 1 235 4 1 0 172 4 0 1 58 4 0 0 121
5 1 1 402 5 1 0 308 5 0 1 121 5 0 0 215
6 1 1 182 6 1 0 156 6 0 1 72 6 0 0 98
7 1 1 60 7 1 0 99 7 0 1 11 7 0 0 43
8 1 1 104 8 1 0 89 8 0 1 21 8 0 0 36
;
run;
weight count;
run;

(Odds Ratio) Logit 2.1734 1.9829 2.3823

(Col1 Risk) Logit 1.5132 1.4362 1.5942

(Col2 Risk) Logit 0.7011 0.6734 0.7300
Logit Estimate of Common Odds
Ratio
• An alternative estimate of common odds ratio is
called logit estimate of common odds ratio. The idea
of this estimate is to estimate   log( ) first, and
then exponentiate it to get the estimate of  .
Ratio
• Calculating this estimate takes two steps:
Step 1:. Estimate   log( ) by a weighted average of
the individual log odds ratios
r
ˆ   wk log(ˆk )
k 1
where  k2
wk  r

j 1
2
j
1
 1 1 1 1 
 k2      
n
 k 11 n k 10 nk 01 n k 00 
log(ˆk )  log(nk11nk 00 nk10 nk 01 )

Ratio
• Step 2: the estimate of  is obtained by
exponentiating the estimate of   log( ), i.e.,
ˆL  exp( ˆ )
• The logit estimate is also reasonable estimate of
common odds ratio, but it has problem with zero
cell as opposed to M-H estimate.
Ratio
data cmh;
cards;
1 1 1 126 1 1 0 100 1 0 1 35 1 0 0 61
2 1 1 908 2 1 0 688 2 0 1 497 2 0 0 807
3 1 1 913 3 1 0 747 3 0 1 336 3 0 0 598
4 1 1 235 4 1 0 172 4 0 1 58 4 0 0 121
5 1 1 402 5 1 0 308 5 0 1 121 5 0 0 215
6 1 1 182 6 1 0 156 6 0 1 72 6 0 0 98
7 1 1 60 7 1 0 99 7 0 1 11 7 0 0 43
8 1 1 104 8 1 0 89 8 0 1 21 8 0 0 36
;
run;
weight count;
run;
Ratio

(Odds Ratio) Logit 2.1734 1.9829 2.3823

(Col1 Risk) Logit 1.5132 1.4362 1.5942

(Col2 Risk) Logit 0.7011 0.6734 0.7300
• Fact: When nk are sufficiently large,
log(ˆL ) is approximately normally
distributed with mean and standard

E (log(ˆ ))  log( )
error: L
1 2
 r

 (log(ˆL ))    k2 
 k 1 
• The asymptotic confidence interval for the
log of  is
log(ˆL )  z1 / 2 (log(ˆL ))
• The asymptotic confidence interval for can

be obtained by exponentiating endpoints of
the above confidence interval.
data cmh;
cards;
1 1 1 126 1 1 0 100 1 0 1 35 1 0 0 61
2 1 1 908 2 1 0 688 2 0 1 497 2 0 0 807
3 1 1 913 3 1 0 747 3 0 1 336 3 0 0 598
4 1 1 235 4 1 0 172 4 0 1 58 4 0 0 121
5 1 1 402 5 1 0 308 5 0 1 121 5 0 0 215
6 1 1 182 6 1 0 156 6 0 1 72 6 0 0 98
7 1 1 60 7 1 0 99 7 0 1 11 7 0 0 43
8 1 1 104 8 1 0 89 8 0 1 21 8 0 0 36
;
run;
weight count;
run;

(Odds Ratio) Logit 2.1734 1.9829 2.3823

(Col1 Risk) Logit 1.5132 1.4362 1.5942

(Col2 Risk) Logit 0.7011 0.6734 0.7300
Breslow-Day Test
• The Mantel-Haenszel(logit) estimate of common odds
ratio are developed under the hypothesis that the
conditional odds ratios are equal. It is necessary to
test this odds ratio homogeneity hypothesis:
H 0 : 1     r
before obtaining the Mantel-Haenszel(logit) estimate,
where  k is the conditional odds ratio
corresponding to the k-th level of z (k=1,…,r)
Non-central Hypergeometric
distribution
• Fact: For the kth 2 by 2 table, the
conditional distribution of nk11 given
column totals, mk1 and mk0, and row totals,
nk1 and nk0 , fixed is so-called Non-central
Hypergeometric distribution, which has the
following probability mass function:
 i) 
   
nk 1 nk 0
i mk 1 i  k
i
   
P ( nk11 nk 1 nk 0
mk 1 u  k
u
u
u
Non-central Hypergeometric
distribution
• Hypergeometric distribution is a special case of Non-
central Hypergeometric distribution since when the
odds ratio,  k , equals 1, the mass function of Non
central Hypergeometric distribution becomes that of
Hypergeometric distribution:
 i) 
    
nk 1 nk 0
i

  
mk 1 i
nk 1
i
nk 0
mk 1 i
     
P(nk11 nk 1 nk 0 nk
u mk 1 u mk 1
u
Breslow-Day Test
• The idea of Breslow-Day test is under the null
hypothesis(  k are equal), nk11 is approximately
Non-central Hypergeometric distributed with
 k  ˆMH
and hence nk11 should be close to E (n11k ;ˆMH ) , the
mean of this Non-central Hypergeometric
distribution.
Breslow-Day Test
• Breslow-Day test statistics takes the form:
 BD
2

r
 nk11  E (nk11 ;ˆMH ) 2
k 1 Var (n ;ˆ )
k 11 MH
• Under H0, Breslow-Day test statistics has a

chi-squared distribution with degrees of
freedom r-1.
Breslow-Day Test
data cmh;
cards;
1 1 1 126 1 1 0 100 1 0 1 35 1 0 0 61
2 1 1 908 2 1 0 688 2 0 1 497 2 0 0 807
3 1 1 913 3 1 0 747 3 0 1 336 3 0 0 598
4 1 1 235 4 1 0 172 4 0 1 58 4 0 0 121
5 1 1 402 5 1 0 308 5 0 1 121 5 0 0 215
6 1 1 182 6 1 0 156 6 0 1 72 6 0 0 98
7 1 1 60 7 1 0 99 7 0 1 11 7 0 0 43
8 1 1 104 8 1 0 89 8 0 1 21 8 0 0 36
;
run;
weight count;
run;
Breslow-Day Test
Breslow-Day Test for
Homogeneity of the Odds Ratios
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
Chi-Square 5.199
DF 7
Pr > ChiSq 0.6356
Cochran- Mantel-Haenszel Test
• Cochran- Mantel-Haenszel test is to test whether the
common conditional (adjusted) odds ratio of y and x
equals to one, i.e.
H0 :  1
• Of course, one can use the confidence interval of  to
test this null hypothesis. The problem with using
confidence interval for hypothesis testing is the failure
of obtaining p-value.
• The idea of CMH test is similar to that of
Breslow-Day test: under the null
hypothesis,
• nk11 is close to its mean r
E (nk11 ;1) for each k.
As a result,rthe total k 1
nk11 is also close to
its mean,  E (nk11 ;1)

k 1
• Cochran- Mantel-Haenszel test statistics takes the
form: r r
  nk11   E (nk11 ;1)  2
2 k 1 k 1
 CMH  r
 Var (nk11;1)
k 1
• Under the null hypothesis, Cochran- Mantel-
Haenszel test statistics has a chi-squared
distribution with degrees of freedom 1.
data cmh;
cards;
1 1 1 126 1 1 0 100 1 0 1 35 1 0 0 61
2 1 1 908 2 1 0 688 2 0 1 497 2 0 0 807
3 1 1 913 3 1 0 747 3 0 1 336 3 0 0 598
4 1 1 235 4 1 0 172 4 0 1 58 4 0 0 121
5 1 1 402 5 1 0 308 5 0 1 121 5 0 0 215
6 1 1 182 6 1 0 156 6 0 1 72 6 0 0 98
7 1 1 60 7 1 0 99 7 0 1 11 7 0 0 43
8 1 1 104 8 1 0 89 8 0 1 21 8 0 0 36
;
run;
weight count;
run;
Cochran-Mantel-Haenszel Statistics (Based on Table Scores)
Statistic Alternative Hypothesis DF Value Prob

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
1 Nonzero Correlation 1 280.1375 <.0001
2 Row Mean Scores Differ 1 280.1375 <.0001
3 General Association 1 280.1375 <.0001
Effect Modifier
• An effect modifier is a variable (z) that modifies
the association of the exposure variable (x) with
the response variable (y). In other words, if z is an
effect modifier, then the conditional (adjusted)
odds ratio of x and y changes across the level of z.
• For example, y=Alzheimer Disease (AD)
x=gender z=Apoe-4. The association of gender
with AD in the apoe-4 group is stronger than that
in the non apoe-4 group.
Effect Modifier
• In general, a variable (z) can be classified
into four categories according to whether it
is a confounder and whether it is an effect
modifier:
1. confounder (yes) and effect modifier (no)

2. confounder (yes) and effect modifier (yes)
3. confounder (no) and effect modifier (no)
4. confounder (no) and effect modifier (yes)
Effect Modifier
• Category 3 & 4 are relevant in clinical trial
as z can not be confounder due to
randomization even though it can be effect
modifier.
• Category 1 & 2 are relevant in
observational study as z can be both
confounder and effect modifier.
Home Work Assignment
Problem 1 (3.8 on page 68)
Table 3.5 (given on slide 42) refers to the effect of passive smoking on lung
cancer. It summarizes results of case-control studies from three countries
among nonsmoking women married to smokers. Test the hypothesis that
having lung cancer is independent of passive smoking, controlling for
country. Report the P-value, and interpret.
(Note: Weak associations in observational studies are suspect. With relatively
small changes in the data, perhaps representing effects of misclassification or
other bias, the association could disappear. See, for instance, R.L.Tweedie et
al., Garbage in, garbage out, Chance, 7: no. 2, 20-27(1994))
Problem 2 (3.9 on page 68)
Refer to the previous problem. Assume that the true odds ratio between
passive smoking and lung cancer is the same for each study. Estimate its
value, and use software to find a 95% confidence interval. Interpret. Analyze
whether the odds ratios truly are identical.
Table 3.5
Country Spouse Cases Controls
Smoked
Japan No 21 82
Yes 73 188
Great No 5 16
Britain Yes 19 38
United No 71 249
States Yes 137 363

Lecture 4

Uploaded by

Document Informationclick to expand document information

Document Informationclick to expand document information

Copyright:

Available Formats

Lecture 4

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lecture 4

Uploaded by

Copyright:

Available Formats

Introduction to Mantel-Haenszel

• The disadvantage of this Ad-hoc approach is the

• A: Three types of estimates of  :

is the weight associated with the k-th odds ratio

Estimates of the Common Relative Risk (Row1/Row2)

Type of Study Method Value 95% Confidence Limits

Cohort Mantel-Haenszel 1.5192 1.4417 1.6008

Cohort Mantel-Haenszel 0.6999 0.6721 0.7290

distributed with mean and standard

• The asymptotic confidence interval for  can

Type of Study Method Value 95% Confidence Limits

Cohort Mantel-Haenszel 1.5192 1.4417 1.6008

Cohort Mantel-Haenszel 0.6999 0.6721 0.7290

log(ˆk )  log(nk11nk 00 nk10 nk 01 )

Type of Study Method Value 95% Confidence Limits

Cohort Mantel-Haenszel 1.5192 1.4417 1.6008

Cohort Mantel-Haenszel 0.6999 0.6721 0.7290

distributed with mean and standard

• The asymptotic confidence interval for can

Type of Study Method Value 95% Confidence Limits

Cohort Mantel-Haenszel 1.5192 1.4417 1.6008

Cohort Mantel-Haenszel 0.6999 0.6721 0.7290

• Under H0, Breslow-Day test statistics has a

its mean,  E (nk11 ;1)

Cochran-Mantel-Haenszel Statistics (Based on Table Scores)

Statistic Alternative Hypothesis DF Value Prob

1. confounder (yes) and effect modifier (no)

You might also like