Estimation and Inference For The Mediation Proportion
Estimation and Inference For The Mediation Proportion
Estimation and Inference For The Mediation Proportion
∗
Harvard School of Public Health, [email protected]
†
Harvard School of Public Health, [email protected]
‡
Harvard School of Public Health, [email protected]
This working paper is hosted by The Berkeley Electronic Press (bepress) and may not be commer-
cially reproduced without the permission of the copyright holder.
http://biostats.bepress.com/harvardbiostat/paper204
Copyright c 2016 by the authors.
Estimation and Inference for the Mediation
Proportion
Daniel Nevo, Xiaomei Liao, and Donna Spiegelman
Abstract
In epidemiology, public health and social science, mediation analysis is often un-
dertaken to investigate the extent to which the effect of a risk factor on an outcome
of interest is mediated by other covariates. A pivotal quantity of interest in such an
analysis is the mediation proportion. A common method for estimating it, termed
the “difference method”, compares estimates from models with and without the
hypothesized mediator. However, rigorous methodology for estimation and statis-
tical inference for this quantity has not previously been available. We formulated
the problem for the Cox model and generalized linear models, and utilize a data
duplication algorithm together with a generalized estimation equations approach
for estimating the mediation proportion and its variance. We further considered
the assumption that the same link function hold for the marginal and conditional
models, a property which we term ”g-linkability”. We show that our approach
is valid whenever g-linkability holds, exactly or approximately, and present re-
sults from an extensive simulation study to explore finite sample properties. We
developed estimation and inference methods that reflect the fact the mediation
proportion is bounded between zero and one. In particular, we developed statis-
tical testing procedures for the existence of mediation that honors these bounds,
and compare the empirical behavior of crude and logit based confidence intervals.
The methodology is illustrated by an analysis of pre-menopausal breast cancer
incidence in the Nurses’ Health Study. User-friendly publicly available software
implementing those methods can be downloaded at the last author’s website.
Estimation and inference for the mediation proportion
Daniel Nevo, Xiaomei Liao and Donna Spiegelman
Abstract
In epidemiology, public health and social science, mediation analysis is often undertaken
to investigate the extent to which the effect of a risk factor on an outcome of interest is
mediated by other covariates. A pivotal quantity of interest in such an analysis is the me-
diation proportion. A common method for estimating it, termed the “difference method”,
compares estimates from models with and without the hypothesized mediator. However,
rigorous methodology for estimation and statistical inference for this quantity has not previ-
ously been available. We formulated the problem for the Cox model and generalized linear
models, and utilize a data duplication algorithm together with a generalized estimation
equations approach for estimating the mediation proportion and its variance. We further
considered the assumption that the same link function hold for the marginal and condi-
tional models, a property which we term “g-linkability”. We show that our approach is
valid whenever g-linkability holds, exactly or approximately, and present results from an
extensive simulation study to explore finite sample properties. We developed estimation
and inference methods that reflect the fact the mediation proportion is bounded between
zero and one. In particular, we developed statistical testing procedures for the existence of
mediation that honors these bounds, and compare the empirical behavior of crude and logit
based confidence intervals. The methodology is illustrated by an analysis of pre-menopausal
breast cancer incidence in the Nurses’ Health Study. User-friendly publicly available soft-
ware implementing those methods can be downloaded at the last author’s website.
1 Introduction
In many public health, biological, and biomedical systems, the mechanism that explains how
an intervention or exposure affects the outcome of interest is unknown, even after a causal
association between the exposure and the outcome is established. It is sometimes hypothesized
that there exists a mediator that connects the exposure and the outcome, sitting on the causal
pathway between the exposure and the outcome. In observational studies, identifying a plausible
ideally pre-specified, mediator can strengthen the casual inference of the findings. For example,
in an evaluation of the effectiveness of the ongoing, trillion dollar President’s Emergency Plan
for AIDS Relief (PEPFAR) in reducing HIV incidence and prevention in sub-Saharan Africa,
it would strengthen the evidence of a causal inference if it could be shown that a substantial
proportion of the reduction in disease incidence in time was mediated by increased programmatic
coverage in the region, thus diminishing exogenous time trends as the best explanation for any
observed decline.
Several methods have been proposed to assess whether mediation exists and to quantify its
magnitude [21, 9, 18, 20]. Baron and Kenny [2] described a sequence of hypothesis tests to asses
the evidence in the data for mediation by a specific covariate. They assumed a linear model for
http://biostats.bepress.com/harvardbiostat/paper204
the use of the methodology developed in studying mediation of the effect of risk factors for pre-
menopausal breast cancer incidence by mammographic density in the Nurses’ Health Studies
NHSI [3] and NHSII [38]. In Section 7 we discuss results and related issues. We describe the
software we make publicly available in the Appendix.
2 The models
Assume Y1 , ..., Yn is a sample of results of an outcome of interest, and that for each subject i
we also observe a vector of factors Z i = (Xi , Mi , W i ) where Xi , Mi and W i are an exposure of
interest, a mediator and a vector of confounders, respectively. We assume the conditional mean
function for the outcome is E(Yi |Z i ) = g −1 (Z Ti β) with g being the link function and where β,
an unknown parameter vector, is composed of the appropriate components βX ,βM and β W . A
consistent estimator, β̂, for β is obtained as the solution to the estimating equations
n
X
U (β) = D i vi−1 [yi − E(Yi |Z i )] = 0 (1)
i=1
where D i = ∂E(Yi |Z i )/∂β and vi is a working variance of yi . By GEE theory, the variance of
β̂ can be consistently estimated by the robust sandwich estimator [15, 12].
Traditionally, mediation analysis considers a single mediator. However, methods for address-
ing multiple mediators have been developed [35]. For simplicity of presentation, we consider
in this paper the case of a single mediator M . First, consider the following conditional and
marginal mean models for Y , with respect to M
Let B = (β0 , β1 , β2 , β 3 ) and B ? = (β0? , β1? , β ?3 ) be the vectors of conditional and marginal regres-
sion model parameters, and denote B̂ and B̂ ? for their estimators obtained by solving equation
(1) under models (2) and (3), separately, respectively. When the two models (2) and (3) both
hold simultaneously, we say we have g-linkability.
Throughout this paper, we assume that, after adjusting for measured confounders, there is no
unmeasured confounding of the estimates of the exposure-outcome relationship, the mediator-
outcome relationship or the exposure-mediator relationship. We also assume that confounders
of the mediator-outcome relationship are unaffected by the exposure. Under these assumptions,
and when models (2) and (3) hold, the TE equals β1? and the NIE equals β1? − β1 , for the identity
and log link functions [19, 34], and if, in addition, the outcome is rare, this is also true for the
logit link function [36]. Therefore, the mediation proportion, p, which is the ratio between the
NIE and the TE, equals to
β ? − β1 β1
p= 1 ? = 1 − ?.
β1 β1
A necessary condition for M to be interpreted as a mediator is that p ∈ (0, 1]. The situation
where p = 0 corresponds to β1 = β1? , hence in this case M does not mediate the effect of X at
all. On the other hand, if p = 1 then the effect of X is fully mediated by M . Finally, if p ∈
/ [0, 1],
it is either that the NDE and NIE are in opposite directions or M is not a mediator at all, but
β̂1 and β̂1? are the appropriate components of B̂ and B̂ ? . Under g-linkability, this estimator is
consistent by standard GEE theory and the general mapping theorem.
The question of mediation can also be investigated when the available data is survival data.
Lin et al. [16] considered this question for the Cox model in the context of the PTE. First, as
in [16], we define the following two models for the hazard function at time t, h(t), conditionally
and marginally, with respect to M
where X, M and W are allowed to be time dependent and λ0 (t) and λ?0 (t) are baseline hazard
functions. The authors of [16] have shown that these two models cannot hold at same time.
Rt
However, they claimed that if either β ?3 or Λ?0 (t) = 0 λ?0 (s)ds are small, then model (4) is
a good approximation to the true conditional model. The assumption that Λ?0 (t) is small is
the rare outcome assumption. They confirmed this claim using a small scale simulation study.
When (4) holds, approximately, the Cox model is approximately g-linkable. Thus, in addition
to GLMs, we investigate in this paper estimation and inference for p in approximately g-linkable
Cox models.
http://biostats.bepress.com/harvardbiostat/paper204
3.1 Identity link function
Under the identity link function, models (2) and (3) simplify to
E(Y |X, M, W ) = β0 + β1 X + β2 M + β T3 W
T
E(Y |X, W ) = β0? + β1? X + β ?3 W .
We now show that g-linkability holds whenever E(M |X, W ) is a linear function of X and W .
To see that, let E(M |X, W ) = a + b1 X + bT3 W , for some a, b1 and b3 . Then,
T
E(Y |X, W ) = E(E(Y |X, M, W )|X, W ) = β0 +β1 X+β2 E(M |X, W )+β T3 W = β0? +β1? X+β ?3 W
and we have
Therefore, in the log link case, g−linkability holds if the log of the moment generating function
of M |X, W can be written as a linear function of X and W . That is, log E[exp(β2 M )|X, W ] =
T
a0 + b01 X + b03 W .
where
σβ̂2 = V ar(β̂1 ), σβ̂2 ? = V ar(β̂1? ) and σβ̂1 ,β̂ ? = Cov(β̂1 , β̂1? ).
1 1 1
with z1−α/2 being the appropriate quantile of the normal distribution. While this confidence
interval is asymptotically valid, in finite samples it may include negative values or values larger
than one. Since such values are outside the parameter space for p if M is indeed a mediator, values
outside the parameter space should be excluded. One option is to trim the resulting confidence
interval so it would be contained in [0, 1]. Alternatively, a logit-based confidence interval can be
constructed, using again the delta method, and then back-transformed the resulting confidence
interval to get a confidence interval which is, by definition, contained in [0, 1]. More formally,
p
let ψ = logit(p) = log( 1−p ) and let ψ̂ = logit(p̂). By the delta method, ψ̂ is consistent and
asymptotically normally distributed with variance
1
σψ̂2 = σ2 ,
p2 (1 − p)2 p̂
which can be estimated by plugging in p̂ and σ̂p2 in the above expression. Then, a (1 − α) level
confidence interval for the mediation proportion p is obtained as
exp ψ̂ − z1−α/2 σ̂ψ̂ exp ψ̂ + z1−α/2 σ̂ψ̂
, (7)
1 + exp ψ̂ − z1−α/2 σ̂ψ̂ 1 + exp ψ̂ + z1−α/2 σ̂ψ̂
http://biostats.bepress.com/harvardbiostat/paper204
mapping theorem, we then have
(
1 c ≤ 0,
lim P (Zp+ ≥ c) →
n→∞ 1 − Φ(c) c > 0
where Φ is the cumulative distribution function of standard normal distribution and the p-value
for testing H0 : p = 0 vs. H1 : p > 0 equals to one if σ̂p−1 p̂ < 0 and to 1 − Φ(σ̂p−1 p̂) if p̂ > 0.
However, if the unconstrained p̂ is larger than one, then this is misleading. Indeed p maybe
larger than zero in this case, but p should not be interpreted as a mediation proportion, as noted
in [16]. In this case, M is not a mediator, but a confounder. This demonstrates the point that
the described test should not be used without first considering the unconstrained value of p̂, and
making sure it is within the parameter space [0, 1].
Consider the distribution of (Zp+ )2 = [max(0, Zp )]2 . This statistic equals to zero if Zp < 0,
which occurs with probability of 0.5 since Zp is a standard normal variable. Therefore, for any
nonnegative value c+ we have
√
P [(Zp+ )2 < c+ ] = P [Zp ≤ 0] + P [0 < Zp < c+ ] = 0.5 + 0.5P [Zp2 < c+ ] = 0.5 + 0.5P (χ2(1) < c+ )
√ √ √
since P [0 < Zp < c+ ] = 0.5P [− c+ < Zp < c+ ], where χ2(k) is a χ2 variable with k degrees
of freedom. Thus, the asymptotic distribution of (Zp+ )2 is a mixture of χ2(0) and χ2(1) random
variables, with mixture probability of 0.5, similar to what was previously shown [31].
An alternative test statistic is based upon a test for the difference between the effect estimates
in the marginal and the conditional models. That is, on dˆ = β̂1? − β̂1 . Under the assumptions
in this paper, dˆ is a consistent estimate for the NIE. A test statistic based on dˆ is based on
Zd = σd−1 ˆ
ˆ d, where
ˆ = σ 2 + σ 2 ? − 2σ
σd2ˆ = V ar(d) β̂ 1 β̂ 1
β̂1 ,β̂ ? .
1
E(Yij |Xi , Xi? , Mi , W i , W ?i ) = g −1 (β0 I{j = 1}+β1 Xi +β2 Mi +β T3 W i +β0? I{j = 2}+β1? Xi? +β ?T ?
3 W i ),
(8)
where j = 1, 2 are the rows created from duplicating each observation and are treated as repeated
measures. Model (8) implies that we can write E(Yi1 |Xi , Xi? , Mi , W i , W ?i ) = E(Yi1 |Xi , Mi , W i )
and E(Yi2 |Xi , Xi? , Mi , W i , W ?i ) = E(Yi2 |Xi? , W ?i ). Let R be a 2 × 2 working correlation matrix
1/2 1/2
and denote B i = diag(vi1 , vi2 ), where vij = V ar(Yij ). Let also V i = B i RB i be a 2 × 2
working variance for the vector (Yi1 , Yi2 ). Here, the GEE are defined as
n
!
X yi1 − E(Yi1 |Xi , Mi , W i )
UGEE (B) = (Di , Di? )V −1
i , (9)
i=1
yi2 − E(Yi2 |Xi? , W ?i )
where Di = ∂E(Yi1 |Xi , Mi , W i )/∂B and Di? = ∂E(Yi2 |Xi? , W ?i )/∂B ? are two column vectors. If
R is taken to be the identity matrix, then V i = B i and (9) simplifies to the following estimating
equations
(1)
! Pn !
−1
UIEE (β) i=1 Di vi1 [yi − E(Yi1 |Xi , Mi , W i )]
UIEE (B) = (2) = Pn ? −1
= 0. (10)
UIEE (β ? ) ? ?
i=1 Di vi2 [yi − E(Yi2 |Xi , W i )]
Then, the estimating equations given by (10) are identical to the estimating equations for fitting
models (2) and (3) separately, because D i and vi and Z i in equation (1) are equal to Di , vi1
and (Xi , Mi , W i ), respectively, under model (2), and they are equal to Di? , vi2 and (Xi? , W ?i ),
respectively, under model (3). The major advantage of the data duplication algorithm is that
it provides an estimator for σβ1 ,β1? in a straightforward manner. Taking a working correlation
matrix other than the identity may result in more efficient estimators of B̂, but would not have
the desirable property that the duplicated data estimating equations are identical to the two
separate estimating equations from the two separate models.
5 Simulation study
In the simulation studies, we considered several issues regarding the performance of the method-
ology we presented throughout the paper. We first present results concerning g-linkability for
the logit link function and the Cox model. Then, we turn to the performance of the mediation
proportion estimator, studying its bias, the coverage rate of the accompanied confidence inter-
vals and the type I error and the power of the statistical tests described in Section 4. For the
generalized linear models, we used the GEE data duplication method as described in the previ-
ous section. For the Cox model, estimates were calculated using the data duplication method
suggested by Lin et al. [16].
Throughout these simulation studies, we assume that there are no confounders in the model.
X and!M were generated using a bivariate normal with mean (0, 0)T and covariance matrix
1 ρ
. Then, we have that β2 = ρp β1? for the identity, log and logit link functions (the latter
ρ 1
under the rare outcome assumption); see Web Appendix A. In these scenarios, g−linkability
holds for all three link functions. The estimation and inference procedures apply to any bivariate
http://biostats.bepress.com/harvardbiostat/paper204
distribution of X and M that satisfies the simple moment conditions given in Section 3, and
here we used the bivariate normal distribution for generating the data merely for convenience.
The estimation and inference procedures do not use the bivariate normal distribution of (X, M ).
5.1 g-linkability for the logit link function and of the Cox model
In order to asses the magnitude of the bias when assuming g-linkability of the logit link function
and the Cox model, we conducted a simulation study under various conditions and inspected
the resulting bias in p̂, as estimated using the data duplication algorithm described in Section
4.2 while taking the working correlation matrix to be the identity. First we describe the logit
link function model. We simulate Y under the logistic regression model
logit(P (Y = 1|X, M )) = β0 + β1 X + β2 M.
We chose the model parameter values in the following way. First, we chose ρ = corr(X, M ), p
and β1? . Then, by definition we had β1 = (1 − p)β1? , and we took β2 as if g-linkability exactly
holds. That is, β2 = ρp β1? . Then, we fixed the unconditional case probability P (Y = 1) and
found the appropriate β0 value by solving for β0 in the equation
P (Y = 1) = E(expit(β0 + β1 X + β2 M )),
where expit(u) = exp(u)/(1+exp(u)). Finally, the sample size was given as n = E(Ncases )/P (Y =
1) where E(Ncases ) is number of expected cases. We considered the following values for the pa-
rameters. p = 0.1, 0.2, ..., 0.8; ρ = p, p+0.1, ..., 0.8, with ρ ≥ p to satisfy that β2 ≤ β1? or in words,
to ensure that the total effect of X is larger than effect of M ; β1? = log(1.25), log(1.5), log(2);
P (Y = 1) = 0.005, 0.01, 0.1, 0.25; E(Ncases ) = 100, 500, 1000. The number of simulation itera-
tions per scenario was 1000.
For the Cox model, we simulated the data similarly to the logit link function simulations.
First, we simulated X and M as before. Then, given fixed ρ, p and β1? , β2 = ρp β1? . We took a
Weibull distribution for the baseline hazard and used Exponential distribution for the censoring
(mean=50), with additional cutoff at age 90. Given the desired proportion number of cases in
the population, we used simulations to find the appropriate values for the Weibull distribution
shape parameter, while fixing the scale parameter at 200. As in the logit link case, we chose the
sample size as the number of expected cases (E(Ncases )) divided by the expected proportion of
cases (P (δ = 1)), where δ is the event indicator.
In order to assess g-linkability, and the finite sample performance of p̂, we calculated the
relative bias, defined as 100 × | mean(p p̂)−p |. Ideally, this quantity should be close to zero. We note
that bias may arise either because g-linkability fails to hold, or because of a sample size not large
enough. Figure 1 presents bias for β1? = log(1.5) as a function of the parameters. First, it is of
note that whenever the overall prevalence or cumulative incidence of Y was small, as in the rare
disease scenario, and the number of cases was sufficiently large, bias was minimal. Even when
the disease was not as rare, e.g., P (Y = 1) = 0.25, when there were enough cases, and when p
was large enough (e.g., p > 0.2 in this case), the bias was minimal. Considering the g-linkability
of the Cox model, presented for β1? = log(1.5) in Figure 2, the results were similar to the results
obtained for the logit link function. That is, when the outcome was rare (P (δ = 1) was small)
10
http://biostats.bepress.com/harvardbiostat/paper204
larger than p, neither of the methods produced confidence intervals with nominal coverage,
especially when the sample size was small. Comparing between the trimmed untransformed and
the transformed-based confidence intervals, the latter did not offer any clear advantage in terms
of performance. For small sample size, the transformation-based confidence interval tended to
be wider, especially for p = 0.1, without any substantial gain in terms of coverage rate. For a
larger sample size, the two confidence intervals were comparable in their performance.
Throughout this section, we presented in parallel results for the identity and logit link function
and the Cox model. There was a very strong agreement between the results for the logit link
function for binary data and the Cox model, as one may have expect given the close relationship
between the logistic regression model and the Cox model in epidemiology and public health
evaluations.
In addition to the scenarios we described above, we conducted simulations for the identity
link function with error distributions other than the normal one. We considered symmetric
distribution with tails heavier than the normal distribution as well as skewed distributions. As
predicted by GEE theory, the performance of the mediation proportion estimator, the statistical
tests and the confidence interval was only slightly changed. Details are given in Web Appendix
C.
6 Illustrative example
We illustrate the use of our methodology in the analysis of breast cancer data from the Nurses
Health’s Studies (NHS and NHSII) [3, 38]. It was previously found that high mammographic
density (MD) is a risk factor for breast cancer [22]. The goal here is to investigate whether,
and to what extent, the effects of more distal risk factors for pre-menopausal breast cancer
are mediated by high MD. Detailed description of this study is given in [26]. In this nested
case-control study, controls were matched to cases by current age, menopausal status, current
hormone use, month, time of day, fasting status and time of the day at blood collection and
luteal day (for NHSII samples only). There were 559 pre-menopausal cases and 1727 controls.
Since the disease is rare, and as shown in the previous section, g-linkability should hold. The
mediator is percent MD. We conducted mediation analysis for all breast cancer risk factors with
significant total effects: personal history of benign breast disease (HBBD), family history of
breast cancer (FH), adolescent somatotype (ASM), body mass index at age 18 (BMI18), age at
first birth (AFB), age at menarche (AM) and height (HT). Results were adjusted for current
age, fasting status, blood collection time of the day, mammography batch (NHS batch 1, NHS
batch 2 or NHSII), current BMI, BMI18, ASM, HBBD, parity, AFB, and AM, where mediation
was assessed separately for a number of these variables, where most of the others were treated
as confounders.
Table 5 presents the estimated mediation proportions, confidence intervals and p-values,
along with the estimated risk factor effects. Of note is that MD is significant as a mediator for
HBBD, ASM and BMI18, regardless whether the test was based on p̂ or d, ˆ although p-values
corresponding to the latter test were much smaller. Confidence intervals were quite wide for
ASM and BMI18. This may be due to the moderate sample size, and the relatively small effect.
11
12
http://biostats.bepress.com/harvardbiostat/paper204
holds for both the conditional model and the marginal model, where the latter is the model
without the mediator. We have shown that g-linkability holds for the identity and the log link
function when fairly general conditions are met. In addition, g-linkability holds for the logit
link function whenever the outcome is rare. When the outcome is not rare, one may fit the
log-binomial model instead, as noted in [33], which may be preferable anyway, as the odds ratio
is typically not the parameter of interest [32].
In conclusion, the general framework for mediation analysis in generalized linear models
developed in this paper along with the methodology established, will allow researchers to inves-
tigate mediation under various outcome scenarios and to quantify results based on rigorously
derived and empirically studied estimators and hypothesis tests.
Acknowledgments
This work was supported by National Institutes of Health grant DP1ES025459.
Appendix
One major goal of this paper is to produce statistical tools to be used in practice. The SAS macro
%mediate implements the data duplication algorithm and reports point and interval estimates for
the mediation proportion and the results for the mediation test using the difference method. It is
available on the last author’s website http://www.hsph.harvard.edu/donna-spiegelman/software/mediate.
Simulations were conducted using R code that can be obtained by request to the first author.
Both the SAS macro and the R code can be used for either GLMs or survival data analysis.
References
[1] Duane F Alwin and Robert M Hauser. The decomposition of effects in path analysis. American
sociological review, pages 37–47, 1975.
[2] Reuben M Baron and David A Kenny. The moderator–mediator variable distinction in social
psychological research: Conceptual, strategic, and statistical considerations. Journal of personality
and social psychology, 51(6):1173, 1986.
[3] Charlene F Belanger, Charles H Hennekens, Bernard Rosner, and Frank E Speizer. The nurses’
health study. The American Journal of Nursing, 78(6):1039–1040, 1978.
[4] Robert M Carney, William B Howells, James A Blumenthal, Kenneth E Freedland, Phyllis K Stein,
Lisa F Berkman, Lana L Watkins, Susan M Czajkowski, Brian Steinmeyer, Junichiro Hayano, et al.
Heart rate turbulence, depression, and survival after acute myocardial infarction. Psychosomatic
medicine, 69(1):4–9, 2007.
[5] Mary Kathryn Cowles. Bayesian estimation of the proportion of treatment effect captured by a
surrogate marker. Statistics in medicine, 21(6):811–834, 2002.
[6] David R Cox. Regression models and life-tables. Journal of the Royal Statistical Society. Series B
(Methodological), pages 187–220, 1972.
[7] Laurence S Freedman. Confidence intervals and statistical power of the validationratio for surrogate
or intermediate endpoints. Journal of Statistical Planning and Inference, 96(1):143–153, 2001.
13
14
http://biostats.bepress.com/harvardbiostat/paper204
[28] Andrea L Roberts, Margaret Rosario, Heather L Corliss, Karestan C Koenen, and S Bryn Austin.
Childhood gender nonconformity: A risk indicator for childhood abuse and posttraumatic stress in
youth. Pediatrics, 129(3):410–417, 2012.
[29] Andrea L Roberts, Margaret Rosario, Heather L Corliss, Karestan C Koenen, and S Bryn Austin.
Elevated risk of posttraumatic stress in sexual minority youths: mediation by childhood abuse and
gender nonconformity. American journal of public health, 102(8):1587–1593, 2012.
[30] James M Robins and Sander Greenland. Identifiability and exchangeability for direct and indirect
effects. Epidemiology, pages 143–155, 1992.
[31] Steven G Self and Kung-Yee Liang. Asymptotic properties of maximum likelihood estimators and
likelihood ratio tests under nonstandard conditions. Journal of the American Statistical Association,
82(398):605–610, 1987.
[32] Donna Spiegelman and Ellen Hertzmark. Easy sas calculations for risk or prevalence ratios and
differences. American journal of epidemiology, 162(3):199–200, 2005.
[33] Linda Valeri and Tyler J VanderWeele. Mediation analysis allowing for exposure–mediator inter-
actions and causal interpretation: Theoretical assumptions and implementation with sas and spss
macros. Psychological methods, 18(2):137, 2013.
[34] Tyler VanderWeele. Explanation in causal inference: methods for mediation and interaction. Oxford
University Press, 2015.
[35] Tyler VanderWeele and Stijn Vansteelandt. Mediation analysis with multiple mediators. Epidemi-
ologic methods, 2(1):95–115, 2013.
[36] Tyler J VanderWeele and Stijn Vansteelandt. Odds ratios for mediation analysis for a dichotomous
outcome. American journal of epidemiology, 172(12):1339–1348, 2010.
[37] Wei Wang and Jeffrey M Albert. Estimation of mediation effects for zero-inflated regression models.
Statistics in medicine, 31(26):3118–3132, 2012.
[38] Anne M Wolf, David J Hunter, Graham A Colditz, Joann E Manson, Meir J Stampfer, Karen A
Corsano, Bernard Rosner, Andrea Kriska, and Walter C Willett. Reproducibility and validity of a
self-administered physical activity questionnaire. International journal of epidemiology, 23(5):991–
999, 1994.
15
i j Intercept Intercept? X X? M W W? Y
1 1 1 0 x1 0 m1 w1 0 y1
1 2 0 1 0 x1 0 0 w1 y1
2 1 1 0 x2 0 m2 w2 0 y2
2 2 0 1 0 x2 0 0 w2 y2
.. .. .. .. .. .. .. .. .. ..
. . . . . . . . . .
0.8
E(Ncases) = 100
0.6
0.4
0.2
0.8
relative bias
E(Ncases) = 500
<=1%
0.6
1%−5%
ρ
5%−10%
0.4 10%−25%
25%−50%
0.2 >=50%
0.8
E(Ncases) = 1000
0.6
0.4
0.2
0.2 0.4 0.6 0.8 0.2 0.4 0.6 0.8 0.2 0.4 0.6 0.8 0.2 0.4 0.6 0.8
p
Figure 1: Relative bias of the mediation proportion estimator under the logistic model as a function of
the mediation proportion (p), the correlation between the exposure and the mediator (ρ), the number of
expected cases (E(Ncases )) and the outcome rate (P (Y = 1)). The value of β1? was taken to be log(1.5).
0.8
E(Ncases) = 100
0.6
0.4
0.2
0.8
relative bias
E(Ncases) = 500
0.6 <=1%
1%−5%
ρ
5%−10%
0.4 10%−25%
25%−50%
0.2 >=50%
0.8
E(Ncases) = 1000
0.6
0.4
0.2
0.2 0.4 0.6 0.8 0.2 0.4 0.6 0.8 0.2 0.4 0.6 0.8 0.2 0.4 0.6 0.8
p
Figure 2: Relative bias of the mediation proportion estimator under the Cox model as a function of
the mediation proportion (p), the correlation between the exposure and the mediator (ρ), the number
of expected cases (E(Ncases )) and the event rate (P (δ = 1)). The value of β1? was taken to be log(1.5).
16
http://biostats.bepress.com/harvardbiostat/paper204
Table 2: Relative bias in percentage of the mediation proportion estimator under the identity
and logit link functions and the Cox model. Coverage rates of 95% trimmed untransformed
confidence intervals are displayed in brackets. Nout is the mean proportion of simulations with
p̂ ∈
/ [0, 1], where the mean is taken over the rest of the columns.
17
18
http://biostats.bepress.com/harvardbiostat/paper204
Table 4: Coverage rates (CI-RATE) and lengths (CI-LEN) of trimmed untransformed and logit
transformed-based (Trans) confidence intervals for the mediation proportion under the identity
and logit link functions and the Cox model
19
20
BMI at age 18†
-0.23 (0.79) 0.02 -0.05 (0.95) 0.78 0.06–1.50 0.06–1.00 0.05–1.00 0.02 < 10−8
Per 5 unit increase
Age at first birth‡
0.15 (1.17) 0.03 0.15 (1.16) 0.03 -0.09–0.15 0.00–0.15 0.00–0.66 0.31 0.30
Per 5 year increase
Age at menarche
-0.16 (0.86) 0.03 -0.18 (0.84) N/A N/A N/A N/A N/A N/A
Per 2 year increase
Height Per 3 inch
0.13 (1.14) 0.03 0.14 (1.14) N/A N/A N/A N/A N/A N/A
increase
Adjusted for age, fasting status, blood collection time of the day, mammography batch (NHS batch 1, NHS batch 2 or NHSII), current and at age 18 BMI,
adolescent somatotype, history of BBD, parity, age at first birth, and age at menarche
† Not adjusted for adolescent somatotype, BMI, current or at age 18
‡ Among parous women only (478 cases, 1499 controls)
http://biostats.bepress.com/harvardbiostat/paper204