Assignment 7 (Solution)
Assignment 7 (Solution)
Assignment 7 (Solution)
A problem of interest to health officials (and others) is to determine the effects of smoking during pregnancy on
infant health. One measure of infant health is birth weight; a birth weight that is too low can put an infant at risk for
contracting various illnesses. Since factors other than cigarette smoking that affect birth weight are likely to be
correlated with smoking, we should take those factors into account. For example, higher income generally results in
access to better prenatal care, as well as better nutrition for the mother. An equation that recognizes this is:
(i) What is the most likely sign for B2? (+ive, since higher income families can secure better health care and
nutrition for expectant mothers, which will result in healthier infants at birth).
(ii) Do you think cigs and faminc are likely to be correlated? Explain why the correlation might be positive or
negative. (yes, higher faminc is indicative of higher status/better educated families, therefore belonging to a
higher income fam should be negatively correlated with smoking during pregnancy. Remember: cigs is
measured in terms of cigs smoked during pregnancy, you don’t want to answer this in terms of ability to
purchase more).
(iii) What STATA command will you use to confirm your answer for (ii)? (corr cigs faminc)
(iv) A correlation matrix for all three variables is given below. In what direction will omitting faminc from a
regression of bwght on cigs bias your estimate of B1? Explain. (estimate of b-1 will be biased if and only if the
omitted variable is correlated with your variable of interest (cigs) and partly determines your outcome of
interest (bwght). The corr matrix confirms that faminc is correlated with both variables, X and Y, therefore
the coefficient on cigs will be biased. Sign of the bias = Sign of corr between omitted variable and outcome X
(times) Sign of the corr between omitted variable and included variable of interest. Using this formula we can
say the coefficient on cigs will be downward biased).
(v) The regression output for your model is reported below. Discuss your results. Begin by forming your equation,
interpreting your slope estimates, reporting the significance of your estimates, and explaining the predictive power
of your model.
(Use the results to form your equation. Interpret all coefficients and use p-values and t-stats to determine if
each variable has a significant impact on bwght using a 5% significance level. Then finally report how much
of the total variation in bwght is explained by your model and whether knowing info on cigs and faminc alone
is enough for you to confidently predict any infant’s bwght.)
(vi) What is the predicted weight of an infant whose mother smokes 5 cigarettes per day and whose family’s income
was $20,000? (plug in 5 for cigs, and 20 for inc in your equation not 20,000!).