Business Analytics Module 2 Summary
Business Analytics Module 2 Summary
Business Analytics Module 2 Summary
© Copyright 2020 President and Fellows of Harvard College. All Rights Reserved.
o The mean of any single sample lies on the normally distributed Distribution
of Sample Means, so we can use the normal curve’s special properties to
draw conclusions from a single sample mean.
o The mean of the Distribution of Sample Means equals the mean of the
population distribution.
o The standard deviation of the Distribution of Sample Means equals the
standard deviation of the population distribution divided by the square root
of the sample size. Thus, increasing the sample size decreases the width
of the Distribution of Sample Means.
• The sample mean is only a point estimate. Using the properties of the normal
distribution and the Central Limit Theorem, we can construct a range around the
sample mean, called a confidence interval, to estimate the range in which the
true population mean likely lies.
o The width of the confidence interval depends on the level of confidence,
our best estimate of the population standard deviation, and the sample
size. We can only control the level of confidence and the sample size.
o For large samples (n ≥ 30), the lower and upper bounds are calculated
using the following equation: .
o The function CONFIDENCE.NORM calculates the margin of error, which
we add and subtract from the sample mean to find the confidence interval.
o For small samples (n < 30), the lower and upper bounds are calculated
using the following equation: .
§ For small samples, we use a t-distribution, which is shorter and
wider than a normal distribution. The t-distribution provides a wider
range, a more conservative estimate of where the true population
mean lies.
§ The function CONFIDENCE.T calculates the margin of error, which
we add and subtract from the sample mean to find the confidence
interval.
• We can also calculate confidence intervals for proportions. To do so, we must
convert data to dummy (0, 1) variables.
o After that, we can proceed as we would with any other confidence interval.
§ When estimating the true population proportion, we should ensure
that the sample size is large enough by checking that both of the
following conditions are true: n * p ≥ 5, and n(1 − p) ≥ 5. If either of
these guidelines is not satisfied, we must collect a larger sample.
© Copyright 2020 President and Fellows of Harvard College. All Rights Reserved. 2
EXCEL SUMMARY
Recall the Excel functions and analyses covered in this course and make sure to
familiarize yourself with all of the necessary steps, syntax, and arguments. We have
provided some additional information for the more complex functions listed below. As
usual, the arguments shown in square brackets are optional. The functions whose
names include “S” use the standard normal distribution.
• =RAND()
• =NORM.DIST(x, mean, standard_dev, cumulative)
o When cumulative is set to “TRUE”, NORM.DIST finds the cumulative
probability, that is, the probability of being less than or equal to the
specified value x, for a normal distribution with the specified mean and
standard deviation. (Inserting the value “FALSE” provides the height of the
normal distribution at the value x, which is not covered in this course.)
• =NORM.S.DIST(z, cumulative)
o When cumulative is set to “TRUE”, NORM.S.DIST finds the cumulative
probability, that is, the probability of being less than or equal to the
specified value z for a standard normal distribution.
• =NORM.INV(probability, mean, standard_dev)
o Returns the corresponding x-value on a normal distribution for the
specified mean, standard deviation, and cumulative probability.
• =CONFIDENCE.NORM(alpha, standard_dev, size)
o Returns the margin of error using a normal distribution for a specified
alpha, standard_dev, and size. Alpha is the significance level, which
equals one minus the confidence level (for example, a 95% confidence
interval would correspond to the significance level 0.05).
• =CONFIDENCE.T(alpha, standard_dev, size)
o Returns the margin of error using a t-distribution for a specified alpha,
standard_dev, and size.
• =IF(logical_test, [value_if_true], [value_if_false])
o Returns value_if_true if the specified condition is met, and returns
value_if_false if the condition is not met.
© Copyright 2020 President and Fellows of Harvard College. All Rights Reserved. 3