Ch03 ISM
Ch03 ISM
3.5 Half of the Wired readers have an income of no more than $97,661 while half of the Wired.com
readers have an income of no more than $87,333.
3.7 (a) Half of the new houses were sold at a price no higher than $323,100.
(b) On average, the sales price of houses was $370,800.
(c) The sales price of new houses in 2016 is right skewed
3.9 (a)
Total passengers % Change
(b)
Variance 2.55137E+14 0.001
Standard Deviation 15972998.61 0.038
Range 63,810,186 0.200
Coeffient of Variation 27.8% 69.3%
Z scores start as follows: 2.914, 2.405, 1.933, etc., for the total number of passengers and
-1.522, -1.052, and -0.006 for the percentage change.
In case of the total number of passengers the mean is somewhat larger than the median
and there seems to be one or outliers with Z scores above 2. The percentage change is
close to symmetric.
(b) Because the coefficient of variation is 57.32% for one-year CD’s and 38.29% for
five- year CD’s one concludes that relative to the mean, one-year CD’s are much
more variable then five-year CD’s.
3.17 (a)
(c) The three-year return is higher for low risk funds than average risk or high risk funds for both
the growth and the value funds. However, this pattern changes when Market Cap categories are
considered. For example, the three-year return percentage for growth funds with average risk is
much higher for large cap funds than for midcap or small market cap funds. Also for value
funds with average risk, the three-year return for midcap funds is higher than the return for large
funds. The standard deviation for high risk funds is higher than average risk or low risk low risk
funds for both the growth and value funds. The standard deviation for high risk large cap
growth funds is much larger than any other category.
3.19 (a)
(c) The mean three-year return for large-cap funds is much higher than mid-cap or
small-cap funds. In all risk categories except large-cap average risk, five-star
funds have the highest mean three-year return. Large-cap, five-star, high-risk
funds have the highest mean three -year return and the lowest standard deviation.
The highest standard deviation is found in mid-cap, average-risk, one-star funds.
-10 -5 0 5 10
0 5 10
The distribution is left-skewed.
(d) Answers are the same.
3.27
3.29 (a)
(b) five number summaries see Min, Q1, Median, Q3, Max from part (a)
(c)
3.35 (a) The mean dollar price of a Big Mac was $3.47 in July 2018. The standard deviation is
about $1.05, which means that there is $1.05 average distance between the prices of the
individual countries and the population mean price.
(b) Based on the descriptive statistics and the boxplot the distribution is right skewed
(b) On average, the market capitalization for this population of 30 companies is $185.8
billion. The typical distance between the market capitalization and the mean market
capitalization for this population of 30 companies is $133.7 billion.
3.39 (a) The study suggests that the perceived usefulness of smartphones in an educational setting
and the number of times students used their smartphone to send or read email for class
purpose are positively correlated.
(b) There could be a cause and effect relationship between perceived usefulness of
smartphones and the number of times students used their smartphone to send or read
email for class purposes. The more a student uses their smartphone for class the
more they may feel it is useful in an educational setting.
(c) The correlation coefficient is more valuable for expressing the relationship because it
does not depend on the units used.
(d) There is a strong positive linear relationship between U.S. gross and worldwide gross,
first weekend gross and worldwide gross and first weekend gross and U.S. gross
3.43 (a), (b)
Data Set GlobalInternetUsage GlobalSocialMedia
Covariance 152808.40 23400.0847
Coefficient of
0.86269 0.300218
Correlation
(c) There is a strong positive linear relationship between the percentage of adults polled who
use the Internet at least occasionally and GDP. There is a very weak positive linear
relationship between social media usage and GDP.
3.45 Central tendency or location refers to the fact that most sets of data show a distinct tendency to
group or cluster about a certain central point.
3.47 The first quartile is the value below which ¼ of the total ranked observations will fall, the median
is the value that divides the total ranked observations into two equal halves and the third quartile
is the observation above which ¼ of the total ranked observations will fall.
3.49 The Z score measures how many standard deviations an observation in a data set is away from the
mean.
3.51 The empirical rule relates the mean and standard deviation to the percentage of values that will
fall within a certain number of standard deviations of the mean.
3.53 Shape is the manner in which the data are distributed. The shape of a data set can be symmetrical
or asymmetrical (skewed).
3.55 For symmetrical distributions, the boxplot is also symmetrical with the median splitting the box
in half and whiskers of equal length. For left skewed distributions the box plot’s left whisker will
be longer and the median will be located in the right half of the box. For rights skewed
distributions the box plot’s the right whisker will be longer and the median will be located in the
left half of the box.
(c)
3.59
(a) Mean = 43.04 Median = 28.5 Q1 = 14 Q3 = 54
(b) Range = 164 Interquartile range = 40 Variance = 1,757.79
Standard deviation = 41.926 Coefficient of variation = 97.41%
(c) Box-and-whisker plot for Days to Resolve Complaints
Box-and-whisker Plot
Days
0 50 100 150
Minitab Output:
The mean call duration is 232.8 seconds. The middle ranked call duration is 228 seconds.
The data is symmetric but with one high outlier. The range, calculated as the difference
between the smallest and largest call duration, is 1076 seconds. The typical distance
between the call duration and the mean call duration for this sample of 50 customers is
called the standard deviation which is 158.7 seconds.
(b) Using the formulas in the text with n= 50, Q1 = 139. Q3 = 273
Therefore 5 number summary is 65, 139, 228, 273, 1141
(c) Disregarding the outlier, the distribution is left skewed.
(d) 75% of the call durations are less than 273 seconds. The target call duration of less than
240 seconds is only met for approximately 60% of calls. One could conclude the target is
not met.
* Note Minitab uses a slightly different formula to calculate the quartiles 5 number summary is min,
Q1, median, Q3, max
(b)
There is a positive correlation between cost of a meal and summated rating. The higher
priced restaurants tend to receive higher rating than the lower priced restaurants.
(d) The median cost of a meal in the center City is $52 while the median cost of a meal in the
metro area is $42.50. The range in costs of meals in the center city is greater than the
range in costs of meals in the metro area.
(c) The average prices of the two-star, three-star and four-star hotels are 103.66, 148.35, and
206.38 Canadian dollars respectively while the middle rank prices are 106, 146, and 208
Canadian dollars respectively. The difference in prices between the lowest and highest
price hotels of the two-star, three-star and four-star hotels are 159, 249 and 262 Canadian
dollars, respectively while the difference in prices among the middle 50% hotels are 68,
75 and 81.75 Canadian dollars, respectively. The typical distance of the prices around
the mean for the two-star, three-star and four-star hotels are 42.98, 58.85 and 64.48
Canadian dollars, respectively. The amount of average spread around the mean relative
to the mean prices of the two-star, three-star and four-star hotels are 41.1%, 39.3% and
31%, respectively.
(d) The prices of the two-star and four-star hotels are roughly symmetric while the three-star
hotels are slightly right-skewed.
(e) Covariance Matrix
(c)
(d) The correlation between mean connection speed and mean peak connection speed is
0.768.The correlation between percent of time the speed is above 4 Mbps and the percent of
the time the connection speed is above 10 Mbps is 0.800
(e) The average of the average connections speeds for the various countries surveyed is
9.028 Mbps. Half of the countries surveyed have average connection speeds below 8.7
Mbps. One-quarter of the countries surveyed have average connection speeds below 4.65
Mbps while another one-quarter have average connection speeds above 12.55 Mbps.
The range of average connection speeds is 19 Mbps. The middle 50% of the countries
have average connection speed spread over 7.9 Mbps. The typical spread of the average
connection speed around the mean is 4.745 Mbps.
The average of the average peak connections speeds for the various countries surveyed is
47.473 Mbps. Half of the countries surveyed have average peak connection speeds
below 47.90 Mbps. One-quarter of the countries surveyed have average peak connection
speeds below 29.550 Mbps while another one-quarter have average peak connection
speeds above 58.1 Mbps. The range of average peak connections speeds is 123.6 Mbps.
The middle 50% of the countries have average connection peak speed spread over 28.550
Mbps. The typical spread of the average peak connection speed around the mean is
23.134 Mbps.
For the countries surveyed, the average of the percent of time the connection speed is
above 4Mbs is 67.655% while the average of the percent of time the connection speed is
above 10Mbps is 25.865%.
Because the coefficient of variation is 52.1% for average connection speed, 48.3% for
average peak connection speed, 42.4% for percent of time above 4 Mbps and 83.8% for
percent of time above 10 Mbps one concludes that relative to the mean, percent time
above 10 Mbps is the much more variable than the other measures.
(f) There is a positive linear relationship between mean connection speed and mean peak
connection speed and there is also a positive linear relationship between percent of time
connection speed is above 4 Mbps and the percent of time connection speed is above 10
Mbps.
(b)
Range 19.01
IQR 3.14
Variance 10.97
Standard Deviation 3.31
Coefficient of Variation 20.66%
(c) Boxplot
Students will need to create the boxplot. The average commuting distance seems to be
somewhat right-skewed based on the boxplot and the mean is slightly larger than the
median. This indicates the average commuting distance in some regions is quite long.
The correlation coefficient is 0.719.
(d) The average of the average weekly commuting time is 16.04 minutes. Half of the
average weekly commuting time is less than 15.66 minutes. The range of average weekly
commuting time is 19.01 minutes. The middle 50% of the average weekly commuting
time spreads over 3.14 minutes. The typical spread of average weekly commuting time
around the mean is 3.31.
3.73 The variables “gender” and “major” are categorical and cannot be summarized with boxplots
because boxplots are created using the data from numerical variables. Similarly, the mean is a
static computed on numerical variables so is not appropriate for the categorical variables “gender”
or “major”. Pie charts are used for categorical variables, so they should not be created using data
from the numerical variables “grade point average” and “height”.