Stats Week 2 ALA
Stats Week 2 ALA
This page is not graded; it is for practicing using the Histogram tool.
Follow the instructions for creating a histogram on pages 96-99 of Salkind, using the data
set given here. You will need to type out the bins that you want in a separate "Bins"
column. Each number in the Bins column is the maximum point of a bin (Salkind is not
super clear on this). For example, if you give the bins as
20
25
30
35
then Excel will create five bins:
the first one will have all values 20 or less
the second one will have values greater than 20 but less than or equal to 25
the third one will have values greater than 25 but less than or equal to 30
the fourth one will have values greater than 30 but less than or equal to 35
the fifth one will be called "More" and will have all values greater than 35.
Be careful with your highest bin entry. Excel always creates a bin called "More" for any
scores that are higher than the highest one you gave. To avoid an empty "More" bin taking
up space, don't enter any number that is greater than or equal to your highest data point
as a bin.
You should experiment with different bin sizes. Too small, and there will only be 1 or 2 data
points in each bin. Too big, and everything will end up in two or three bins and you won't
be able to see much detail. For the data set to the right, making a new bin at every integer
works fairly well. You should always make all the bins the same size (i.e. type in bin
numbers that are equally spaced apart).
For histograms, you should not check the Labels box. Don't include any labels in your Input
Range. Also, remember that you need to check the "Chart Output" box at the bottom, or
else it won't give you a graph.
For some reason, Excel creates a chart that doesn't look like a histogram. It looks like a
column chart, because the columns are separated from each other. To make it look like a
real histogram, right click on one of the columns in the chart and click Format Data Series.
Then reduce the Gap Width to 0%.
Number of years at a company before retirement Bins
25.2
27.6
31.2
33.6
28.8
28.8
31.2
27.6
24.0
22.8
25.2
28.8
30.0
32.4
28.8
30.0
27.6
27.6
31.2
33.6
26.4
26.4
25.2
27.6
31.2
33.6
28.8
28.8
31.2
27.6
24.0
22.8
31.2
33.6
28.8
28.8
This is a data set of 50 comprehension scores.
a. Create a histogram for this data set, choosing bins that make for a good informative visual
representation. Remember, don't enter a bin that is greater than or equal to your highest
data point; Excel will create a "More" bin to handle the highest scores.
b. Is the skewness of this distribution positive or negative? What does that tell you about the
distribution?
Comp Score
12 36 49 29
15 34 45 54
11 33 45 56
16 38 47 57
21 42 43 59
25 44 31 54
21 47 12 56
8 54 14 43
6 55 15 44
2 51 16 41
22 56 22 42
26 53 29 7
27 57
This data set compares a set of athletes in two categories: speed (measured in feet per
second for a 50-yard swim) and strength (number of pounds bench-pressed).The two
numbers in the same row belong to the same athlete.
a. Use the Correlation tool to calculate the correlation coefficient. Is the correlation direct or
indirect? How strong is it, based on Table 6.3 on page 135 of Salkind?
b. Find the coefficient of determination. What percent of the variability in speed is explained
by the variability in strength?
c. Why do you think the correlation was not stronger than this?
Speed Strength
6.94 135
6.41 213
5.66 243
5.88 167
7.21 120
7.69 134
7.18 209
8.02 176
5.03 156
5.23 177
No Data Analysis Tools needed on this page.
b. Do these two people make a reliable interviewing team? Why or why not?
Candidate Evaluation 1 Evaluation 2
1 1 1
2 1 1
3 1 1
4 1 1
5 0 0
6 0 0
7 1 0
8 0 1
9 0 1
10 1 0
11 1 0
12 0 0
13 0 0
14 1 0
15 1 0
16 1 1
17 0 1
18 0 0
19 1 0
20 1 1
21 1 1
22 1 0
23 0 0
24 0 1
25 0 0
26 1 1
27 1 0
28 0 1
29 1 0
30 0 1
31 1 0
32 0 0
33 1 1
34 0 1
35 0 0
36 1 0
37 1 1
38 1 0
39 0 1
40 0 0
41 0 1
42 1 1
43 0 1
44 1 0
45 1 0
46 1 0
47 1 1
48 0 0
49 0 1
50 0 0