Chapter 2 Describing Data Using Tables and Graphs
Chapter 2 Describing Data Using Tables and Graphs
Chapter 2 Describing Data Using Tables and Graphs
) 22 (~
Total Home Runs Tally f
135-156 |||| |||| 10
157-178 ||| 3
179-200 |||| || 7
201-222 |||| | 6
223-244 |||| 4
30 =
f
Relative frequency and percentage distributions
f
f
s frequencie all of sum
class that of frequency
class a frequency relative
E
= =
% 100 = frequency relative Percentage
Example 2.6
Calculate the relative frequencies and percentages distributions for the data in Example 2.5.
Solution
Total Home Runs Class Boundaries Relative
Frequency
Percentage
135-156 134.5 - 156.5 0.333 33.3
157-178 156.5 178.5 0.100 10.0
179-200 178.5- 200.5 0.233 23.3
201-222 200.5 222.5 0.200 20.0
223-244 222.5 244.5 0.133 13.3
Sum= 0.999 Sum = 99.9%
Grouped (quantitative) data can be displayed in a histogram or a polygon.
Graphing Grouped Data
Histogram
Three types of histogram
1. Frequency histogram
2. Relative frequency histogram
3. Percentage histogram
A frequency histogram consists of a set of rectangle having
a) The bases on a horizontal axis with centres at the class marks and lengths equal to the
class interval sizes
b) The areas proportional to the class frequencies
If the class intervals all have equal size
the height of the rectangles are proportional to the class frequencies
otherwise
the height of the rectangles must be adjusted
Procedures to draw a histogram:
1. Mark the class boundary of each interval on the horizontal axis.
2. For each class, mark the frequencies (or relative frequencies or percentages) on
the vertical axis.
3. Draw a bar for each class so that its height represents the frequency of that class.
(No gap between each bars)
4. Label the histogram.
Polygon
A line graph formed by joining the midpoints of the tops of successive bars in a
histogram
Next, we mark two more classes (with zero frequencies), one at each end, and mark the
midpoints.
Three types of polygon:
1. Frequency polygon
2. Relative frequency polygon
3. Percentage polygon
Table 2.4: Frequency distribution for Table 2.3
Total Home Runs Class mark Frequency
135-156 145.5 10
157-178 167.5 3
179-200 189.5 7
201-222 216 6
223-244 233.5 4
Figure 2.1: Frequency histogram for Table 2.4
Figure 2.2: Relative frequency histogram for Table 2.4
Figure 2.3: Frequency polygon for Table 2.4
For a very large data set, as the number of classes is increased (and the width of classes is
decreased), the frequency polygon eventually becomes a smooth curve. Such a curve is called
a frequency distribution curve or simply a frequency curve. Figure 2.4 shows the frequency
curve for a large data set with a large number of classes.
Figure 2.4: Frequency distribution curve
Single-Valued Classes
Is used if the observations in a data set assume only a few distinct values (classes that are
made of single values and not of intervals)
Useful in cases of discrete data with only a few possible values.
Example 2.7
A sample of 40 randomly selected household from a city produced the following data on the
number of vehicles owned:
5
1
2
4
1
3
1
2
1
3
2
1
2
0
2
1
0
2
1
2
1
5
2
1
1
1
2
1
2
2
1
4
1
3
1
1
1
4
1
3
Construct a frequency distribution table for these data.
Solution
Vehicles owned Number of households (f)
0
1
2
3
4
5
2
18
11
4
3
2
The frequency distribution can be displayed in a bar graph.
2.4 Shapes of Histograms
Symmetric
- Identical on both sides of its central point.
Skewed
- the tail on one side is longer than the tail on the other side.
Uniform or rectangular
- has the same frequency for each class.
(a) and (b) Symmetric frequency curves.
(c) Frequency curve skewed to the right.
(d) Frequency curve skewed to the left.
2.5 Cumulative frequency distribution
A table that presents the total number of values that fall below the upper boundary of
each class.
It is constructed for quantitative data only.
set data the in s frequencie all of sum
class a of frequency cumulative
frequency relative cumulative =
% 100 = frequency relative cumulative percentage cumulative
Example 2.8:
Using the frequency distribution of Table 2.4, reproduced here, prepare a cumulative
frequency distribution for the number of Total Home Runs
Total Home Runs Frequency
135-156 10
157-178 3
179-200 7
201-222 6
223-244 4
Table 2.5: Cumulative frequency distribution for Table 2.4
Total Home Runs Cumulative frequency
<156.5 10
<178.5 10+3=13
<200.5 10+3+7=20
<222.5 10+3+7+6=26
<244.5 10+3+7+6+4=30
Table 6: Cumulative relative frequency and cumulative percentage for Table 2.4
Weight (kg) Cumulative relative frequency Cumulative percentage
<156.5 10/30=0.333 33.3
<178.5 13/30=0.433 43.3
<200.5 20/30=0.667 66.7
<222.5 26/30=0.867 86.7
<244.5 30/30=1.000 100.0
Ogive / Cumulative frequency curve
A curve drawn for the cumulative frequency distribution by joining with straight lines the
dots marked above the upper boundaries of classes at heights equal to the cumulative
frequencies of respective classes.
Note:
1. The ogive starts at the lower boundary of the first class and ends at the upper
boundary of the last class.
2. If relative cumulative frequency is used in placed of cumulative frequency, the graph
is called relative cumulative frequency curve or percentage ogive
Figure 2.5: Ogive for the cumulative frequency distribution of Table 2.4
2.6 Stem-and-leaf displays
Each value is divided into two portions - a stem and a leaf. The leaves for each stem are
shown separately in a display.
Note:
1. It is constructed only for quantitative data.
2. An advantage over a frequency distribution because we do not lose information on
individual observations.
Example 2.9
The following are the scores of 30 college students on a statistics test.
75
69
83
52
72
84
80
81
77
96
61
64
65
76
71
79
86
87
71
79
72
87
68
92
93
50
57
95
92
98
Construct a stem-and-leaf display.
Figure 2.6: Stem and Leaf display Figure 2.7: Ranked stem-and-leaf
of test scores display of test scores
Example 2.10
The following data are the monthly rents paid by a sample of 30 households selected from a
city.
880 1081 721 1075 1023 775 1235 750 965 960
1210 985 1231 932 850 825 1000 915 1191 1035
1151 630 1175 952 1100 1140 750 1140 1370 1280
Construct a stem-and-leaf display for these data.
Solution
Example 2.11
The following stem-and leaf display is prepared for the number of hours that 25 students
spent working on computer during the past month.
0 6
1 1 7 9
2 2 6
3 2 4 7 8
4 1 5 6 9 9
5 3 6 8
6 2 4 4 5 7
7
8 5 6
Prepare a new stem-and leaf display by grouping the stems.
Note:
The leaves for each stem of a group are separated by an asterisk (*).
If the stem does not contain a leaf, indicate the leaf by two consecutive asterisks.
Dotplot display
Displays the data of a sample by representing each piece of data with a dot positioned
along a scale (horizontal scale or vertical scale).
The frequency of the values is represented along the other scale.
Example 2.5
A sample of 19 exam grades was randomly selected from a large class:
76 74 82 96 66 76 78 72 52 68
86 84 62 76 78 92 82 74 88
Construct a dotplot of these data.