Module 1 - Descriptive Statistics
Module 1 - Descriptive Statistics
Module 1 - Descriptive Statistics
Statistics
Prepared by:
Ezrha C. Godilano
BSIE, MSIE, CIE
[email protected]
Learning Objectives
At the end of this exercise, students should be:
• Statistical terms
• Measures of variation
– Measure the extent to which data are dispersed or spread
out.
– Range, variance, standard deviation are used to measure
the variability of a set of data.
Terminologies
Measures of Location
• Arithmetic mean
– referred to as simply the mean or average
– total sum divided by number of values
• Median
– middle value that separates the greater and lesser halves of a data set
• Mode
– is the value or element that occurs the most often in a data set or a
probability distribution
Comparison
Type Equation Example Result
Arithmetic mean (1+2+2+3+4+7+9) / 7 4
Median 1, 2, 2, 3, 4, 7, 9 3
Mode 1, 2, 2, 3, 4, 7, 9 2
Terminologies
Measures of Variation
Range
◦ is the length of the smallest interval which contains all the data
◦ calculated by subtracting the smallest observation (sample minimum) from the
greatest (sample maximum) and provides an indication of statistical dispersion.
Variance
◦ describes how far values lie from the mean
Standard Deviation
◦ the std. deviation of a data set, statistical population or a probability
distribution is the square root of its variance
◦ shows how much variation there is from the “average”
◦ Low std deviation - indicates that the data points tend to be very close to the
mean
◦ High std deviation - indicates that the data is spread out over a large range of
values
Inductive statistics
• Used to determine from a limited amount of
data (sample) an important conclusion about
a much larger amount of data (population).
– Day 1: 44 38 41 50 36 36 43 42 49 48
– Day 2: 35 40 37 41 43 50 45 45 39 38
– Day 3: 50 41 47 36 35 40 42 43 48 33
Required
1. Find the Mean, Median, Mode and Standard
Deviation of:
a) Day 1 to 3
• Mean
Results and Interpretation Range Std. Deviation
Alternative 1
Alternative 2
Alternative 3
Answer the following questions
1. Which alternative has a greater variability?
5. Conclusion
Problem 1-C
• The following sample data concerns the
characteristic of drivers under 26 years old
who got into automobile accidents for which
there was property damage and the driver
had an illegal blood alcohol. Other costs and
relevant characteristics associated with these
accidents have been included in this data set
collected by Sloppy Research, Inc.
Required
Use Excel functions to calculate:
1. Arrange by gender and determine the descriptive
statistics for each gender using a chart for each and
not functions. Analyze the results.
2. Use Excel functions to calculate the following
statistics for the property damage variable of this
population.
a) Count
b) Largest(1)
c) Smallest(1)
d) Confidence Level(95.0%)
Problem 1-D
Twenty-five randomly selected students were asked
the number of movies they watched the previous
week. The results are as follows:
# of movies frequency
0 5
1 9
2 6
3 4
4 1
177; 205; 210; 210; 232; 205; 185; 185; 178; 210;
206; 212; 184; 174; 185; 242; 188; 212; 215; 247;
241; 223; 220; 260; 245; 259; 278; 270; 280; 295;
275; 285; 290; 272; 273; 280; 285; 286; 200; 215;
185; 230; 250; 241; 190; 260; 250; 302; 265; 290;
276; 228; 265
Required
1. Organize the data from smallest to largest value.
2. Find the mean.
3. Find the median.
4. Find the mode.
5. If our population were all professional football
players, would the above data be a sample of
weights or the population of weights? Why?
6. If our population were the San Francisco 49ers,
would the above data be a sample of weights or the
population of weights? Why?
References
• http://www.mathcracker.com/solved_statistic
s_problems.php
• Esmeria, G.J., IE Computer Applications I. 2001
Revised Edition.