Module 1 - Descriptive Statistics

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 31

Descriptive

Statistics

Prepared by:
Ezrha C. Godilano
BSIE, MSIE, CIE
[email protected]
Learning Objectives
At the end of this exercise, students should be:

• Able to analyze data based on statistical


descriptions

• Familiar with the concepts and uses of Data


Analysis
Assumptions
Students are familiar with :

• Statistical terms

• Basic features of Microsoft Excel


Statistics
• Defined as the science that deals with the
collection, tabulation, analysis, interpretation,
and presentation of quantitative data.

• especially useful in drawing general


conclusions about a set of data from a sample
of the data
Descriptive Statistics
 Used to describe and analyze a subject or group.

 Descriptions of statistical data can be quite brief


or elaborate.

 provide simple summaries about the sample and


the measures.

 simply describe what is or what the data shows


2 kinds of descriptive measures
• Measures of location
– Consider those measures which somehow describe the
center or middle of a set of data.
– Arithmetic mean, median, mode

• Measures of variation
– Measure the extent to which data are dispersed or spread
out.
– Range, variance, standard deviation are used to measure
the variability of a set of data.
Terminologies
Measures of Location
• Arithmetic mean
– referred to as simply the mean or average
– total sum divided by number of values

• Median
– middle value that separates the greater and lesser halves of a data set

• Mode
– is the value or element that occurs the most often in a data set or a
probability distribution
Comparison
Type Equation Example Result
Arithmetic mean (1+2+2+3+4+7+9) / 7 4
Median 1, 2, 2, 3, 4, 7, 9 3
Mode 1, 2, 2, 3, 4, 7, 9 2
Terminologies
Measures of Variation
 Range
◦ is the length of the smallest interval which contains all the data
◦ calculated by subtracting the smallest observation (sample minimum) from the
greatest (sample maximum) and provides an indication of statistical dispersion.

 Variance
◦ describes how far values lie from the mean

 Standard Deviation
◦ the std. deviation of a data set, statistical population or a probability
distribution is the square root of its variance
◦ shows how much variation there is from the “average”
◦ Low std deviation - indicates that the data points tend to be very close to the
mean
◦ High std deviation - indicates that the data is spread out over a large range of
values
Inductive statistics
• Used to determine from a limited amount of
data (sample) an important conclusion about
a much larger amount of data (population).

• Conclusions or inferences cannot be stated


with absolute certainty

• Language of probability is often used


Note:
• When doing statistical descriptions, it is
necessary to define the type of data.

• a set of data can be classified as a population


or a sample.
Note:
• Population - a set of data consists of all
conceivably possible (or hypothetically
possible) observations of a certain
phenomenon.

• Sample – set of data that contains only a


part of observations
Numerical example
• Cut-out syrup density in canned fruits
– Is the percentage or degree by weight of sugar in the syrup
solution when a can is opened

– e.g. Examination of samples of 8 cans each of three grades


of dried prunes yielded the following results:
– Grade A: 44, 46, 43, 43, 46, 41, 44, 45
– Grade B: 35, 38, 36, 41, 41, 39, 39, 41
– Grade C: 34, 32, 29, 29, 31, 35, 31, 31
Required
• Calculate the mean, median, and (if exists)
the mode of each of the three samples.

• Determine other statistical descriptions of


the data using Excel.
Solution
1. Open a new worksheet in Excel
2. Enter data in columns.
3. Click Tools (Data Menu) and select Data
Analysis.
4. Highlight Descriptive Statistics, press OK.
5. Final solution shown.
Exercise 1
Descriptive Statistics
Instructions:
Save your all the Excel Sheets in ONE Excel File. Label the
sheets as Problem 2-A, Problem 2-B, etc.
File name should be in the format: LAST NAME_Exer1
Problem 1-A
• The state police, using radar, checked the
speeds (in mph) of 30 passing motorists at a
checkpoint. The results are listed below:

– Day 1: 44 38 41 50 36 36 43 42 49 48
– Day 2: 35 40 37 41 43 50 45 45 39 38
– Day 3: 50 41 47 36 35 40 42 43 48 33
Required
1. Find the Mean, Median, Mode and Standard
Deviation of:

a) Day 1 to 3

b) Day 1, 2 and 3 individually


Problem 1-B
 A production supervisor of XYZ company is evaluating
three alternative procedures for processing a job
order. The delay times in the processing of job order of
the three alternatives for the 10 trials are as follows:

 Alternative 1: 400, 175, 480, 640, 135, 700, 155, 220,


200, 136
 Alternative 2: 178, 1300, 146, 410, 138, 210, 145,
1420, 180, 184
 Alternative 3: 720, 500, 175, 165, 130, 168, 800, 618,
188, 164
Required
• Determine the mean, range, and standard
deviation of each of these alternatives.

• Mean
Results and Interpretation Range Std. Deviation
Alternative 1
Alternative 2
Alternative 3
Answer the following questions
1. Which alternative has a greater variability?

2. Which alternative would you recommend to


process the job order? Why?

3. Subtract 10 seconds from each of the data.


Recalculate the mean, range, and the standard
deviation, and compare the results with those
obtained in the proceeding problem.
4. Result after subtracting 10 seconds
Mean Range Std. Deviation
Alternative 1
Alternative 2
Alternative 3

5. Conclusion
Problem 1-C
• The following sample data concerns the
characteristic of drivers under 26 years old
who got into automobile accidents for which
there was property damage and the driver
had an illegal blood alcohol. Other costs and
relevant characteristics associated with these
accidents have been included in this data set
collected by Sloppy Research, Inc.
Required
Use Excel functions to calculate:
1. Arrange by gender and determine the descriptive
statistics for each gender using a chart for each and
not functions. Analyze the results.
2. Use Excel functions to calculate the following
statistics for the property damage variable of this
population.
a) Count
b) Largest(1)
c) Smallest(1)
d) Confidence Level(95.0%)
Problem 1-D
 Twenty-five randomly selected students were asked
the number of movies they watched the previous
week. The results are as follows:
# of movies frequency
0 5
1 9
2 6
3 4
4 1

 Find the sample mean, median and standard deviation


& analyze the results
Problem 1-E
 Following are the published weights (in pounds)
of all of the team members of the San Francisco
49ers from a previous year (Source: San Jose
Mercury News)

 177; 205; 210; 210; 232; 205; 185; 185; 178; 210;
206; 212; 184; 174; 185; 242; 188; 212; 215; 247;
241; 223; 220; 260; 245; 259; 278; 270; 280; 295;
275; 285; 290; 272; 273; 280; 285; 286; 200; 215;
185; 230; 250; 241; 190; 260; 250; 302; 265; 290;
276; 228; 265
Required
1. Organize the data from smallest to largest value.
2. Find the mean.
3. Find the median.
4. Find the mode.
5. If our population were all professional football
players, would the above data be a sample of
weights or the population of weights? Why?
6. If our population were the San Francisco 49ers,
would the above data be a sample of weights or the
population of weights? Why?
References
• http://www.mathcracker.com/solved_statistic
s_problems.php
• Esmeria, G.J., IE Computer Applications I. 2001
Revised Edition.

You might also like