Unit 3

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 42

Lesson 3:

Descriptive Statistics
Objectives:
By the end of the lesson, you will be able to:
• Apply various measures of central tendency – including the mean,
median, and the mode – to a set of ungrouped data.
• Apply various measures of variability—including the range,
interquartile range, mean absolute deviation, variance, and standard
deviation —to a set of ungrouped data.
• Describe a data distribution statistically and graphically using
skewness, kurtosis, and box-and-whisker plots.
Topics:
• Lesson 3.1: Measures of Central Tendency: Ungrouped Data
• Lesson 3.2: Measures of Variability: Ungrouped Data
• Lesson 3.3: Measures of Shape
Measures of Central Tendency: Ungrouped
Data
• Measures of central tendency yield information about the
center, or middle part, of a group of numbers.
• Measures of central tendency do not focus on the span of
the data set or how far values are from the middle numbers.
Mode
• The mode is the most frequently occurring value in a set of
data.
• Organizing the data into an ordered array (an ordering of the
numbers from smallest to largest) helps to locate the mode.
Example:
Median
• The median is the middle value in an ordered array of
numbers.
• For an array with an odd number of terms, the median is the
middle number.
• For an array with an even number of terms, the median is
the average of the two middle numbers.
Steps for getting the median
• STEP 1. Arrange the observations in an ordered data array.
• STEP 2. For an odd number of terms, find the middle term of
the ordered array. It is the median.
• STEP 3. For an even number of terms, find the average of the
middle two terms. This average is the median.
Median
• Suppose a business researcher wants to determine the
median for the following numbers.
15 11 14 3 21 17 22 16 19 16 5 7 19 8 9 20 4
The researcher arranges the numbers in an ordered array.
3 4 5 7 8 9 11 14 15 16 16 17 19 19 20 21 22
Continue…
• The median is unaffected by the magnitude of extreme
values. This characteristic is an advantage, because large and
small values do not inordinately influence the median.
Mean
• The average of a group of numbers and is computed by
summing all numbers and dividing by the number of
numbers.
• The population mean is represented by the Greek letter mu
(μ).
• The sample mean is represented by .
Mean
• The formulas for computing the population mean and the sample
mean are given in the boxes that follow.
Continue…
• The capital Greek letter sigma (Σ) is commonly used in
mathematics to represent a summation of all the numbers in
a grouping.
• N is the number of terms in the population, and n is the
number of terms in the sample.
Example
Solution:
• Mode: 9,000
• Median: With 13 different companies in this group, N=13.
The median is located at the (13+1)/2 = 7th position. Because
the data are already ordered, the 7th term is 20,000, which is
the median.
• Mean: The total number of cars in service is 1,791,000 =
• μ=
Lesson 3.2
Measures of Variability
Measures of Variability
• Range
• Variance
• Standard Deviation
Range
• The range is the difference between the largest value of a
data set and the smallest value of a set.
• It is a crude measure of variability, describing the distance to
the outer bounds of the data set. It reflects those extreme
values because it is constructed from them.
Variance
• The variance is the average of the squared
deviations about the arithmetic mean for a set of
numbers. The population variance is denoted by .
Population Variance

Table next slide shows the original production numbers for the
computer company, the deviations from the mean, and the
squared deviations from the mean
• The sum of the squared
deviations about the mean of
a set of values—called the
sum of squares of x and
sometimes abbreviated as SSx
—is used throughout
statistics. For the computer
company, this value is 130.
Dividing it by the number of
data values (5 weeks) yields
the variance for computer
production.
• Because the variance is computed from squared deviations,
the final result is expressed in terms of squared units of
measurement. Statistics measured in squared units are
problematic to interpret. Therefore, when used as a
descriptive measure, variance can be considered as an
intermediate calculation in the process of obtaining the
standard deviation.
Standard Deviation
• The standard deviation is a popular measure of variability. It
is used both as a separate entity and as a part of other
analyses, such as computing confidence intervals and in
hypothesis testing.
Population Standard Deviation

The standard deviation is the square root of the variance. The


population standard deviation is denoted by .
Sample Variance and Standard Deviation
• The sample variance is denoted by and the sample standard
deviation by s. The main use for sample variances and
standard deviations is as estimators of population variances
and standard deviations. Because of this, computation of the
sample variance and standard deviation differs slightly from
computation of the population variance and standard
deviation.
Sample Variance and Standard Deviation
• Both the sample variance and sample standard deviation use
in the denominator instead of n because using n in the
denominator of a sample variance results in a statistic that
tends to underestimate the population variance.
Sample Variance and Standard Deviation
Sample Variance:

Sample Standard Deviation:


Example
• Shown here is a sample
of six of the largest
accounting firms in the
United States and the
number of partners
associated with each
firm as reported by the
Public Accounting
Report.
Lesson 3.3:
Measures of Shape
Measures of Shape
Measures of shape are tools that can be used to describe the
shape of a distribution of data. In this section, we examine
two measures of shape, skewness and kurtosis. We also look
at box-and-whisker plots.
Skewness
A distribution of data in which the right half is a mirror image
of the left half is said to be symmetrical. One example of a
symmetrical distribution is the normal distribution, or bell
curve, shown in Figure 3.8
Skewness
Skewness is when a distribution is asymmetrical or lacks
symmetry. The distribution in Figure 3.8 has no skewness
because it is symmetric. Figure 3.9 shows a distribution that is
skewed left, or negatively skewed, and Figure 3.10 shows a
distribution that is skewed right, or positively skewed.
Skewness
Continue…
• The skewed portion is the long, thin part of the curve.
• Many researchers use skewed distribution to denote that the
data are sparse at one end of the distribution and piled up at
the other end.
• Instructors sometimes refer to a grade distribution as
skewed, meaning that few students scored at one end of the
grading scale, and many students scored at the other end.
Skewness and the relationship of the Mean,
Median and the Mode
• The concept of skewness helps to understand the relationship of the mean,
median, and mode. In a unimodal distribution (distribution with a single peak or
mode) that is skewed, the mode is the apex (high point) of the curve and the
median is the middle value. The mean tends to be located toward the tail of the
distribution, because the mean is particularly affected by the extreme values.
Skewness and the relationship of the Mean,
Median and the Mode
• A bell-shaped or normal distribution with the mean, median, and
mode all at the center of the distribution has no skewness. Figure
3.11 displays the relationship of the mean, median, and mode for
different types of skewness.
Formula for Skewness:
Formula for Skewness:

If the value of Sk is negative, the distribution is negatively skewed. If the


value of Sk is positive, the distribution is positively skewed. The greater
the magnitude of Sk, the more skewed is the distribution.
Kurtosis
Kurtosis describes the amount of peakedness of a distribution.
Distributions that are high and thin are referred to as leptokurtic
distributions.
Distributions that are flat and spread out are referred to as platykurtic
distributions.
Between these two types are distributions that are more “normal” in
shape, referred to as mesokurtic distributions. These three types of
kurtosis are illustrated in Figure 3.12
Kurtosis
Formula:

S = standard deviation
n = total number of observations

You might also like