C22 P04 Statistical Averages

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 41

Statistics:

Is a mathematical science
pertaining to the
collection ,
analysis,
interpretation or
explanation and presentation of
data.
CENTRAL VALUE :(AVERAGE)
It is the one around which all other
values are dispersed or distributed.

USES:
1.To find whether normal value lie close to it or
too small/large present at both the ends.
2.To find which group is better off by comparing
between the groups.
e.g.: Incubation period of cholera<typhoid
AVERAGES
Arithmetic mean:
It is obtained by summing up of all
observations by number of
observations.
It is denoted by X.
Mean (Arithmetic Mean)
Mean (arithmetic mean) of data values
Sample mean
n Sample Size
X i
X1  X 2   Xn
X i 1

n n
Population mean
N Size Population
X i
X1  X 2   XN
 i 1

N N
Mean (Arithmetic Mean)
 The most common measure of central
tendency
 Acts as ‘Balance Point’
 Affected by extreme values (outliers)
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 12 14
ADVANTAGES:
1.There is only one mean for a data set.
2.Uses all information in the data.
3.Easy to manipulate mathematically.
4.Easy to understand.

DISADVANTAGES:
1.Influenced by extreme values.
(e.g. If Bill Gates moved in to any neighborhood
then the avg. income of the family would increase
dramatically beyond what it was previously)

2 .May be ridiculous some times.


(e.g. Average number of children =4.76)
TWO TYPES:
1.UNGROUPED SERIES.
2.GROUPED SERIES.
UNGROUPED SERIES
UNGROUPED SERIES
If the number of observations is small.

Two groups:
1.Direct.
2.Indirect.

Choice depends on size of observation


DIRECT:
SUMMATION
NO. OF OBSERVATIONS
E.g.: Tuberculin test of 10 boys is given in
ascending order find the mean size of reaction
3,5,7,7,8,8,9,10,11,12
Ans: x = 3+5+7+7+8+8+9+10+11+12
10
=80/10
=8
INDIRECT:
When the size of the observation is large.
By assuming arbitrary mean or working
origin (w )
x = E(X-w)
n
X=w+x

e.g.: Mean incubation period of 9 polio cases is

1 2 3 4 5 6 7 8 9

23 22 20 24 16 17 18 19 21
X X-w x
By direct method:
23 23-20 +3
22 22-20 +2 X = E X = 180 = 20
20 20-20 0 n 9
By assumed mean:
24 24-20 +4
16 16-20 -4 x =E(X-w) = 10-10
n 9
17 17-20 -3
18 18-20 -2 =0
19 19-20 -1
21 21-20 +1 So, X=w+x
= 20+0 = 20.
180 +10-10=0
GROUPED SERIES
If the no of observations is large.
Procedure:
Data arranged in groups &frequency
distribution table prepare first.
Find values of each group separately.
Multiply the mid values of each group with
frequency.
Add up these product values & divide by no
of values.
This is weighted mean or grand mean or
mean of means.
Grouped Data Arithmetic Mean
Example:

To find the Arithmetic Mean of 1,2,3,1,2,3,2.

The arithmetic mean = 1+2+3+1+2+3+2/7 = 14/7 = 2


In this case there are two 1's, three 2's and two 3's. The
number of times each number occurs is called its
frequency. This can be clearly given in a table as below.

X value Frequency ΣfX


1 2 1*2=2
2 3 2*3=6
3 2 3*2=6
Step 1: Find Σf.
Σf = 7

Step 2: Now, find ΣfX.


ΣfX = ((1*2)+(2*3)+(3*2)) = 14

Step 3: Now, Substitute in the above


formula
Arithmetic mean = ΣfX/Σf =
14/7 = 2
Class Interval Arithmetic Mean Definition:
A range of values of a variable, an interval used in
dividing the scale of the variable for the purpose of
tabulating the frequency distribution of a sample. In
other words, we can define as the individual group of
scores in a grouped frequency distribution.

Formula:
Class Interval Arithmetic Mean :
Arithmetic Mean = ΣfX/Σf
where
X = Midpoint
f = Frequency
Class Interval Arithmetic Mean Example:
To find the Arithmetic Mean of

Intervals
Frequency Step 1: Find Σf.
(f)
Σf = 7
10 - 20 3 Step 2: Then, Find the
Midpoint for the class
interval.
20 - 30 9

Midpoint (x) = (10+20)/2,


30 - 40 5 (20+30)/2,
(30+40)/2 = 15, 25, 35
Step 3: Now, Find ΣfX.

ΣfX =((3*15)+(9*25)+(5*35)) = (45+225+175) = 445

Step 4: Now, Substitute in the above formula given.


Arithmetic mean = ΣfX/Σf = 445/17 = 26.1765

Frequency(
Intervals Midpoint fx
f)
10 - 20 3 (10 + 20)/2 = 15 3 * 15 = 45
20 - 30 9 (20 + 30)/2 = 25 9 * 25 = 225
30 - 40 5 (30 + 40)/2 = 35 5 * 35 = 175
MEDIAN
It is the value of middle observation after
placing the observations in either ascending or
descending order.

Half the values lie above it and half below it.


Median
• Robust measure of central tendency
• Not affected by extreme values

0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 12 14

• In an ordered array, the median is the “middle”


number.
If n or N is odd, the median is the middle
number.
If n or N is even, the median is the average of
the two middle numbers.
TWO TYPES:
1.UNGROUPED SERIES.
2.GROUPED SERIES.
UNGROUPED SERIES

If the number of observations is


odd then median of the data will be
n+1/2 observation.
even then median of the data will be
the average of n/2 and( n/2 ) +1
Example 1: To find the median of 4,5,7,2,1 [ODD].
Step 1: Count the total numbers given.
There are 5 elements or numbers in the
distribution.

Step 2: Arrange the numbers in ascending order.


1,2,4,5,7

Step 3: The total elements in the distribution (5) is


odd.
The middle position can be calculated using the
formula. (n+1)/2
So the middle position is (5+1)/2 = 6/2 = 3
The number at 3rd position is = Median = 4
Example 2 : To find the median of 5,7,2,1,6,4.
step 1 : count the total numbers given.
there are 6 numbers in the distribution.
step 2 :arrange the numbers in ascending
order.
1,2,4,5,6,7.
step 3 :the total numbers in the distribution is 6
(even).
so the average of two numbers which
are respectively in positions n/2 and (n/2)+1
will be the median of the given data.
Median = (2+1)/2 = 1.5.
GROUPED SERIES
simply divide the total observation by 2

• If the number of observations is 200 then


median will be 100th observation.
• If the number of observations is 201 then
median will be 101th observation.
ADVANTAGES:
1.There will be only one median for a given
data set.
2 .It is unaffected by the extreme values.

DISADVANTAGES:
1.It doesn't take in to consideration all the
observations.
Median is better than mean in cases where
there are extreme values in the given data.

e.g.: Incomes of 5 individuals in a survey is


50,100,100,150,2000

here mean is 480


median 100
Mode
• A measure of central tendency
• Value that occurs most often
• Not affected by extreme values
• Used for either numerical or categorical
data
• There may be no mode or several modes
Mode = 9 No Mode

0 1 2 3 4 5 6
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
PROPERTIES:
1. There could be more than one mode for a given
data.
2. It is un affected by extreme values.
3. It does not use all the observations in the given
data.
To find the mode of 11,3,5,11,7,3,11
Step 1: Arrange the numbers in ascending order.
3,3,5,7,11,11,11

Step 2: In the above distribution


Number 11 occurs 3 times,
Number 3 occurs 2 times,
Number 5 occurs 1 times,
Number 7 occurs 1 times.
So the number with most occurrences is 11 and is
the Mode of this distribution.
Mode = 11
RANGE :
Range is the difference between the highest and the
lowest values in a frequency distribution.

Example: To find the range in 3,5,7,3,11


Step 1: Arrange the numbers in ascending order.
3,3,5,7,11

Step 2:

The largest number is 11


The smallest value is 3
Formula = largest number - smallest number
Range = 11-3 = 8
Mean, Median, Mode: Which is
Best

1 The mean is generally the first choice.


2 When the following scenarios, the median
is the best.
a. there are extreme observations
b. determine the rank of a particular
value relative to the data set.

3 The mode is rare the best measure.


Relationship: Mean, Median, Mode

• Symmetric distribution (simply histogram): Mean,


Median and Mode are the same.
• Skewed to the right: Mean is larger than the
Median.
• Skewed to the left: Mean is smaller than the
Median.
If a distribution is symmetrical, the mean,
median and mode may coincide…
Mode
MEAN

Median
A positively skewed distribution
(“skewed to the right”)

MODE
MEAN

MEDIAN

Note: Median not as sensitive


as Mean for the skewness
A negatively skewed distribution
(“skewed to the left”)

MEAN

MEDIAN

MODE
1 . PARKE.
2 . BK MAHAJAN (METHODS IN BIO STATISTICS).
3 . INTERNET.

You might also like