BS3. Statistics

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 62

LIFE UNIVERSITY

Department of Foundation Year

Subject: Basic statistics

Instructor: Chrin Mac

Tel: 012 525 988

Email: [email protected]
Chapter 3: Describing Data: Measure
of Location and Dispersion
 OBJECTIVES
 Measure of Location in Data Sets
 The Arithmetic Mean
 The Weight Mean
 The Median
 The Mode
 Relationship Between Mode, Mean and Median
 Quartiles, Deciles, and Percentiles
Measure of Location in
Data Sets
 A measure of location is a value that is
calculated for a group of data and that is
used to describe the data in some way.
 In statistics, the measure of location is
focused on the value of Average in data
sets.
 Average is a measure of central tendency
for a collection of values.
The Arithmetic Mean
(In Group of Data)
 The arithmetic mean, or arithmetic
average, is defined as the sum of the
values in the data group divided by the
number of values.
 The arithmetic mean is defined by formula:

  X
X
 X
N n
For Population For Samples
Example

Number of foundation year students


attended to Life University Library per
week expressed that:
Day N. Student
Monday 4
Tuesday 6
Wednesday 3
Thursday 5
Friday 8

to compute a value of Mean ?


Solution
- to define a values of Mean
- by using a arithmetic mean formula:


 X

X 1  X 2  X 3 X 4  X 5
N N
- where: X  4, X  6, X  3, X  5, and X  8
1 2 3 4 5
-
N 5
-  X    4  6  3  5  8  26  5.2  5

- Therefore,N a value of mean is5   5 5
- It means that average number of students
went to library per week is 5.
Practices
1. Find the arithmetic mean of the following
numbers: 8, 6, 10, 11, 12, 6, 6, 8, 14, 15, 14, 14,
10, 12, 6, 11. (ans: 10.18)
2. Eight automobiles were priced at 10,499;
$11,988; $7,444; $5,995; $14,999; $6,492;
10,750; and $7,937. What is the arithmetic mean
of the prices?
3. Mrs. Smith purchased ten dozen rolls as follows:
2 dozen $1.19, 1 dozen $1.88, 3 dozen $.94, and
4 dozen $1.28. What average price per dozen did
she pay for the ten dozen rolls?
The Weight Mean
 The Weight Mean or Mean Average is an
arithmetic mean in which each values is weighted
according to its importance in the overall group
 The weighted mean of a set of numbers
with corresponding weights
,Xis1 , computed
X 2 , X 3 ,........ Xfrom
n
the following formula:
W 1, W2 , W3 , ....W n

(W1 X 1  W2 X 2  ....  Wn X n ) (W1 X 1  W2 X 2  ....  Wn X n )


W  XW 
W1  W2  W3  ....  Wn W1  W2  W3  ....  Wn

For Population For Sample


Example

During one hour period on a hot Saturday


afternoon boy chris served fifty drinks.
He sold five drinks for $ 0.50, fifteen for
$ 0. 75, fifteen for $0.90, and fifteen for
$1.10. Compute the weighted mean for
the price of the drinks. (Ans: 0.875$)
Solution
 Compute the weighted mean for the price of the drinks
 By Using formula of the Weight Mean

(W1 X 1  W2 X 2  ....  Wn X n )
XW 
 Where: W1  Wand2  W3  ....  Wn

W 1 5, W2  15, W3  15, and W 4 15


X 1 0.50, X 2  0.75, X 3  0.90, and X 4 1.10

 Therefore 5(0.50)  15(0.75)  15(0.90)  15(1.10) 43.75


 X W the value of weight mean is $0.875   0.875
5  15  15  15 50
Practices
Monthly salary of garment workers worked in one
garment factory in Preah Sihanouk Province showed
that:
Monthly salary (USD) N. of Garment Workers
150 = X1 250 = W1
200 150
250 100
300 50
350the weight mean for
To compute 20 monthly salary of

garment workers. (Ans: 200.87$) ; N=570


The Median (In Group Data)

 If the sample data are arranged in


increasing order, the median is
(i) the middle value if n is an odd
number, or
(ii) midway between the two middle
values if n is an even number
Example 1 – n is odd
The reordered systolic blood pressure
data seen earlier are:

113, 124, 124, 132, 146, 151, and


170.

The Median is the middle value of the


ordered data, i.e. 132.
Example 2 – n is even
Six men with high cholesterol participated in a
study to investigate the effects of diet on
cholesterol level. At the beginning of the study,
their cholesterol levels (mg/dL) were as follows:
366, 327, 274, 292, 274 and 230.
Rearrange the data in numerical order as follows:

230, 274, 274, 292, 327 and 366.

The Median is half way between the middle two


readings, i.e. (274+292)  2 = 283.
Other way for computing
the value of Median
 In order to compute the value of Median,
we first need to compute the position of
median through 2 different ways.
1. If n is odd numbers, the position of median
n 1
is standing on 2 .
2. If n is even numbers, the position of median
is standing on n 2 2.
Examples

1. Find the median for the following two data


sets
Data set 1: 10, 12, 15, 15, 18, 20.
Data set 2: 12, 15, 15, 24, 26, 28, 28.
Solution
Find the median for the following two data sets.
Data set 1: 10, 12, 15, 15, 18, 20.
Solution (Con’t)
In Data Set 1, where: n = 6 is even
So, the position of median is in . n2
2
.
n2 62
  4
Data set
2 1: 10,
2 12, 15, 15, 18, 20.
The value of median is .
15  15
 15
2
Therefore, the value of median is 15
Solution (Con’t)

In Data Set 2: 12, 15, 15, 24, 26, 28, 28.


Where: n=7 is odd.
n 1
So, the position of median is in .2
n 1 7 1
  4
2 2
We count in data set 2
12, 15, 15, 24, 26, 28, 28.
Therefore, the value of median is 24
Practice

A bank auditor selects 11 checking accounts and


records the amount in each of the accounts.
The 11 observations in increasing order are as
follows (Dollar Units):
150.25 175.35 195.00 200.00
235.00 240.45 250.55 256.00
275.50 290.10 300.55.
To compute a value of Median of data?.
The Mode (In Group of Data)

 The mode is the value that occurs most


frequently in a set of values.

 Ex: For a samples of 15 students at an elementary


school snack bar, the following sales amounts
arranged in ascending order of magnitude are
observed: $.10, .10, .25, .25, .25, .35, .40, .53, .90,
1.25, 1.35, 2.45, 2.71, 3.09, 4.10. Determine the
mode for these sales amounts.
Solution

Determine the mode for these sales


amounts.
Data:
$.10, .10, .25, .25, .25, .35, .40, .53, .90,
1.25, 1.35, 2.45, 2.71, 3.09, 4.10
In data set, we found that the most
frequently value occur is $.25
- Therefore, a value of mode is $.25
The Mean, Median and Mode
(In Class of Data)
Mean
In order to compute a value of mean in
class of data set is followed by the
formula as:

  fX
For Population X 
 fX
For Sample
N n

Where : X is middle value of each class; f is a frequency in each class;


and n or N is total number of observations
Example
The price of the 80 vehicles sold last month at Whitner
Autoplex are expressed That:

Selling Price ($ Thousands) Frequency


12 up to 15 8
15 up to 18 23
18 up to 21 17
21 up to 24 18
24 up to 27 8
27 up to 30 4
30 up to 33 2
to compute a value
Total
of Mean? 80
Solution
 To compute a value of Mean
 fX
n .
 By using a formula: X 

Selling Price Frequency Middle fX


($ Thousands) (f) Values (X)
12 up to 15 8 13.5 108
15 up to 18 23 16.5 379.5
18 up to 21 17 19.5 331.5
21 up to 24 18 22.5 405
24 up to 27 8 25.5 127.5
27 up to 30 4 28.5 114
30 up to 33 2 31.5 63
Total 80  fX  1528.5
Solution (Con’t)

 Where: n = 80

X 
 fX 1528.5
  19.106
n 80
 Therefore, a value of mean is 19.106

___________
Practice
Frequency distribution of monthly apartment
rental rates for 200 apartments showed:
Rental Rate (USD) Number of apartments
$ 350-379 3
380-409 8
410-439 10 to find a
440-469 13
470-499 33
value of
500-529 40 Mean of
530-559 35
560-589 30
data?.
590-619 16 (Ans: 477.75$)
620-649 12
Total n = 200
The Median (In Class of Data)
 In order to compute a value of median in
class of data is followed by formula:
n
 CF
Median  L  2 (i )
f

Where: L is lower limit with median; n is total number of


observations; f is frequency in each class of median; CF
is a cumulative frequency with before class of median;
and i is a class interval with median.
Example
 The price of the 80 vehicles sold last month at
Whitner Autoplex are expressed That:
Selling Price ($ Thousands) Frequency
12 up to 15 8
15 up to 18 23 To
18 up to 21 17 compute a
21 up to 24 18 value of
24 up to 27 8 Median of
27 up to 30 4
data?.
30 up to 33 2
Total 80
Solution
 To compute a value of Median of data
 Data: Selling Price Frequency (f) Cumulative
($ Thousands) Frequency (CF)
12 up to 15 8 8
15 up to 18 23 31
18 up to 21 17 48 (C.Median)
21 up to 24 18 66
24 up to 27 8 74
27 up to 30 4 78
30 up to 33 2 80
Total 80
Solution (Con’t)
 Where: n=80 is even, then the position of
median is n , then
2

n  2 80  2
  41
2 2 2
 So, the position of median is in class of 18 up
to 21. Because 41 is between 31 to 48.
 And
f  17, CF  31, L  18000, i  3000
80  31
 Median  18000  2 (3000)  19,588.00
17
Therefore , the value of median is 19,588.00USD
 X=19,106.00USD
 Median = 19,588.00 USD
Practice

Age of a sample of applicants for a training


program
Age N. of Applicants
18 up to 20 5 To compute a
20 up to 22 18
22 up to 24 10
value of
24 up to 26 6 median of
26 up to 28 6 data?.
28 up to 30 5
30 up to 32 2
Median =22.6
Total 52
The Mode (In Class of Data)
 Mode in class of data is a middle value with the
most of frequency.
 Example: Age of a sample of applicants for a
training program Age N. of Applicants
18 up to 20 5
To find Mode of data?.
20 up to 22 18
Solution 22 up to 24 10
Mode of data is in class of
20 up to 22. Because, it has 24 up to 26 6
the most frequency. 26 up to 28 6
20  22 28 up to 30 5
Mode   21
2 30 up to 32 2
Total 50
Practice

Lifetime of cutting tools in an industrial


process. Hours before replacement N. of Tools
18 up to 20 5
To compute
20 up to 22 18
a value of 22 up to 24 10
Mean , 24 up to 26 6

Median, and 26 up to 28 6
28 up to 30 5
Mode of
30 up to 32 2
data?. Total 52
Relationship Between Mean
Mode and Median
 Large sample values tend to inflate the mean. This will
happen if the histogram of the data is right-skewed.

 The median is not influenced by large sample values and is


a better measure of centrality if the distribution is skewed.

 Note if mean=median=mode then the data are said to be


symmetrical

 e.g. In the CK measurement study, the sample mean =


98.28. The median = 94.5, i.e. mean is larger than median
indicating that mean is inflated by two large data values 201
and 203.
Relationship Between Mean
Mode and Median(Con’t)
 The relationship between Mean, Mode,
and median is:

Mean  Median  Mode


Mean  Median  Mode
Median  Mean  Mode
Mode  Mean  Median
Quartiles, Deciles, and
Percentiles
Quartiles
- Quartile is used to measure of the
position of dispersion data:
- With ordinary data, we divided quartile
with 4 different parts included: Q1 , Q2 , and Q3
Where : Q1: is a fisrt quartile; Q2 : is a sec ond quartile;
and Q3 is a third quartile
Figure: Quartile Divided

Q1  25% Q2  50% Q3  75%

First Second Third


Quartile Quartile Quartile
Quartile with Discrete Variables
 In case, the number of observation n is odd.
 In order to compute a value of quartile in
discrete variables as showed by the formula:
1
 First Quartile (Q1 ) is on position of (n  1)
4
1
 Second Quartile (Q2 ) is on position of (n  1)
2
3
 Third Quartile (Q3 ) is on position of (n  1)
4
Example
 To find a value of Q1 , for data
Q2 , and Q3 set:

30 30 50 60 80 90 120 140
190 200 240.
Solution
- To find a value of Q1 , Q2 , andfor
Q3 data set
- We observe that: n=11 is odd
Data set: 30 30 50 60 80 90
120 140 190 200 240
Solution (Con’t)
1 1
Thus:   Q1is order value of 4
( n  1)  (11  1)  3
4
Therefore : A value of quartile is Q1  50
1 1
 Q2is order value of (n  1)  (11  1)  6
2 2
Therefore : A value of quartile is Q2  90
3 3
 Q3is order value of ( n  1)  (11  1)  9
4 4
Therefore : A value of quartile is Q3  190

Therefore, a value of Q1  50, Q2  90, and Q3  190


Practice (Homework)
 Examination result of foundation year students
in field of statistics showed as:
60 70 60 75 65 60 70 70
80 80 90 75 60 70 89 80
70 75 60 65 65 80 90 75
70 60 65 75 75 80 80 70
90.
- To compute a value of ?.
Q1 , Q2 , and Q3
In case, n = even
 In order to compute a value of quartile with
even number is determined by formula as:
(n  1)
 Second Quartile (Q2 ) is on position of , it ' s a median
2
n 1
 First Quartile (Q1 ) is on position of 2 , it ' s a left of Q 2
2
n 1
 Third Quartile (Q3 ) is on position of 2 , it ' s a right of Q2
2
Example

 To compute a value of Q1 ,with value


Q2 , and Q3 of
data set: 20 23 23 26 27 28.
Solution
- To compute a value of with value of
data set: 20 23 23 Q126 , Q2 , and27
Q3 28.
- Where: n=6 is even
Solution(Con’t)

 And Q2 is a median. Then a position of


n 1 6 1
data set is on 2  2 . 3.5
 So, Q  23 2 26  24
2 . Then
.5
n 1 6 1
Q1  2  2  2, is a left of Q2 .  Q1  23.
2 2
n 1 6 1
- Moreover, Q3  2
2
 2
2
 2, is a right of Q2 .  Q3  27.

- Therefore, a value of Q1  23, Q2  24.5, and Q3  27


Practice

 To compute a value of Q1 , Q2 , and Q3 with a


data set: 50 50 60 60 70 70 .
Quartile of Continue Variable

 In order to compute a value of quartile of


continue variable is used:
Where:
(n  CF ) . i - L: is a Lower bound for the class
 Q1  L  4
f of quartile.

(n  CF ) . i -n: total number of observation


 Q2  L  2
f -CF: Cumulative Frequency with
the class of quartile.
(3n  CF ) . i - f: is a frequency of each class with
 Q3  L  4
f quartile.
- i : class interval with quartile.
Practice
Monthly salary of Life University Staff are showed:
Monthly Salary Frequency
100-150 67
150-200 550
200-250 454
250-300 418
300-350 376
350-400 375
400-450 328

To compute a value of ?.
Q1 , Q2 , and Q3
Solution

To compute a value of Q1 , Q2 , and Q3 ?.


By using: ( n  CF ) . i
4
Q  L 
1
f
(n  CF ) . i
 Q2  L  2
f
(3n  CF ) . i
 Q3  L  4
f
- Where: n = 2568 (even)
1 1
- Position of 4 4 (2568)  642
Q 
1 n 
Solution (Con’t)

A cumulative frequency showed:


Monthly Frequency CF
Salary ($)
100-150 67 67
150-200 550 617
200-250 454 1071
250-300 418 1489
300-350 376 1865
350-400 375 2240
400-450 328 2568
Solution (C0n’t)
 A value of Q1stay between 617-1071.
 And Q1is in class: 200-250.
 Where:
n
L  200;  642; CF  617; i  250  200  50; and f  1071  617  454
4
( n  CF ) . i (642  617).50
 Q1  L  4  200   202.75 $
 Therefore, f1 of monthly salary 454
of LU staff is
less than $202.75.
4
Solution (Con’t)

Q 2 is a variable with a class of median is


250 -300.
n 2568
L  250;   1284; f  1489  1071  418; CF 1071
2 2
( n  CF ) . i (1284  1071).50
 Q2  L  2  250   275.47 $
f 418
 Therefore, a value of Q2  275.47 $
Solution (Con’t)
 Compute a value of . Q 3
 Q 3 is a variable with a class of median is 350 -
400.
3n 3(2568)
L  350;   1926; f  2240  1865  375; CF 1865
4 2
(3n  CF ). i (1926  1865).50
 Q3  L  4  350   358.13
f 375
Therefore, a value of .
Q3  $358.13
Deciles

 Deciles are divided in tenths different parts:

 In order to compute
10% 10% a position
10% 10% 10% 10% 10% of deciles,
10% 10%

we used:

n 1 2n 1 3n 1 4n 1 5n 1
D1   ; D2   ; D3   ; D4   ; D5  
10 2 10 2 10 2 10 2 10 2
 6n 1 7n 1 8n 1 9n 1
D6   ; D7   ; D8   ; and D9  
10 2 10 2 10 2 10 2
Example
 To compute a value of with
D3 ;the
D5 ; data
and Dset:
7
10 10
15 15 20 20 25 15 25 25 10
10 30 30 30
Solution
To compute a value of D3 ; D5 ; and D7
Data set: 10 10 10 10 15 15 15
20 20 25 25 25 30 30 30
By using a formula: 3n 1 5n 1 7n 1
D3   ; D5   ; and D7  
10 2 10 2 10 2
Solution (Con’t)
3n 1 3(15) 1
 The position of D3   
10 2 10 2
 . 5

 Therefore, a value of D3  15 .
5n 1 5(15) 1
 The position of D  5  
10 2 10
 . 8
2

 Therefore, a value of D5  20 .
7n 1 7(15) 1
 The position of D  10  2  10  2.11
7

 Therefore, a value of D7  25 .
Practice

 To compute a value of D1 up-to D10 with


a data set: 257 278 269 257 456 389
466 389 465 578 579 670
 D1= n/10 +1/2 = 12/10 +1/2 = 2.
 Where n= 12
Percentiles
 Percentiles are also divided into 100 parts.
 We can compute a value of Percentile by first
finding a position of Percentile through a
formula:
n 1 35n 1
P1 ( First )   ;....... P35 (Thirty Five)   ;......
100 2 100 2
50n 1 70n 1
P50 ( Fifty )   ; ........P70 ( Seventy)   ;........
100 2 100 2
100n 1
and P100 ( A hundredth )  
100 2
Example

 To compute a value of P35 ; P50 ; and P90


with the data set: 10 10 15 15 20
20 25 15 25 25 10 10 30
30 30.
Solution
To compute a value of P35 ; P50 ; and P90 .
- The position of
35n 1 50n 1 90n 1
P35   ; P50   ; .and P90   .
100 2 100 2 100 2
Solution (Con’t)
 Data set: 10 10 10 10 15 15 15
20 20 25 25 25 30 30 30
35(15) 1
P35    5.75  6
100 2
50(15) 1
P50   8
100 2
90(15) 1
P90    14
100 2
 Therefore, a value of
P35  15; P50  20; and P90  30
Practice

 Domestic disturbance calls per 24-hour


period for the data in19 22 23 25 27 29
29 30 30 34 36 37 37 38 38 40 40 40
44 45 46 46 47 47 49 50 50 56 56 58.
 To find the value of
Q1 , Q2 , and Q3 ;
D4 , D6 , and D8 ;
P15 , P25 , P35 ,P45 , P55 , P75 , P85 , and P95 .
Thank You

You might also like