Lesson 14. Analysis of Variance: SST X X) N

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 11

Lesson 14.

Analysis of Variance

Analysis of Variance or ANOVA refers to a comparison test used to determine the significant
difference among normal population means.

The comparison in means of 3 or more populations which follow normal distributions can be
taken simultaneously in just one application of this test. This test is therefore a generalization of z an t
tests of two normal population means. This test was developed by Sir Ronald Fischer (1892 - 1962).

The following assumptions should be met in the use of ANOVA

1. The various groups are assumed to be normal populations.

2. The variance of the different groups are assumed to be equal

3. The random samples in the groups should be independent.

When all the three assumptions are net, the results of the analysis of variance will be valid.

A. One-Way Analysis of Variance

The following formulas are used:


2
2(∑ x)
SST =∑ x −
n

Where:
SST = sum of squares total
X = individual values in each column
N = total sample size
2 2
(∑ x c ) (∑ x )
SSB= −
n N

Where:
SSB = sum of squares between columns
∑Xc = sum of individual values per column
n = size of sample per column

SSW =SST −SSB

Where:

SSW = sum of squares within column


Analysis of Variance for One - Way Classification Table Summary
Source of Sum of Degree of Computed
Mean Square
Variation Squares freedom F - value
       
Between Col. SSB k-1    
    MSB = SSB / k - 1  
      F = MSB / MSW
Within Col. SSW k (n - 1)    
    MSW = SSW / k(n-1)  
Total SST N-1    

Example 1: Equal Sample Size

A safety engineer is testing four different types of smoke alarm systems. After installing five if each
type in smoke chamber, he introduced smoke to a uniform level, electrically connected the alarms
and observed the reaction time in seconds. Is there a significant difference in the reaction time of the
four types?

Alarm Type
Observations
1 2 3 4

1 5.2 7.4 3.9 12.3

2 6.3 8.1 6.4 9.4

3 4.9 5.9 7.9 7.8

4 3.2 6.5 9.3 10.8

5 6.8 4.9 4.1 8.5

Solution:
Step 1. Make assumptions.
- Respondents randomly selected
- Distribution is normal

Step 2. State the hypothesis.


Ho: µ1 = µ2 = µ3 = µ4
All population means are equal.
Ha: At least two of the means are not equal.

Step 3. Set a critical region.


Sampling distribution = F distribution.
Level of significance = 0.05 (two tailed)
df numerator = k - 1 = 4 - 1 = 3
df denominator = k(n- 1) = 4(5 - 1) = 16
F tabular = 3.24

Step 4. Compute the test statistic


Alarm Type
Observations Total
1 2 3 4

1 5.2 7.4 3.9 12.3

2 6.3 8.1 6.4 9.4

3 4.9 5.9 7.9 7.8

4 3.2 6.5 9.3 10.8

5 6.8 4.9 4.1 8.5

Total 26.4 32.8 31.5 48.8 139.5

Mean 5.28 6.56 6.3 9.76 6.975

SST =[ 5.22+6.3 2+ 4.92 +3.22+ 6.82+ 7.42 +8.12 +5.92 +6.52 +¿ 4 .9 2+3.9 2+6.4 2 +7.92+ 9.32 +4.12 +12.32 +9.4 2+ 7.82+ 10.8 2+8

SST =[ 27.04 +39.69+24.01+10.24+ 46.24+54.76+ 65.61+34.81+ 42.25+24.0+15.21+40.96+ 62.41+ 86.4

SST =106.91

( 26.4 ) 2 ( 32.8 ) 2 ( 31.5 ) 2 ( 48.8 ) 2 ( 139 ) 2


SSB= [ 5
+
5
+
5
+
5 ] −
20

SSB=1029.3−973.0125

SSB=56.29

SSW =SST −SSB

SSW =106.91−56.29

SSW =50.62

Source of Sum of Degree of Mean Square Computed F-Value


Variation Squares Freedom
Between Col. 56.29 3 56.29 18.76
MSB= =18.76 F= =5.94
3 3.16
Within Col. 50.63 16 50.63
MSW = =3.16
16
Total 106.92 19

Step 5. Make a decision.


Since Computed F value of 5.94 is greater than the tabular F value of 3.24 with df (3,16) at
0.05 level of significance, Ho is rejected. There is a significant difference in reaction time between
smoke alarm type.

Example 2: Unequal Sample Size

The following are growth (cm) of a certain plant due to the application of 4 different concentrations of
certain chemicals over a specified period of time.

Concentrations
1 2 3 4
8.1 7.6 6.8 6.7
8.6 8.3 5.7 7.2
9.3 8.5 7.1 6.2
9.1 8 6.7 6.8
  7.9 7.3 7
    6  

Is there a significant difference in the average growth of these plants for the different concentrations
of the chemical? Use 0.05 level of significance.

Solutions

Step 1. Make assumptions.

- Respondents randomly selected

- Distribution is normal

Step 2. State the hypothesis.

Ho: µ1 = µ2 = µ3 = µ4

All average means are equal.

Ha: The means are not equal.

Step 3. Set a critical region.

Sampling distribution = F distribution.

Level of significance = 0.05 (two tailed)

df numerator = k - 1 = 4 - 1 = 3

df denominator = N - k = 20 - 4 = 16

F tabular = 3.24
Step 4. Compute the test statistic.

  Concentrations  
  1 2 3 4  
  8.1 7.6 6.8 6.7  
  8.6 8.3 5.7 7.2  
  9.3 8.5 7.1 6.2  
  9.1 8 6.7 6.8  
    7.9 7.3 7  
    6    
Total 35.1 40.3 39.6 33.9 148.9
nt n1 = 4 n2 - 5 n3 = 6 n4 = 5 n = 20

( 148.9 ) 2
SST =8.12+ 8.62 +… … … 6.82 +7.02−
20
SST =1127.91−1108.561
SST =19.35
( 35.1 ) 2 ( 40.3 ) 2 ( 39.6 ) 2 ( 33.9 ) 2
SSB= + + + −1108.561
4 5 6 5
SSB=1124.02−1108.56
SSB=15.46
SSW =SST −SSB
SSW =19.35−15.46
SSW =3.89
Source of Sum of Degree of Mean Square Computed
Variations Squares Freedom   F - value
15.46 5.15
MSB= =5.15 F= =21.46
Between Col 15.46 3 3 .24  
3.89
MSW = =0.24
Within Col. 3.89 16 16  
Total 19.35 19

Step 5. Make a Decision.

Since the computed F value of 21.46 is much greater than the Tabular F value of 3.24, Ho is rejected:
There is a significant difference in growth between the 4 different concentrations.

B. Two - Way Analysis of Variance

The following formulas are used:


( ∑ x )2
2
SST =∑ x −
N

Where:
SST = sum of squares total
X = individual values in each column
N = total sample size

( ∑ X c )2 ( ∑ x )2
SSR= −
n N

Where:
SSR = sum of squares row means
∑Xc = sum of individual values per column
n = size of sample per column

( ∑ x c )2 ( ∑ x ) 2
SSC= −
n N

SSE=SST −SSR−SSC

Where:

SSE = sum of squares error

Two - Way ANOVA Table

Source of Sum of Degree of Computed


Mean Square
Variations Squares Freedom F - value
       
Column Mean SSC c-1 MSC = SSC / c - 1 F1 = MSC / MSE
(treatment)         
Row Mean SSR r-1 MSR = SSR/ r - 1 F2 = MSR / MSE
(blocks)        
Error SSE (r-1) (c - 1) MSE = SSE/(r-1) (c-1)  
         
Total SST rc - 1    

Example:

In order to compare three-word processor A, B. C; Monique was timed on preparing a certain


kind of report on each of the machines for four consecutive days. The results (in minutes) are shown
in the following table.

Machines
Day
A B C
1 17 22 20
2 18 20 21
3 21 24 23
4 18 23 17

Test the following hypothesis at 0.05 level of significance.

a) There is id no significant difference on the output of the three machines.

b) The machines perform equally well on different days.

Solution:

Step 1. Make assumptions.

- Respondents randomly selected

- Distribution is normal

Step 2. State the hypothesis.

Ho: µA = µB = µC

The machines perform equally well.

Ha: β1 = β2 = β3

The three machines perform equally well on different days.

Step 3. Set a critical region.

Sampling distribution = F distribution; Two-Way ANOVA

Level of significance = 0.05

For treatment:

df numerator = c - 1 = 3 - 1 = 2

df denominator = (r -1) (c-1) = (4 – 1) ( 3 -1) = (3) (2) = 6

F1 = (2,6)

F tabular = 5.14

For Blocks:

df numerator = r - 1 = 4 - 1 = 3
df denominator = (r -1) (c-1) = (4 - 1) (3 -1) = (3)(2) = 6

F2 = (3, 6)

F tabular = 4.76

Step 4. Compute the test statistic.

Machines
Day Total
A B C
1 17 22 20 59
2 18 20 21 59
3 21 24 23 68
4 18 23 17 58
Total 74 89 81 244
( 244 ) 2
SST =289+324+ 441+ 324+ 484+ 400+576+529+ 400+ 441+529+289−
12
¿ 5026−4961.33
¿ 64.67
81 2
SSC=74 2+89 2+ −4961.33
4
6561
¿ 5476+7921+ −4961.33
4
19,958
¿ −4961.33
4
¿ 4989.50−4961.33
¿ 28.17
58 2
SSR=59 2+59 2+68 2+ −4961.33
3
3364
¿ 3481+3481+4523+ −4961.33
3
14,849
¿ −4961.33
3
¿ 4983.33−4961.33
¿ 22
SSE=64.67−28.17−22=14.50

Source of Sum of Degree Mean Square Computed


of
Variations Squares Freedom F - value
Column SSC = 3 - 1 =2 28.17 14.08
Mean 28.17
MSC= =14.08 F 1= =5.82
2 2.42
Row Mean SSR = 22 4 -1 = 3 22 7.33
MSR= =7.33 F 2= =3.03
3 2.42
Error SSE =14.50 (2)(3) =6 14.50  
MSE= =2.42
6
Total SST = 64.67 11    

Step 5. Decision
For Treatments:
Since computed F1 = 5.92 exceeds tabular value of F1 = 5.14, Ho is rejected.
For Blocks:
Since computed F2 = 3.03 does not exceed the tabular F2 = 4.76, Ho is accepted.

In other words, we conclude that there is significant difference among the machines and that there is
no difference in the average output of the three machines on the daily basis.

Exercise 14.

Solve the following problems:

1. Consider Ho: µ1 = µ2 = µ3 = µ4 =µ5 =µ6

A). Complete the following ANOVA table.

Source of Sum of Degree of Mean Square Computed

Variations Squares Freedom   F - value

Between Col 250    

Within Col.        

Total 400 30

B.) Using 5% significance level, would you reject Ho?

2.Consider Ho; µ1 = µ2 = µ3 = µ4 =µ5

a). Complete the following ANOVA table.

Source of Sum of Degree of Mean Square Computed


Variations Squares Freedom   F - value

Between Col 128    

Within Col. 160   16  

Total    

b). Using 5% significance level, would you reject Ho?

3.Consider Ho; µ1 = µ2 = µ3 ; Sample sizes are n1 = 26 ; n2 = 6

A). Complete the following ANOVA table.

Source of Sum of Degree of Mean Square Computed

Variations Squares Freedom   F - value

Between Col 100    

Within Col.        

Total 260  

B) Using 1% significance level, would you reject Ho?

4.Consider Ho; µ1 = µ2 = µ3 = µ4 ; Sample sizes are n1 = n2 = 5 ; n3 = 6 , n4 = 8.

A). Complete the following ANOVA table.

Source of Sum of Degree of Mean Square Computed

Variations Squares Freedom   F - value

Between Col 54    

Within Col. 240      

Total    

B). Using 1% significance level, would you reject Ho?

5. Monique, owner of a large company, wanted to compare the mean daily output of a particular item
for four plants. Foe each plant, a random sample of 4 days gave the data listed in the following table.
Do the sample data indicate a difference in the population means for five plants? Use a 5% level of
significance.

A B C D
28 21 23 16
17 16 14 24
17 11 12 12
18 13 10 14

1. Solve using 5% level of significance. For One - Way ANOVA

A B C
1.9 2.3 2.8
2.3 2.7 2.8
2.8 3.2 2.9
2.4 2.8 3.5
2.5 2.9 3
2.5 2.9

You might also like