MAT 3 14th WeeK
MAT 3 14th WeeK
MAT 3 14th WeeK
PROBABILITY
ENGR. OSCAR H.
HALAMANI, JR.
Correlation & Regression Analysis
Pearson’s Product-Moment
Correlation (Pearson’s r)
Hypothesis Testing On Pearson’s
Product Moment Correlation
Correlation Analysis
a statistical method used to determine whether a
relationship betwen variables exist.
Regression Analysis
a statistical method used to described the nature of
relationship between variables, that is, either positive
or negative, linear or non linear
Positive Relationship
exists when either variables increase at the same time
or both decrease at the same time.
Negative Relationship
exists when one variable increases, the other variable
decreases or vice versa.
Scatter Diagram
a useful tool for checking the assumption in a
regression analysis.
Pearson Product-Moment Correlation
is a the most widely used in statistics to measure the
degree of the relationship between the linear related
variables.
Correlation
refers to the departure of two random variables from
independence.
Correlation Coefficient
defined as the covariance divided by the standard
deviations of variables
Pearson Product-Moment Correlation Coefficient
(Pearson’s r)
is a the measure of the linear strengthof the
association between the two variables.
varies between +1 and -1
r Verbal Interpretation
0.00 No Correlation
± 0.01 to ± 0.20 Slight Correlation
± 0.21 to ± 0.40 Low Correlation
± 0.41 to ± 0.60 Moderate Correlation
± 0.61 to ± 0.80 High Correlation
± 0.81 to ± 0.99 Very High Correlation
± 1.00 Perfect Correlation
Test of Significance
used to find out if the variables are related or
not.
t = r√N-2
√1-r2
Where:
t = t - test for correlation coefficient
r = correlation coefficient
N = number of paired samples
Steps in Conducting Hypothesis Testing of
Pearson Product-Moment Correlation
1. State the null hypothesis (Ho) and the alternative
hypothesis (Ha).
2. Determine the level of significance (α), one-tailed or
two-tailed, degree of freedom (df = N - 2) and the critical
value of t (refer to the given table).
3. Compute the value of the Pearson’s r and calculate the
value of t-value (tcomputed).
4. Make a statistical decision.
5. State the conclusion.
Pearson Product-Moment Correlation
For a product-moment correlation, the null hypothesis
states that the population correlation coefficient is equal
to a hypothesized value (usually 0 indicating no linear
correlation), against the alternative hypothesis that it is
not equal (or less than, or greater than) the hypothesized
value.
Ho: r = 0
Ha: r ≠ 0 for ( Ha: r ≠ 0 or Ha: r > 0 or Ha: r < 0 )
Note:
the test statistics is always two - tailed test
Example
1. AUS rural bank is studying the relationship between the
mean account balance for individual account transactions
per month. A sample of eight account revealed below.Find
the coefficient of correlation. Determine at α = 0.01,
whether the correlation in the population is greater than
zero.
Customer 1 2 3 4 5 6 7 8
Mean Balance (P) 50 32 53 24 15 10 16 27
#Transactions 5 2 7 9 10 2 4 11
Step 1: State the hypotheses
Ho: r = 0
Ha: r ≠ 0
Step 2: Determine α, df, tailed-test and the critical value.
α = 0.01, df = 8 - 2 = 6, two-tailed test
Critical Value of t (tcritical) = ± 3.707
Step 3: Compute the Pearson’s r & the t-value
Given: N=8 X2 = 8219
ΣX = 227 ΣY2 = 400
ΣY = 50 ΣXY = 1432
Customer X Y X2 Y2 XY
1 50 5 2,500 25 250
2 32 2 1,024 4 64
3 53 7 2,809 49 371
4 24 9 576 81 216
5 15 10 225 100 150
6 10 2 100 4 20
7 16 4 256 16 64
8 27 11 729 121 297
Total 227 50 8219 400 1432
r = NΣXY - (ΣX) (ΣY) .
√ [N(ΣX2) - (ΣX)2] [N(ΣY2) - (ΣY)2]
r= 8(1,432) - (227)(50) .
√ [8(8,219) - (227)2] [8(400) - (50)2]
r = 0.03 ( Slight Postive Correlation)
t= r√N-2 = 0.03 √ 8 - 2
√ 1 - r2 √ 1 - (0.03)2
t = 0.07
Step 4: Decision Rule
if tcomputed > tcritical ; Reject Ho
tcomputed < tcritical ; Do not reject Ho
Since: 0.07 < 3.707 ; Do not reject Ho
Step 5: Conclusion
There is no enough evidence that shows significant
relationship between the mean account balance for
individual account and the number of transactions per
month.
Example
2. SJS company has been selling to retail customers in MM
area. They advertise extensively on radio, print ads, and in
the internet. The owner would like to review the relationship
between the amount spent in advertising expenses and sales.
Find the coefficient of correlation. Determine at α = 0.10
whether the correlation in the population is greater than zero.
Month Jan Feb Mar Apr May Jun Jul Aug Sep
Expenses (KP) 10 14 12 9 13 15 8 13 16
Sales (KP) 180 170 190 220 235 208 215 175 250
Step 1: State the hypotheses
Ho: r = 0
Ha: r ≠ 0
Step 2: Determine α, df, tailed-test and the critical value
α = 0.10, df = 9 - 2 = 7, two-tailed test
Critical Value of t (tcritical) = ± 1.895
Step 3: Compute the Pearson’s r & the t-value.
Given: N=9 ΣX2 = 1,404
ΣX = 110 ΣY2 = 383,639
ΣY = 1,843 ΣXY = 22,610
Month X Y X2 Y2 XY
Jan 10 180 100 32,400 1,800
Feb 14 170 196 28,900 2,380
Mar 12 190 144 36,100 2,280
Apr 9 220 81 48,400 1,980
May 13 235 169 55,225 3,055
Jun 15 208 225 43, 264 3,120
Jul 8 215 64 46,225 1,720
Aug 13 175 169 30,625 2,275
Sep 16 250 256 62,500 4,000
Total 110 1,843 1,404 383,639 22,610
r = NΣXY - (ΣX) (ΣY) .
√ [N(ΣX2) - (ΣX)2] [N(ΣY2) - (ΣY)2]
r= 9(22,610) - (110)(1843) .
√ [9(1,404) - (110)2] [9(383,639) - (1843)2]
r = 0.14 ( Slight Postive Correlation)
t= r√N-2 = 0.14 √ 9 - 2
√ 1 - r2 √ 1 - (0.14)2
t = 0.37
Step 4: Decision Rule
if tcomputed > tcritical ; Reject Ho
tcomputed < tcritical ; Do not reject Ho
Since: 0.37 < 1.895 ; Do not reject Ho
Step 5: Conclusion
There is no enough evidence that shows significant
relationship between the advertising expenses and the
sales per month.
Example
3. The owner of the a chain of fruit shake stores would like
to study the correlation between atmospheric temperature
and sales during the summer season. A random sample of 12
days is selected with the results given as follows:
Day 1 2 3 4 5 6 7 8 9 0 11 12
Temp (OF) 79 76 78 84 90 83 93 94 97 85 88 82
Total Sales 147 143 147 168 206 155 192 211 209 187 200 150
Regression Analysis