D1UA401B Research Methodology-UNIT-4 Pazhanisamy-BBA IV Semester Section19
D1UA401B Research Methodology-UNIT-4 Pazhanisamy-BBA IV Semester Section19
D1UA401B Research Methodology-UNIT-4 Pazhanisamy-BBA IV Semester Section19
18-2
Understand . . .
How to test regression models for linearity
and whether the equation is effective in
fitting the data.
Nonparametric measures of association and
the alternatives they offer when key
assumptions and requirements for parametric
techniques cannot be met.
18-3
UNIVARIATE ANALYSIS
18-5
For continuous linearly related
Pearson correlation coefficient
variables
For nonlinear data or relating a main
Correlation ratio (eta) effect to a continuous dependent
variable
17-7
Inferential Descriptive
Statistics Statistics
17-8
17-9
As Abacus states in
this ad, when
researchers ‘sift
through the chaos’ and
‘find what matters’ they
experience the “ah ha!”
moment.
17-10
Classical statistics Bayesian statistics
Objective view of Extension of classical
probability approach
Established hypothesis Analysis based on
is rejected or fails to sample data
be rejected Also considers
Analysis based on established subjective
sample data probability estimates
17-11
Null
H0: = 50 mpg
H0: < 50 mpg
H0: > 50 mpg
Alternate
HA: = 50 mpg
HA: > 50 mpg
HA: < 50 mpg
17-12
17-13
17-14
17-15
17-16
17-17
17-18
17-19
True
True value
value of
of parameter
parameter
Alpha
Alpha level
level selected
selected
One
One or
or two-tailed
two-tailed test
test used
used
Sample
Sample standard
standard deviation
deviation
Sample
Sample size
size
17-20
17-21
State
Statenull
null
hypothesis
hypothesis
Interpret
Interpret the
the Choose
Choose
test
test statistical
statistical test
test
Stages
Stages
Obtain
Obtain critical
critical Select
Select level
levelof
of
test
test value
value significance
significance
Compute
Compute
difference
difference
value
value
17-22
Parametric Nonparametric
17-23
Independent
Independent observations
observations
Normal
Normal distribution
distribution
Equal
Equal variances
variances
Interval
Interval or
or ratio
ratio scales
scales
17-24
17-25
17-26
17-27
Easy
Easy to
to understand
understand and
and use
use
Usable
Usable with
with nominal
nominal data
data
Appropriate
Appropriate for
for ordinal
ordinal data
data
Appropriate
Appropriate for
for non-normal
non-normal
population
population distributions
distributions
17-28
How many samples are involved?
17-29
k-Sample Tests
Two-Sample Tests
_______________________________________ _______________________________________
_____ _____
17-30
Is there a difference between observed
frequencies and the frequencies we would
expect?
Is there a difference between observed and
expected proportions?
Is there a significant difference between some
measures of central tendency and the
population parameter?
17-31
Z-test t-test
17-32
Null Ho: = 50 mpg
Statistical test t-test
Significance level .05, n=100
Calculated value 1.786
Critical test value 1.66
(from Appendix C,
Exhibit C-2)
17-33
18-34
18-35
18-36
18-37
18-38
Z-test and Chi-square test are statistical tests used to compare
groups or test hypotheses. Z-test is used when the sample size is
large and the population standard deviation is known, and is
used to test hypotheses about the mean of a normal
population2. It is used to compare two groups by comparing
their population proportions.
Chi-square test is used when the sample size is small, and is
used to test hypotheses about the distribution of a categorical
variable. It can be used to compare the difference in population
proportions between two or more groups, or to compare one
group to a value1. A chi-square test for equality of two
proportions is exactly the same thing as a z-test.
18-39
Expected
Intend Number Percent Frequencies
Living Arrangement to Join Interviewed (no. interviewed/200) (percent x 60)
Dorm/fraternity 16 90 45 27
Apartment/rooming
13 40 20 12
house, nearby
Apartment/rooming
16 40 20 12
house, distant
Live at home 15 30 15 9
_____ _____ _____ _____
17-40
Null Ho: 0 = E
17-41
18-42
There was no significant relationship between handedness
and nationality, Χ2 (1, N = 428) = 0.44, p = .505.
18-43
Use a Z test when you need to compare group means.
Use the 1-sample analysis to determine whether
a population mean is different from a hypothesized
value. Or use the 2-sample version to determine
whether two population means differ.
A Z test is a form of inferential statistics. It uses
samples to draw conclusions about populations.
For example:
One sample: Do employees training program have an
average IQ score different than a hypothesized value of
100?
Two sample: Do two IQ boosting programs have
different mean scores for employees ?
(it require the population standard deviation)
18-44
This analysis uses sample data to evaluate hypotheses that refer to
population means (µ). The hypotheses depend on whether you’re
assessing one or two samples.
One-Sample Z Test Hypotheses
Null hypothesis (H0): The population mean equals a hypothesized
value (µ = µ0).
Alternative hypothesis : (HA): The population mean DOES NOT
equal a hypothesized value (µ ≠ µ0).
When the p-value is less or equal to your significance level (e.g.,
0.05), reject the null hypothesis. Your sample data support the
notion that the population mean does not equal the hypothesized
value.
18-45
Two-Sample Z Test Hypotheses
Null hypothesis (H0): Two population means are
equal (µ1 = µ2).
Alternative hypothesis (HA): Two population means
are not equal (µ1 ≠ µ2).
Again, when the p-value is less than or equal to
your significance level, reject the null hypothesis.
The difference between the two means is
statistically significant. Your sample data support
the idea that the two population means are
different.
18-46
17-47
Example 1sample T test : (known sd)
A farming company wants to know if a new fertilizer
has improved crop yield or not.
Historic data shows the average yield of the farm is 20
tonne per acre. They decide to test a new organic
fertilizer on a smaller sample of farms and observe the
new yield is 20.175 tonne per acre with a standard
deviation of 3.02 tonne for 12 different farms.
Did the new fertilizer work?
18-48
18-49
Suppose the IQ levels among individuals in two different cities are
known to be normally distributed each with population standard
deviations of 15.
A scientist wants to know if the mean IQ level between individuals
in city A and city B are different, so she selects a simple random
sample of 20 individuals from each city and records their IQ
levels.
To test this, she will perform a two sample z-test at significance
level α = 0.05 using the following steps:
x1 (sample 1 mean IQ) = 100.65
n1 (sample 1 size) = 20
x2 (sample 2 mean IQ) = 108.8
n2 (sample 2 size) = 20-
(Since the p-value (0.0858) is not less than the significance level
(.05), the scientist will fail to reject the null hypothesis.)
18-50
18-51
17-52
T-Test: Formula and solved examples (collegedunia.com)
18-53
Null Ho: A sales = B sales
Statistical test t-test
Significance level .05 (one-tailed)
Calculated value 1.97, d.f. = 20
Critical test value 1.725
(from Appendix C, Exhibit C-2)
17-54
F Test is usually used as a generalized Statement for
comparing two variances( Ho cannot be rejected)
18-55
18-56
The purpose of a one-way ANOVA test is to
determine the existence of a statistically significant
difference among several group means.
From the output table we see that the F test statistic is 2.358 and
the corresponding p-value is 0.11385.
Since this p-value is not less than 0.05, we fail to reject the null
hypothesis.
This means we don’t have sufficient evidence to say that there is
a statistically significant difference between the mean exam
scores of the three groups .
18-58
18-59
SSE = 21.4 + 10 + 5.4 + 10.6 = 47.4
18-60
18-61
Suppose we have the following dataset with one response
variable y and two predictor variables X1 and X2:
17-67
Phi Chi-square based for 2*2 tables
18-68
Is there a relationship between X and Y?
18-69
18-70
18-71
18-72
X
X causes
causes Y
Y
Y
Y causes
causes X
X
X
X and
and YY are
are activated
activated by
by
one
one or
or more
more other
other variables
variables
X
X and
and YY influence
influence each
each
other
other reciprocally
reciprocally
18-73
18-74
A coefficient is not remarkable simply
because it is statistically significant!
It must be practically meaningful.
18-75
18-76
18-77
Y
X
Price per Case
Average Temperature (Celsius)
(FF)
12 2,000
16 3,000
20 4,000
24 5,000
18-78
Y is completely unrelated to X
and no systematic pattern is evident
18-79
18-80
Total proportion of variance in Y explained by
X
Desired r2: 80% or more
18-81
18-82
18-83
18-84
Artifact correlations Phi
Bivariate correlation Coefficient of
analysis determination (r2)
Bivariate normal Concordant
distribution Correlation matrix
Chi-square-based Discordant
measures Error term
Contingency coefficient Goodness of fit
C lambda
Cramer’s V
18-85
• Linearity • Pearson correlation
• Method of least squares coefficient
• Ordinal measures • Prediction and confidence
• Gamma bands
• Somers’s d
• Proportional reduction in
error (PRE)
• Spearman’s rho
• Regression analysis
• tau b
• Regression coefficients
• tau c
18-86
• Intercept • Scatterplot
• Slope • Simple prediction
• Residual • tau
18-87
Dependency Interdependency
19-88
Multiple
Multiple Regression
Regression
Discriminant
Discriminant Analysis
Analysis
MANOVA
MANOVA
Structural
Structural Equation
Equation Modeling
Modeling (SEM)
(SEM)
Conjoint
Conjoint Analysis
Analysis
19-89
Develop Control Test
self-weighting for and
estimating confounding explain
equation to Variables causal
predict values theories
for a DV
19-90
19-91
19-92
Forward
Backward
Stepwise
19-93
Collinearity
Statistics
VIF
Choose one of the variables
1.000
and delete the other
2.289
Create a new variable
2.289
that is a composite of the others
2.748
3.025
3.067
19-94
A. Predicted Success
Number of
Actual Group Cases 0 1
Unsuccessful 0 15 13 2
86.70% 13.30%
Successful 1 15 3 12
20.00% 80.00%
Note: Percent of “grouped” cases correctly classified: 83.33%
B. Unstandardized Standardized
X1 .36084 .65927
X1 2.61192 .57958
X1 .53028 .97505
Constant 12.89685
19-95
19-96
19-97
Factor
Factor Analysis
Analysis
Cluster
Cluster Analysis
Analysis
Multidimensional
Multidimensional Scaling
Scaling
19-98
19-99
19-100
19-101
19-102
19-103
If X is sales and Y is profit in the business for
any X –sales values the profit can be estimated .
19-104
Select
Select sample
sample to
to cluster
cluster
Define
Define variables
variables
Compute
Compute similarities
similarities
Select
Select mutually
mutually exclusive
exclusive clusters
clusters
Compare
Compare and
and validate
validate cluster
cluster
19-105
Average linkage method • Confirmatory factor
Backward elimination analysis
Beta weights • Conjoint analysis
Centroid • Dependency techniques
Cluster analysis • Discriminant analysis
Collinearity • Dummy variable
Communality • Eigenvalue
• Factor analysis
19-106
Factors • Multidimensional
Forward selection scaling (MDS)
Holdout sample • Multiple regression
Interdependency • Multivariate analysis
techniques • Multivaria analysis of
Loadings variance (MANOVA)
Metric measures • Nonmetric measures
Multicollinearity • Path analysis
19-107
Path diagram • Stepwise selection
Principal components • Stress index
analysis • Structural equation
Rotation modeling
Specification error • Utility score
Standardized
coefficients
19-108