37610-Hair6 Im
37610-Hair6 Im
INTRODUCTION..........................................2 CHAPTER ONE...........................................8 CHAPTER THREE........................................37 ADVANCED DIAGNOSTICS FOR MULTIPLE REGRESSION ANALYSIS ................................................68 CHAPTER FIVE.........................................71 CHAPTER SIX..........................................91 CHAPTER SEVEN.......................................102 CHAPTER EIGHT.......................................118 CHAPTER NINE........................................129 CHAPTER TEN.........................................144 CHAPTER ELEVEN......................................154 CHAPTER TWELVE......................................169 SAMPLE MULTIPLE CHOICE QUESTIONS....................177
INTRODUCTION
This manual has been designed to provide teachers using Multivariate Data Analysis, 6th edition, with supplementary teaching aids. The course suggestions made here are the result of years of experience teaching the basic content of this text in several universities. Obviously, the contents may be modified to suit the level of the students and the length of the term. Multivariate data analysis is an interesting and challenging subject to teach. As an instructor, your objective is to direct your students' energies and interests so that they can learn the concepts and principles underlying the various techniques. You will also want to help your students learn to apply the techniques. Through years of teaching multivariate analysis, we have learned that the most effective approach to teaching the techniques is to provide the students with real-world data and have them manipulate the variables using several different programs and techniques. The text is designed to facilitate this approach, making available several data sets for analysis. Moreover, accompanying sample output and control cards are provided to supplement the analyses discussed in the text.
A third major change to the text is a substantial expansion in coverage of structural equations modeling. We now have three chapters on this increasingly important technique. Chapter 10 provides an overview of structural equation modeling, Chapter 11 focuses on confirmatory factor analysis, and Chapter 12 covers issues in estimating structural 2
models. These three chapters provide a comprehensive introduction to this technique. Each chapter has been revised to incorporate advances in technology, and several chapters have undergone more extensive change: Chapter 2 Examining the Data has an expanded section on missing data assessment, including a flowchart depicting a series of decisions that are involved in identifying and then accommodating missing data. Chapter 5, Multiple Discriminant Analysis and Logistic Regression, provides complete coverage of analysis of categorical dependent variables, including both discriminant analysis and logistic regression. An expanded discussion of logistic regression includes an illustrative example using the HBAT database. Chapter 7, Conjoint Analysis, has a revised examination of issues of research design that focuses on the development of the conjoint stimuli in a concise and straightforward manner. An important development is the continuation of the Web site Great Ideas in Teaching Multivariate Statistics at www.mvstats.com, which can also be accessed as the Companion Web site at www.prenhall.com/hair. This Web site acts as a resource center for the textbook as well as everyone interested in multivariate analysis, providing links to resources for each technique as well as a forum for identifying new topics or statistical methods. In this way we can provide more timely feedback to researchers than if they were to wait for a new edition of the book. We also plan for the Web site to be a clearinghouse for materials on teaching multivariate statisticsproviding exercises, datasets, and project ideas.
these otherwise complex statistical tools. At the end of each chapter, we review the Learning Objectives to provide the student with an overview of what has been covered in the chapter in relation to those major concepts or issues defined in the Learning Objectives. Finally, a series of questions is provided to stimulate the student to evaluate what has been read and to translate this material into a workable knowledge base for use in future applications.
among techniques as well as the techniques themselves. The HBAT dataset has three forms utilized throughout the text: HBAT the primary database described in the text which has multiple metric and nonmetric variables allowing for use in most of the multivariate techniques. HBAT200 an expanded dataset, comparable to HBAT except for 200 rather than 100 respondents, that allows for multiple independent variables in MANOVA while still maintaining adequate sample size in the individual cells. HBAT_MISSING a reduced dataset with 70 respondents and missing data among the variables. It is utilized to illustrate the techniques for diagnosis and remedy of missing data described in Chapter 2.
In addition to these datasets, there are several others used with specific techniques, including conjoint analysis, multidimensional scaling and structural equation modeling. These datasets include: HBAT_CPLAN and HBAT_CONJOINT the datafiles needed to perform the full profile conjoint analysis available in SPSS. HBAT_MDS and HBAT_CORRESP the datafiles used in performing the multidimensional scaling and correspondence analyses described in the text. HBAT_SEM the original data responses from 400 individuals which are the basis for the structural equation analyses of Chapters 10, 11 and 12. HBAT_COV is the covariance matrix derived from HBAT_SEM that is used as input to structural equation programs such as LISREL, EQS or AMOS.
Finally, two additional datasets are provided to allow students access to data other than the HBAT datafiles described in the textbook: HATCO this dataset has been utilized in past versions of the textbook and provides a simplified set of variables amenable to all of the basic multivariate techniques. SALES this dataset concerns sales training and is comprised of 80 respondents, representing a portion of data that was collected by academic researchers. Also, a copy of the sales training questionnaire is provided.
Given the widespread interchangeability of data formats among statistical programs, all of the datasets are provided in two formats. First is the .SAV format used in SPSS, which allows for documentation of variable descriptions, etc., in a standard format. Also, all of the datasets are contained in an EXCEL worksheet, amenable to input to any statistical program. Between these two formats the student or faculty member should be able to employ the data with any available statistical software program. Control Card Files: To assist the instructor in performing the analyses illustrated in the text, control card files containing program syntax commands are provided for the statistical software programs SPSS (Statistical Package for the Social Sciences, SPSS Inc.) and LISREL (Scientific Software, Inc.). Computer Outputs If computer access is not available for a particular technique, files of the actual outputs (from SPSS for Chapters 2 9 and LISREL for Chapters 10-12) for each analysis are also provided. This enables faculty and students with the complete computer outputs even without actually running the programs.
Acknowledgements
The authors wish to express their thanks to the following colleagues for their assistance in preparation of the current and previous versions of the Instructor's Manual and other course supplements: Rick Andrews, Scott Roach, Barbara Ross, Sujay Dutta, Bruce Alford, Neil Stern, Laura Williams, Jill Attaway, Randy Russ, Alan Bush, Sandeep Bhowmick and Ron Roullier.
3. There are primary and secondary uses of each of the multivariate techniques as well as cases in which they are not applicable. Given that you will be introduced to each one in the text chapters, perhaps it would be helpful to show the "fit" of each to the four abilities just enumerated. Figure 1 at the end of this section is a reference in this respect. In the Figure, "P" indicates a primary ability of the technique, while "S" represents a secondary ability. The "NA" means that the technique is not applicable. 4. While multivariate analyses are descended from univariate techniques, the extension to multiple variables or variates introduces a number of issues which must be understood before examining each technique in detail. The Role of the Variate Multivariate techniques differ from univariate techniques in that they employ a variate. o Definition: a linear combination of variables with empirically determined weights. o Variables are specified by the researcher, and the weights are assigned by the technique. o Single value: When summed, the variate produces a single value that represents the entire set of variables. This single value, in addition to the specific contribution of each variable to the variate, is important in all multivariate techniques. Specification of measurement scales Each multivariate technique assumes the use of a certain type of data. For this reason, the type of data held by the researcher is instrumental in the selection of the appropriate multivariate technique. The researcher must make sure that he or she has the appropriate type of data to employ the chosen multivariate technique. o Metric vs nonmetric: Data, in the form of measurement scales, can be metric or nonmetric. Nonmetric data are categorical variables that describe differences in type or kind by indicating the presence or absence of a characteristic or property. Metric data are continuous measures that reflect differences in amount or degree in a relative quantity or distance.
Identification of measurement error o Definition: the degree to which the observed values in the sample are not representative of the true values in the population. o Sources: several sources are possible, including data entry errors and the use of inappropriate gradations in a measurement scale. o All variables in a multivariate analysis have some degree of measurement error, which adds residual error to the observed variables. o Need for assessment of both the validity and the reliability of measures. Validity is the degree to which a measure accurately represents what it is supposed to. Reliability is the degree to which the observed variable measures the true value and is error-free.
Statistical significance and statistical power. Almost all of the multivariate techniques discussed in this text are based on statistical inference of a population's values from a randomly drawn sample of the population. o Specifying statistical error levels. Interpretation of statistical inferences requires the specification of acceptable levels of error. Type I error (alpha) is the probability of rejecting the null hypothesis when it is actually true. Type II error (beta) is the probability of failing to reject the null hypothesis when it is actually false. Alpha and beta are inversely related. o Power: the probability (1-beta) of correctly rejecting the null hypothesis when it should be rejected. Specifying type II error (beta) also specifies the power of the statistical inference test. Power is determined by 3 factors: its inverse relationship with alpha; the effect size; and the sample size. 10
When planning research, the researcher should: estimate the expected effect size, and then select the sample size and the alpha level to achieve the desired power level. Upon completion of the analyses, the actual power should be calculated in order to properly interpret the results.
set." It is traditional to envision a data set as being comprised of rows and columns. The rows pertain to each observation, such as each person or each completed questionnaire in a large survey. The columns, on the other hand, pertain to each variable, such as a response to a question or an observed characteristic for each person. Data sets can be immense; a single study may have a sample size of 1,000 or more respondents, each of whom answers 100 questions. Hence the data set would be 1,000 by 100, or 100,000 cells of data. Obviously, the need for parsimony is very evident. Multivariate techniques aid the researcher in obtaining parsimony by reducing the number of computations necessary to complete statistical tests. For example, when completing univariate tests, if an average were computed for each variable, 100 means would result, and if a correlation were computed between each variable and every other variable, there would be close to 5,000 separate values computed. In sharp contrast, multivariate analysis would require many less computations. For example, factor analysis might result in ten factors. Cluster analysis might yield five clusters. Multiple regression could identify six significant predictor variables. MANOVA might reveal 12 cases of significant differences. Multiple discriminant analysis perhaps would find seven significant variables. It should be evident that parsimony can be achieved by using multivariate techniques when analyzing most data sets.
Step 3: Evaluate the Assumptions Underlying the Multivariate Technique Once the data are collected, the researcher must evaluate the underlying assumptions of the chosen multivariate technique. All techniques have conceptual and statistical assumptions which must be met before the analysis can proceed. Step 4: Estimate the Multivariate Model and Assess Overall Model Fit Following a testing of the assumptions, a multivariate model is estimated with the objective of meeting specific characteristics. After the model is estimated, the overall model fit is compared to specified criteria. The model may be respecified for better fit if necessary. Step 5: Interpret the Variate Once an acceptable model is achieved, the nature of the multivariate relationship is investigated. Coefficients of the variables are examined. Interpretation may lead to model respecification. The objective is to identify findings which can be generalized to the population. Step 6: Apply the Diagnostics to the Results Finally, the researcher must assess whether the results are unduly affected by a single or a small set of observations and determine the degree of generalizability of the results by validation methods. These two actions provide the researcher with support for the research findings.
Know Your Data Multivariate analyses require rigorous examination of data. Diagnostic measures are available to evaluate the nature of sets of multivariate variables. Strive for Model Parsimony The researcher should evaluate those variables chosen for inclusion in the analysis. The objective is to create a parsimonious model which includes all relevant variables and excludes all irrelevant variables. Specification error (omission of relevant variables) and high multicollinearity (inclusion of irrelevant variables) can substantially impact analysis results. Look at Your Errors Often, the first model estimation does not provide the best model fit. Thus, the researcher should analyze the prediction errors and determine potential changes to the model. Errors serve as diagnostics for achieving better model fit. Validate Your Results The researcher should always validate the results. Validation procedures ensure the analyst that the results are not merely specific to the sample, but are representative and generalizable to the population.
14
FIGURE 1 MULTIVARIATE TECHNIQUES AND THEIR ABILITIES Ability Technique Control Multiple Regression NA Multiple Discriminant NA MANOVA P Canonical Correlation NA Factor Analysis NA Cluster Analysis NA Multidimensional Scaling NA Conjoint Analysis NA Structural Equation Modeling NA Legend: P: S: NA: Primary ability Secondary ability Not applicable Describe Explain Predict
S S NA P P P S S S
S S S S S S P P P
P P S S NA NA S S P
15
16
(3)
LIST AND DESCRIBE THE MULTIVARIATE DATA ANALYSIS TECHNIQUES DESCRIBED IN THIS CHAPTER. CITE EXAMPLES FOR WHICH EACH TECHNIQUE IS APPROPRIATE. Answer a. DEPENDENCE TECHNIQUES - variables are divided into dependent and independent. (1) Multiple Regression (MR) - the objective of MR is to predict changes in a single metric dependent variable in response to changes in several metric independent variables. A related technique is multiple correlation. Multiple Discriminant Analysis (MDA) - the objective of MDA is to predict group membership for a single nonmetric dependent variable using several metric independent variables. Multivariate Analysis of Variance (MANOVA) simultaneously analyzes the relationship of 2 or more metric dependent variables and several nonmetric independent variables. A related procedure is multivariate analysis of covariance (MANCOVA) which can be used to control factors other than the included independent variables. Canonical Correlation Analysis (CCA) simultaneously correlates several metric dependent variables and several metric independent variables. Note that this procedure can be considered an extension of MR, where there is only one metric dependent variable. Conjoint Analysis - used to transform nonmetric scale responses into metric form. It is concerned with the joint effect of two or more nonmetric independent variables on the ordering of a single dependent variable. Structural Equation Modeling - simultaneously analyzes several dependence relationships (e.g., several regression equations) while also having the ability to account for measurement error in the process of estimating coefficients for each independent variable.
(2)
(3)
(4)
(5)
(6)
17
b.
INTERDEPENDENCE TECHNIQUES - all variables are analyzed simultaneously, with none being designated as either dependent or independent. (1) Factor Analysis (FA) - used to analyze the interrelationships among a large number of variables and then explain these variables in terms of their common, underlying dimensions. The two major approaches are component analysis and common factor analysis. Cluster Analysis - used to classify a sample into several mutually exclusive groups based on similarities and differences among the sample components. Multidimensional Scaling (MDS) - a technique used to transform similarity scaling into distances in a multidimensional space.
(2)
(3)
(4)
EXPLAIN WHY AND HOW THE VARIOUS MULTIVARIATE METHODS CAN BE VIEWED AS A FAMILY OF TECHNIQUES. Answer The multivariate techniques can be viewed as a "family" of techniques in that they are all based upon constructing composite linear relationships among variables or sets of variables. The family members complement one another by accommodating unique combinations of input and output requirements so that an exhaustive array of capabilities can be brought to bear on complex problems.
(5)
WHY IS KNOWLEDGE OF MEASUREMENT SCALES IMPORTANT TO AN UNDERSTANDING OF MULTIVARIATE DATA ANALYSIS? Answer Knowledge and understanding of measurement scales is a must before the proper multivariate technique can be chosen. Inadequate understanding of the type of data to be used can cause the selection of an improper technique, which makes any results invalid. Measurement scales must be understood so that questionnaires can be properly designed and data adequately analyzed.
18
(6)
WHAT ARE THE DIFFERENCES BETWEEN STATISTICAL AND PRACTICAL SIGNIFICANCE? IS ONE A PREREQUISITE FOR THE OTHER? Answer Statistical significance is a means of assessing whether the results are due to change. Practical significance assess whether the result is useful or substantial enough to warrant action. Statistical significance would be a prerequisite of practical significance.
(7)
WHAT ARE THE IMPLICATIONS OF LOW STATISTICAL POWER? HOW CAN THE POWER BE IMPROVED IF IT IS DEEMED TOO LOW? Answer The implication of low power is that the researcher may fail to find significance when it actually exists. Power may be improved through decreasing the alpha level or increasing the sample size.
(8)
DETAIL THE MODEL-BUILDING APPROACH TO MULTIVARIATE ANALYSIS, FOCUSING ON THE MAJOR ISSUES AT EACH STEP. Answer Stage One: Define the Research Problem, Objectives, and Multivariate Technique to Be Used The starting point for any analysis is to define the research problem and objectives in conceptual terms before specifying any variables or measures. This will lead to an understanding of the appropriate type of technique, dependence or interdependence, needed to achieve the desired objectives. Then based on the nature of the variables involved a specific technique may be chosen. Stage Two: Develop the Analysis Plan A plan must be developed that addresses the particular needs of the chosen multivariate technique. These issues include: (1) sample size, (2) type of variables (metric vs. nonmetric, and (3) special characteristics of the technique. Stage Three: Evaluate the Assumptions Underlying the Multivariate Technique All techniques have underlying assumptions, both conceptual and empirical, that impact their ability to represent multivariate assumptions. Techniques based on statistical inference must meet the assumptions of multivariate normality, linearity, independence of error terms, and equality of variances. Each technique must be considered individually for meeting these and other assumptions. 19
Stage Four: Estimate the Multivariate Model and Assess Overall Model Fit With assumptions met, a model is estimated considering the specific characteristics of the data. After the model is estimated, the overall model fit is evaluated to determine whether it achieves acceptable levels of statistical criteria, identifies proposed relationships, and achieves practical significance. At this stage the influence of outlier observations is also assessed. Stage Five: Interpret the Variate With acceptable model fit, interpretation of the model reveals the nature of the multivariate relationship. Stage Six: Validate the Multivariate Model The attempts to validate the model are directed toward demonstrating the generalizability of the results. Each technique has its own ways of validating the model.
20
21
3. To analyze the impacts of uncertainties inherent in data collection, including controllable and uncontrollable factors which may influence the data set. Controllable factors controlled by the researcher or analyst, such as the input of data. No matter how carefully the data is input, some errors will occur. For example, errors may result from incorrect coding or the misinterpretation of codes. Data examination provides the analyst an overview of the data, which will call attention to any impossible or improbable values which require further attention. Uncontrollable factors characteristic of the respondent or the data collection instrument, may also be detected via data examination. For example, cases with a large number of missing values may be identified. In addition, outliers, or extreme cases, are designated in data examination techniques.
2. Relationships between two or more variables may be examined by graphical plots. Scatterplot the most common form of graphical display for examining the bivariate relationships among variables. The scatterplot is a graph of data points, where the horizontal axis is one variable and the vertical axis is another variable. The variable observations can be many values, including actual values, expected values, and residuals. The patterns of the data points represent the relationship between the two variables (i.e. linear, curvilinear, etc...). Scatterplot matrices scatterplots computed for all combinations of variables. The diagonal of the matrix contains the histograms for each variable.
23
3. Testing for group differences requires examination of 1) how the values are distributed for each group, 2) if outliers are present in the groups, and 3) whether or not the groups are different from one another. Box plot a pictorial representation of the data distribution of each group. Each group is represented by a box, with the upper and lower boundaries of the box marking the upper and lower quartiles of the data distribution. o Box length is the distance between the 25% percentile and the 75% percentile, such that the box contains the middle 50% of the values. The asterisk inside the box identifies the median. o Lines or whiskers extending from each box represent the distance to the smallest and the largest observations that are less than one quartile range from the box (also marked by an X). o Outliers (marked O) are observations which range between 1.0 and 1.5 quartiles away from the box. o Extreme values are marked E and represent those observations which are greater than 1.5 quartiles away from the end of the box. 4. When the analyst wishes to graphically examine more than two variables, one of three types of multivariate profiles is appropriate. Glyphs or Metroglyphs: some form of circle with radii that correspond to a data value or a multivariate profile which portrays a bar-like profile for each observation. Mathematical transformation: transformation of the original data into a mathematical relationship which can be displayed graphically. Iconic representation: pictorially represents each variable as a component of a whole picture. This most common form is a face, with each variable representing a different feature.
24
Unidentifiable When the missing data are due to an action of the respondent, they are often unidentifiable and cannot be accommodated in the research design. In this case, the researcher evaluates the pattern of the missing data and determines the potential for remedy. 25
4. Assessing the degree of randomness will identify one of two types: missing at random (MAR) and missing completely at random (MCAR). Missing at random (MAR): When the missing values of Y depend on X, but not on Y. This occurs when X biases the randomness of the observed Y values, such that the observed Y values do not represent a true random sample of all actual Y values in the population. Missing completely at random (MCAR): When the observed values of Y are truly a random sample of all Y values. Approaches for diagnosing the randomness of the missing data process o Significance tests for a single variable: Form two groups, one group being those observations with missing data and another group being those observations with valid values, and test for significant differences between the two groups on any other variables of interest. If significant differences are found, a nonrandom missing data process is present, meaning that the missing data should be classified as MAR. o Dichotomized correlations for a pair of variables: For each of the two variables, replace each valid value with a value of one and each missing value with a value of zero, then compute correlations for the missing values of each variable. The correlations indicate the degree of association between the missing data on each variable pair. Low correlations denote randomness in the pair of variables. If all variable pairs have low correlations, the missing data can be classified as MCAR. o Overall test of randomness: Analyze the pattern of missing data on all variables and compare it to the pattern expected for a random missing data process. If no significant differences are found, the missing data can be classified as MCAR.
26
5. Approaches are available for dealing with missing data that are selected based on the randomness of the missing data process. Use of only observations with complete data. When conducting analysis, the researcher would include only those observations with complete data. o Default in many statistical programs. o Used only if the missing data are missing completely at random (MCAR); when used with data which are missing at random (MAR), the results are not generalizable to the population. Delete case(s) and/or variable(s). The researcher would delete the case(s) and/or variable(s) which exceed a specified level from the analysis. o Most effective for data which are not missing at random, but is an alternative which can be used if the data are MAR or MCAR. Imputation methods. Imputation methods replace missing values with estimates based on the valid values of other variables and / or cases in the sample. Imputation methods should be used only if the data are MCAR. o Selecting values or observations to be used in the imputation process. The complete case approach uses observations that have no missing data. only data from
The all-available approach uses all available valid observations to estimate missing data, maximizing pairwise information.
o Five imputation methods are available: Case substitution: observations with missing data are replaced by choosing another nonsampled observation. Mean substitution: missing values for a single variable are replaced with the means value of that variable based on all responses. Cold deck imputation: missing values are replaced with a 27
constant value derived from external sources or previous research. Regression imputation: missing values are replaced with predicted estimates from a regression analysis. Estimated values are based on their relationship with other variables in the data set. Multiple imputation: a combination of several methods, two or more methods of imputation are used to derive a composite estimate for the missing value.
Model-based procedures. Model-based procedures incorporate missing data into the analysis, either through a process specifically designed for missing data estimation or as an integral portion of the standard multivariate analysis.
2. Outliers can be classified into four categories. Outliers arising from a procedural error. These outliers result from data entry errors or mistakes in coding. They should be identified and eliminated during data cleaning. Outliers resulting from an extraordinary event with an explanation These outliers can be explained. If found to be representative of the population, they should be kept in the data set. Outliers resulting from an extraordinary event with no explanation. These outliers cannot be explained. Often, these observations are deleted from the data set. 28
Ordinary values which become unique when combined with other variables. While these values cannot be distinguished individually, they become very noticeable when combined with other values across other variables. In these cases, the observations should be retained unless specific evidence to the contrary is found. 3. Identification can be made from any of three perspectives: univariate, bivariate, or multivariate. If possible, multiple perspectives should be utilized to triangulate the identification of outliers. Univariate detection: examine the distribution of observations for a variable and select as outliers those values which fall at the outer ranges of the distribution. o When standardized, data values which are greater than 2.5 may be potential outliers. (For large sample sizes, the value may increase to 3 or 4.) Bivariate detection: examine scatterplots of variable pairs and select as outliers those values which fall markedly outside the range of the other observations. o Ellipse representing a confidence interval may be drawn around the expected range of observations on the scatterplot. Those values falling outside the range are potential outliers. Influence plot, where the point varies in size in proportion to its influence on the relationship and the largest points are potential outliers. Multivariate detection: assess each observation across a set of variables and select as outliers those values which fall outside a specified range specific to the statistical test employed. o Mahalanobis D2 is commonly used in multivariate analyses to identify outliers. It is a measure of the distance in multidimensional space of each observation from the mean center of the observations. o Conservative values (i.e. .001) for the statistical tests should be set for identification of potential outliers.
29
4. Only observations which are truly unique from the population should be designated as outliers. The researcher should be careful of identifying too many observations as outliers. Profiles on each outlier should be generated and the data should be examined for the variable(s) responsible for generating the outlier. Multivariate techniques may also be employed to trace the underlying causes of outliers. The researcher should be able to classify the outlier in one of the four categories discussed above. Unnecessary deletion of outliers will limit the generalizability of the analysis. Outliers should be deleted from the analysis only if they are proven to be not representative of the population.
30
Transformations: When a distribution is found to be non-normal, data transformations should be computed. Skewness: Skewness values exceeding +2.58 are indicative of a nonnormal distribution. Other statistical tests are available in specific statistical software programs.
3. Homoscedasticity: dependent variables should exhibit equal levels of variance across the range of predictor variables. Common sources: Most problems with unequal variances stem from either the type of variables included in the model or from a skewed distribution. Impact: Violation of this assumption will cause hypothesis tests to be either too conservative or too sensitive. Identification: graphical versus statistical. o Graphical plot of residuals will reveal violations of this assumption. o Statistical tests for equal variance dispersion relate to the variances within groups formed by nonmetric variables. The most common test is the Levene test, which is used to assess if the variances of a single metric variable are equal across any number of groups. When more than one variable is being tested, the Box's M test should be used. Remedies: Heteroscedastic variables can be remedied through data transformations.
4. Linearity: variables should be linearly related. Identification: Scatterplots of variable pairs are most commonly used to identify departures from linearity. Examination of the residuals in a simple regression analysis may also be used as a diagnostic method. Nonlinearity: If a nonlinear relationship is detected, the most direct approach is to transform one or both of the variables. Other than a transformation, a new variable which represents the nonlinear relationship can be created.
31
5. Prediction errors should not be correlated. Patterns in the error terms reflect an underlying systematic bias in the relationship. Residual plots should not contain any recognizable pattern.
Violations of this assumption often result from problems in the data collection process. 6. Data transformations enable the researcher to modify variables to correct violations of the assumptions of normality, homoscedasticity, and linearity and to improve the relationships between variables. Basis: Transformations can be based on theoretical or empirical reasons. Distribution shape: The shape of the distribution provides the basis for selecting the appropriate transformation. o Flat distribution is the most common transformation is the inverse. o Positively skewed distributions transformed by taking logarithms o Negatively skewed distributions transformed by taking the square root. o Cone-shaped distribution which opens to the right should be transformed using an inverse. A cone shaped distribution which opens to the left should be transformed by taking the square root. o Nonlinear transformations can take many forms, including squaring the variable and adding additional variables termed polynomials. General guidelines for performing data transformations. o Ratio of a variable's mean divided by its standard deviation should be less than 4.0. o Select the variable with the smallest ratio from item 1. o Transformations should be applied to the independent variables except in the case of heteroscedasticity.
32
o Heteroscedasticity can only be remedied by transformation of the dependent variable in a dependence relationship. If a heteroscedasticity relationship is also nonlinear, the dependent and perhaps the independent variables must be transformed. o Transformations may change the interpretation of the variables.
33
b.
34
(3)
DISTINGUISH BETWEEN DATA WHICH ARE MISSING AT RANDOM (MAR) AND MISSING COMPLETELY AT RANDOM (MCAR). EXPLAIN HOW EACH TYPE WILL IMPACT THE ANALYSIS OF MISSING DATA. Answer a. Missing at Random (MAR): If the missing values of Y depend on X, but not on Y, the missing data are at random. This occurs when X biases the randomness of the observed Y values, such that the observed Y values do not represent a true random sample of all actual Y values in the population. Missing Completely at Random (MCAR): When the observed values of Y are truly a random sample of all Y values. When the missing data are missing at random (MAR), the analyst should only use a modeling-based approach which accounts for the underlying processes of the missing data. When the missing data are missing completely at random (MCAR), the analyst may use any of the suggested approaches for dealing with missing data, such as using only observations with complete data, deleting case(s) or variable(s), or employing an imputation method.
b. c.
(4)
DESCRIBE THE CONDITIONS UNDER WHICH A RESEARCHER WOULD DELETE A CASE WITH MISSING DATA VERSUS THE CONDITIONS UNDER WHICH A RESEARCHER WOULD USE AN IMPUTATION METHOD. Answer The researcher must first evaluate the randomness of the missing data process. If the data are missing at random, deleting a case is the only acceptable alternative of the two. Data which are missing at random cannot employ an imputation method, as it would introduce bias into the results. Only cases with data which are missing completely at random would utilize an imputation method. If the data are missing completely at random, the choice of case deletion versus imputation method should be based on theoretical and empirical considerations. If the sample size is sufficiently large, the analyst may wish to consider deletion of cases with a great degree of missing data. Cases with missing data are good candidates for deletion if they represent a small subset of the sample and if their absence does not otherwise distort the data set. 35
For instance, cases with missing dependent variable values are often deleted. If the sample size is small, the analyst may wish to use an imputation method to fill in missing data. The analyst should, however, consider the amount of missing data when selecting this option. The degree of missing data will influence the researcher's choice of information used in the imputation (i.e. complete case vs. all-available approaches) and the researcher's choice of imputation method (i.e. case substitution, mean substitution, cold deck imputation, regression imputation, or multiple imputation). (5) EVALUATE THE FOLLOWING STATEMENT, "IN ORDER TO RUN MOST MULTIVARIATE ANALYSES, IT IS NOT NECESSARY TO MEET ALL OF THE ASSUMPTIONS OF NORMALITY, LINEARITY, HOMOSCEDASTICITY, AND INDEPENDENCE." Answer As will be shown in each of the following chapter outlines, each multivariate technique has a set of underlying assumptions which must be met. The degree to which a violation of any of the four above assumptions will distort data analyses is dependent on the specific multivariate technique. For example, multiple regression analysis is sensitive to violations of all four of the assumptions, whereas multiple discriminant analysis is primarily sensitive to violations of multivariate normality. (6) DISCUSS THE FOLLOWING STATEMENT, "MULTIVARIATE ANALYSES CAN BE RUN ON ANY DATA SET, AS LONG AS THE SAMPLE SIZE IS ADEQUATE." Answer False. Although sample size is an important consideration in multivariate analyses, it is not the only consideration. Analysts must also consider the degree of missing data present in the data set and examine the variables for violations of the assumptions of the intended techniques.
36
1. Factor Analysis has three primary objectives. Identification of the structure of relationships among either variables or respondents. Identification of representative variables from a much larger set of variables for use in subsequent multivariate analyses. Creation of an entirely new set of variables, which are much smaller in number in order to partially or completely replace the original set of variables for inclusion in subsequent multivariate techniques.
4. Sample size is an important consideration in factor analyses. The sample size should be 100 or larger. Sample sizes between 50 and 100 may be analyzed but with extreme caution. The ratio of observations to variables should be at least 5 to 1 in order to provide the most stable results.
1. The most basic assumption is that the set of variables analyzed are related. Variables must be interrelated in some way since factor analysis seeks the underlying common dimensions among the variables. If the variables are not related, then each variable will be its own factor. o Example: if you had 20 unrelated variables, you would have 20 different factors. When the variables are unrelated, factor analysis has no common dimensions with which to create factors. Thus, some underlying structure or relationship among the variables must exist. The sample should be homogenous with respect to some underlying factor structure.
2. Factor analysis assumes the use of metric data. Metric variables are assumed, although dummy variables may be used (coded 0-1). Factor analysis does not require multivariate normality.
Multivariate normality is necessary if the researcher wishes to apply statistical tests for significance of factors.
3. The data matrix must have sufficient correlations to justify the use of factor analysis. Rule of thumb: a substantial number of correlations greater than .30 are needed. Tests of appropriateness: anti-image correlation matrix of partial correlations, the Bartlett test of sphericity, and the measure of sampling adequacy (MSA greater than .50).
A Priori criterion The researcher may know how many factors should be extracted. This criterion is useful if the researcher is testing previous research or specific hypotheses. Percentage of Variance criterion As factor analysis extracts each factor, the cumulative percent of variance explained is used to determine the number of factors. The researcher may desire to specify a necessary percentage of variance explained by the solution. Scree Test criterion A scree test is derived by plotting the eigenvalues of each factor relative to the number of factors in order of extraction. o Earlier extracted factors have the most common variance, causing rapid decline in amount of variance explained as additional factors extracted. o At some point, the amount of specific variance begins to overtake the common variance in the factors. A scree plot reveals this by a rapid flattening of the plot line. o While the factors contain mostly common variances the plot line will continue to decline sharply, but once specific variance becomes too large the plot line will become horizontal. o Point where the line becomes horizontal is the appropriate number of factors. o Scree test almost always suggests more factors than the latent root criterion. Heterogeneity of the Respondents In a heterogeneous sample, the first factors extracted are those which are more homogeneous across the entire sample. Those factors which best discriminate among subgroups in the sample will be extracted later in the analysis.
Oblique rotation methods such as OBLIMIN (SPSS) and PROMAX (SAS) allow correlated factors.
2. Criteria for Practical and Statistical Significance of Factor Loadings Magnitude for practical significance: Factor loadings can be classified based on their magnitude: o Greater than + .30 minimum consideration level. o + .40 more important o + .50 practically significant (the factor accounts for 25% of variance in the variable).
Power and statistical significance: Given the sample size, the researcher may determine the level of factor loadings necessary to be significant at a predetermined level of power. For example, in a sample of 100 at an 80% power level, factor loadings of .55 and above are significant. Necessary loading level to be significant varies due to several factors: o Increases in the number of variables; decreases the level for significance. o Increases in the sample size; decreases the level necessary to consider a loading significant. o Increases in the number of factors extracted; increases the level necessary to consider a loading significant. 3. Interpreting a Factor Matrix: Look for clear factor structure indicated by significant loadings on a single factor and high communalities. Variables that load across factors or that have low loadings or communalities may be candidates for deletion. Naming the factor is based on an interpretation of the factor loadings. o Significant loadings: The variables that most significantly load on each factor should be used in naming the factors. The variables' magnitude and strength provide meaning to the factors. o Impact of the Rotation: The selection of a rotation method affects the interpretation of the loadings. Orthogonal rotation each variable's loading on each factor is independent of its loading on another factor. Oblique rotation independence of the loadings is not preserved and interpretation then becomes more complex.
4. Respecification should always be considered. Some methods are: Deletion of a variable(s) from the analysis Employing a different rotational method for interpretation Extraction of a different number of factors
Beyond the interpretation and understanding of the relationship among the variables, the researcher may wish to use the factor analysis results in subsequent analysis. Factor analysis may be used to reduce the data for further use by (1) the selection of a surrogate variable, (2) creation of a new variable with a summated scale, or (3) replacement of the factor with a factor score. 1. A surrogate variable that is representative of the factor may be selected as the variable with the highest loading.
2. All the variables loading highly on a factor may be combined (the sum or the average) to form a replacement variable. Advantages of the summated scale:
o Measurement Error is reduced by multiple measures. o Taps all aspects or domains of a concept with highly related multiple indicators. Basic Issues of Scale Construction:
o A conceptual definition is the starting point for creating a scale. The scale must appropriately measure what it purports to measure to assure content or face validity. o A scale must be unidimensional, meaning that all items are strongly associated with each other and represent a single concept. o Reliability of the scale is essential. Reliability is the degree of consistency between multiple measurements of a variable. Testretest reliability is one form of reliability. Another form of reliability is the internal consistency of the items in a scale. Measures of internal consistency include item-to-total correlation, inter-item correlation, and the reliability coefficient. o Once content or face validity, unidimensionality, and reliability are established other forms of scale validity should be assessed. Discriminant validity is the extent that two measures of similar but different concepts are distinct. Nomological validity refers to the degree that the scale makes accurate predictions of other concepts. 3. Factor scores, computed using all variables loading on a factor, may also be used as a composite replacement for the original variable. Factor scores are computed using all variables that load on a factor. Factor scores may not be easy to replicate.
(3)
WHAT GUIDELINES CAN YOU USE TO DETERMINE THE NUMBER OF FACTORS TO EXTRACT? EXPLAIN EACH BRIEFLY. Answer The appropriate guidelines utilized depend to some extent upon the research question and what is known about the number of factors that should be present in the data. If the researcher knows the number of factors that should be present, then the number to extract may be specified in the beginning of the analysis by the a priori criterion. If the research question is largely to explain a minimum amount of variance then the percentage of variance criterion may be most important. When the objective of the research is to determine the number of latent factors underlying a set of variables a combination of criterion, possibly including the a priori and percentage of variance criterion, may be used in selecting the final number of factors. The latent root criterion is the most commonly used technique. This technique is to extract the number of factors having eigenvalues greater than 1. The rationale being that a factor should explain at least as much variance as a single variable. A related technique is the scree test criterion. To develop this test the latent roots (eigenvalues) are plotted against the number of factors in their order of extraction. The resulting plot shows an elbow in the sloped line where the unique variance begins to dominate common variance. The scree test criterion usually indicates more factors than the latent root rule. One of these four criterion for the initial number of factors to be extracted should be specified. Then an initial solution and several trial solutions are calculated. These solutions are rotated and the factor structure is examined for meaning. The factor structure that best represents the data and explains an acceptable amount of variance is retained as the final solution.
(4)
HOW DO YOU USE THE FACTOR-LOADING MATRIX TO INTERPRET THE MEANING OF FACTORS? Answer The first step in interpreting the factor-loading matrix is to identify the largest significant loading of each variable on a factor. This is done by moving horizontally across the factor matrix and underlining the highest significant loading for each variable. Once completed for each variable the researcher continues to look for other significant loadings. If there is simple structure, only single significant loadings for each variable, then the factors are labeled. Variables with high factor loadings are considered more important than variables with lower factor loadings in the interpretation phase. In
(5)
general, factor names will be assigned in such a way as to express the variables which load most significantly on the factor. HOW AND WHEN SHOULD YOU USE FACTOR SCORES IN CONJUNCTION WITH OTHER MULTIVARIATE STATISTICAL TECHNIQUES? Answer When the analyst is interested in creating an entirely new set of a smaller number of composite variables to replace either in part or completely the original set of variables, then the analyst would compute factor scores for use as such composite variables. Factor scores are composite measures for each factor representing each subject. The original raw data measurements and the factor analysis results are utilized to compute factor scores for each individual. Factor scores may replicate as easily as a summated scale, therefore this must be considered in their use.
(6)
WHAT ARE THE DIFFERENCES BETWEEN FACTOR SCORES AND SUMMATED SCALES? WHEN ARE EACH MOST APPROPRIATE? Answer The key difference between the two is that the factor score is computed based on the factor loadings of all variables loading on a factor, whereas the summated scale is calculated by combining only selected variables. Thus, the factor score is characterized by not only the variables that load highly on a factor, but also those that have lower loadings. The summated scale represents only those variables that load highly on the factor. Although both summated scales and factor scores are composite measures there are differences that lead to certain advantages and disadvantages for each method. Factor scores have the advantage of representing a composite of all variables loading on a factor. This is also a disadvantage in that it makes interpretation and replication more difficult. Also, factor scores can retain orthogonality whereas summated scales may not remain orthogonal. The key advantage of summated scales is, that by including only those variables that load highly on a factor, the use of summated scales makes interpretation and replication easier. Therefore, the decision rule would be that if data are used only in the original sample or orthogonality must be maintained, factor scores are suitable. If generalizability or transferability is desired then summated scales are preferred.
(7)
WHAT IS THE DIFFERENCE BETWEEN Q-TYPE FACTOR ANALYSIS AND CLUSTER ANALYSIS? Answer Both Q-Type factor analysis and cluster analysis compare a series of responses to a number of variables and place the respondents into several groups. The difference is that the resulting groups for a Q-type factor analysis would be based on the intercorrelations between the means and standard deviations of the respondents. In a typical cluster analysis approach, groupings would be based on a distance measure between the respondents' scores on the variables being analyzed.
(8)
WHEN WOULD THE RESEARCHER USE AN OBLIQUE ROTATION INSTEAD OF AN ORTHOGONAL ROTATION? WHAT ARE THE BASIC DIFFERENCES BETWEEN THEM? Answer In an orthogonal factor rotation, the correlation between the factor axes is arbitrarily set at zero and the factors are assumed to be independent. This simplifies the mathematical procedures. In oblique factor rotation, the angles between axes are allowed to seek their own values, which depend on the density of variable clusterings. Thus, oblique rotation is more flexible and more realistic (it allows for correlation of underlying dimensions) than orthogonal rotation although it is more demanding mathematically. In fact, there is yet no consensus on a best technique for oblique rotation. When the objective is to utilize the factor results in a subsequent statistical analysis, the analyst may wish to select an orthogonal rotation procedure. This is because the factors are orthogonal (independent) and therefore eliminate collinearity. However, if the analyst is simply interested in obtaining theoretically meaningful constructs or dimensions, the oblique factor rotation may be more desirable because it is theoretically and empirically more realistic.
3. This is accomplished by a statistical procedure called ordinary least squares which minimizes the sum of squared prediction errors (residuals) in the equation.
Prediction predict a dependent variable with a set of independent variables. As such, two objectives are associated with prediction: o Maximization of the overall predictive power of the independent variables in the variate. o Comparison of competing models made up of two or more sets of independent variables to assess the predictive power of each.
Explanation explain the degree and character of the relationship between dependent and independent variables. As such, three objectives are associated with explanation: o Determination of the relative importance of each independent variable in the prediction of the dependent variable. o Assessment of the nature of the relationships between the predictors and the dependent variable. (i.e. linearity) o Insight into the interrelationships among the independent variables and the dependent variable. (i.e. correlations)
2. Multiple regression analysis is appropriate for statistical relationships, not functional relationships. Statistical relationships assume that more than one value of the dependent value will be observed for any value of the independent variables. An average value is estimated and error is expected in prediction. Functional relationships assume that a single value of the dependent value will be observed for any value of the independent variables. An exact estimate is made, with no error.
3. The selection of dependent and independent variables for multiple regression analysis should be based primarily on theoretical or conceptual meaning. Selecting the dependent variable: dictated by the research problem, with concern for measurement error, or whether the variable is an accurate and consistent measure of the concept being studied. Selecting the independent variables: The inclusion of an independent variable must be guided by the theoretical foundation of the regression model and its managerial implications. A variable that by chance happens to influence statistical significance, but has no theoretical or managerial relationship with the dependent variable is of no use to the researcher in explaining the phenomena under observation. Researchers must be concerned with specification error, or the inclusion of irrelevant variables or the omission of relevant variables. 4. The researcher must seek parsimony in the regression model. The fewest independent variables with the greatest contribution to the variance explained should be selected.
1. The sample size used will impact the statistical power and generalizability of the multiple regression analysis. The power (probability of detecting statistically significant relationships) at specified significance levels is related to sample size. o Small samples (less than 20 observations), will detect only very strong relationships with any degree of certainty. o Large samples (1000 or more observations) will find almost any relationship statistically significant due to the over sensitivity of the test. The generalizability of the results is directly affected by the ratio of observations to independent variables. o Minimum level is 5 to 1 (i.e. 5 observations per independent variable in the variate).
o Desired level is 15 to 20 observations for each independent variable. 2. Most regression models for survey data are random effects models. In a random effects model, the levels of the predictor are selected at random and a portion of the random error comes from the sampling of the predictors. 3. When a nonlinear relationship exists between the dependent and the independent variables or when the analyst wishes to include nonmetric independent variables in the regression model, transformations of the data should be computed. Nonlinear relationships: o Arithmetic transformations (i.e. square root or logarithm) and polynomials are most often used to represent nonlinear relationships. Moderator effects: o reflect the changing nature of one independent variable's relationship with the dependent variable as a function of another independent variable. o represented as a compound variable in the regression equation. o moderators change the interpretation of the regression coefficients. To determine the total effect of an independent variable, the separate and the moderated effects must be combined. Nonmetric variable inclusion: o Dichotomous variables, also known as dummy variables may be used to replace nonmetric independent variables. o The resulting coefficients represent the differences in group means from the comparison group and are in the same units as the dependent variable.
3. Uncorrelated error terms. Any errors encountered in multiple regression analysis are expected to be completely random in nature and not systematically related. The researcher should expect the same chance of random error at each level of prediction. 4. Normality dependent and independent variables are normally distributed. The data distribution for all variables is assumed to be the normal distribution. A normal distribution is necessary for the use of the F and t statistics, since sufficiently large violations of this assumption invalidate the use of these statistics. Diagnostics for normality are: o histograms of the residuals (bell-shaped curve) o normal probability plots (diagonal line) o skewness Shapiro-Wilks test Kolmogorov-Smirnov test If violations of normality are found, there are data transformations that may restore normality and allow the use of the variable(s) in the regression equation.
5. Multicollinearity multiple regression works best with no collinearity among the independent variables. The presence of collinearity or multicollinearity suppresses the R2 and confounds the analysis of the variate. This correlation among the predictor variables prohibits assessment of the contribution of each independent variable. Tolerance value or the variance inflation factor (VIF) can be used to detect multicollinearity. Once detected, the analyst may choose one of four options:
o use the model for prediction only, o assess the predictor-dependent variable relationship with simple correlations, or o use a different method such as Bayesian
1. Model selection accomplished by any of several methods available to aid the researcher in selecting or estimating the best regression model. Confirmatory specification o The analyst specifies the complete set of independent variables. Thus, the analyst has total control over variable selection. Sequential search approaches o Sequential approaches estimate a regression equation with a set of variables and by either adding or deleting variables until some overall criterion measure is achieved. Variable entry may be done in a forward, backward, or stepwise manner. Forward method begins with no variables in the equation and then adds variables that satisfy the F-to-enter test. Then the equation is estimated again and the F-to-enter of the remaining variables is calculated. This is repeated until the F-to-enter test finds no variables to enter. Backward elimination begins with all variables in the regression equation and then eliminates any variables with the F-to-remove test. The same repetition of estimation is performed as with forward estimation.
Stepwise estimation is a combination of forward and backward methods. It begins with no variables in the equation as with forward estimation and then adds variables that satisfy the F test. The equation is estimated again and additional variables that satisfy the F test are entered. At each re-estimation stage, however, the variables already in the equation are also examined for removal by the appropriate F test. This repetition continues until both F tests are not satisfied by any of the variables either in or out of the regression equation.
Combinatorial Methods o The combinatorial approach estimates regression equations for all subset combinations of the independent variables. The most common procedure is known as all-possible-subsets regression. o Combinatorial methods become impractical for very large sets of independent variables. For example, for even 10 independent variables, one would have to estimate 1024 regression equations.
2. The variate must meet the assumptions of linearity, constant variance, independence and normality along with the individual variables in the analysis. 3. Assessment of the regression model fit is in two parts: examining overall fit and analyzing the variate. Examine the overall model fit. o Examine the variate's ability to predict the criterion variable and assess how well the independent variables predict the dependent variable. o Several statistics exist for the evaluation of overall model fit Coefficient of determination (R2) The coefficient of determination is a measure of the amount of variance in the dependent variable explained by the independent variable(s). A value of one (1) means perfect explanation and is not encountered in reality due to ever present error. A value of .91 means that 91% of the variance in the dependent variable is explained by the independent variables.
The amount of variation explained by the regression model should be more than the variation explained by the average. Thus, R2 should be greater than zero. R2 is impacted by two facets of the data: o the number of independent variables relative to the sample size. (see sample size discussion earlier) For this reason, analysts should use the adjusted coefficient of determination, which adjusts for inflation in R2 from overfitting the data. o the number of independent variables included in the analysis. As you increase the number of independent variables in the model, you increase the R2 automatically because the sum of squared errors by regression begins to approach the sum of squared errors about the average.
Standard error of the estimate Standard error of the estimate is another measure of the accuracy of our predictions. It represents an estimate of the standard deviation of the actual dependent values around the regression line. Since this is a measure of variation about the regression line, the smaller the standard error, the better.
F-test The F-test reported with the R2 is a significance test of the R2. This test indicates whether a significant amount (significantly different from zero) of variance was explained by the model.
Analyze the variate. o The variate is the linear combination of independent variables used to predict the dependent variables. Analysis of the variate relates the respective contribution of each independent variable in
the variate to the regression model. The researcher is informed as to which independent variable contributes the most to the variance explained and may make relative judgments between/among independent variables (using standardized coefficients only).
o Regression coefficients are tested for statistical significance. The intercept (or constant term) should be tested for appropriateness for the predictive model. If the constant is not significantly different from zero, it cannot be used for predictive purposes. The estimated coefficients should be tested to ensure that across all possible samples, the coefficient would be different from zero. The size of the sample will impact the stability of the regression coefficients. The larger the sample size, the more generalizable the estimated coefficients will be. An F-test may be used to test the appropriateness of the intercept and the regression coefficients.
4. The researcher should examine the data for influential observations. Influential observations, leverage points, and outliers all have an effect on the regression results. The objective of this analysis is to determine the extent and type of effect. One of four conditions gives rise to influential observations: o an error in observation or data entry, o a valid but exceptional observation, which is explainable by an extraordinary situation, o an exceptional observation with no likely explanation, or o an ordinary observation on its individual characteristics, but exceptional in its combination of characteristics. Detection of problem cases is accomplished with: o examine residuals studentized residuals > 2 (+ or -)
o identify leverage points with two predictor variables, plot the two variables as the axes of a two-dimensional plot with three or more variables, use the hat matrix. Leverage values: p>10 and n>50 then use 2p/n, p<10 or n<50 then use 3p/n o examine single case diagnostics, such as DFBETA, Cook's distance, and DFFIT.
'Part Correlation': removes the effect of the remaining independent variables from the independent side of the equation. 'Partial Correlation': removes their effect from both sides of the regression equation. This correlation provides the researcher with a more pure correlation between the dependent and independent variables.
o T-values indicate the significance of the partial correlation of each variable. This may be compared against the researcher's a priori standard for significance. 2. Multicollinearity is a data problem that can adversely impact regression interpretation by limiting the size of the R-squared and confounding the contribution of independent variables. For this reason, two measures, tolerance and VIF, are used to assess the degree of collinearity among independent variables. Tolerance is a measure of collinearity between two independent variables or multicollinearity among three or more independent variables. It is the proportion of variance in one independent variable that is not explained by the remaining independent variables. o Each independent variable will have a tolerance measure and each of measure should be close to 1. A tolerance of less than .5 indicates a collinearity or multicollinearity problem. Variance inflation factor (VIF) is the reciprocal of the tolerance value
o The regression model or equation may be retested on a new or split sample. o No regression model should be assumed to be the final or absolute model of the population. Calculation of the PRESS statistic o Assesses the overall predictive accuracy of the regression by a series of iterations, whereby one observation is omitted in the estimation of the regression model and the omitted observation is predicted with the estimated model. The residuals for the observations are summed to provide an overall measure of predictive fit. Comparison of regression models o The adjusted R-square is compared across different estimated regression models to assess the model with the best prediction. 2. Before applying an estimated regression model to a new set of independent variable values, the researcher should consider the following: Predictions have multiple sampling variations, not only the sampling variations from the original sample, but also those of the newly drawn sample. Conditions and relationships should not have changed materially from their measurement at the time the original sample was taken. Do not use the model to estimate beyond the range of the independent variables in the sample. The regression equation is only valid for prediction purposes within the original range of magnitude for the prediction variables. Results cannot be extrapolated beyond the original range of variables measured since the form of the relationship may change. o Example: If a predictor variable is the number of tires sold and in the original data the range of this variable was from 30 to 120 tires per month, then prediction of the dependent variable when 200 tires per month are sold is invalid. We are outside the original range of magnitude; we do not know the form of the relationship between the predictor variable and the criterion variable. At 200 tires per month the relational form may become curvilinear or
quadratic.
(3)
HOW CAN NONLINEARITY BE CORRECTED OR ACCOUNTED FOR IN THE REGRESSION EQUATION? Answer Nonlinearity may be corrected or accounted for in the regression equation by three general methods. One way is through a direct data transformation of the original variable as discussed in Chapter 2. Two additional ways are to explicitly model the nonlinear relationship in the regression equation through the use of polynomials and/or interaction terms. Polynomials are power transformations that may be used to represent quadratic, cubic, or higher order polynomials in the regression equation. The advantage of polynomials over direct data transformations in that polynomials allow testing of the type of nonlinear relationship. Another method of representing nonlinear relationships is through the use of an interaction or moderator term for two independent variables. Inclusion of this type of term in the regression equation allows for the slope of the relationship of one independent variable to change across values of a second dependent variable.
(4)
COULD YOU FIND A REGRESSION EQUATION THAT WOULD BE ACCEPTABLE AS STATISTICALLY SIGNIFICANT AND YET OFFER NO ACCEPTABLE INTERPRETATIONAL VALUE TO MANAGEMENT? Answer Yes. For example, with a sufficiently large sample size you could obtain a significant relationship, but a very small coefficient of determinationtoo small to be of value. In addition, there are some basic assumptions associated with the use of the regression model, which if violated, could make any obtained results at best spurious. One of the assumptions is that the conditions and relationships existing when sample data were obtained remain unchanged. If changes have occurred they should be accommodated before any new inferences are made. Another is that there is a "relevant range" for any regression model. This range is determined by the predictor variable values used to construct the model. In using the model, predictor values should fall within this relevant range. Finally, there are statistical considerations. For example, the effects of multicollinearity among predictor variables is one such consideration.
(5)
WHAT IS THE DIFFERENCE IN INTERPRETATION BETWEEN THE REGRESSION COEFFICIENTS ASSOCIATED WITH INTERVAL SCALED PREDICTOR VARIABLES AS OPPOSED TO DUMMY (0,1) PREDICTOR VARIABLES? Answer The use of dummy variables in regression analysis is structured so that there are (n-1) dummy variables included in the equation (where n = the number of categories being considered). In the dichotomous case, then, since n = 2, there is one variable in the equation. This variable has a value of one or zero depending on the category being expressed (e.g., male = 0, female = 1). In the equation, the dichotomous variable will be included when its value is one and omitted when its value is zero. When dichotomous predictor variables are used, the intercept (constant) coefficient (bo) estimates the average effect of the omitted dichotomous variables. The other coefficients, b1 through bk, represent the average differences between the omitted dichotomous variables and the included dichotomous variables. These coefficients (b1-bk) then, represent the average importance of the two categories in predicting the dependent variable. Coefficients bo through bk serve a different function when metric predictors are used. With metric predictors, the intercept (bo) serves to locate the point where the regression equation crosses the Y axis, and the other coefficients (b1-bk) indicate the effect on the predictor variable(s) on the criterion variable (if any).
(6)
WHAT ARE THE DIFFERENCES BETWEEN INTERACTIVE AND CORRELATED PREDICTOR VARIABLES? DO ANY OF THESE DIFFERENCES AFFECT YOUR INTERPRETATION OF THE REGRESSION EQUATION? Answer The term interactive predictor variable is used to describe a situation where two predictor variables' functions intersect within the relevant range of the problem. The effect of this interaction is that over part of the relevant range one predictor variable may be considerably more important than the other; but over another part of the relevant range the second predictor variable may become the more important. When interactive effects are encountered, the coefficients actually represent averages of effects across values of the predictors rather than a constant level of effect. Thus, discrete ranges of influence can be misinterpreted as continuous effects.
When predictor variables are highly correlated, there can be no real gain in adding both of the variables to the predictor equation. In this case, the predictor with the highest simple correlation to the criterion variable would be used in the predictive equation. Since the direction and magnitude of change is highly related for the two predictors, the addition of the second predictor will produce little, if any, gain in predictive power. When correlated predictors exist, the coefficients of the predictors are a function of their correlation. In this case, little value can be associated with the coefficients since we are speaking of two simultaneous changes. (7) ARE INFLUENTIAL CASES ALWAYS TO BE OMITTED? GIVE EXAMPLES OF WHEN THEY SHOULD AND SHOULD NOT BE OMITTED? Answer The principal reason for identifying influential observations is to address one question: Are the influential observations valid representations of the population of interest? Influential observations, whether they be "good" or "bad," can occur because of one of four reasons. Omission or correction is easily decided upon in one case, the case of an observation with some form of error (e.g., data entry). However, with the other causes, the answer is not so obvious. A valid but exceptional observation may be excluded if it is the result of an extraordinary situation. The researcher must decide if the situation is one which can occur among the population, thus a representative observation. In the remaining two instances (an ordinary observation exceptional in its combination of characteristics or an exceptional observation with no likely explanation), the researcher has no absolute guidelines. The objective is to assess the likelihood of the observation occurring in the population. Theoretical or conceptual justification is much preferable to a decision based solely on empirical considerations.
Overview
This material reviews additional diagnostic procedures available to the research analyst to assess the underlying assumptions of multiple regression analysis. The following discussion includes the assessment of multicollinearity and the identification of influential observations. Although these concepts are particularly important in multiple regression analysis, these terms will be referred to in later chapters.
Assessing Multicollinearity
Multicollinearity can be diagnosed in a two-step process: Step one: Identify all condition indices above 30. The condition index represents the collinearity of combinations of variables in the data set. Step two: For all condition indices above 30, identify variables with variance proportions above .50%. o The regression coefficient variance-decomposition matrix shows the proportion of variance for each regression coefficient attributable to each eigenvalue (condition index). o A collinearity problem is present when a condition index identified in part one accounts for a substantial proportion of variance (.90 or above) for two or more coefficients.
Step 1: Examine the residuals. o Studentized residuals identify observations which are outliers on the dependent variable. With large sample sizes, studentized residuals of greater than + 2.0 are substantial. o Dummy-variable regression may also be used to identify outliers on the dependent variable when the researcher has at least N + 1 (where N is the sample size) degrees of freedom available. A dummy variable is added for each observation, the model is reestimated, and those dummy variables with significant coefficients are outliers. o Partial regression plots visually portray all individual cases. Outliers can be identified as those cases which impact the regression slope and the corresponding regression equation coefficients.
Step 2: Identify leverage points from the predictors. o Hat matrix represents the combined effects of all independent variables for each case and enables the researcher to identity multivariate leverage points. Diagonal of the matrix represents the distance of the observation from the mean center of all other observations on the independent variables. Interpretation: Large values indicate that the observations carry disproportionate weight in determining the dependent variable. Threshold values: When the number of predictors is greater than 10 and the sample size exceeds 50, a value is large if it exceeds 2p/n (where p is the number of predictors and n is the sample size); otherwise, use 3p/n. o Mahalanobis distance, as discussed in Chapter 2, is also used to identify outliers.
Step 3: Identify single case influential observations. o Studentized deleted residual (the studentized residual for observation i when observation i is deleted from calculation of the regression equation) may be used to identify extremely influential, individual observations. Values greater than + 2.0 indicate an influential case.
o DFBETA reflects the relative change in the regression coefficient when an observation is deleted. For small or medium sample sizes, values greater than 1.0 may be considered influential; for large sample sizes, values exceeding 2 sqrt(n) are influential. o COVRATIO is similar to DFBETA, but is different in that it considers all coefficients collectively rather than individually. Values of the COVRATIO - 1 which exceed + 3p n are indicative of influential observations. o Cook's distance measures the influence of an observation based on the size of changes in the predicted values when the case is omitted (residuals) and the observation's distance from the other observations (leverage). Values greater than 1.0 should be considered influential. o DFFIT measures the degree to which fitted values change when the case is deleted. Values exceeding 2 * square root of (p/n) are indicative of influential observations. Step 4: Select and accommodate influential observations. o Use of multiple measures: The above steps should converge on the identification of observations which may adversely affect the analyses. The researcher should never classify an observation based on a single measure, but should always conduct a number of diagnostics to identify truly influential observations. o Remedy: Once identified, influential observations may be deleted where justified. If the researcher is unable to delete the observations, a more robust estimation technique should be used.
5. Discriminant analysis is also sensitive to the sample sizes of each group. At minimum, smallest group size must exceed the number of independent variables. Each group should have at least 20 observations.
Groups of similar relative size avoid adverse effects on estimation and classification. 6. The sample should be divided into an estimation sample and a validation sample. Data used to derive the discriminant function cannot also be used to validate the function. If the researcher uses all of the available data to derive the discriminant function, then the validation of that function on the same data is upwardly biased. By dividing the sample into two parts, the validation sample is not biased and also possesses the same properties as the analysis sample. Division of the sample is most often equal (i.e. 50% estimation / 50% validation); however, there is no rule specifying a certain division. A proportional stratified sampling procedure is used to select the observations in the validation sample. Thus, the sizes of the groups will be proportionate to the total sample distribution.
2. Discriminant analysis assumes multivariate normality of the independent variables. The data distribution for the independent variables is assumed to be the normal distribution, a requirement for use of the F test.
3. Discriminant analysis assumes unknown, but equal dispersion and covariance structures (matrices) for the groups as defined by the dependent variable. This assumption is necessary for the maximization of the ratio of variance between groups to the variance within groups. Unequal covariance matrices can have an adverse affect on classification. Equality is assessed by the Box's M test for homogeneity of dispersion matrices. Based on the problem's origin, remedies for violations of this assumption may be: o increasing sample size, o computing group-specific covariance matrices, o using quadratic classification techniques. 4. Discriminant analysis is adversely affected by collinearity of the independent variables. When selecting independent variables with a stepwise method, multicollinearity may impede inclusion of certain variables in the variate and impact interpretation.
5. Discriminant analysis implicitly assumes that all relationships are linear. Nonlinear relationships can be remedied with the appropriate transformations.
6. Outliers adversely impact the classification accuracy of discriminant analysis. The analyst should complete diagnostics for influential observations, and outliers which are not representative of the population should be deleted.
1. To derive discriminant functions, the researcher must choose a computational method. Two methods from which to choose: simultaneous and stepwise methods. Simultaneous all of the independent variables, regardless of discriminatory power, are used to compute the discriminant function(s). o Best use: when the researcher has some theoretical reason to include all of the independent variables. Stepwise independent variables are chosen one at a time on the basis of their discriminating power. Through a series of "step" iterations similar to stepwise regression, only those variables that provide the unique discriminatory power will be included in the analysis. o Best use: when the researcher has a large number of independent variables and wishes to select the best combination of predictor variables. 2. After the discriminant function is computed, the researcher must assess its level of statistical significance. Criteria: Wilks' lambda, Hotelling's trace, Pilliai's criteria, Roy's greatest characteristic root, Mahalanobis' distance, and Rao's V measures. Significance level: The conventional criterion of .05 or beyond is most often used. 3 or more groups: If the dependent variable has three or more groups, analysts must: o carefully evaluate the statistical significance of each discriminant function. o some discriminant functions may not contribute significantly to overall discriminatory power.
3. Overall fit of the discriminant function may be assessed by constructing classification matrices. Group sizes: When developing a classification matrix, the analyst must decide whether or not the observed group sizes in the sample are representative of the group sizes in the population. The default assumption is that the population group sizes are assumed to have an equal chance of occurring. Cutting Score: Used to construct a classification matrix. The optimal cutting score (critical Z value) is dependent on whether or not the sizes of the groups are equal or unequal. Individual discriminant scores for the observations in the validation, or holdout, sample are compared to the cutting score and thereby classified into a group. Interpretation:
o Diagonal of the classification matrix represent the number of respondents correctly classified. o Off-diagonal values represent the incorrect classifications. o Percentage correctly classified is shown for: each group overall or hit ratio: The overall percentage correctly classified is shown at the bottom. 4. The acceptable level of predictive, or classification, accuracy of the discriminant function may be assessed via comparison to several criteria. Chance criteria the hit ratio may be compared to the percentage of respondents who would be correctly classified by chance, with the following chance criteria available: o Equal chance criterion: When the sample sizes of the groups are equal, percentage of correct classification by chance is equal to 1 divided by the number of groups. o Maximum chance criterion: Percentage of correct classification by chance is based on the sample size of the largest group. To accurately predict based on this criterion, the hit ratio should exceed the percentage equal to the proportional size of the largest group in the sample.
o Proportional chance criterion: When the sample sizes of the groups are unequal, percentage of correct classification by chance is equal to the sum of the proportion of respondents in each group squared. o Classification accuracy should be at least one fourth greater than that achieved by chance. Press's Q statistic A statistical test that computes a value based on the number of correct classifications, the total sample size and the number of groups, and compares this value to a critical value (chi-square value for 1 degree of freedom at the desired confidence level). o Sensitive to sample size. Large samples are more likely to show significance than small sample sizes of the same classification rate.
2. Interpretation of two or more discriminant functions is somewhat more complicated. The analyst must now determine the relative importance of the independent variables across all the discriminant functions. Rotation: Simplifies the profiling of each discriminant function. Rotations, similar to those done in factor analysis, do not change the structure of the solution, but make the functions easier to interpret. Variable Importance Across Multiple Functions: The relative importance of each independent variable across all significant discriminant functions can be determined with: o Potency Index is a composite or summary measure that indicates which independent variables are most discriminating across all discriminant functions. o Stretching the vectors is one approach used to identify the relative importance of independent variables. Vectors are created by drawing a line from the origin to a point representing each discriminant loading multiplied by its respective univariate F value. The length of the vector is indicative of the relative importance of each variable in discriminating among the groups.
Logistic Regression: Regression with a Binary Dependent Variable What is logistic regression?
1. Logistic regression is another multivariate technique that forms a variate (linear combinations of metric independent variables) used to predict the classification of a categorical dependent variable. 2. It is similar to discriminant analysis in its classification approach that produces predictions of group membership for each observation. 3. It is different, however, in that it can only accommodate a binary or two-group categorical dependent variable. Logistic regression is thus comparable to a twogroup discriminant analysis. 4. It operates in a manner quite similar to multiple regression although it utilizes a maximum likelihood estimation procedure rather than the ordinary least squares approach used in multiple regression. The coefficients in the variate can be interpreted in a manner quite similar to regression.
several areas. The following discussions highlight these differences in the following areas: Unique nature of the dependent variable as represented by the logistic curve Estimating the model using the Odds and Logit values Differing measures of model estimation fit Interpreting the different types of estimated logistic coefficients
1. How does logistic regression represent the dependent measure in a form amenable to model estimation in a manner similar to regression? Logistic regression uses the logistic curve, which is a S-shaped curve that levels off as it approaches zero and one, to portray the predicted value for any observation. In this manner, it avoids the problems found in multiple regression where the predicted values of a binary measure can go lower than zero and higher than one. The logistic curve represents the probability that an observation is in one group or another (lets call them groups 0 and 1). If the probability is less than 50% then the observation is classified in group 0 and if it is greater than 50% it is classified in group 1. The objective is to estimate a logistic variate that predicts low probability values for all the observations in group 0 and high probability values from group 1. The logistic curve also meets the assumptions assuming a normal distribution for the error terms and homoscedasticity by equalizing variance. In doing so, it allows for a regression-like procedure to be used rather than the means difference approach seen in discriminant analysis.
2. What are the odds ratio and logit values used in model estimation? Since logistic regression uses a logistic curve as the predicted value of the logistic regression variate, it must find a way to transform the probability value so that the predicted value can never fall below 0 or above one. This transformation is achieved by expressing the probability as odds the ratio of the probabilities of being in groups 0 and 1. In this form, any probability is stated in a metric form that can be directly estimated. Likewise any estimated odds value can be converted back to a probability value.
o Example: Assume that the probability of being in Group 1 was 80%, making the probability of being in Group 0 equal to 20%. If we are expressing the odds of being in Group 1, then it is 4.0 (i.e., 80% 20%) or it is four times as likely to be in Group 1. If we were expressing the odds of being in Group 0, then they would b .25 (i.e., 20% 80%) or there is one-fourth the chance of being in Group 0 compared to Group 1. But there is one additional problem: How do we keep the odds value from going below zero, which is the theoretical minimum? The is solved by taking the logarithm of the odds and creating the logit value. Odds less than 1.0 will have negative values and those above 1.0 will have positive values. No matter how low the logit value goes, it can still be transformed back to an acceptable odds value by taking the antilog. As will be seen, the estimated logistic coefficients can be stated in terms of estimating the odds value and the logit value and providing two forms of the coefficients. In the last section we will examine how to interpret each form of the coefficient and what diagnostic information is best derived from each.
3. Since logistic regression used maximum likelihood rather than ordinary least squares used in multiple regression, how is overall model fit evaluated? The maximum likelihood method fits the likelihood value and the measure of fit is -2 time the log of the likelihood value, often expressed as -2LL. The minimum value is 0 corresponding to perfect fit. Model comparisons are made by comparing the difference in-2LL. A typical comparison is: o Estimate a null model: acting as the baseline, it is generally estimated as a model with no independent variables. As such, any improvements in fit can be attributed to the independent variables added to the variate. o Estimate the proposed model: Add the independent variables in the specified model o Assess the -2LL difference: If the -2LL difference is statistically significant, then we can assume that the overall model fit is statistically significant.
There are also pseudo R2 measures which are calculated to represent the percent explanation similar to R2 in multiple regression. Predictive accuracy is assessed in the same manner as discriminant analysis, using classification ratios and evaluating the hit ratio, which is the percentage correctly classified. 4. How do the two forms of logistic coefficients represent the relationship between the independent and dependent variables? The two forms of logistic coefficients relate to the two ways in which the dependent variable can be represented as odds or as the logit value. As such, it is essential to understand what each form of coefficient represents in terms of interpretation of the variate: o Logistic coefficient estimated in the original model form where the logit value acts as the dependent variable. o Exponentiated logistic coefficient a transformation of the logistic coefficient (antilog of the logistic coefficient) that reflects changes in the odds value. While each form of coefficient can be used to assess both directionality and magnitude of the relationship, it is easiest to use separate forms of the logistic coefficient for each assessment: o Directionality of the relationship can be determined directly from the logistic coefficients, where the signs (positive or negative) represent the type of relationship between independent and dependent variable. o Magnitude of the relationship: This is best determined with the exponentiated coefficient, where the percentage change in the dependent variable (the odds value) is shown by the calculation (Exponentiated coefficient 1.0) * 100. Since the relationship between independent and dependent variables is nonlinear (i.e., the S-shaped logistic curve), the amount of change in the dependent variable for a unit change in the independent variable depends where on the logistic curve the value of the independent variable occurs. o If the independent variable value falls close to the 50%
probability value, then one can expect a larger impact on probability for a change in the independent variable. o But if the independent variable value falls in the high or low probability ranges, then the impact is less. o For example, if the value of the independent variable is associated with a 90% probability value, then it is much harder to increase the probability value since this is in the flatter portion of the curve. But if the probability is close to 50%, it is much easier to increase or decrease the probability since the curve is much steeper in this area.
(3)
WHAT CRITERIA COULD YOU USE IN DECIDING WHETHER TO STOP A DISCRIMINANT ANALYSIS AFTER ESTIMATING THE DISCRIMINANT FUNCTION(S)? AFTER THE INTERPRETATION STAGE? Answer a. Criterion for stopping after derivation. The level of significance must be assessed. If the function is not significant at a predetermined level (e.g., .05), then there is little justification for going further. This is because there is little likelihood that the function will classify more accurately than would be expected by randomly classifying individuals into groups (i.e., by chance). Criterion for stopping after interpretation. Comparison of "hit-ratio" to some criterion. The minimum acceptable percentage of correct classifications usually is predetermined.
b.
(4)
WHAT PROCEDURE WOULD YOU FOLLOW IN DIVIDING YOUR SAMPLE INTO ANALYSIS AND HOLDOUT GROUPS? HOW WOULD YOU CHANGE THIS PROCEDURE IF YOUR SAMPLE CONSISTED OF FEWER THAN 100 INDIVIDUALS OR OBJECTS? Answer When selecting individuals for analysis and holdout groups, a proportionately stratified sampling procedure is usually followed. The split in the sample typically is arbitrary (e.g., 50-50 analysis/hold-out, 60-40, or 75-25) so long as each "half" is proportionate to the entire sample. There is no minimum sample size required for a sample split, but a cut-off value of 100 units is often used. Many researchers would use the entire sample for analysis and validation if the sample size were less than 100. The result is an upward bias in statistical significance which should be recognized in analysis and interpretation.
(5)
HOW DO YOU DETERMINE THE OPTIMUM CUTTING SCORE? Answer a. For equal group sizes, the optimum cutting score is defined by: ZA + ZB ZCE = N ZCE =critical cutting score value for equal size groups ZA = centroid for group A ZB = centroid for Group B N = total sample size b. For unequal group sizes, the optimum cutting score is defined by: NAZA + NBZB ZCU = NA + NB ZCU =critical cutting score value for unequal size groups NA = sample size for group A NB = sample size for Group B
(6)
HOW WOULD YOU DETERMINE WHETHER OR NOT THE CLASSIFICATION ACCURACY OF THE DISCRIMINANT FUNCTION IS SUFFICIENTLY HIGH RELATIVE TO CHANCE CLASSIFICATION? Answer Some chance criterion must be established. This is usually a fairly straight-forward function of the classifications used in the model and of the sample size. The authors then suggest the following criterion: the classification accuracy (hit ratio) should be at least 25 percent greater than by chance. Another test would be to use a test of proportions to examine for significance between the chance criterion proportion and the obtained hitratio proportion.
(7)
HOW DOES A TWO-GROUP DISCRIMINANT ANALYSIS DIFFER FROM A THREE-GROUP ANALYSIS? Answer In many cases, the dependent variable consists of two groups or classifications, for example, male versus female. In other instances, more than two groups are involved, such as a three-group classification involving low, medium, and high classifications. Discriminant analysis is capable of handling either two groups or multiple groups (three or more). When two classifications are involved, the technique is referred to as twogroup discriminant analysis. When three or more classifications are identified, the technique is referred to as multiple discriminant analysis.
(8)
WHY SHOULD A RESEARCHER STRETCH THE LOADINGS AND CENTROID DATA IN PLOTTING A DISCRIMINANT ANALYSIS SOLUTION? Answer Plots are used to illustrate the results of a multiple discriminant analysis. By using the statistically significant discriminant functions, the group centroids can be plotted in the reduced discriminant function space so as to show the separation of the groups. Plots are usually produced for the first two significant functions. Frequently, plots are less than satisfactory in illustrating how the groups differ on certain variables of interest to the researcher. In this case stretching the discriminant loadings and centroid data, prior to plotting the discriminant function, aids in detecting and interpreting differences between groups. Stretching the discriminant loadings by considering the variance contributed by a variable to the respective discriminant function gives the researcher an indication of the relative importance of the variable in discriminating among the groups. Group centroids can be stretched by multiplying by the approximate Fvalue associated with each of the discriminant functions. This stretches the group centroids along the axis in the discriminant plot that provides more of the accounted-for variation.
(9)
HOW DO LOGISTIC REGRESSION AND DISCRIMINANT ANALYSES EACH HANDLE THE RELATIONSHIP OF THE DEPENDENT AND INDEPENDENT VARIABLES? Answer Discriminant analysis derives a variate, the linear combination of two or more independent variables that will discriminate best between the dependent variable groups. Discrimination is achieved by setting variate weights for each variable to maximize between group variance. A discriminant (z) score is then calculated for each observation. Group means (centroids) are calculated and a test of discrimination is the distance between group centroids. Logistic regression forms a single variate more similar to multiple regression. It differs from multiple regression in that it directly predicts the probability of an event occurring. To define the probability, logistic regression assumes the relationship between the independent and dependent variables resembles an S-shaped curve. At very low levels of the independent variables, the probability approaches zero. As the independent variable increases, the probability increases. Logistic regression uses a maximum likelihood procedure to fit the observed data to the curve.
(10)
WHAT ARE THE DIFFERENCES IN ESTIMATION AND INTERPRETATION BETWEEN LOGISTIC REGRESSION AND DISCRIMINANT ANALYSIS? Answer Estimation of the discriminant variate is based on maximizing between group variance. Logistic regression is estimated using a maximum likelihood technique to fit the data to a logistic curve. Both techniques produce a variate that gives information about which variables explain the dependent variable or group membership. Logistic regression may be comfortable for many to interpret in that it resembles the more commonly seen regression analysis.
(11)
EXPLAIN THE CONCEPT OF ODDS AND WHY IT IS USED IN PREDICTING PROBABILITY IN A LOGISTIC REGRESSION PROCEDURE. Answer One of the primary problems in using any predictive model to estimate probability is that is it difficult to constrain the predicted values to the appropriate range. Probability values should never be lower than zero or higher than one. Yet we would like for a straight-forward method of estimating the probability values without having to utilize some form of nonlinear estimation. The odds ratio is a way to express any probability value in a metric value which does not have inherent upper and lower limits. The odds value is simply the ratio of the probability of being in one of the groups divided by the probability of being in the other group. Since we only use logistic regression for two-group situations, we can always calculate the odds ratio knowing just one of the probabilities (since the other probability is just 1 minus that probability). The odds value provides a convenient transformation of a probability value into a form more conducive to model estimation.
1. The following three types of questions are appropriate objectives for MANOVA Multiple Univariate Research Questions: The researcher identifies a number of separate dependent variables that are to be analyzed separately, but needs some control over the experimentwide error rate. MANOVA is used to assess whether an overall difference is found between groups; then the separate univariate tests are employed to address each dependent variable. Structured Multivariate Research Questions: The researcher gathers data which have two or more dependent measures that have specific relationships between them. A common type of structured question would be a repeated measures design. Intrinsically Multivariate Research Questions: The researcher wishes to address how a set of dependent measures differ as a whole across groups. The collective effect of several variables is of interest, not the individual effects. 2. Only those dependent variables which have a sound conceptual or theoretical basis should be selected for inclusion in the analysis. Inclusion of irrelevant variables may adversely affect the resulting conclusions of an analysis. This is especially so when the researcher's objective is to learn about the collective effect.
acceptable) and the effects of each treatment may be described. Significant disordinal interactions require the redesign of the study; the main effects cannot be interpreted. 4. The researcher must decide on the use of a covariate. Covariate is a metric independent variable that is regressed on the dependent variables to eliminate its effect before using the dependent variable in the analysis. Thus, covariates remove effects on the dependent variable before assessing any main effects from the independent variables. Covariates are not entered into MANOVA as independent variables (factors) because they are metric and we would lose too much information if they were made categorical. Use of covariate to either:
o eliminate some systematic error outside the control of the researcher, which may bias the results, or o account for differences in the responses due to unique characteristics of the respondents. Nature of covariate: Highly correlated with the dependent variable, but not correlated with the independent variables. Number of covariates included in the analysis should be less than (.10 * sample size) - (number of groups - 1). Requirements for Use of a Covariate:
o Must have some relationship with the dependent measures. o Must have a homogeneity of regression effect, meaning that they have equal effects on the dependent variable across groups.
2. The variance-covariance matrices must be equal for all treatment groups. Equality of variance assumed across the dependent variables for each group. Box test may be used to test equality of covariance matrices.
Violation of this assumption has minimal impact if the groups are of approximately equal size. 3. The observations must be independent. Each observation or subject's response must be independent from all others. If any situation arises in which some connection is made between observations and not accounted for in the procedures, significant biases can occur. Violation of this assumption is the most serious. If violations are detected, the researcher can combine observations with a group and analyze the group's average score instead of the scores of separate respondents. In addition, if a violation occurs, the researcher may employ a covariate to account for the dependence.
Impact of large samples: The equal variance-covariance matrices test most likely will be violated with a very large sample size. As an alternative, examine F statistic and Chi square for additional information. As a rule of thumb, there should be at least one Chi-square for each degree of freedom. While this is only a rule of thumb, ratios in this area will normally perform well with MANOVA. 5. Dependent variables multicollinearity. must be linearly related and exhibit low
Nonlinear relationship: if detected among the dependent variables, an appropriate transformation should be conducted. Multicollinearity among the dependent redundancy and decreases statistical efficiency. variables indicates
1. Criteria to assess multivariate differences across groups: Roy's greatest characteristic root, Wilks' lambda, Hotellings' trace and Pillai's criterion. Pillai's criterion or Wilks' lambda are the most immune to violations of the assumptions and maintain the greatest power, while Roy's gcr is most powerful if all assumptions are met. Pillai's criterion is more robust if the sample size is small, unequal cell sizes are present, or homogeneity of covariances is violated.
2. The level of power of a statistical test is based on the alpha level, the effect size of the treatment and the sample size of the groups.
Power is inversely related to alpha. If the alpha level is set too conservatively, the power of the test may be too low for valid results to be identified. Effect size is directly related to the power of the statistical test for a given sample size. The larger the effect size, the greater the power of the test (i.e. the greater the standardized differences between groups, the more probable that the statistical test will identify a treatment's effect if it exists). Increasing the sample size increases the power by reducing sampling error. However, sample sizes greater than 150 per group do not contribute greatly to increasing the power of the test. In fact, in very large sample sizes, the power of the test may become too sensitive, identifying almost any difference as significant.
o A priori tests. The analyst specifies which group comparisons are to be made instead of testing all possible combinations. Thus, a priori tests are more powerful than post hoc tests. Context for use: A priori tests are most appropriate when the analyst has conceptual bases which support the selection of specific comparisons. A priori comparisons should not be used as an exploratory technique. Identification of differences between groups by post hoc or a priori statistical tests. o Single dependent variable contribution assessment has the potential to inflate Type 1 error when running several consecutive a priori tests, such as univariate tests. o Adjustment for potential Type 1 error inflation involves the use of the Bonferroni inequality or a stepdown analysis.
1. Replication is the primary means of validation of MANOVA results. Exact replication may be difficult in certain research contexts (such as survey research). Covariate usage is dictated when the researcher is knowledgeable of characteristics of the population which may affect the dependent variables. 2. Significant MANOVA results do not necessarily support causation. Causation is based on several criteria must be met before the researcher can suggest causation. Causation can never be proved.
In a way, MANOVA and discriminant analysis are mirror images. The dependent variables in MANOVA ( a set of metric variables) are the independent variables in discriminant analysis. The single nonmetric dependent variable of discriminant analysis becomes an independent variable in MANOVA. Moreover, both use the similar methods in forming the variates and assessing statistical significance between groups. Use of one technique over the other primarily depends upon the research objective. Discriminant analysis employs a single nonmetric variable as the dependent variable. The independent metric variables are used to form variates that maximize differently between groups formed by the dependent variable. The objective is to determine the independent variables that discriminate between groups. In MANOVA, the set of metric variables now act as dependent variables and the objective becomes finding groups of respondents that exhibit differences on the set of dependent variables. (2) DESIGN A TWO-WAY MANOVA EXPERIMENT. WHAT ARE THE DIFFERENT SOURCES OF VARIANCE IN YOUR EXPERIMENT? WHAT WOULD THE INTERACTION TEST TELL YOU?
Answer a. Requirements for two-way MANOVA: 1) Two (or more) metric dependent variables. 2) Two (or more) nonmetric experimental (treatment) variables. The experimental design is a 2 x 2 (n x n) matrix of independent nonmetric variables. 3) Subjects are assigned at random, but in equal numbers to each of the cells. 4) Statistics are calculated for each cell: a. totals for both (all) dependent variables b. sums of squares for both (all) dependent variables c. sums of products of dependent variables
5) Marginals are computed a. There are four sources of variance: 1) between columns (treatments) 2) between rows (factors) 3) interactions between factors and treatments 4) residual error In factorial designs (n x n) the interaction test would aid in discovering an interaction effect. In other words, the joint effect of treatment variables in addition to the individual main effects on the dependent variable(s). (3) BESIDES THE OVERALL, OR GLOBAL, SIGNIFICANCE, THERE ARE AT LEAST THREE APPROACHES TO DOING FOLLOW-UP TESTS: (A) USE OF SCHEFFE' CONTRAST PROCEDURES; (B) STEP-DOWN ANALYSIS, WHICH IS SIMILAR TO STEPWISE REGRESSION IN THAT EACH SUCCESSIVE F-STATISTIC IS COMPUTED AFTER ELIMINATING THE EFFECTS OF THE PREVIOUS DEPENDENT VARIABLES; AND (C) EXAMINATION OF THE DISCRIMINANT FUNCTION(S). NAME THE PRACTICAL ADVANTAGES AND DISADVANTAGES OF EACH OF THESE APPROACHES.
Answer a. Scheffe' Contrast procedures Tests for differences between groups on any dependent variable.
These procedures ensure that the probability of any Type I error across all comparisons will be held to d = .05 (or at the level specified by the researcher). A disadvantage in using the Scheffe' test is that it requires the use of the gcr distribution. If the Scheffe' test is to be used, then the most appropriate overall test would be the gcr-statistic in MANOVA. b. Step-Down Analysis Similar to F-tests but allows for correlation among dependent variables. Analogous to step-wise regression in concept. May overlook a significant dependent (independent) variable due to its high correlation with another dependent (independent) variable. c. Multiple Discriminant Analysis of the SSCP matrix The relative importance of each independent variable can be identified by deriving correlations between each original dependent variable and the discriminant function.
(4)
Major areas of differences between groups can be identified. HOW IS STATISTICAL POWER AFFECTED BY STATISTICAL AND RESEARCH DESIGN DECISIONS? HOW WOULD YOU DESIGN A STUDY TO ENSURE ADEQUATE POWER?
Answer The primary factors affecting power can be assessed prior to a study, estimated effect size, desired alpha level, the number of dependent variables, and sample size. To ensure adequate power, the researcher should estimate the effect size and the needed sample size to achieve the desired level of power given the alpha required. In the design of the study, the researcher should consider the use of as few dependent variables as possible, especially if they are correlated. (5) DESCRIBE SOME DATA ANALYSIS SITUATIONS IN WHICH MANOVA AND MANCOVA WOULD BE APPROPRIATE IN YOUR AREAS OF INTEREST. WHAT TYPES OF UNCONTROLLED VARIABLES OR COVARIATES MIGHT BE OPERATING IN EACH OF THESE SITUATIONS?
Answer There are a wide variety of applications possible in the areas of psychology and education. Examples of the use of these techniques in these two fields may be found in the selected readings at the end of the chapter. A wide variety of applications are also possible in the area of marketing. One type of experiment which might be carried out in advertising research would be to test the effects of two broadcast communications media at three different times of the day on consumer knowledge and intention to buy simultaneously. Covariates in such an experiment might include sex, age, or education level of the respondents. These could be controlled for after the experiment if these variables did indeed have an effect on the outcome of the test. Another type of experiment might be to test the effect of a point of purchase display (present or absent) against newspaper advertising. Two cities could be selected which possess similar demographic profiles. The local newspaper in one city only would carry ads about the specific product. Some stores would be selected in each city for the point of purchase displays and some selected for observation without the displays. Dependent variables to be observed might include levels of traffic on the aisles containing the product and the proportion of purchases containing the item of interest. Covariates might include frequency of shopping trips and readership of both newspapers. Similar problems of interest might occur in any discipline where experimental design is of concern.
Use estimates of purchaser or customer judgments to predict market shares among objects with differing sets of features (other things held constant). Isolate groups of potential customers who place differing importance on the features in order to define high and low potential segments. Identify marketing opportunities by exploring the market potential for feature combinations not currently available.
All positive and negative factors/attributes which impact (add to or detract from) the overall worth of the product / service should be included in the model. Limited to making statements pertaining to the variables and levels used in the analysis. We cannot interpolate between variables or levels of variables. In its most general form of part-worth utilities, the conjoint procedure uses categorical relationships between variables so there is no assumption of a linear relationship. Assumption is that the model contains all the needed dimensions to make the choice (i.e., inclusion of all determinant attributes), so the researcher must ensure that the specified attributes define the total worth of the products. 3. Determinant factors can be specified and are limited to what we specify as the basis for decision. Implicit is that the researcher can specify the dimensions or variables upon which a decision is based, and even further that the dimensions we specify are the only dimensions used. Conjoint analysis requires some a priori basis for selection of variables. The justification may be theoretical or derived from other research, such as a survey to determine the appropriate variables to include.
Factors must be precise and perceptually distinct. In other words, the variables must be singular, concrete attributes that illicit the same interpretation from all respondents. Descriptions of factors must be easily understood by respondents. The dimensions upon which choice is made should be stated in very tangible terms. Factors must be capable of being verbalized or written in order to be operationalized in conjoint analysis. 3. Dimensionality and number of attributes. Unidimensionality is required of all variables used in conjoint analysis. Variables that are multidimensional may lead to interpretation problems. One respondent may respond to the variable with low importance, while another considers the same variable with high importance because the two respondents were considering two different dimensions of the same variable. The number of variables used in the analysis must balance the needs of complexity and specificity. Including too many variables results in too complex a design, which requires consumers to make endless hypothetical preference judgments and researchers to complete intricate analyses. However, too few variables in the design will not provide the level of specificity needed to bring any validity to the model. 4. Number and range of levels. Reasonability: The number of levels for each variable should reflect the most reasonable levels expected by the consumer. Believability: Levels outside the range of believability only weaken the model and provide spurious results. Balance in number of levels: The researcher should balance the number of levels across variables. Unequal levels across variables may adversely impact the consumer's perception of relative importance of the variables. Complexity of design: As you increase the number of variables and the levels of each variable, you will rapidly reach a very complex design. The researcher should use only the necessary variables and required levels for each variable.
Just as found in other multivariate techniques, the independent variables should not have any substantial degree of collinearity. Collinearity typically results from the basic character of the variable itself, not from the levels of a variable. Interattribute correlation has the tendency to result in unbelievable or otherwise unacceptable stimuli. The solutions are:
o eliminate one of the attributes, but this impacts validity when deleted attributes impact utility of object o create "super attributes" that are combinations of the correlated variables (e.g., miles per gallon and acceleration are combined in new attribute terms performance). o construct the set of stimuli to exclude the unacceptable stimuli, typically known as prohibited pairs. This may impact orthogonality of stimuli design as well as design efficiency. o constrain estimation process to be sure part-worths correspond to prespecified relationship. 6. The researcher must determine whether an additive or an interactive composition rule is appropriate. Additive model with no interactions is normally assumed. This type of model is widely used in consumer research and simplifies the implementation process. All consumers are assumed to use the same choice rule. Conjoint does not make allowances for multiple choice rules (composition rules) in the same data set. When a conjoint analysis is to be performed the researcher must establish a priori the choice rule to be used for the entire group. 7. The researcher must determine how the levels of a factor are related. Each type of relationship can be specified separately; however, such a choice would produce less efficient and less reliable estimates. The researcher must consider the trade-offs of selecting a type of relationship that is most like the preference formations of consumers and of producing reliable estimates.
8. The researcher must select a presentation method. popular forms of presentation are:
Full Profile: By far the most widely used, full profile presents to the respondent a series of hypothetical objects, each derived from a combination of a level from each specified independent variable. It has the advantage of being more representative of the actual decisionmaking process followed by respondents. Trade-off: Trade-off considers each pair of attributes and asks respondents to indicate the preference order for each combination of attribute levels. While attempting to approximate the consumer's true nature, it too often presents an artificial decision-making context. Pairwise comparison: Pairwise comparison of profiles with a complete or reduced set of attributes allows the respondent to make simple judgments about the profiles. 9. In data collection, the researcher must define the set of stimuli to present to the respondent. The two options are: Factorial design: all combinations of levels are utilized. This design is many times impractical, unless the researcher is interested in a very small number of variables and levels. Fractional factorial design: a subset of the combinations are employed. This design is used most often and is the chosen method for researchers interested in larger numbers of variables and levels. The number of stimuli is dependent on the composition rule. The stimuli should always be examined for the presence of unacceptable stimuli, which can be the result of: o Unbelievable combinations of levels o Obvious combinations (e.g., all the most preferable) which are less useful in choice decisions o Violations of constraints of combinations of levels 10. The researcher must select a measure of preference. Dependent on choice of presentation method. o Trade-off method: employs only ranking data o Pairwise comparison method and the full-profile method: either rating or ranking.
11. The researcher must choose a means to administer the stimuli. Administration: successfully performed by person, mail or telephone with the proper planning and technical support.
1. Estimation technique must be appropriate for the type of data collected. Rank order data requires the use of a modified analysis of variance technique which is designed for ordinal data. Metric ratings data can be analyzed by ordinary least squares approaches. 2. Traditional estimation techniques (MONANOVA for rankings and regression-based methods for metric ratings) are being supplemented by Bayesian estimation. Primary advantage is that model specifications that must be estimated at only the aggregate level with traditional techniques can now be estimated at individual level. 3. Should assess the overall fit of the model at both the individual and the aggregate levels. Goodness of fit involves comparison of actual preference measures (rankings or ratings) with predicted values from estimated model.
Individual or aggregate level assessments: The correlations between a person's actual response and his / her predicted response should be tested for statistical significance. Can also be combined across individuals for aggregate assessment. Validation profiles: In order to test for overfitting of the model, researchers should always plan for a validation sample of stimuli. To do so, the researcher employs more stimuli than necessary to fit the model and uses the extra data to test model accuracy.
1. Analysis begins with a comparison of each variable (factor or attribute) and then an examination of the levels of each variable. The results of conjoint analysis provide information pertaining to each factor as a whole and to each level of each factor. A comparison between factors may be performed and then an examination of the levels of each factor allows for understanding of relative influences. Assess theoretical consistency and avoid reversals patterns of part-worths for a factor which are contrary to acceptable relationships. Examples such as higher preference for lower quality objects or less convenient stores. o Any number of issues contribute to reversals Inadequate respondent effort Data collection failures Research context limits o Identifying reversals requires researcher judgment as no absolute criteria. Usually involves examination of part-worth patterns for each respondent o Typically, one of three remedies for reversals are used: Do nothing small number of reversals can be ignored, especially if aggregate results are primary interest, Apply constraints estimation techniques allow for specifications in allowable part-worth patterns, or
Delete respondents identify and delete respondents with inappropriate or large number of reversals. 2. In most cases, disaggregate analysis should be used to interpret conjoint results. Level of analysis: All of the measures mentioned above are provided for each respondent in the sample (disaggregate) and also for the sample as a whole (aggregate). Unless the researcher has reason to assume the population is homogeneous with respect to the factors being measured, disaggregate analysis is most appropriate. 3. Next, interpret the relative importance of each attribute and each level of each attribute. Importance of variable/factor: For each respondent, conjoint analysis determines an importance value for each variable used in the analysis. o Percentage value based on a range of zero to 100 percent. When summed, the importance values for all variables will total 100 percent. Importance of level: There is also a utility value for each level of each variable, providing a measure of the influence of each level of each variable. o Expressed in raw form (utility) with a sign indicating the relationship (positive or negative) with the dependence variable. A large negative value would mean that this level of the variable was associated with lower levels of preference, while a positive value increases
1. The researcher should internally and externally validate conjoint analysis results.
Internal validation: the researcher confirms that the choice of composition rule is most appropriate. This is usually completed in a pretest. External validation: corresponds to a test of sample representativeness. The sample should always be evaluated for population representation. 1. The three most common applications of conjoint results are: Segmentation is the grouping of individuals with similar part-worths or importance values. A marginal profitability analysis aids in the product design process by predicting the viability of each hypothetical product, given the cost of each product and its expected market share and sales volume. Choice simulators enable the researcher to predict consumer response to market questions. For example, the market shares among any set of products can be estimated. Any number of product sets can be evaluated, varying in both the type and number of products in a set. In any application, the researcher provides the market stimuli and the simulator predicts consumer response.
Given the comparison task for the respondent, typically no more than six attributes are included in the analysis. All of the issues discussed earlier in terms of factor and level characteristics and design consideration are still applicable, along with the need for validation tasks independent from the estimation procedure. Until the recent development of Bayesian estimation, choice-based analyses were typically estimated only at the aggregate level since each respondent could not provide enough responses for individual level partworth estimates. Now individual and aggregate estimates are possible. Widespread adoption has increased the availability of software for choice-based designs, which now can be completely designed and estimated, even with Bayesian methods, by any researcher.
(2) USING EITHER THE DIFFERENCES MODEL OR A CONJOINT PROGRAM, ANALYZE THE DATA FROM THE PRECEDING EXPERIMENT. EXAMINE THE INTERACTIONS. Answer You must look for interactions on a respondent by respondent basis. When stress=0 or R-square = 1, you need not look. However, when stress is not 0 and regression is not 1, it does not mean that you necessarily have an interaction. Unfortunately, with some respondents, lack of attention or consistent evaluation procedures produces poor fits that cannot be explained as interaction. A crude but effective way to look for interactions is the method shown in the text. An example follows for a 3 level factor with a 2 level (with one other 2 level factor as in problem 1). No Interaction A1 A2 B1 B2 B3 1+4=5 2+5=7 3+6=9 7+10=17 8+11=19 9+12=21 B1 B2 B3 Interaction A1 A2 1+4=5 2+5=7 3+6=9 9+12=21 8+11=19 7+10=17
(3)
DESIGN A CONJOINT ANALYSIS EXPERIMENT WITH AT LEAST FOUR VARIABLES AND TWO LEVELS OF EACH VARIABLE THAT IS APPROPRIATE TO A MARKETING DECISION. IN DOING SO, DEFINE THE COMPOSITIONAL RULE YOU WILL USE, THE EXPERIMENTAL DESIGN FOR CREATING STIMULI, AND THE ANALYSIS METHOD. USE AT LEAST FIVE RESPONDENTS TO SUPPORT YOUR LOGIC.
Answer The student can use any number of rules found in the literature (threshold, multiplicative, etc.) but will typically find that the simple linear additive model gives a good starting point. In addition to its naive simplicity, it lends itself to classical experimental designs for administration and interpretations. The student should quickly see (as pointed out in the example problems in this manual) that with rank order data, the number of solutions are bounded and easily estimated. For those respondents for whom the model fits, the analysis task equates to just classifying the respondents into the appropriate pattern of coefficients. If the student uses a scale for obtaining evaluations in the experiment, then the rank order assumptions of MANOVA are not necessary, as the data can be assumed to not clearly represent only order of choice. If inspection of the data from a design based on a linear model suggests another decision model was used by the respondent, it is usually easier to augment the original design rather than start over. The linear model is a good starting point to suggest the direction for augmentation (which may not be obvious before the initial linear experiment). (4) WHAT ARE THE PRACTICAL LIMITS OF CONJOINT ANALYSIS IN TERMS OF VARIABLES OR TYPES OF VALUES FOR EACH VARIABLE? WHAT TYPE OF CHOICE PROBLEMS ARE BEST SUITED TO ANALYSIS WITH CONJOINT ANALYSIS? WHICH ARE LEAST WELL SERVED BY THE USE OF CONJOINT ANALYSIS?
Answer Conjoint analysis is limited in terms of both the type and number of attributes that can be used to describe the choice objects. Perhaps more limiting is the fact that only tangible and easily communicated attributes are feasible, since other attributes are not easily accommodated in either of the presentation methods. Moreover, the number of attributes is usually limited to less than ten, such that a choice object must be characterized on a small number of dimensions. Conjoint analysis is best suited to examining the choice of hypothetical objects which have easily quantifiable characteristics. Moreover, the product must be viewed as comprised of separate attributes and not really valued by "the whole is greater than the sum of its parts" axiom.
It is ill-suited to examine existing objects (since it is hard to describe them in simple terms) and objects which have intangible attributes (e.g., sensory-based attributes or "images" which convey an emotional appeal). 5. HOW WOULD YOU ADVISE A MARKET RESEARCHER TO CHOOSE AMONG THE THREE TYPES OF CONJOINT METHODOLOGIES? WHAT ARE THE MOST IMPORTANT ISSUES TO CONSIDER, ALONG WITH EACH METHODOLOGYS STRENGTHS AND WEAKNESSES?
The choice of a conjoint methodology revolves around three basic characteristics of the proposed research: (1) the number attributes, (2) level of analysis and (3) the permitted model form. Traditional conjoint analysis is characterized by a simple additive model containing up to nine factors for each individual. The adaptive conjoint method, also an additive model, can accommodate up to 30 factors for each individual. A choice-based conjoint method employs a unique form of presenting stimuli in sets rather than one-by-one. It also differs in that it directly includes interaction and must be estimated at the aggregate level. Choice of a method should be made based on the number of factors and the need to represent interaction effects.
The variable should be excluded if the researcher cannot identify why it should be included in the analysis. 2. Cluster analysis is very sensitive to outliers in the dataset therefore, the researcher should conduct a preliminary screening of the data. Outliers are either observations which are truly nonrepresentative of the population or observations which are representative of an undersampling of an actual group in the population. A graphic profile diagram may be used to identify outliers.
Outliers should be assessed for their representativeness of the population and deleted if they are unrepresentative.
3. The researcher must specify the interobject similarity measure and the characteristics which will define similarity among the objects clustered. Correlational measures represent similarity by the analyzing patterns across the variables. These measures do not consider the magnitude of the variable values, only the patterns, and thus are rarely used. Distance measures represent similarity as the proximity of observations to each other across the variables. These measures focus on the magnitude of the values, by classifying as similar those cases which are close to each other. o Euclidean distance, which is the length of the hypotenuse of a right triangle formed between the points, is the most commonly used measure. o Standardization: Used when the range or scale of one variable is much larger or different from the range of others. o Presence of intercorrelation among clustering variables: Preferred measure Mahalanobis distance, which standardizes the data and also sums the pooled within-group variance-covariance matrices, compensating for intercorrelation among the variables. Association measures are used to represent similarity among objects measured by nonmetric terms (nominal or ordinal measurement). Often simple association measures are used to determine the degree of agreement or disagreement between a pair of cases. 1. Data may be metric, nonmetric, or a combination of both. All scales of measurement may be used. But note that the use of a combination of data types will make the interpretation of the cluster analysis very tentative. The researcher should be cautious interpreting these conditions. 2. Cluster analysis assumes that the sample is truly representative of the population. Outliers which are not representative of the population should be deleted. 3. Multicollinearity among the variables may have adverse effects on the analysis. Multicollinearity causes the related variables to be weighted more heavily, thereby receiving improper emphasis in the analysis.
One or more of the highly collinear variables should be deleted or use a distance measure, such as Mahalanobis distance, which compensates for this correlation. 4. Naturally-occurring groups must be present in the data. Cluster analysis assumes that partitions of observations in mutually exclusive groupings do exist in the sample and population. Cluster analysis cannot confirm the validity of these groupings. This role must be performed by the researcher by: o ensuring that theoretical justification exists for the cluster analysis, and o performing follow-up procedures of profiling and discriminating among groups 1. Hierarchical clustering has two approaches: agglomerative or divisive methods.
Complete linkage based on the maximum distance between objects Average linkage based on the average distance between objects Ward's method based on the sum of squares between the two clusters summed over all variables
Centroid method based on the distance between cluster centroids. The centroid method requires metric data and is the only method to do so. 3. Nonhierarchical clustering assigns all objects within a set distance of the cluster seed to that cluster instead of the tree-building process of hierarchical clustering. Nonhierarchical clustering has three approaches: sequential threshold based on one cluster seed at a time selected and membership in that cluster fulfilled before another seed is selected parallel threshold based on simultaneous cluster seed selection and membership threshold distance adjusted to include more or fewer objects in the clusters optimizing same as the others except it allows for membership reassignment of objects to another cluster based on some optimizing criterion 4. While there is no set rule as to which type of clustering to use, it is suggested that both hierarchical and nonhierarchial clustering algorithms be used. First stage a hierarchical cluster analysis is used to generate and profile the clusters. Second stage a nonhierarchical cluster analysis is used to finetune the cluster membership with its switching ability. In this case, the centroids from hierarchical clustering are used as the seeds for nonhierarchical clustering. 5. There is no generally accepted procedure for determining the number of clusters to extract. This decision should be guided by theory and the practicality of the results. Several items in the output are available to help the analyst determine how many clusters to extract. Some of the most common methods used include the following: clustering coefficient a measure of the distance between two objects being combined. The actual values will depend on the clustering method and measure of similarity used. o Coefficient size indicates the homogeneity of objects being merged. Small coefficient indicates fairly homogeneous objects are being
merged, while a large coefficient is the result of very different objects being combined. o Large increase (absolute or percentage) in the clustering coefficient is an indication of the joining of two diverse clusters, which denotes that a possible "natural grouping" existed before the clusters were joined. This then becomes one potential cluster solution. o Researcher must then examine the possible solutions identified from the results and select one as best supportive of the research objectives. The solution's appropriateness must be confirmed with additional analyses. dendrogram pictorial representation of the clustering process which identifies how the observations are combined into each cluster. As the lines joining clusters become longer, the clusters are becoming increasingly more dissimilar. vertical icicle pictorially represents the number of objects across the top and the number of clusters down the side. The blanks represent clusters and the X's indicate the members per cluster. 6. When cluster solution is reached, examine the structure of each cluster and determine whether or not the solution should be respecified. 7. Respecification may be needed if widely varying cluster sizes or clusters with only one to two observations are found.
Split the sample into two groups and cluster analyze each separately. Obtain cluster centers from one group and use them with the other groups to define clusters. 2. Profiling involves assessing how each cluster differs from the other clusters on relevant descriptive dimensions. Only variables not used in the cluster analysis are used in profiling. Often, variables used in this step are demographics, psychographics, or consumption patterns. Discriminant analysis is technique often used.
3. Predictive or criterion validity of the clusters may be tested by selecting a criterion variable that is not used in the cluster analysis and testing for its expected variability across clusters.
Answer Partitioning - the process of determining if and how clusters may be developed. Interpretation - the process of understanding the characteristics of each cluster and developing a name or label that appropriately defines its nature. Profiling - stage involving a description of the characteristics of each cluster to explain how they may differ on relevant dimensions. (2) WHAT IS THE PURPOSE OF CLUSTER ANALYSIS AND WHEN SHOULD IT BE USED INSTEAD OF FACTOR ANALYSIS?
Answer Cluster analysis is a data reduction technique thats primary purpose is to identify similar entities from the characteristics they possess. Cluster analysis identifies and classifies objects or variables so that each object is very similar to others in its cluster with respect to some predetermined selection criteria. As you may recall, factor analysis is also a data reduction technique and can be used to combine or condense large numbers of people into distinctly different groups within a larger population (Q factor analysis). Factor analytic approaches to clustering respondents are based on the intercorrelations between the means and standard deviations of the respondents resulting in groups of individuals demonstrating a similar response pattern on the variables included in the analysis. In a typical cluster analysis approach, groupings are devised based on a distance measure between the respondent's scores on the variables being analyzed. Cluster analysis should then be employed when the researcher is interested in grouping respondents based on their similarity/dissimilarity on the variables being analyzed rather than obtaining clusters of individuals who have similar response patterns.
(3)
WHAT SHOULD THE RESEARCHER CONSIDER WHEN SELECTING A SIMILARITY MEASURE TO USE IN CLUSTER ANALYSIS?
Answer The analyst should remember that in most situations, different distance measures lead to different cluster solutions; and it is advisable to use several measures and compare the results to theoretical or known patterns. Also, when the variables have different units, one should standardize the data before performing the cluster analysis. Finally, when the variables are intercorrelated (either positively or negatively), the Mahalanobis distance measure is likely to be the most appropriate because it adjusts for intercorrelations and weighs all variables equally. (4) HOW DOES THE RESEARCHER KNOW WHETHER TO USE HIERARCHICAL OR NONHIERARCHICAL CLUSTER TECHNIQUES? UNDER WHICH CONDITIONS WOULD EACH APPROACH BE USED?
Answer The choice of a hierarchical or nonhierarchical technique often depends on the research problem at hand. In the past, hierarchical clustering techniques were more popular with Ward's method and average linkage being probably the best available. Hierarchical procedures do have the advantage of being fast and taking less computer time, but they can be misleading because undesirable early combinations may persist throughout the analysis and lead to artificial results. To reduce this possibility, the analyst may wish to cluster analyze the data several times after deleting problem observations or outlines. However, the K-means procedure appears to be more robust than any of the hierarchical methods with respect to the presence of outliers, error disturbances of the distance measure, and the choice of a distance measure. The choice of the clustering algorithm and solution characteristics appears to be critical to the successful use of CA. If a practical, objective, and theoretically sound approach can be developed to select the seeds or leaders, then a nonhierarchical method can be used. If the analyst is concerned with the cost of the analysis and has an a priori knowledge as to initial starting values or number of clusters, then a hierarchical method should be employed.
Punj and Stewart (1983) suggest a two-stage procedure to deal with the problem of selecting initial starting values and clusters. The first step entails using one of the hierarchical methods to obtain a first approximation of a solution. Then select candidate number of clusters based on the initial cluster solution, obtain centroids, and eliminate outliers. Finally, use an iterative partitioning algorithm using cluster centroids of preliminary analysis as starting points (excluding outliers) to obtain a final solution. Punj, Girish and David Stewart, "Cluster Analysis in Marketing Research: Review and Suggestions for Application," Journal of Marketing Research, 20 (May 1983), pp. 134-148. (5) HOW CAN YOU DECIDE HOW MANY CLUSTERS TO HAVE IN YOUR SOLUTION?
Answer Although no standard objective selection procedure exists for determining the number of clusters, the analyst may use the distances between clusters at successive steps as a guideline. In using this method, the analyst may choose to stop when this distance exceeds a specified value or when the successive distances between steps make a sudden jump. Also, some intuitive conceptual or theoretical relationship may suggest a natural number of clusters. In the final analysis, however, it is probably best to compute solutions for several different numbers of clusters and then to decide among the alternative solutions based upon a priori criteria, practical judgment, common sense, or theoretical foundation. (6) WHAT IS THE DIFFERENCE BETWEEN THE INTERPRETATION STAGE AND THE PROFILING STAGE?
Answer The interpretation stage involves examining the statements that were used to develop the clusters in order to name or assign a label that accurately describes the nature of the clusters. The profiling stage involves describing the characteristics of each cluster in order to explain how they may differ on relevant dimensions. Profile analysis focuses on describing not what directly determines the clusters but the characteristics of the clusters after they are identified. The emphasis is on the characteristics that differ significantly across the clusters, and in fact could be used to predict membership in a particular attitude cluster.
(7)
Answer The hierarchical clustering process may be represented graphically in several ways; nested groupings, a vertical icicle diagram, or a dendogram. The researcher would use these graphical portrayals to better understand the nature of the clustering process. Specifically, the graphics might provide additional information about the number of clusters that should be formed as well as information about outlier values that resist joining a group.
Obtain comparative evaluations of objects when the specific bases of comparison are unknown or indefinable. 2. Multidimensional scaling is defined through three decisions: selection of the objects to be evaluated, choice of similarity or preference data, and choice of individual or group level analysis. Selection of objects All relevant objects must be included in the analysis. The omission of relevant objects or the inclusion of irrelevant objects will greatly influence the results. Relevancy is determined by the research questions. Choice of similarity or preference data the researcher must evaluate the research question and decide whether he or she is interested in respondents' evaluations of how similar one object is to another or of how a respondent feels (like / dislike) about one object compared to another. Aggregate versus disaggregate analysis the researcher must decide whether he or she is interested in producing output on a per subject basis or on a group basis.
o Aggregate analysis creates a single map for the group, resulting in an analysis of the "average respondent." o Disaggregate analysis examines each respondent separately, creating a separate perceptual map for each respondent. o Recommendation: disaggregate method of analysis due to
1. Assessing similarity is the most fundamental decision in perceptual mapping, with two approaches available: the decompositional (attribute-free) and compositional (attribute-based) approaches. Decompositional: measures the overall impression or evaluation of an object, or a global measure of similarity, and then derives spatial positions in multidimensional space to reflect these perceptions. o Advantages: Requires only that respondents give overall perceptions of objects; they are not required to detail the attributes used in evaluation. Each respondent gives a full assessment of similarities among all objects, therefore maps can be developed for each respondent or aggregated to a composite map. o Disadvantages: Research has no objective basis provided by the respondent which identifies the dimensions of evaluation. Solutions require substantial researcher judgment. Compositional: measures an impression or evaluation for each combination of specific attributes, combines the set of specified attributes in a linear combination, and derives evaluative dimensions for object positioning. o Advantages: Explicit descriptions of the dimensions underlying the perceptual space.
o Disadvantages: The similarity between objects is limited to only the attributes which are rated by the respondents. The research must assume some method of combining these attributes to represent overall similarity. This chosen method may or may not represent the respondent's thinking. The data collection effort is substantial. Results are not available for the individual respondent. 2. The analyst must ensure that the objects selected for analysis do have some basis of comparison. Just asking respondents for comparative responses between objects does not mean that underlying evaluative dimensions exist. 3. The number of objects must be determined while balancing two issues: a greater number of objects to ensure adequate information for higher dimensional solutions versus the increased effort demanded of the respondent as the number of objects increases. Rule of thumb: more than four objects for each derived evaluative dimension. Thus, at least five objects are required for a one dimensional perceptual map. Violating the rule of thumb: having less than the suggested number of objects for a given dimensionality causes an inflated estimate of fit and may adversely impact validity of the resulting perceptual maps. 4. Multidimensional scaling produces metric output for both metric and nonmetric input. Input measures of similarity may be metric or nonmetric. The results from both types are very similar, meaning that the researcher's choice is dependent mostly on the preferred mode of data collection. 5. Choice of using either similarity or preference data based on research objectives.
Similarity data represent perceptions of attribute similarities of the specified objects, but do not offer any direct insights into the determinants of choice among the objects. o Comparability of objects: Since similar data investigate the question of which stimuli are the most similar and which are the most dissimilar, the analyst must be able to assume that all pairs of stimuli may be compared by the respondents. o Three procedures for data collection: comparison of paired objects rank or rate similarity of all object pair combinations confusion data subjective clustering of objects derived measures scores given to stimuli by respondents Preference data reflect the preference order among the set of objects, but are not directly related to attributes, since we are not able to demonstrate the correspondence among attributes and choice. o Nature of preference data: arrange the stimuli in terms of dominance relationships. The stimuli are ordered according to the preference for some property of the stimuli. o Data collection modes: direct ranking objects ranked from most preferred to least preferred paired comparisons (when presented with all possible pair combinations, the most preferred object in each is chosen) Combination approach: Methods are available for combining the two approaches, but in each instance the analyst must assume the inference can be made between attributes and preference without being directly assessed.
2. Not all respondents will attach the same level of importance to a dimension, even if all respondents perceive the dimension. Although two consumers may be aware of an attribute of a product, they may not both attach equal importance to this dimension. Thus, consumers may view different dimensions as important.
3. Respondents' dimensions and level of importance will change over time. While it would be very convenient for marketers if consumers would always use the same decision process with the same stimuli dimensions, this is not the case. Over time, consumers will assign different levels of importance to the same dimensions of a stimuli or may even change the dimensions of the stimuli that they evaluate completely. Changes in consumers' lives are reflected in their evaluation of stimuli.
NOTE: Given the wide number of techniques encompassed under the general technique of perceptual mapping, we are unable to discuss each separately. The following discussion will center on general issues in perceptual mapping. 1. How does MDS determine the optimal positioning of objects in perceptual space? Five-step procedure: Most MDS programs follow a five-step procedure which involves selection of a configuration, comparison to fit measures, and reduction of dimensionality. Primary criterion for determining an optimal position is the preservation of the ordered relationship between the original rank data and the derived distances between points. Degenerate solutions: The researcher should be aware of degenerate solutions, which are inaccurate perceptual maps. Degenerate solutions may be identified by a circular pattern of objects or a clustered pattern of objects at two ends of a single dimension. 2. How are the number of dimensions to be included in the perceptual map determined?
Trade-off of best fit with the smallest number of dimensions possible. Interpretation of more than three dimensions is difficult. Three approaches for determining the number of dimensions:
o Subjective evaluation: The researcher evaluates the spatial maps and determines whether or not the resulting configuration looks reasonable. o Stress measurement: measures the proportion of variance in the data that is not accounted for by the model. It is the opposite of the fit index. Desire a low stress index, since stress is minimized when the objects are placed in a configuration such that the distances between the objects best matches the original distances. Scree plot of stress index: The stress measure for models with varying numbers of dimensions may be plotted to form a scree plot as in factor analysis. The interpretation of this plot is the same, where the analyst looks for the bend in the plot line. Overall fit index: a squared correlation index which indicates the amount of variance in the data that can be accounted for by the model. This is a measure of how well the model fits the data. Desired levels of the fit index are similar to those desired when using regressions R2.
Parsimony: Parsimony should be sought in selecting the number of dimensions. The stress measure and the overall fit index react much the same as R2 in regression. As you add dimensions, the fit index always improves and stress always decreases. Thus, the analyst must make a trade-off between the fit of the solution and difficulty of interpretation due to the number of dimensions. 3. With preference data, three additional issues are 1) estimation of the ideal point explicitly or implicitly, 2) use of an internal or an external analysis, and 3) portrayal of the ideal point. Ideal point estimation: ideal points (preferred combination of perceived attributes) may be determined by explicit or implicit estimation procedures. o Explicit estimation: respondents are asked to identify or rate a hypothetical ideal combination of attributes
o Implicit estimation: an ideal combination of attributes is empirically determined from respondents' responses to preference measure questions Internal versus external analysis o Internal analysis develops spatial maps solely from preference data. o External analysis fits ideal points based on preference data to a stimulus space developed from similarities data. o Recommendation: external analysis be performed due to computational difficulties with internal analysis and the fact that perceptual space (preference) and evaluative space (similarities) may not contain the same dimensions with the same salience. Vector or point representation of ideal point ideal point is the most preferred combination of dimensions. o Point representation is location of most preferred combination of dimensions from the consumer's standpoint. o Vectors are lines extended from the origin of the graph toward the point which represents the combination of dimensions specified as ideal. o Difference in representation: with a point representation, deviance in any direction leads to a less preferred object, while with a vector, less preferred objects are those located in the opposite direction from which the vector is pointing.
Objective procedures: formal methods, such as PROFIT, are used to empirically derive underlying dimensionality from attribute ratings. 2. When using compositional methods, the analyst should compare the perceptual map against other measures of perception for interpretation. Perceptual map positions are totally defined by the attributes specified by the researcher.
1. Validation will help ensure generalizability across objects and to the population. Split-samples or multi-samples may be utilized to compare MDS results. Only the relative positions of objects can be compared across MDS analyses. Underlying dimensions across analyses cannot be compared. Bases of comparisons across analyses: visual or based on a simple correlation of coordinates. Multi-approach method: applying both decompositional and compositional methods to the same sample and looking for convergence.
CORRESPONDENCE ANALYSIS
Correspondence analysis (CA) is another form of perceptual mapping that involves the use of contingency or cross-tabulation data. Its application is becoming widespread within many areas, both practitioner and academic. The following sections detail some of the unique aspects of correspondence analysis.
Contingency tables are used to transform nonmetric data to metric form. Dimensional reduction is performed in a manner similar to factor analysis. Performs perceptual mapping represented in multidimensional space. where categories may be
Stage 2: Research Design of Stage 2: Research Design of Correspondence Analysis Correspondence Analysis
1. Correspondence analysis requires only a rectangular data matrix of nonnegative data. Rows and columns do not have predefined meanings, but represent responses to categorical variables. The categories for a row or column may be a single variable or a set of variables.
1.
The
degree
of
similarity
among
Answer Multidimensional scaling (MDS) is a family of techniques which helps the analyst to identify key dimensions underlying respondents' evaluations of objects. MDS techniques enable the researcher to represent respondents' perceptions spatially; that is, to create visual displays that represent the dimensions perceived by the respondents when evaluating stimuli (e.g., brands, objects). MDS differs from cluster and factor analysis in that it provides a visual representation of individual and group respondents' perceptions of the object(s), while cluster and factor analysis provide a classification of objects or variables so that each object is very similar to others in its cluster. (2) WHAT IS THE DIFFERENCE BETWEEN PREFERENCE DATA AND SIMILARITIES DATA, AND HOW DOES IT IMPACT THE RESULTS OF MDS PROCEDURES?
Answer In obtaining preference data from respondents, the stimuli are ordered in terms of the preference for some property. When collecting similarities data, the researcher is trying to determine which items are most similar to each other and which are most dissimilar. Preference data allow the researcher to view the location of objects in a spatial map when distance implies differences in preference. The choice of input data is important to the researcher when using MDS since an individual's perception of objects in a preference context may be different from that in a similarity context. For instance, a particular dimension may be very useful in describing the differences between two objects but is of no consequence in determining preferences. Therefore, two objects could be perceived as different in a similarities-based map but similar in a preference-based spatial map. In employing MDS the researcher should identify which type of output information is needed before deciding on the form of input data. For example, if a company is interested in determining how similar/dissimilar their product is to all competing products, then similarities data would be required. However, if the researcher is interested in respondent's preferences for one brand over another, then preference data is required.
(3)
Answer Ideal points may be used to represent the most preferred combination of perceived attributes (i.e., an ideal product), or may be used to define relative preference so that products further from the ideal should be less preferred. In determining ideal points, the researcher may use either explicit or implicit estimation. Explicit estimation involves having the respondents rate a hypothetical ideal on the same attributes on which the other stimuli were rated. There are several procedures for implicitly positioning ideal points. One method is to locate the ideal point as close as possible to the most preferred object and as distant as possible from the least preferred object. Implicit estimation positions an ideal point in a defined perceptual space such that the distance from the ideal conveys changes in preference. (4) HOW DO METRIC AND NONMETRIC MDS PROCEDURES DIFFER?
Answer Nonmetric methods assume ordinal input and metric output, and the distance output by the MDS procedure may be assumed to be approximately internally scaled. Metric methods assume that input as well as output are metric, which allows the researcher to strengthen the relationship between the final output dimensionality and the input data. (5) HOW CAN THE ANALYST DETERMINE WHEN THE "BEST" MDS SOLUTION HAS BEEN OBTAINED?
The objective of the analyst should be to obtain the best fit with the smallest number of dimensions, which requires a trade-off between the fit of the solution and the number of dimensions. Interpretation of solutions derived in more than three dimensions is extremely difficult and is usually not worth the improvement in fit. The analyst may also use an index of fit to determine the number of dimensions. The index of fit (or R-square) is a squared correlation index that can be interpreted as indicating the proportion of variance of the disparities that can be accounted for by the MDS procedure. Measures of .60 or better are considered acceptable; the higher the R-square, the better the fit. A third approach is to use a measure of stress. Stress measures the proportion of the variance of the disparities that is not accounted for by the MDS model.
(6)
HOW DOES THE RESEARCHER GO ABOUT IDENTIFYING THE DIMENSIONS IN MDS? COMPARE THIS PROCEDURE WITH THAT FOR FACTOR ANALYSIS.
Answer 1. The analyst can adopt several procedures for identifying the underlying dimensions: Respondents may be asked to interpret the dimensionality subjectively by inspecting the maps. If the data were obtained directly, the respondents may be asked (after stating the similarities and/or preferences) to identify the characteristics most important to them in stating these values. The set of characteristics can then be screened for values that match the relationships portrayed in the maps. The subjects may be asked to evaluate the stimuli on the basis of research-determined criteria (usually objective values) and researcher perceived subjective values. These evaluations can be compared to the stimuli distances on a dimension-by-dimension basis for labeling the dimensions. Attribute or ratings data may be collected in the original research to assist in labeling the dimensions. Specific programs, such as PROFIT (PROPerty FITting), are available for this specific purpose. 2. The procedure for labeling the dimensions in MDS is much more subjective and requires additional skill and experience in analyzing the results. While the labeling of factors is also met with its share of subjectivity, the researcher may rely on "rules of thumb" for use in determining which items load on which factor, the number of factors to extract, and the naming of factors. (7) COMPARE AND CONTRAST CORRESPONDENCE ANALYSIS TO THE MDS TECHNIQUES.
Correspondence analysis is a compositional perceptual mapping technique which relies on the association among nominally scaled variables. Measures of similarity are based on the chi-square metric derived from a cross tabulation table. It has the unique feature of spatially representing both objects and attributes on the same spatial map.
(8)
Correspondence analysis allows the representation of the rows and columns of a contingency table in joint space. Using the totals for each category an expected value is calculated for each cell. Then the difference between the expected and actual is calculated. Using this value, a chi-square statistic is formed for each cell as the squared difference divided by the expected value. The chi-square values can be converted to similarity measures by applying the opposite sign of their difference. The similarity measure provides a standardized measure of association that can be plotted in an appropriate number of dimensions (number of rows or columns minus one).
The distinguishing features of SEM include: (1) (2) The estimation of multiple and interrelated dependence relationships in a single analysis. An ability to represent unobserved concepts in these relationships and correct for measurement error in the estimation process thus providing for more accurate relationships. A focus on explaining the covariance among the measured items. This allows an assessment of fit and provides a better tool for assessing the construct validity of a set of measures. The assessment of fit allows a better examination of the accuracy of a model.
(3)
Pages 711-718 provide an introductory overview. The overview emphasizes a few key points that help understand SEM. One point involves the distinction between exogenous and endogenous constructs.
144
Exogenous constructs are the latent, multi-item equivalent of independent variables. As such, they use a variate of measures to represent the construct which acts as an independent variable in the model. They are determined by factors outside of the model (i.e., they are not explained by any other construct or variable in the model). Exogenous constructs are represented without error. The error lies in the measured variables that indicate the construct. Endogenous constructs are the latent, multi-item equivalent to dependent variables. (i.e., a variate of individual dependent variables). They are constructs that are theoretically determined by factors within the model. Endogenous constructs must have an error term associated with each. In addition, the measured variables that indicate the construct also have an error term.
The relationships are represented by parameters in a set of structural equations. That is, the equations are interrelated. The equations for the measurement model contain parameters used in the equation for the structural parameters. SEM is a particularly useful technique because it allows researchers to do more than simply test the significance of relationships. It allows researchers to assess the overall validity of a proposed model by assessing its fit. SEM does not seek to only explain variance. It explains covariance. Therefore, fit is assessed by how well the structural equations can be used to reproduce the observed covariance among measured items. The closer the estimated covariances come to the observed covariances, the better the fit. Above all, whether testing the structural model or the measurement model, it is a good idea to emphasize that SEM is a confirmatory technique. This means, it is useful in testing some proposed theory. It is not the most appropriate technique to simply empirically assess relationships.
146
Confirmatory modeling strategy - The most direct application of structural equation modeling is a confirmatory modeling strategy. The researcher specifies a single model (set of relationships), and SEM is used to assess how well the model fits the data based on a comparison of the observed covariance (S) with the estimated covariance (). Competing modeling strategy - As a means of evaluating the estimated model with alternative models, overall model comparisons can be performed in a competing models strategy. The strongest test of a proposed model is to identify and test competing models that represent truly different hypothetical structural relationships. When comparing these models, the researcher comes much closer to a test of competing theories, which is a much stronger test than just a slight modification of a single theory.
SEM is also inappropriate when sample size considerations are not met. In recent years, research has suggested that simple cutoffs such as a sample of 300 are needed are too simplistic. Based on the discussion of sample size, the following suggestions are offered: SEM models containing five or fewer constructs, each with more than three items(observed variables),and with high item communalities (.6 or higher),can be adequately estimated with samples as small as 100150.If any communalities are modest (.45.55),or the model contains constructs with fewer than three items, then the required sample size is more on the order of 200. If the communalities are lower or the model includes multiple underidentified (fewer than3 items) constructs, then minimum sample sizes of 300 or more are needed to be able to recover population parameters. When the number of factors is larger than six, some of which use fewer than three measured items as indicators and multiple low communalities are present, sample size requirements may exceed 500.
147
1. Defining individual constructs a. Here, it is important to stress the importance of face validity or the fact that the definitions of constructs match the content of the item indicators. 2. Developing the overall measurement model a. Here, it is important to consider the number of indicators used for each construct and whether or not the measured variables indicate a latent construct or form some factor. This will be discussed more in the next chapter. 3. Designing a study to produce empirical results 4. Assessing the measurement model validity a. A more detailed discussion of fit is included. Basically, once a researcher specifies a model, a SEM program such as LISREL, EQS or AMOS estimates parameters and provides an assessment of overall model fit. A researchers theory is used to specify a model, and the model fit compares the theory to reality as represented by the data. If a researchers theory were perfect, the estimated covariance matrix and the actual observed covariance matrix would be equal. Thus, the estimated covariance matrix (k) is compared mathematically to the actual observed covariance matrix (S) to provide an estimate of model fit. The closer the values of these two matrices are to each other, the better the model is said to fit. b. The difference in the covariance matrices (S - k) is a key value in SEM. Indeed, SEM estimation procedures like maximum likelihood produce parameter estimates that mathematically minimize this difference for a specified model. A Chi-square (2) test provides a statistical test of the resulting difference. c. It is also important to point out the differences in the types of fit indices: i. Absolute fit indices ii. Relative fit indices iii. Parsimony fit indices
148
5. Specifying the structural model a. Figure 10-8 demonstrates a complete structural path model including both the measurement parameters and the structural parameters. b. Each hypotheses means that a path must be freed between constructs. 6. Assessing structural model validity a. The fit must be assessed again just as it was done for the measurement model. b. In addition, the size and significance of the parameters representing hypotheses must be examined.
149
150
4.
HOW IS STRUCTURAL EQUATION MODELING SIMILAR TO THE OTHER MULTIVARIATE TECHNIQUES DISCUSSED IN THE EARLIER CHAPTERS? SEM shares much similarity with multiple regression and factor analysis. Multiple regression is also used to test dependence relationships. However, it cannot assess relationships for multiple dependent variables simultaneously. The equations for SEM and a multiple regression analysis are similar in form. SEM also shares similarity with exploratory factor analysis in that each is capable of representing latent factors by creating variates using measured variables. Unlike EFA however, the researcher must specific the number of factors and the variables that load on it prior to conducting a CFA.
5.
WHAT IS A THEORY? HOW IS A THEORY REPRESENTED IN A SEM FRAMEWORK? Theory can be defined as a systematic set of relationships providing a consistent and comprehensive explanation of a phenomenon. From this definition, we see that theory is not the exclusive domain of academia but can be rooted in experience and practice obtained by observation of realworld behavior. Theories are represented by models. The models specify relationships. The relationships correspond to parameter estimates that are represented in equations for each construct or measured variable. The measured variables form variates that create the constructs.
6.
WHAT IS A SPURIOUS CORRELATION? HOW MIGHT IT BE REVEALED USING SEM? A spurious relationship is one that is false or misleading. One way a relationship can be spurious is when there is a really another event that explains both the cause and effect. Using SEM, a model including the cause and effect can be modified by adding the third variable/construct as an additional predictor of both the cause and effect. If the relationship between the cause and effect becomes insignificant, there is evidence of spuriousness in the relationship.
7.
WHAT IS FIT? Fit indicates how well a specified model reproduces the covariance matrix among the measured items. An assessment of fit provides an assessment of the accuracy of some theoretical model. 151
8.
WHAT IS THE DIFFERENCE BETWEEN AN ABSOLUTE AND A RELATIVE FIT INDEX? Absolute fit indices are based on how well a specified model reproduces the observed covariance matrix. They do not explicitly compare the fit of a specified model to any other model as do other types of fit indices. Relative fit indices are incremental fit indices. Incremental fit indices are a group of statistical fit indices that assess how well a specified model fits relative to some alternative baseline model. Most commonly, the baseline model is a null model specifying that all measured variables are unrelated to each other.
9.
HOW DOES SAMPLE SIZE AFFECT STRUCTURAL EQUATION MODELING? Sample size affects structural equation modeling in several ways. Sample size influences whether or not SEM is appropriate. The chapter provides guidelines for assessing whether the sample size is sufficient to get reliable SEM results. The factors that influence the appropriateness include the number of constructs and measured variables in a model and the communalities of the latent constructs. Additionally, sample size affects SEM through its impact on fit. Sample size is included in the equation for many fit indices, including the chi-square likelihood test. Therefore, with large samples, these fit indices increase in a way that may not reflect inaccuracy in the model.
10.
WHY ARE THERE NO MAGIC VALUES THAT DESIGNATE GOOD FIT FROM POOR FIR ACROSS ALL SITUATIONS? Because there are too many factors that influence fit. These include the sample size, the number of constructs, the number of measured items and the number of measured items representing each construct. So for example, a CFI of .97, which may be acceptable for a model with 24 measured variables, five constructs and a sample size over 250, may be in appropriate for a model with only four constructs and 12 measured variables and a sample size of 125.
152
11.
DRAW A PATH DIAGRAM WITH TWO EXOGENOUS CONSTRUCTS AND ONE ENDOGENOUS CONSTRUCT. THE EXOGENOUS CONSTRUCTS ARE EACH MEASURED BY 5 ITEMS AND THE ENDOGENOUS CONSTRUCT IS MEASURED BY 4 ITEMS. BOTH EXOGENOUS CONSTRUCTS ARE EXPECTED TO BE RELATED NEGATIVELY TO THE ENDOGENOUS CONSTRUCT. It should look something like this (the student can fill in the names of the constructs as he/she wishes):
E X1
E X2
E X3
E X4
E X5
Price
(1)
Atmosphere
(3)
X6 X7 X8 X9 X10
Y1
Y2
Y3
Y4
153
154
Construct validity is the extent to which a set of measured items actually reflect the theoretical latent construct they are designed to measure. Thus, it deals with the accuracy of measurement. Evidence of construct validity provides confidence that item measures taken from a sample represent the actual true score that exists in the population. Construct validity is made up of four important components: 1. Convergent validity is the extent to which items that are indicators of a specific construct converge or share a high proportion of variance in common. CFA provides evidence in the form of measurement parameter coefficients or factor loadings, which can be used to compute variance extracted estimates. o Standardized factor loadings should be at least .5 or greater but preferably, .7 or greater. o Variance extracted estimates for a construct should be .5 or greater. 2. Discriminant validity is the extent to which a construct is truly distinct from other constructs. Thus, high discriminant validity provides evidence that a construct is unique and captures some phenomena other measures do not. CFA provides evidence in two ways: First, the items making up two constructs could just as well make up only one construct. So, competing CFA models could be set up comparing the fit of a CFA assuming the items make up one construct with that of a CFA assuming they make up two constructs. If the fit of the two construct model is not significantly better than that of the one construct model, then there is insufficient discriminant validity. A more conservative test is to compare the variance extracted percentages for any two constructs with the square of the correlation estimate () between these two constructs. The variance extracted estimates should be greater than the squared correlation estimate.
3. Nomological validity is tested by examining whether or not the correlations between the constructs in the measurement theory make sense. The matrix of construct correlations can be useful in this assessment. In other words, the construct should fit with other theoretical concepts as theory would suggest that it does.
155
Things that are expected to be unrelated should show no correlation. Things that are opposites should produce negative correlations. Things that coincide to some degree should show positive correlations.
4. Face validity is sometimes called content validity. It is the extent to which the content of the items match the construct definition. CFA does not provide direct evidence of face validity. Informed judgment is used to provide evidence of face validity (common sense logic). This can be and should be examined prior to conducting a CFA.
o Over-identified models are those with more unique covariance and variance terms than parameters to be estimated. o A just-identified model has just enough degrees of freedom to estimate all parameters required for a unique solution. They fit perfectly by definition and thus are not very interesting. CFA is also not diagnostic in this case. CFA is used prior to testing a theoretical model proposing relationships between constructs. In other words, prior to conducting a structural model. 156
CFA is not used to examine measurement properties if constructs are represented by single items or by composite indicators.
CFA. The concept of unidimensionality is discussed. Unidimensionality means that a set of observed variables (indicators) has only one underlying construct. Figure 11-2 illustrates the concepts of between construct error variance and within construct error variance. Both are threats to the unidimensionality of a CFA model.
157
Good practice suggests that a measurement model have congeneric properties. When a measurement model hypothesizes no covariance between or within construct error variances, meaning they are all fixed at zero, the measurement model is said to be congeneric. Congeneric measurement models are considered to be sufficiently constrained to represent good measurement properties. The appropriate number of items per construct is discussed. Largely as a result of the identification issue, good practice suggests four or more constructs are needed. Three items can be acceptable for a given construct if other constructs have more than three item indicators. Reflective constructs must be treated differently from formative constructs. o A reflective measurement theory is modeled based on the idea that latent constructs cause the measured variables and that the error results in an inability to fully explain these measures. Thus, the arrows are drawn from latent constructs to measured variables. As such, reflective measures are consistent with classical test theory. o A formative measurement theory is modeled based on the assumption that the measured variables cause the construct. The error in formative measurement models is an inability to fully explain the construct. A key assumption is that formative constructs are not considered latent. The arrows are drawn from the measured variable to the construct.
1. The measurement model must be specified in this stage. One aspect in doing this is setting the scale for each construct. Two ways to do this are discussed. Most commonly, the value of one loading estimate per construct is set to one. 2. The researcher should design a study that will minimize the likelihood of identification problems. Two conditions that help establish identification are the: 158
The order condition refers to the requirement discussed earlier that the net degrees of freedom for a model be greater than zero. That is, the number of unique covariance and variance terms less the number of free parameter estimates must be positive. The rank condition can be difficult to verify. It requires that each parameter estimated be algebraically defined. This can be much more difficult to establish. When an SEM program provides a message about linear dependence, it is likely due to a problem with the rank condition.
159
o Is there a theoretical reason to expect conceptual layers of constructs (each layer reflecting a different level of abstraction)? o Are all first-order factors expected to influence other nomologically related constructs in the same way? o Are the higher-order factors going to be used to predict other constructs of the same level of abstraction? o Are the minimum conditions for identification and good measurement practice present in both the first-order and higherorder layers of the measurement theory? 2. Multiple Group Analysis From a CFA standpoint, this often means that some type of crossvalidation is in order. Cross-validation is an attempt to reproduce the results found in one sample using data from a different sample. o Refer to the different types of cross-validation on pages 820 - 821. Loose-cross-validation is usually sufficient to support the same measurement model in each group. Factor-loading equivalence is sufficient to support comparisons of relationships between groups. Including partial metric invariance (equivalence and invariance can be used interchangeably). Scalar invariance is sufficient to support comparisons of construct means between groups. Including partial scalar invariance (at least two intercept terms per measured variable equal for all constructs).
o Also, Table 11-8 contrasts different tests of invariance and the results that go along with each. Measurement Bias o Constant methods bias implies that the covariance among measured items is caused in part by the fact that a common type of scale is used across a set of items. 160
o CFA can be used to test this by creating methods factors for each scale type or type of measurement device.
161
2. WHAT DOES FIT REFER TO WHEN USING CFA? Technically, fit means that the measurement equations conforming to the model specifications for each construct can be used to compute an estimated covariance matrix that can be compared to the actual observed covariance matrix of all measured items. The closer together they become, the better the fit. More importantly however, measurement model fit goes along with assessing how valid the measures taken together are. If a measurement model fits well its overall fit statistics are consistent with those discussed in Chapter 10. However, fit is also suggested of construct validity. So, overall, a measurement model that fits well also displays the empirical characteristics of construct validity suggested in the chapter. If the measurement model fits well, the researcher can proceed to examine a structural model representing specific theoretical relationships among the constructs.
162
3.
WHAT IS THE DIFFERENCE BETWEEN A FIXED AND FREE PARAMETER IN CFA? A free parameter is one for which a value is estimated using SEM procedures. A fixed parameter is one for which a value is specified prior to estimating a model using SEM procedures. Those parameters that are assumed to be zero, such as cross-construct loading estimates, are also considered fixed. LIST AND VALIDITY. DEFINE THE COMPONENTS OF CONSTRUCT
4.
Convergent validity is the extent to which items that are indicators of a specific construct converge or share a high proportion of variance in common. Discriminant validity is the extent to which a construct is truly distinct from other constructs. Thus, high discriminant validity provides evidence that a construct is unique and captures some phenomena other measures do not. Nomological validity is tested by examining whether or not the correlations between the constructs in the measurement theory make sense. Face validity is sometimes called content validity. It is the extent to which the content of the items match the construct definition.
5.
WHAT ARE THE STEPS IN DEVELOPING A NEW CONSTRUCT MEASURE? Define the construct theoretically. Developing scale items that match the definition. Judge the items for content Conduct a pretest to evaluate the items. Scale modifications are made if necessary. A confirmatory test of the measurement model is conducted using CFA. 163
6.
WHAT ARE THE PROPERTIES OF A CONGENERIC MEASUREMENT MODEL? WHY DO THEY REPRESENT THE PROPERTIES OF GOOD MEASUREMENT? Congeneric measurement models have the following properties: o All cross-construct loading estimates are fixed to zero (assumed not to exist) o This means that every measured item loads on exactly one latent construct o All between-construct error covariance terms are fixed to zero o All within-construct error covariance terms are fixed to zero Congeneric measurement models are considered to be sufficiently constrained to represent good measurement properties because a congeneric measurement model meets requirements associated with construct validity.
7.
WHY IS A FOUR-ITEM FACTOR OVER-IDENTIFIED? A four-item factor is over identified because it has net positive degrees of freedom after estimation. A four-item congeneric measurement model requires estimation of eight estimates using eight out of a possible ten degrees of freedom ([4 * 5]/2).
8.
WHAT ARE THE CONSIDERATIONS IN DETERMINING WHETHER OR NOT INDICATORS SHOULD BE MODELED AS FORMATIVE OR REFLECTIVE? What is the direction of causality between the multiple indicators and the factor (construct)? o Reflective items are caused by the factor. o Formative items cause the factor.
164
What is the nature of the covariance among the indicator items? o If the items are expected to covary highly with each other, then the reflective model is more appropriate. If an indicator should not be highly related to the others, you probably should delete it. Thus, a key point is that with reflective models, all of the indicators will tend to move together meaning that changes in one will be associated with changes in the others. High inter-item covariance provides evidence consistent with reflective indicators. o Formative indicators of a factor are not expected to show high covariance. Thus, an index may be comprised of numerous measures for which there is no common basis. As a result, formative indicator items are not expected to move together. Is there high duplicity in the content of the items?
o If all of the indicator items share a common conceptual basis, meaning they all indicate the same thing, then the measurement model is best considered reflective. Since all the items represent the same concept, dropping an item does not materially change a constructs meaning. o Formative items need not share a common conceptual basis. Therefore, if it appears the indicators cause the formative construct, but they share nothing in common conceptually, then they are still acceptable as formative indicators. o With formative indicator models, dropping an item produces a material change in the construct. How do the indicators relate to other variables? o All of the indicators of a single construct relate to other variables in a similar way in a reflective measurement model. o The indicators of a formative construct need not relate to other variables in a similar way. For formative measurement models, the researcher would expect one indicator to produce a different pattern of relationships with an outside variable than would another indicator.
165
9.
WHAT IS A HEYWOOD CASE AND HOW IS IT TREATED USING SEM? The term Heywood case refers to a factor solution that produces an error variance estimate of less than 0 (a negative error variance). Heywood cases are particularly problematic in CFA models with small samples or when the three indicator rule is not followed. So, avoiding small samples and one and two item indicators is a way to avoid Heywood cases. If a Heywood case occurs, with the resulting identification issues, the offending variable, if it can be isolated, can be deleted. Alternatively, an added constraint such as fixing factor loadings to be equal, as in tau-equivalence, may enable a CFA solution. WHAT IS THE DIFFERENCE BETWEEN A GOODNESS OF FIT AND A BADNESS OF FIT INDEX? A goodness of fit index, like the CFI, works in a way such that higher values represent better fit. In this case, the closer the values are to one, the better is the fit. A badness of fit index, like the RMSEA, works in a way such that smaller values represent better fit. For the RMSEA, values that become progressively close to zero provide better fit.
10.
11.
IS IT POSSIBLE TO ESTABLISH PRECISE CUT-OFFS FOR CFA FIT INDICES? EXPLAIN YOUR ANSWER. No, it isnt possible to establish precise cut-offs. The same fit criteria apply for CFA as for SEM in general. Recalling from the previous chapter, there are too many things that can affect the various indices from situation to situation to expect a one-size fits all fit index value. In particular, the number of constructs, measured items and the sample size involved in a CFA all can affect common fit indices.
166
12.
DESCRIBE THE STEPS IN A SPECIFICATION SEARCH. A specification search is an empirical trial and error approach that may lead to sequential changes in the model based on key model diagnostics. The steps involved in conducting one are: o Examine factor loadings are any too low? o Examine residuals are any too high? o Examine modification indices do any suggest a significant improvement in fit if freed? o Estimate the model after making the changes suggested through these stages is the fit significantly better? If so, stop. o Otherwise, repeat the process until an acceptable fit is found.
13.
WHAT CONDITIONS MAKE A SECOND ORDER FACTOR MODEL APPROPRIATE? Is there a theoretical reason to expect two conceptual layers of a construct exist? Are all the first-order factors expected to influence other nomologically related constructs in the same way? Are the higher-order factors going to be used to predict other constructs of the same general level of abstraction (i.e., global personality global attitudes)? Are the minimum conditions for identification and good measurement practice present in both the first-order and higher order layers of the measurement theory?
If the answer to all of these questions is yes, a second order factor model can be justified.
167
14.
WHAT CONDITIONS MUST BE SATISFIED IN ORDER TO DRAW VALID CONCLUSIONS ABOUT DIFFERENCES IN RELATIONSHIPS AND DIFFERENECES IN MEANS BETWEEN THREE DIFFERENT GROUPS OF RESPONDENTS ONE FROM CANADA, ONE FROM ITALY AND ONE FROM JAPAN? EXPLAIN YOUR RESPONSE. Measurement Invariance Must Exist in Two Forms:
o Metric Invariance the loading estimates do not vary between Canada and Japan. Valid relationship comparisons can be made if at least partial metric invariance exists. This means at least two loading estimates have to be the same in both Canada and the Japan for each construct. o Scalar Invariance the measured variable intercept terms must be the same among the Canadian sampling units as they are among the Japanese sampling units. To make valid mean comparisons, at least partial scalar invariance must exist. This means at least two intercept terms have to be the same among both Canadian and Japanese respondents. 15. AN INTERVIEWER COLLECTS DATA ON AUTOMOBILE SATISFACTION. TEN QUESTIONS ARE COLLECTED VIA A PERSONAL INTERVIEW. THEN, THE RESPONDENT RESPONDS TO ANOTHER 20 ITEMS BY MARKING THE ITEMS USING A PENCIL. HOW CAN CFA BE USED TO TEST WHETHER OR NOT THE QUESTION FORMAT HAS BIASED THE RESULTS? Two additional factors can be introduced to the CFA. One should represent the influence of the paper and pencil instrument. The other should represent the personal interview format. If the introduction of the two method factors changes the parameter estimates and fit of the original automobile satisfaction model significantly, then the methods are biasing the results.
168
169
3. Models can be either recursive or nonrecursive. A model is considered recursive if the paths between constructs all proceed only from the predictor (antecedent) construct to the dependent or outcome construct (consequences). A nonrecursive model contains feedback loops. A feedback loop exists when a construct is seen as both a predictor and an outcome of another single construct. The feedback loop can either involve a direct or an indirect relationship.
3. The CFA fit provides a useful baseline to assess the structural or theoretical fit. Since a recursive structural model cannot fit any better (have a lower 2) than the overall CFA, one can conclude that the structural theory lacks validity if the structural model fit is substantially worse than the CFA model fit.
171
4. Theoretical validity increases to the extent that the parameter estimates are: Statistically significant and in the predicted direction. That is, they are greater than zero for a positive relationship and less than zero for a negative relationship. Nontrivial. This can be checked using the completely standardized loading estimates. The guideline here is the same as in other multivariate techniques. Recall that SEM provides an excellent tool for testing theory. Therefore, any relationship revealed in a post hoc analysis provides only empirical evidence not theoretical support. For this reason, post hoc identified relationships should not be relied upon in the same way as the original theoretical relationships.
172
2. Multi-Group Analysis Multiple group analysis for structural models is an extension of the multiple group CFA case. The interest now focuses on similarities and differences between structural parameters indicating differences in relationships between the groups. Researchers often develop a theory that predicts that one or more structural relationships vary between groups. This frequently involves a test of moderation
173
2.
HOW CAN A MEASURED VARIABLE REPRESENTED WITH A SINGLE ITEM BE INCORPORATED INTO AN SEM MODEL? The relationship between the variable and the latent construct is set (fixed) to the square root of the estimated reliability. The corresponding error term is set to 1 the reliability estimate. CHARACTERISTIC OF A
3.
One construct can be thought of as causing another construct and at the same time, be considered as an outcome of that same construct. Visually, an arrow can be drawn from one construct to another and a separate arrow can be drawn in the opposite direction.
4.
HOW IS THE VALIDITY OF AN SEM MODEL ESTIMATED? By assessing the overall fit of the model. By assessing the statistical significance, size and direction of the relationships between constructs. Are they consistent with predictions and nontrivial? 174
5.
WHAT IS A MAJOR CONCERN WHEN USING SEM TECHNIQUES WITH LONGITUDINAL DATA? One of the key issues in modeling longitudinal data with SEM involves added sources of covariance associated with taking measures on the same units over time.
6.
WHAT IS PLS AND HOW IS IT DIFFERENT FROM SEM? PLS treats the factors as individual composite scores. In other words, it does not try to recreate the covariance between measured item scores. Degrees of freedom do not have a meaningful role in PLS as in SEM. PLS does not generally rely on optimization procedures as does SEM. PLS models have fewer problems with statistical identification and fatal errors that prevent solutions. PLS finds solutions based on minimizing the variance in endogenous constructs. SEM attempts to reproduce observed covariance between measured items. PLS cannot distinguish formative and reflective indicators. PLS does not require the characteristics of good measurement to produce results. PLS is less sensitive to sample size considerations.
175
7.
DRAW A STRUCTURAL MODEL HYPOTHESIZING THAT 3 EXOGENOUS CONSTRUCTS, X, Y AND Z, EACH AFFECTING A MEDIATING CONSTRUCT, M, WHICH IN TURN DETERMINES TWO OTHER OUTCOMES, P A ND R.
P M
Y Z
8.
HOW CAN SEM TEST FOR A MODERATING EFFECT? Multi-group SEM is often used to test moderating effects. A multigroup SEM model is conducted as described in the previous chapter for multi-group CFA. The procedures that are used for testing moderation in this manner follow closely along with the tests of invariance performed with CFA. That is, the same SEM model structure is used with both groups. The model fit can be compared to an alternative model containing the same pattern of free and fixed elements with the exception that the relationship representing the moderating effect is made equal in both groups. If the model holding the relationship equal is not significantly worse, then evidence suggests there is no need for a higher order model. Alternatively, continuous variable interactions can be computed and used with SEM. This is the less preferred application in most research contexts.
9.
WHY IS IT SO IMPORTANT TO FIRST EXAMINE THE RESULTS OF A MEASUREMENT MODEL BEFORE PROCEEDING TO TEST A STRUCTURAL MODEL? To assess the construct validity of a set of measures. Significant structural relationships between constructs that lack validity would have little meaning.
176
Chapter 1: Introduction
Circle the letter of the item which best answers the question. 1. Multivariate analysis is difficult to define. Which of the below statements most adequately defines this type of analysis? a. Examining relationships between or among more than two variables. b. Simultaneously analyzing multiple measurements using statistical methods. c. An analysis in which all variables are interrelated in such ways that their different effects cannot be easily studied separately. d. Multivariate analysis is best defined as an extension of univariate analysis and/or bivariate analysis. Multivariate techniques can be classified as either dependence or interdependence methods. From the list below pick the pair of techniques that can both be classified under the dependence methods. a. Factor analysis and cluster analysis b. Multiple discriminant analysis and multivariate analysis of variance c. Multiple regression analysis and multidimensional scaling d. Cluster analysis and conjoint analysis Multiple discriminant analysis is useful in situations where: a. The total sample can be divided into groups based on descriptive variables. b. The total sample can be divided into groups based on independent variables. c. The total sample can be divided into groups based on a dependent variable. d. The total sample can be divided into groups based on a combination of independent and dependent variables. The successful categorization of dependence methods requires two characteristics: a. The number of independent and dependent variables. b. The number and measurement scale of the independent variables. c. The number and measurement scale of the dependent variables. d. The number of dependent variables and the measurement scale of both independent and dependent variables. 177
2.
3.
4.
5.
Multiple regression is best described as a: a. Nonmetric dependence method. b. Nonmetric independence method. c. Metric independence method. d. Metric dependence method. Data analysis involves the basic nature of: a. Testing, evaluating, and gathering information. b. Assigning, labeling, and manipulating information. c. Partitioning, identifying, and measuring information. Cluster analysis is used to: a. Transform consumer judgments of similarity or preference into distances represented in multidimensional space. b. Analyze the interrelationships among a large number of variables. c. Explore simultaneously the relationship among several independent variables. d. Develop meaningful subgroups of individuals or objects. Measurements with a nominal scale involve: a. The most precise measurements available in data analysis. b. Showing the relation to the amount of the attribute possessed. c. Assigning numbers which are used to label subjects or objects. d. An absolute zero point. If you had nonmetric or qualitative data, one would be using: a. nominal scale. b. interval scale. c. ratio scale. d. ordinal scale. e. d or a An example of a multivariate technique that would be appropriate for use with a nonmetric dependent variable and a metric independent variable is: a. b. c. d. multiple discriminant analysis. multiple regression analysis. multivariate analysis of variance. canonical correlation analysis.
6.
7.
8.
9.
10.
178
2.
3.
4.
5.
6.
7.
Outliers are not the result of: a. Procedural errors. b. Extraordinary observations associated with a specific event. c. An unrepresentative sample. d. An ordinary value which is unique when combined with other variables. The most fundamental assumption in multivariate analyses is: a. Independence of error terms. b. Linearity. c. Normality. d. Homoscedasticity. The Box's M test is used to test for the assumption of: a. Independence of error terms. b. Linearity. c. Normality. d. Homoscedasticity. All of the following statements are true, except: a. Heteroscedasticity can only be remedied by transformation of the independent variable. b. To conduct a transformation, the ratio of a variable's mean divided by its standard deviation should be less than 4.0. c. Simple transformations change the interpretation of the variables. d. When choosing between the transformation of two variables, the variable with the smallest ratio of a variable's mean divided by its standard deviation should be chosen.
8.
9.
10.
180
2.
3.
4.
5.
6.
181
7.
Two ways to rotate factors in factor analysis are: a. oblique rotation, orthogonal rotation. b. factor rotation, main rotation. c. orthogonal rotation, centroid rotation. d. linear rotation, parameter rotation. Unrotated factor solutions always achieve the objective of: a. data reduction. b. data summarization. c. bivariate reduction. d. treatment reduction. The most commonly used technique in factor extraction is called the: a. a priori criterion. b. percentage of variance criterion. c. scree test criterion. d. latent root criterion. In testing the significance of factor loadings, the larger the sample size,: a. the larger the loading to be considered significant. b. the smaller the loading to be considered significant. c. the larger the residual variance. d. the smaller the residual variance.
8.
9.
10.
182
2.
3.
4.
183
5.
The coefficient of determination is used: a. to assess the relationship between the dependent and the independent variables. b. as a guide to the relative importance of the predictor variables. c. as a prediction in estimating the size of the confidence interval. d. to test the different coefficients of each independent variable. The two most common approaches to regression analysis are: a. backward elimination, parameter elimination. b. stepwise forward estimation, backward elimination. c. the interval scale, ratio scale. d. multiple estimation, background elimination. The coefficients resulting from standardized data are called: a. alpha coefficients. b. variable coefficients. c. beta coefficients. d. freedom coefficients. The method of elimination of the variables from the regression model is done through a: a. correlation matrix. b. confidence level measurement. c. residual plot. d. determination of the standard deviation of the data. In testing the normality of error term distribution one can use three procedures. The simplest method is: a. testing the criterion variables for slack. b. construction of histograms. c. measure the variance of the plot of residuals. d. looking at the appropriateness of the F-statistic. Backward elimination involves: a. looking at each variable for consideration in the model prior to the inclusion of the variable into the developing equation. b. computing a regression equation with all the variables, and then going back and deleting those independent variables which are nonsignificant. c. examining the collinearity diagnostics. d. examining the t-value for the original variables in the equation.
6.
7.
8.
9.
10.
184
2.
3.
4. c. d.
One assumption in deriving discriminant functions is the: a. abnormality of the distributions and a positive centroid. b. unequal costs of misclassifications. normality of the distributions and unknown depression and covariance structures. normality of the distributions and known high-low ratio.
185
5.
One objective for applying discriminant analysis is: a. to determine if statistically significant differences exist between the high ratios of the defined priori groups. b. to determine which independent variables account for the most difference in the average score profiles of the groups being analyzed. c. to determine the model which possesses the property of additivity and homogeneity, with the highest R2 value. d. to determine the correlations between a single dependent variable and several independent variables. The three stages of discriminant analysis include: a. rotation, correlation, and application. b. extraction, derivation, and projection. c. derivation, calibration, and interpretation. d. analyzing, predicting, and rotation. The simultaneous method is: a. a computational method utilized in deriving discriminant functions. b. useful when the analyst wants to consider a relatively large number of independent variables for inclusion. c. necessary in order to clarify the usefulness Mahalanobis D statistic. d. useful in developing a classification matrix that discriminates significantly. The maximum chance criterion should be used when the sole objective: a. is to estimate the hold-out samples. b. is to maximize the percentage correctly classified. c. is to interpret the magnitude of the standardized discriminant weights. d. to measure the linear correlation between independent variables. Discriminant loadings are most commonly used to: a. examine the sign and magnitude of the standardized discriminant weights. b. develop a classification matrix to assess the predictions of the function. c. evaluate the group differences. d. measure the simple linear correlation between each independent variable and the discriminant function.
6.
7.
8.
9.
186
10.
The interpretation phase of discriminant analysis involves the three methods of: a. standardized discriminant weights, discriminant structure correlations, and partial F-values. b. chance models, cutting score determination, and stepwise method. c. simultaneous method, discriminant structure, and partial F-values. d. deviation, validation, and standardization.
187
2.
3.
4.
188
5.
The primary function of an experimental design is to serve as: a. a hypothesis for the experiment in general. b. an analysis sample taken from the whole population. c. the dummy variables used in the regression analysis. d. a control mechanism to provide more confidence in your relationships among variables. In "analysis of variance" designs: a. metric dependent variables are used with metric independent variables. b. metric independent variables are used with nonmetric independent variables. c. nonmetric independent variables are used with nonmetric dependent variables. d. nonmetric dependent variables are used with nonmetric independent variables. Random effects designs assume the groups being studied are a random sample from a larger population with a: a. known mean and variance. b. unknown standard deviation and mean. c. unknown mean and known variance. d. unknown mass and variance. When one is testing the significance of difference among three or more treatment groups, he must use: a. only the Hotelling's T2 statistic. b. both the Hotelling's T2 statistic and the Mahalanobis D2 statistic. c. only the Mahalanobis D2 statistic in conjunction with Wilk's lambda. d. the Wilk's lambda. When one has two or more factors, each at two or more levels, he has what is known as a: a. full factorial design. b. bivariate factorial design. c. fixed-effect factorial design. d. random effect factorial design. Residual or error variance should be: a. not normally distributed, with unequal error variance among the cells. b. normally distributed, with dependent variables. c. normally distributed, with equal error variance among the cells. d. bimodally distributed, with dependent variables.
6.
7.
8.
9.
10.
189
2.
3.
4.
5.
190
6.
The method of presentation of the objects is presumed to be most realistic. a. trade-off b. full profile c. both are equally realistic d. Realism is not a consideration in presentation methods. The part-worth relationship which is most similar to the relationships found in past multivariate techniques is: a. separate. b. quadratic. c. linear. d. nonlinear. e. conjoint not comparable in this regard The most appropriate use of a conjoint analysis choice simulator is for: a. assessing the importance of new attributes. b. incorporating new choice objects into the estimation of the partworths. c. assessing preferences for a specified set of objects. d. defining segments of consumers with similar part-worth profiles. e. All of the above are appropriate uses for choice simulators. Conjoint analysis is most closely like which of these multivariate techniques? a. factor analysis b. analysis of variance (ANOVA) c. cluster analysis d. discriminant analysis e. multidimensional scaling Which of the following is not a feature which distinguishes conjoint analysis from other multivariate techniques? a. can predict nonlinear as well as linear relationships b. predicts relationships for each respondent c. can be estimated in which each level of a variable has no relationship to other levels d. a and b only e. a and c only
7.
8.
9.
10.
191
2.
3.
4.
5.
192
6.
Cluster analysis is best seen as a data reduction technique when it: a. defines the sample in a small number of clusters. b. represents similarity in a single multivariate measure. c. identifies the most distinguishing characteristics of the differences among clusters. d. a and c e. None of the above. In the absence of absolute guidelines, the number of clusters selected to represent the "natural groupings" of the sample are determined by: a. theoretical or conceptual guidelines. b. measures of internal consistency. c. practical considerations. d. only can make the decision based on all of these measures. The validation stage involves: a. ensuring that the cluster solution is representative of the population. b. assessing the accuracy of the similarity measure used. c. determining that significant differences do exist between the clusters based on the characteristics used to define similarity d. All of the above. The linkage technique that minimizes the effects of extreme observations is: a. complete linkage. b. Ward's method. c. single linkage. d. average linkage. Which of the following is of most concern when using a nonhierarchcial technique? a. the measure of similarity b. the linkage method c. the specification of seed points d. construction of the dendogram
7.
8.
9.
10.
193
2.
3.
4.
5.
194
6.
In MDS, nonmetric methods assume the use of: a. ordinal input. b. metric output. c. nominal input. d. nonmetric output. e. c, d only f. a, b only In MDS, metric methods assume the use of: a. metric input and metric output. b. nonmetric input and nonmetric output. c. nominal input and ordinal output. d. factor input and trace output. One procedure for gathering preference data is to rate each stimuli on a(n): a. implicit scale. b. explicit scale. c. disjoint scale. d. factor scale. is the perceptual mapping technique based on the association of nominal data. a. Correspondence analysis b. Individual difference analysis (INDSCAL) c. Property fitting (PROFIT) d. Internal preference analysis e. External preference analysis Which of the following is not an assumption of MDS? a. all respondents will perceive objects in the same dimensionality b. respondents place different levels of importance on the dimensions of an object c. perceptions are not stable over time d. all of the above are assumptions of MDS
7.
8.
9.
10.
195
c.
d. e.
a. b. c. d. e.
196
a. b. c. d. e.
How is an exogenous construct represented in an SEM path model? The construct is represented by a box with arrows going into it. The construct is represented by an oval with arrows going into it. The construct is represented by a box with no arrows. The construct is represented by an oval with no arrows going into it. The construct is represented by an oval with arrows going into it from other constructs (ovals). 5) SEM algorithms are designed to: maximize the chi-square likelihood function. minimize the difference between the observed and estimated covariance matrices. minimize the covariances between constructs. minimize the RMSEA. not a, b, c or d 6) 7) Which is not an absolute fit index? CFI SRMR RMSEA chi-square GFI Which of the following is the accepted cut-off value for the TLI and CFI that distinguishes a good from a poor fit? .07 .90 .95 .97 There is no single accepted cut-off value.
a. b. c. d. e.
a. b. c. d. e. 8) a. b. c. d. e.
197
6. What conditions below are sufficient to justify deleting a scale item from analysis? a. a standardized residual of 3.0 b. a standardized factor loading estimate of .60 c. theoretical inspection suggesting it does not match the definition as well as do other items d. a and b e. a, b and c are each sufficient 7. Suppose a researcher is interested in comparing measurement theory results from respondents in South Korea with respondents from Australia. The researcher is specifically interested in whether or not the same scales can be used to capture the same constructs in each country. What type of cross-validation is sufficient to draw this conclusion? a. loose cross-validation b. full metric invariance c. strong factorial invariance d. factor loading equivalence e. partial scalar invariance 8. Which statistic is most useful for testing invariance and drawing conclusions about differences between groups? a. 2 b. SRMR c. R2 d. AFGI e. GFI
199
a. b. c. d. e.
200
5.
a. b. c. d. e. 6. a. b. c. d. e. 7.
A researcher tests a structural model of sales performance using a sample of 760 professional salespeople. The researcher determines that a significant relationship exists between salesperson intelligence and sales volume. The standardized path estimate is significant and equal to .06. Management dismisses the finding. What is a reason? The relationship is trivial. The sample size is probably too small. The sample size is probably too large. The model fit is poor. A and C ________ occurs when a third variable or construct changes the relationship between two related variables/constructs. An indirect effect A mediating effect A moderating effect A direct effect Correlation ________ occurs when the relationship between a predictor and an outcome is reduced but remains significant when a mediator is also entered as an additional predictor. Moderation Mediation Partial moderation Partial mediation Parsimony After-the-fact tests of relationships for which no hypothesis was theorized. In other words, a path is tested where the original theory did not indicate a path. Moderation Mediation Partial mediation Post hoc analysis Structural theory test When a theory hypothesizes that two constructs in a six construct model are unrelated to each other, the path between those two constructs should: be free. be fixed. be drawn as a single headed arrow. be assigned a value of 1. be drawn as a double headed arrow. 201
a. b. c. d. e. 8. a. b. c. d. e. 9. a. b. c. d. e.