Factor Analysis Presentation
Factor Analysis Presentation
Factor Analysis Presentation
Factor Analysis
George J. Knafl, PhD Professor & Senior Scientist [email protected]
Purpose
to describe and demonstrate factor analysis of survey instrument data
primarily for assessment of established scales with some discussion of the development of new scales
using the Statistical Package for the Social Sciences (SPSS) and the Statistical Analysis System (SAS)
PDF copy of slides are available on the Internet at
http://www.ohsu.edu/son/faculty/knafl/factoranalysis.html
2
Overview
1. examples of established scales 2. principal component analysis vs. factor analysis
terminology and some primary factor analysis methods
3. factor extraction
survey of alternative methods
4. factor rotation
interpreting the results in terms of scales
A Simple Example
subjects undergoing radiotherapy were measured on 6 dimensions [1, p. 33]
number of symptoms amount of activity amount of sleep amount of food consumed appetite skin reaction
can these be grouped into sets of related measures to obtain a more parsimonious description of what they represent?
perhaps there are really only 2 distinct dimensions 6 for these 6 variables?
Survey Instruments
survey instruments consist of items
with discrete ranges of values, e.g., 1, 2,
Example 1 - SDS
symptom distress scale [2]
symptom assessment for adults with cancer 13 items scored 1,2,3,4,5 measuring distress experience related to severity of 11 symptoms
nausea, appetite, insomnia, pain, fatigue, bowel pattern, concentration, appearance, outlook, breathing, cough and the frequency as well for nausea and pain
1 total scale
sum of the 13 items with none reverse coded higher scores indicate higher levels of symptom distress
Example 2 - CDI
Children's Depression Inventory [3]
27 items scored 0,1,2 assessing aspects of depressive symptoms for children and adolescents 1 total scale
sum of the 27 items after reverse coding 13 of them higher scores indicate higher depressive symptom levels
Example 3 FACES II
Family Adaptability & Cohesion Scales [4]
has several versions, will consider version II 30 items scored 1,2,3,4,5 2 scales
family adaptability
family's ability to alter its role relationships and power structure sum of 14 of the items after reverse coding 2 of them higher scores indicate higher family adaptability
family cohesion
the emotional bonding within the family sum of the other 16 of the items after reverse coding 6 of them higher scores indicate higher family cohesion
2 scales are typically used separately, but are sometimes summed to obtain a total FACES scale
10
Example 4 - DQOLY
sum of 23 of the items after reverse coding 1 of them higher scores indicate higher negative impact (worse QOL)
diabetes-related worries
sum of 11 other items with none reverse coded higher scores indicate more worries (worse QOL)
the 3 scales are typically used separately and not usually combined into a total scale
the youth version of the scale is appropriate for children 13-17 years old
also has a school age version for children 8-12 years old 11 and a parent version
Example 5 - FACT
Functional Assessment of Cancer Therapy [6]
27 general (G) items scored 0-4 4 subscales
physical, social/family, emotional, functional subscales sums of 6-7 of the general items with some reverse coded
1 scale
the functional well-being scale (FACT-G) the sum of the 4 subscales higher scores indicate better levels of quality of life
12
13
Example 7 - FMSS
Family Management Style Survey
a survey instrument currently under development parents of children having a chronic illness are being interviewed on how their families manage their child's chronic illness
as many parents as are willing to participate
14
Scale Development/Assessment
as part of scale development, an initial set of items is reduced to a final set of items which are then combined into one or more scales and possibly also subscales established scales, when used in novel settings, need to be assessed for their applicability to those settings such issues can be addressed in part using factor analysis techniques
will address these using data for the CDI, FACES II, DQOLY, and FMSS instruments starting with a popular approach related to principal 15 component analysis (PCA)
16
associated with the z's are an equal # of principal components (PC's) each PC can be expressed as a weighted sum of z's
this is how they are defined and used for a standard PCA
Variable Reduction
PCA can be used to reduce the # of variables one such use is to simplify a regression analysis by reducing the # of predictor variables
predict a dependent variable using the first few PC's determined from the predictors, not all predictors
diminishing returns to using more factors (or PC's), but hopefully there is a natural 18 separation point
Radiotherapy Data
can we model the correlation matrix R as if it its 6 dimensions were determined by 2 factors?
skin reaction is related to none of the others while appetite is related to the other 4 variables
Correlations Number of Symptoms 1 Amount Amount of Activity of Sleep .842** .322 .002 .364 10 10 10 .842** 1 .451 .002 .191 10 10 10 .322 .451 1 .364 .191 10 10 10 .412 .610 .466 .237 .061 .174 10 10 10 .766** .843** .641* .010 .002 .046 10 10 10 .348 -.116 .005 .325 .749 .989 10 10 10 Amount of Food Consumed .412 .237 10 .610 .061 10 .466 .174 10 1 Number of Symptoms Pearson Correlation Sig. (2-tailed) N Pearson Correlation Sig. (2-tailed) N Pearson Correlation Sig. (2-tailed) N Pearson Correlation Sig. (2-tailed) N Pearson Correlation Sig. (2-tailed) N Pearson Correlation Sig. (2-tailed) N Appetite Skin Reaction .766** .348 .010 .325 10 10 .843** -.116 .002 .749 10 10 .641* .005 .046 .989 10 10 .811** .067 .004 .854 10 10 10 .811** 1 .102 .004 .778 10 10 10 .067 .102 1 .854 .778 10 10 10
Amount of Activity
Amount of Sleep
Skin Reaction
**. Correlation is significant at the 0.01 level (2-tailed). *. Correlation is significant at the 0.05 level (2-tailed).
19
same approach used with any factor extraction method since the same k factors F are used with each z, they are called common factors but different loadings L are used with each z different or unique errors u are also used with each z
hence they are called the unique (or specific) factors
20
loadings are combined as entries in a matrix called the factor (pattern) matrix
1 row for each standardized item z
each containing loadings on all k factors for that standardized item
loadings are usually rotated and ordered to be better able to allocated them to factors
Component Matrix a 1 Number of Symptoms Amount of Activity Amount of Sleep Amount of Food Consumed Appetite Skin Reaction Component 2 .827 .361 .903 -.152 .659 -.230 .790 -.128
.977 -.037 .134 .955 Extraction Method: Principal Component Analysis. a. 2 components extracted.
23
Extraction Method: Principal Component Analysis. Rotation Method: Varimax with Kaiser Normalization. a. Rotation converged in 3 iterations.
24
Communalities
part of each z is explained by the common factors
z=L(1)@F(1)+L(2)@F(2)++L(k)@F(k)+u
the communality for z is the amount of its variance explained by the common factors (hence its name)
1=VAR[z]=VAR[L(1)@F(1)+L(2)@F(2)++L(k)@F(k)]+VAR[u] variances add up due to independence assumptions
the variance of the unique factor u is called the uniqueness 1=VAR[z]=communality+uniqueness so the communality is between 0 and 1
u is also called the specific factor for z and then its variance is called the specificity
25
then subtracted from 1 to get the uniqueness for z but need initial values for the communalities to start the computations
26
The PC Method
start by setting all communalities equal to 1
they stay that way if all the factor scores are used
27
but they were re-estimated based on loadings for the 2 extracted factors
the new values are < 1 as they should be when the # of factors < the # of items
Communalities Number of Symptoms Amount of Activity Amount of Sleep Amount of Food Consumed Appetite Skin Reaction Initial 1.000 1.000 1.000 1.000 Extraction .814 .838 .488 .641
28
Initial Communalities
the principal component (PC) method
all communalities start out as 1 and are then recomputed from the extracted factors
for both of these, can stop after the first step or iterate the process until the communalities do not change much
a problem occurs when communalities come out larger than 1 though 29
random settings
generate random numbers between 0 and 1
30
PC-Based Alternatives
1-step principal component (PC) method
set communalities all to an initial value of 1 compute loadings and factor scores re-estimate the communalities from these and stop iterated version available in SAS but not in SPSS
Eigenvalues
each factor F (or PC or FS) has an associated eigenvalue EV
also called a characteristic root since by definition it is a solution to the so-called characteristic equation for the correlation matrix R
the sum of the eigenvalues over all factors equals the total variance
sum of the EV's = total variance = # of items so an eigenvalue measures how much of the total variance of the z's is accounted for by its associated factor (or PC) in other words, factors with larger eigenvalues contribute more towards explaining the total variance of the z's
says to use the factors with eigenvalues > 1 and discard the rest an eigenvalue > 1 means its factor contributes more to the total variance than a single z since each z has variance 1 and so contributes 1 to the total variance
33
Component 1 2 3 4 5 6
34
Eigenvalue
look for the point on x-axis separating the "cliff" from the "debris" at its bottom i.e., a large change in slope
0 1 2 3 4 5 6
the sum of the squared loadings over all z is the portion of the total variance explained by F
so this sum equals the eigenvalue EV for F
the correlation between any 2 z's is the sum of the products of their loadings on each of the factors
37
for the 103 adolescents with type 1 diabetes who responded at baseline to all the items of all 3 of these instruments
88.0% of the 117 subjects providing some baseline data
from Adolescents Benefit from Control (ABCs) of Diabetes Study (Yale School of Nursing, PI Margaret Grey) [9]
40
Communalities
Communalities FACES1 FACES2 FACES3 FACES4 FACES5 FACES6 FACES7 FACES8 FACES9 FACES10 FACES11 FACES12 FACES13 FACES14 FACES15 FACES16 FACES17 FACES18 FACES19 FACES20 FACES21 FACES22 FACES23 FACES24 FACES25 FACES26 FACES27 FACES28 FACES29 FACES30 Initial 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 Extraction .504 .375 .426 .214 .305 .236 .458 .623 .258 .378 .211 .122 .128 .214 .394 .458 .430 .550 .473 .357 .342 .461 .494 .200 .599 .542 .309 .225 .383 .466
the initial communalities are all set to 1 for the PC method they are then recomputed (in the "Extraction" column) based on the 2 extracted factors all the recomputed communalities are < 1 as they should be for a factor analysis with k<30 if 30 factors had been extracted, the communalities would have all stayed 1
a standard PCA
41
Component 1 FACES1 FACES2 FACES3 FACES4 FACES5 FACES6 FACES7 FACES8 FACES9 FACES10 FACES11 FACES12 FACES13 FACES14 FACES15 FACES16 FACES17 FACES18 FACES19 FACES20 FACES21 FACES22 FACES23 FACES24 FACES25 FACES26 FACES27 FACES28 FACES29 FACES30
Matrix a
Loadings
the matrix of loadings
called the component matrix in SPSS for the PC method 30 rows, 1 for each item z 2 columns, 1 for each factor F
Component 2 .702 .110 .611 -.037 -.327 .565 .454 .093 .550 .048 .356 .330 .677 -.004 .789 -.027 -.318 .396 .338 .513 .387 .246 -.335 .102 .303 .191 .185 .424 -.231 .584 .630 .247 .655 -.031 .689 .273 -.487 .486 .565 .195 .564 .155 .673 .088 .686 -.151 -.315 .317 -.532 .562 .731 .089 .527 .178 -.239 .410 -.495 .371 .647 .217
FACES1 loads much more highly on the first factor than on the second factor
since .702 is much larger than .110 and so FACES1 is said to be a marker item (or salient) for factor 1
42
Eigenvalues
Total Variance Explained Initial Eigenvalues Extraction Sums of Squared Loadings % of % of Variance Cumulative % Total Variance Cumulative % Total 8.360 27.867 27.867 8.360 27.867 27.867 2.777 9.255 37.122 2.777 9.255 37.122 1.804 6.012 43.134 1.593 5.309 48.443 1.413 4.712 53.155 1.305 4.350 57.505 1.266 4.221 61.726 1.150 3.835 65.560 .984 3.279 68.839 .898 2.992 71.831 .818 2.726 74.557 .770 2.567 77.124 .708 2.359 79.484 .681 2.268 81.752 .583 1.945 83.697 .563 1.876 85.573 .519 1.731 87.304 .481 1.604 88.908 .453 1.509 90.417 .407 1.357 91.774 .381 1.270 93.043 .361 1.204 94.248 .310 1.035 95.282 .280 .933 96.215 .251 .836 97.051 .226 .752 97.803 .209 .697 98.500 .192 .641 99.141 .155 .516 99.657 .103 .343 100.000 Component 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
the "Total" column gives the eigenvalues in decreasing order the first 2 factors explain about 28% and 9% individually of the total variance
total variance = 30 since items are standardized
44
Eigenvalue
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Component Number
45
46
Communalities
Communalities FACES1 FACES2 FACES3 FACES4 FACES5 FACES6 FACES7 FACES8 FACES9 FACES10 FACES11 FACES12 FACES13 FACES14 FACES15 FACES16 FACES17 FACES18 FACES19 FACES20 FACES21 FACES22 FACES23 FACES24 FACES25 FACES26 FACES27 FACES28 FACES29 FACES30 Initial .650 .538 .615 .575 .594 .405 .582 .702 .501 .501 .511 .379 .410 .427 .462 .619 .699 .708 .574 .504 .617 .663 .723 .361 .665 .586 .534 .515 .489 .582 Extraction .478 .343 .352 .182 .272 .178 .427 .609 .182 .308 .161 .101 .102 .129 .288 .428 .399 .531 .429 .321 .307 .433 .462 .150 .589 .522 .268 .154 .331 .437
the initial communalities are all estimated using associated squared multiple correlations they are then recomputed based on the 2 extracted factors all the initial and recomputed communalities are < 1 as they should be for a factor analysis with k<30
47
Factor
a Matrix
Loadings
2 .107 -.035 .503 .055 .031 .259 .001 -.025 .304 .452 .176 .070 .152 .316 .490 .238 -.034 .272 .455 .175 .137 .096 -.133 .253 .560 .100 .142 .324 .328 .208
FACES1 FACES2 FACES3 FACES4 FACES5 FACES6 FACES7 FACES8 FACES9 FACES10 FACES11 FACES12 FACES13 FACES14 FACES15 FACES16 FACES17 FACES18 FACES19 FACES20 FACES21 FACES22 FACES23 FACES24 FACES25 FACES26 FACES27 FACES28 FACES29 FACES30
Factor 1 .683 .585 -.314 .423 .521 .332 .653 .780 -.299 .322 .361 -.310 .281 .170 -.219 .610 .631 .676 -.471 .538 .537 .651 .667 -.294 -.525 .715 .498 -.223 -.473 .627
48
PC vs. PF Methods
the use of the PC method vs. the PF method is thought to usually have little impact on the results
"one draws almost identical inferences from either approach in most analyses" [11, p. 535]
so far there seems to be only a minor impact to the choice of factor extraction method on the loadings for the FACES data
we will continue to consider this issue
49
Eigenvalues
Total Variance Explained Initial Eigenvalues Extraction Sums of Squared Loadings % of % of Variance Cumulative % Variance Cumulative % Total Total 8.360 27.867 27.867 8.360 27.867 27.867 2.777 9.255 37.122 2.777 9.255 37.122 1.804 6.012 43.134 1.593 5.309 48.443 1.413 4.712 53.155 1.305 4.350 57.505 1.266 4.221 61.726 1.150 3.835 65.560 .984 3.279 68.839 .898 2.992 71.831 .818 2.726 74.557 .770 2.567 77.124 .708 2.359 79.484 .681 2.268 81.752 .583 1.945 83.697 .563 1.876 85.573 .519 1.731 87.304 .481 1.604 88.908 .453 1.509 90.417 .407 1.357 91.774 .381 1.270 93.043 .361 1.204 94.248 .310 1.035 95.282 .280 .933 96.215 .251 .836 97.051 .226 .752 97.803 .209 .697 98.500 .192 .641 99.141 .155 .516 99.657 .103 .343 100.000 Component 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
exactly the same as for the PC method in SPSS, eigenvalues are always computed using the PC method
even if a different factor extraction method is used
so always get the same choice for the # of factors with the EVONE rule and other related rules
but the factor loadings 50 will change
51
Communalities
Communalities FACES1 FACES2 FACES3 FACES4 FACES5 FACES6 FACES7 FACES8 FACES9 FACES10 FACES11 FACES12 FACES13 FACES14 FACES15 FACES16 FACES17 FACES18 FACES19 FACES20 FACES21 FACES22 FACES23 FACES24 FACES25 FACES26 FACES27 FACES28 FACES29 FACES30 Initial .650 .538 .615 .575 .594 .405 .582 .702 .501 .501 .511 .379 .410 .427 .462 .619 .699 .708 .574 .504 .617 .663 .723 .361 .665 .586 .534 .515 .489 .582
the initial communalities are all estimated using associated squared multiple correlations
and so they are the same as before
but communalities based on the extraction as well as the factor matrix are not produced the procedure did not converge because communalities over 1 were generated
suggests that the EV-ONE rule is of questionable value for the ABC FACES items
Factor Matrixa a. Attempted to extract 8 factors. In iteration 25, the communality of a variable exceeded 1.0. Extraction was terminated.
52
Communality Anomalies
communalities are by definition between 0 & 1 but factor extraction methods can generate communalities > 1
Heywood case: when a communality = 1 ultra-Heywood case: when a communality > 1
SAS has an option that changes any communalities > 1 to 1, allowing the iteration process to continue and so avoiding the convergence problems reported for SPSS
53
the EV-ONE rule selects 10 factors PAF did not converge in the default # of 25 iterations
but the # of iterations can be increased
in "Extraction..." change "Maximum Iterations for Convergence:" to 200 (it did not converge at 100)
after more iterations, extraction is terminated because some communalities exceed 1 again the EV-ONE rule appears to be of questionable value
54
Eigenvalue
but the scree plot suggests that 1 may be a reasonable choice for the # of factors
which is the recommended # of scales for CDI
or maybe 4
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
Factor Number
55
converges in 14 iterations
but the EV-ONE rule selects 15 factors seems like far too many
56
10
Eigenvalue
the scree plot, though, suggests that 3 may be a reasonable choice for the # of factors
which is the recommended # of scales for DQOLY
0
1 2 3 4 5 6 7 8 9 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 5 5 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
Factor Number
57
also rules based on % explained variance can generate much different choices for the # of factors
"basically inapplicable as a device to determine the # of factors" [11, p. 483]
58
in SAS, eigenvalue-based rules can generate different choices for the # of factors when applied to different factor extraction methods 4 factors are generated in this case for the FACES items instead of 8 as in SPSS 59
SPSS Code
SPSS is primarily a menu-driven system
statistical analyses are readily requested using its point and click user interface
equivalent code for a menu-driven analysis can be generated using the "paste" button here is code for the most recent analysis
FACTOR /VARIABLES DQOLY1 TO DQOLY51 /MISSING LISTWISE /ANALYSIS DQOLY1 TO DQOLY51 /PRINT INITIAL EXTRACTION /PLOT EIGEN /CRITERIA MINEIGEN(1) ITERATE(200) /EXTRACTION PAF /ROTATION NOROTATE /METHOD=CORRELATION .
60
it also has a feature called Analyst for conducting menu-driven statistical analyses
click on Solutions/Analysis/Analyst to enter it
61
to request the 1-step PC method, use "METHOD=PRINCIPAL" with "PRIORS=ONE" (i.e, set initial/prior communalities to 1) to request the EV-ONE rule, use "MINEIGEN=1"
to request a specific # f of factors, replace "MINEIGEN=1" with "NFACTORS=f" to request the 1-step PF method, change to "PRIORS=SMC" (i.e, estimate the initial/prior communalities using the Squared Multiple Correlations) to iterate either of the above, change to "METHOD=PRINIT" can use "MAXITER=m" to request more than the default of 30 iterations adding "HEYWOOD" can avoid convergence problems
62
or choose "Number of factors:" and provide a specific integer f (no more than the # of items)
alpha factoring
maximizing the reliability (i.e., Chronbach's alpha) for the factors
maximum likelihood
treating the standardized items as multivariate normally distributed with factor analysis correlation structure
image factoring
Kaiser's image analysis of the image covariance matrix
matrix computed from the correlation matrix R and the diagonal 65 elements of its inverse matrix; related to anti-image covariance matrix
the results for some methods can be affected by how the initial communalities are estimated
66
this covers the more commonly used methods [1,12] will not demonstrate other available methods
described as lesser-used in [13,p.362]
67
Chronbach's Alpha ()
a measure of internal consistency reliability
is computed for each scale of an instrument separately
after reverse coding items when appropriate
is often the only quantity used to assess established scales, and so it seems desirable for scales to have maximum
68
69
Factor
Matrixa Factor
Loadings
2 .075 -.016 .423 .164 .079 .279 -.012 -.022 .367 .384 .276 .123 .130 .406 .566 .162 -.015 .215 .402 .130 .112 .004 -.172 .236 .504 .017 .190 .352 .302 .152
1 FACES1 FACES2 FACES3 FACES4 FACES5 FACES6 FACES7 FACES8 FACES9 FACES10 FACES11 FACES12 FACES13 FACES14 FACES15 FACES16 FACES17 FACES18 FACES19 FACES20 FACES21 FACES22 FACES23 FACES24 FACES25 FACES26 FACES27 FACES28 FACES29 FACES30 .672 .582 -.289 .465 .526 .335 .683 .794 -.265 .312 .364 -.292 .268 .204 -.224 .592 .652 .676 -.458 .546 .518 .649 .663 -.298 -.514 .705 .521 -.245 -.474 .610
the matrix of loadings FACES1 once again loads much more highly on the first factor
since .672 is much larger than .075 once again the loadings have changed only a little
from .702 and .110 for the PC method
70
for the CDI items, it does not converge for 1, 2, or 3 factors for DQOLY, it does not converge for 1 or 3 factors, but does converge for 2 factors the alpha factoring method seems very unreliable even when it works, its optimal properties are lost 71 following rotation [11, p. 482]
estimates the correlation matrix R using its most likely value given the observed data assuming R has factor analysis structure and that item values are normally distributed or at least approximately so [1]
72
Loadings
Factor Matrixa Factor 1 FACES1 FACES2 FACES3 FACES4 FACES5 FACES6 FACES7 FACES8 FACES9 FACES10 FACES11 FACES12 FACES13 FACES14 FACES15 FACES16 FACES17 FACES18 FACES19 FACES20 FACES21 FACES22 FACES23 FACES24 FACES25 FACES26 FACES27 FACES28 FACES29 FACES30 .692 .590 -.328 .406 .512 .321 .643 .769 -.317 .314 .354 -.314 .279 .150 -.222 .614 .632 .678 -.484 .536 .537 .644 .670 -.298 -.538 .720 .486 -.218 -.482 .634 2 .114 -.043 .491 -.015 .003 .226 .026 -.004 .230 .472 .091 .033 .173 .240 .426 .282 -.064 .316 .488 .191 .157 .147 -.096 .241 .596 .151 .125 .300 .330 .235
the matrix of loadings FACES1 once again loads much more highly on the first factor
since .692 is much larger than .114 the loadings have changed, but only a little
from .702 and .110 for the PC method
73
can search for the first # of factors for which this test becomes nonsignificant
Goodness-of-fit Test df 246
significant for 7 factors nonsignificant for 8 factors but this is not close to the recommended # of 2 factors
Chi-Square 290.767
Sig. .026
74
in any case, this test tends to generate "more factors than are practical" [11,p. 479] 75
the AIC option in SPSS syntax requests display of the anti-image covariance matrix
76
an AIC (BIC) value does not mean anything by itself it needs to be compared to AIC (BIC) values for other models
77
so the total variance is now the sum of the variances for all the items and the EV-ONE rule should not be used only works with some factor extraction methods
a Factor Matrix
Loadings
Rescaled Factor 1 .679 .585 -.319 .445 .535 .334 .659 .787 -.295 .318 .369 -.304 .278 .173 -.226 .604 .625 .659 -.471 .532 .531 .650 .666 -.304 -.530 .707 .508 -.217 -.474 .618 2 .109 -.034 .511 .075 .051 .271 .004 -.018 .315 .451 .191 .066 .141 .330 .483 .240 -.031 .253 .447 .184 .134 .099 -.143 .243 .555 .098 .150 .314 .315 .212
Raw Factor FACES1 FACES2 FACES3 FACES4 FACES5 FACES6 FACES7 FACES8 FACES9 FACES10 FACES11 FACES12 FACES13 FACES14 FACES15 FACES16 FACES17 FACES18 FACES19 FACES20 FACES21 FACES22 FACES23 FACES24 FACES25 FACES26 FACES27 FACES28 FACES29 FACES30 1 .585 .610 -.381 .515 .654 .447 .700 .878 -.349 .400 .426 -.313 .311 .185 -.250 .640 .654 .704 -.580 .579 .492 .710 .660 -.366 -.556 .739 .585 -.250 -.500 .673 2 .094 -.035 .611 .087 .063 .363 .004 -.020 .372 .567 .220 .068 .158 .353 .533 .254 -.033 .271 .551 .201 .124 .108 -.141 .294 .583 .103 .172 .363 .332 .231
FACES1 once again loads much more highly on the first factor
since .679 is much larger than .109 the loadings have changed, but only a little
from .683 and .107 for the PAF method applied to the correlation matrix
does not appear to be much of an impact to factoring the covariance matrix vs. the correlation matrix
80
then use these variables as predictors in regression models of appropriate outcome variables 81
Correlation Residuals
how much correlations generated by the factor analysis model differ from standard estimates of the correlations
measures how well the model fits correlations between items when the covariance matrix is factored, covariance residuals are generated instead
to generate correlation residuals in SAS add the "RESIDUALS" option to the PROC FACTOR statement to generate listings of these residuals further adding the "OUTSTAT=" option gives a name to an output data set containing among other things the correlation residuals for further analysis in SPSS, use "Reproduced" for the "Correlation matrix" option of "Descriptives..." to generate a listing of residuals
these do not directly address the issue of whether the values for the items are reasonably treated as close to normally distributed or if any are outlying
item residuals address this issue
such results are reported later
82
for the ABC data, there are only 3.8, 3.4, and 2.0 observations per item for the CDI, FACES, and DQOLY items, respectively
relatively low values especially for DQOLY
83
H0: the standardized items are independent (0 factor model) Ha: they are not (i.e., there is at least 1 factor)
84
CDI
KMO and Bartlett's Test Kaiser-Meyer-Olkin Measure of Sampling Adequacy. Bartlett's Test of Sphericity Approx. Chi-Square df Sig. .725 920.324 351 .000
DQOLY
KMO and Bartlett's Test Kaiser-Meyer-Olkin Measure of Sampling Adequacy. Bartlett's Test of Sphericity Approx. Chi-Square df Sig. .699 2911.235 1275 .000
85
Missing Values
by default, SPSS (SAS) deletes any cases (observations) with missing values for any of the items SPSS supports
"Exclude cases listwise", the default option "Exclude cases pairwise"
calculating correlations between pairs of items using all cases with non-missing values for both items can generate very unreliable estimates so best not to use
as long as there aren't too many items with missing values for that case
e.g., if at least 50% or 70% of the item values are 87 not missing
if some factors have small #'s of marker items, the # of factors may have been set too high
at least 2 [11] or 3 [8,13] items per factor is desirable
89
Item-Scale Allocation
when developing scales for a new instrument, the items are usually separated into disjoint sets consisting of the marker items for each factor and used to compute associated scales
marker items represent distinct aspects of associated factors and are the basis for assigning scales meaningful names
items that have high absolute loadings on more than one factor are usually discarded [8]
they do not represent distinct aspects of only one factor
items that have low absolute loadings on all factors should then also be discarded
they do not represent distinct aspects of any factor most authors ignore this issue, but it does happen quite often in practice 90
group factors are those with associated subsets of items loading on them
this is the basis for item-scale allocation rules
"not everyone agrees that general factors are undesirable" [11, p. 503]
instruments which partition their items into disjoint sets corresponding to marker items are assuming that all the factors are distinct group factors instruments that use all items to compute all the scales are assuming the factors are all general factors
e.g., the PCS and MCS scales of the MOS SF-36 are computed from all 35 items used in scale construction but these items are first partitioned into disjoint groups and 91 used to compute associated subscales
Rotation
the interpretation of factors through their marker items can be difficult if based on the loadings generated directly by factor extraction rotated loadings are typically used instead
these are thought to be more readily interpretable
93
same process for any rotation scheme but using a different transformation matrix
Factor Transformation Matrix Factor 1 2 1 .844 .536 2 -.536 .844
94
Extraction Method: Maximum Likelihood. Rotation Method: Varimax with Kaiser Normalization.
Factor 1 FACES1 FACES2 FACES3 FACES4 FACES5 FACES6 FACES7 FACES8 FACES9 FACES10 FACES11 FACES12 FACES13 FACES14 FACES15 FACES16 FACES17 FACES18 FACES19 FACES20 FACES21 FACES22 FACES23 FACES24 FACES25 FACES26 FACES27 FACES28 FACES29 FACES30 .645 .475 -.014 .335 .434 .392 .557 .647 -.144 .518 .347 -.247 .329 .255 .041 .669 .499 .742 -.146 .555 .538 .623 .514 -.122 -.134 .688 .477 -.024 -.230 .661 2 -.274 -.353 .591 -.230 -.272 .018 -.322 -.415 .364 .230 -.113 .196 -.004 .123 .478 -.091 -.393 -.096 .671 -.126 -.156 -.221 -.440 .363 .792 -.259 -.155 .370 .536 -.142
Extraction Method: Maximum Likelihood. Rotation Method: Varimax with Kaiser Normalization. a. Rotation converged in 3 iterations.
95
the same percentage of total variance (32.814%) is explained using rotated loadings as with unrotated loadings but it is allocated differently to the factors
factor 2's contribution has increased from 6.937% to 12.107% while factor 1's contribution has decreased from 25.877% to 20.707%
Total Variance Explained Initial Eigenvalues % of Variance Cumulative % 27.867 27.867 9.255 37.122 6.012 43.134 5.309 48.443 4.712 53.155 4.350 57.505 4.221 61.726 3.835 65.560 3.279 68.839 2.992 71.831 2.726 74.557 2.567 77.124 2.359 79.484 2.268 81.752 1.945 83.697 1.876 85.573 1.731 87.304 1.604 88.908 1.509 90.417 1.357 91.774 1.270 93.043 1.204 94.248 1.035 95.282 .933 96.215 .836 97.051 .752 97.803 .697 98.500 .641 99.141 .516 99.657 .343 100.000 Extraction Sums of Squared Loadings Total % of Variance Cumulative % 7.763 25.877 25.877 2.081 6.937 32.814 Rotation Sums of Squared Loadings Total % of Variance Cumulative % 6.212 20.707 20.707 3.632 12.107 32.814 Factor 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 Total 8.360 2.777 1.804 1.593 1.413 1.305 1.266 1.150 .984 .898 .818 .770 .708 .681 .583 .563 .519 .481 .453 .407 .381 .361 .310 .280 .251 .226 .209 .192 .155 .103
96
97
Factor 1 FACES18 FACES26 FACES16 FACES30 FACES8 FACES1 FACES22 FACES7 FACES20 FACES21 FACES10 FACES23 FACES17 FACES27 FACES2 FACES5 FACES6 FACES11 FACES4 FACES13 FACES14 FACES12 FACES25 FACES19 FACES3 FACES29 FACES15 FACES28 FACES9 FACES24 .742 .688 .669 .661 .647 .645 .623 .557 .555 .538 .518 .514 .499 .477 .475 .434 .392 .347 .335 .329 .255 -.247 -.134 -.146 -.014 -.230 .041 -.024 -.144 -.122 2 -.096 -.259 -.091 -.142 -.415 -.274 -.221 -.322 -.126 -.156 .230 -.440 -.393 -.155 -.353 -.272 .018 -.113 -.230 -.004 .123 .196 .792 .671 .591 .536 .478 .370 .364 .363
column 1 values decrease in absolute value from FACES18 to FACES12 while remaining larger in absolute value than column 2 values
so 22 load more on factor 1: 18,28,,12
after that column 2 values decrease in absolute value while remaining larger in absolute value than column 1 values
other 8 load more on factor 2: 25,19,3,29,15, 28,9,24
need to know what the items are in order to interpret these results item 12 is the only item with maximum absolute loading for a negative loading
suggesting it will need reverse coding
98
Extraction Method: Maximum Likelihood. Rotation Method: Varimax with Kaiser Normalization. a. Rotation converged in 3 iterations.
Discarding Items
using 0.3 as cutoff for low/high loadings
2 -.096 -.259 -.091 -.142 -.415 -.274 -.221 -.322 -.126 -.156 .230 -.440 -.393 -.155 -.353 -.272 .018 -.113 -.230 -.004 .123 .196 .792 .671 .591 .536 .478 .370 .364 .363
Factor 1 FACES18 FACES26 FACES16 FACES30 FACES8 FACES1 FACES22 FACES7 FACES20 FACES21 FACES10 FACES23 FACES17 FACES27 FACES2 FACES5 FACES6 FACES11 FACES4 FACES13 FACES14 FACES12 FACES25 FACES19 FACES3 FACES29 FACES15 FACES28 FACES9 FACES24 .742 .688 .669 .661 .647 .645 .623 .557 .555 .538 .518 .514 .499 .477 .475 .434 .392 .347 .335 .329 .255 -.247 -.134 -.146 -.014 -.230 .041 -.024 -.144 -.122
items 8,7,23,17,2 have both absolute loadings > 0.3 items 14,12 have both absolute loadings < 0.3 suggests discarding 7 items
Extraction Method: Maximum Likelihood. Rotation Method: Varimax with Kaiser Normalization. a. Rotation converged in 3 iterations.
100
FACES30 FACES30 0.66117 0.06204 0.52184 0.76614 0[] -0.14169 0.10117 -0.33193 0.05963 [0]
"[0]" means 0 is in the 95% confidence interval for a loading "0[]" means it is not
FACES1 loads on both factors FACES30 loads on factor 1 but not factor 2
101
11 items load only on factor 1 and 7 only on factor 2 12 items load on both factors, so 40% of the items address both factors
adaptability and cohesion are likely to be highly correlated
identified factors appear to be distinctly different from standard FACES constructs with 23.3% (7/30) 102 inconsistent items
will not consider rotations of CD1-CDI27 since the recommended # of scales is 1 and so rotations are unnecessary
103
104
30 items load on exactly 1 factor 19 items load on exactly 2 factors, 3 on all 3 factors
so 43% of the items address multiple factors
identified factors appear to be similar to standard DQOLY 105 constructs with only 5.9% (3/51) inconsistent items
a substantial amount of 17.4% (4/27) of the items appear to be of negligible value for the ABC subjects
106
Promax
starts with a Varimax rotation changes with parameter called Kappa with default value 4
108
111
it is multiplied on the right by the factor transformation matrix for a varimax rotation
with 2 rows and 2 columns to produce the varimax-rotated factor matrix
then this is multiplied on the right by another transformation matrix to generate the promax rotated loadings
will also have 30 rows and 2 columns
112
FACES1 FACES2 FACES3 FACES4 FACES5 FACES6 FACES7 FACES8 FACES9 FACES10 FACES11 FACES12 FACES13 FACES14 FACES15 FACES16 FACES17 FACES18 FACES19 FACES20 FACES21 FACES22 FACES23 FACES24 FACES25 FACES26 FACES27 FACES28 FACES29 FACES30
Factor 1 .643 .429 .161 .308 .406 .447 .530 .603 -.053 .651 .357 -.220 .368 .323 .188 .725 .444 .806 .036 .586 .558 .634 .447 -.029 .085 .697 .491 .084 -.098 .701
2 -.102 -.244 .657 -.151 -.166 .145 -.184 -.259 .362 .422 -.016 .141 .100 .218 .548 .111 -.281 .128 .705 .035 -.003 -.049 -.329 .367 .843 -.071 -.022 .406 .527 .052
Extraction Method: Maximum Likelihood. Rotation Method: Promax with Kaiser Normaliz a. Rotation converged in 3 iterations.
Inter-Factor Correlations
since promax is an oblique rotation, the associated factors are correlated the factor correlation matrix contains those correlations
only 1 in this case because there are only 2 factors
the 2 factors for this case are distinctly inversely related with an estimated correlation of !.511
Factor Correlation Matrix Factor 1 2 1 1.000 -.511 2 -.511 1.000
Extraction Method: Maximum Likelihood. Rotation Method: Promax with Kaiser Normalization.
114
SAS generates a reference structure matrix for use in place of the pattern matrix when it has such anomalous values
it is interpreted in the same way as a pattern matrix not generated in SPSS
however, if such problems occur, maybe that is an indication that the rotation approach needs to be changed
116
after that factor 2 loadings decrease in absolute value while remaining larger in absolute value than factor 1 loadings
8 load more on factor 2: 25,19,3,29,15,28,9,24
"orthogonal rotations usually lead one to essentially the same major groupings as oblique rotations" [11, p. 536] 117
Impact of Rotations
considered 10 rotations plus no rotation [10]
4 orthogonal
varimax, quartimax, equamax, parsimax
6 oblique
Harris-Kaiser promax starting from each of the other 5
with the default parameter POWER=3
Impact of Rotations
for the FACES items
all 10 rotations generated the same allocation
rotating the loadings appears to have a distinct impact on the results compared to not rotating them, but the choice of the rotation may not have much of an impact on those results 119
cdi1 cdi2 cdi3 cdi4 cdi5 cdi6 cdi7 cdi8 cdi9 cdi10 cdi11 cdi12 cdi13 cdi14 cdi15 cdi16 cdi17 cdi18 cdi19 cdi20 cdi21 cdi22 cdi23 cdi24 cdi25 cdi26 cdi27
Factor 1 .611 -.671 .644 .353 -.209 .539 -.741 -.295 .172 -.641 -.795 .283 -.313 .655 -.258 -.255 .211 -.239 .373 .532 -.421 .551 .155 -.463 -.055 .128 .271
CDI has 27 items scored from 0-2 these are summed to produce its one scale measuring the amount of depressive symptoms after reverse coding 13 of the items
items 2,5,7,8,10,11,13,15,16,18,21,24,25 are reverse coded replace an item y by 2 y
FACES1 FACES2 FACES3 FACES4 FACES5 FACES6 FACES7 FACES8 FACES9 FACES10 FACES11 FACES12 FACES13 FACES14 FACES15 FACES16 FACES17 FACES18 FACES19 FACES20 FACES21 FACES22 FACES23 FACES24 FACES25 FACES26 FACES27 FACES28 FACES29 FACES30
Factor 1 .690 .587 -.291 .407 .518 .338 .651 .773 -.311 .336 .351 -.311 .296 .167 -.196 .625 .621 .688 -.440 .543 .547 .655 .666 -.284 -.481 .725 .497 -.202 -.454 .640
family cohesion is computed by summing the odd items plus item 30 with 6 items 3,9,15,19,25,29 reverse coded
Factor 1 dqoly1 -.328 dqoly2 -.288 dqoly3 -.378 dqoly4 -.207 dqoly5 -.243 dqoly6 -.333 dqoly7 .398 dqoly8 -.229 dqoly9 -.305 dqoly10 -.381 dqoly11 -.264 dqoly12 -.085 dqoly13 -.403 dqoly14 -.195 dqoly15 -.299 dqoly16 -.287 dqoly17 -.391 dqoly18 -.328 dqoly19 -.300 dqoly20 -.365 dqoly21 -.244 dqoly22 -.206 dqoly23 -.105 dqoly24 -.268 dqoly25 -.248 dqoly26 -.296 dqoly27 -.262 dqoly28 -.115 dqoly29 -.392 dqoly30 -.281 dqoly31 -.386 dqoly32 -.281 dqoly33 -.288 dqoly34 -.487 dqoly35 .589 dqoly36 .464 dqoly37 .572 dqoly38 .577 dqoly39 .477 dqoly40 .569 dqoly41 .415 dqoly42 .663 dqoly43 .671 dqoly44 .610 dqoly45 .719 dqoly46 .581 dqoly47 .677 dqoly48 .704 dqoly49 .461 dqoly50 .670 dqoly51 .597 Extraction Method: Maximum Likelihood. a. 1 factors extracted. 8 iterations required.
for CDI and DQOLY, items were separated into those usually reverse coded vs. those usually not for FACES, items were separated into those usually reverse coded plus item 12 vs. the others usually not
12 is the only item in the 2-factor solution with maximum absolute loading at a negative value
perhaps, for the ABC subjects, more clearly defined family rules allowed them more flexibility to adapt in ways that do not violate those rules 123
125
varimax-based item-scale allocations for FACES were quite different from the recommended allocation
the recommended FACES scales might be inappropriate to use for families with adolescents having type 1 diabetes
for both FACES and DQOLY, varimax rotation separated off the items with reverse orientation from the others
does it really identify sets of items associated with different latent constructs or just having different orientations?
126
based on likelihoods for data in folds using parameter estimates computed from data outside of the folds
using the multivariate normal likelihood as in ML factor extraction multiply these deleted fold likelihoods together and normalize to the # of item responses to get the LCV score
LCV seems to be a reasonable way to assess how many factors to extract also considered a variety of other approaches
including rules based on eigenvalues and penalized likelihood criteria
the only other approach with somewhat acceptable results was the minimum BIC approach
which chose 1 for CDI, 2 for FACES, and 2 for DQOLY 129
for FACES
3 factors has a score within 1% of best 1 factor has a score of just above 1% of best
for DQOLY
2, 4, and 5 factors have scores within 1% of best
different choices for the # of factors can be competitive alternatives to the choice with the best score
a range of #'s of factors can have about the same effect part of why choosing the # of factors is a difficult problem
130
for all these methods, the recommended # of factors is chosen for all 3 sets of items using LCV
1 for CDI, 2 for FACES, and 3 for DQOLY there was also very little difference in maximum LCV scores for all of these methods
there does not seem to be much of an impact to the choice of factor extraction procedure 131
Evaluation of Rotations
rotations do not change the correlation structure of the EFA model and so cannot be directly evaluated by LCV but they do suggest summated scales with loadings changed to 1 or 0 which change the correlation structure so rotations can be evaluated using LCV by evaluating CFA models based on rotation-suggested scales considered variety of CFA models for FACES/DQOLY
based on rotation-suggested scales vs. on recommended scales with unit (1) loadings vs. with estimated loadings with all scales dependent vs. with all independent vs. with any subset independent and the rest dependent
132
U1
V1
U2
V2
U3
V3
U4
V4
I1
L1_1
I2
L2_1
I3
L3_2
I4
L4_2
F1
C1_2
F2
with loadings are L3_2 and L4_2, with errors U3 and U4 loadings L3_1 and L4_1 are 0
133
Comparison of Scales
treating scales as dependent was always better
so subsequent reported results use dependent scales not so surprising since scales from the same instrument measure related latent constructs
varimax-suggested scales with estimated loading were best overall for both FACES and DQOLY
other rotations were as good or at least almost as good and a little better than EFA-based scales
so treating factors as grouped rather than as general is reasonable
134
on the other hand, summated scales based on unrotated loadings were not competitive for both FACES and DQOLY
the common practice of basing scales on a rotation appears much better than basing them on unrotated 135 loadings
is the assumption of normality reasonable? are there any outlying item values?
need item residuals for this
to assess this, standardized the item residuals to be independent and standard normally distributed
for the 27103=2781 item values without reverse coding evaluated the EFA model with the better LCV score estimated the 2727 covariance matrix using all the data to reduce the effort, computed standardized residuals for item responses from subjects in each fold separately rather 137 than for all item responses of all subjects combined
for a value of item 25 with meaning: 0: nobody really loves me 1: I am not sure if anybody loves me 2: I am sure that somebody loves me a value of 2 occurs 101 times values of 0 and 1 each occur 1 time the one value of 0 is the outlier
almost all of these adolescents felt loved, so item 25 contributes little distinguishing information
its loading was also found not to be significantly different from zero
138
with residuals tending to be more outlying the closer the mean is to the extremes
this might be why normality is questionable perhaps this will often hold when the range of item values is so limited
139
observed FACES item means are all well away from the extremes of 1 and 5 number of items responses
30103=3090
140
observed DQOLY item means are all away from the extremes of 1 and 5 number of items responses 141
52103=5253
142
the construct validity of the new scales needs to be assessed after the factor analysis
do the new scales predict related quantities as expected?
143
so have data for 379 mothers and 149 fathers complicated by the need to account for the correlation between responses of parents from the same family
but can analyze data from mothers/fathers separately
144
5 FMSs
thriving, accommodating, enduring, struggling, floundering reflecting a continuum of difficulty for managing childhood chronic illness and the extent to which family members' experiences were similar or discrepant
only 280 of the 379 mothers provided values of 1-5 for all of items 1-57 or 4.9 subjects per item very important to adjust for missing data as well as for inter146 parental correlation to avoid losing so much data
correlations between test and retest responses for each of items 1-65
assess the consistency of responses to items over time computed for mothers separately from fathers used Spearman correlations since the range 1-5 for item values was limited for mothers, correlations were significantly nonzero for all items for fathers, correlations were nonsignificant for 8 of the items
items 36(p=.057), 22 (p=.077), and 7,8,18,19,29,60 (p>.10)
responses for mothers were reasonably consistent across time while fathers changed responses fairly often to quite a few items (8/65 or 12.3%) 147
for 4 items (23,30,39,52), the middle value of 3 was over 2 standard deviations from the item mean
these may be distinctly problematic
148
can assess if factoring provides a benefit over treating items as independent using LCV scores
149
Reverse Coding
extracting 1 factor using the ML method
signs of the loadings suggest that 25 of the 57 items need reverse coding from the other 32 items
items 1,3,8,9,15,17-20,23,24,26,28,30,31,36,37,39,40,46, 48,50,52,53,56
all of these items except item 36 were considered to have been worded positively
item 36 was considered to have been worded neutrally
all of the others were originally considered to have been worded negatively except for 5 items
items 5,6,7,45 were considered to have been worded positively item 43 was considered to have been worded neutrally 150 need to check on these potential inconsistencies
# of Factors
scree plot indicates 1 to about 7 factors using ML factor extraction
BIC is minimized at 3 factors LCV is maximized at 8 factors but LCV scores for 3-13 factors are all within 1% of best
Scree Plot
12 10 8
Eigenvalue
0
1 2 3 4 5 6 7 8 9 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7
Component Number
151
Comparison of Scales
using the 8-factor solution with best the LCV score the independent errors model had 9.8% lower score
so there is a distinct benefit to factoring the items
than the associated CFA model for varimax-suggested scales with estimated loadings
treating factors as grouped with items loading only on one of the factors
Comparison of Scales
using unit loadings (i.e. summated scales)
the LCV score decreased by a little over 2% can be a tangible penalty to using summated scales
153
using varimax-suggested scales with estimated loadings for 8 factors normality assumption somewhat questionable
normal plot curved at low end residuals can be skewed, but more to the low end than the high end
lower negative values are due to larger means, i.e., a tendency to respond as strongly disagree more often
Standardized Residual
Item Removal
removing either item 55 or item 57 imposes a tangible penalty in reduced LCV score
removing either generates a 1.1% decrease in LCV while they may generate large residuals, they still have value
still need to assess the impact of removal of the other 155 items
Item Boxplots
items 23,30,39,52 are highly skewed at the low end
primarily strongly disagree with responses close to strongly agree outlying
Acknowledgements
collection and analysis of the ABC data was supported in part by NIH/NINR Grant # R01 NR04009, PI Margaret Grey, and NIH/NIAID Grant # R01 AI057043, PI George Knafl collection and analysis of the FMSS data was supported in part by NIH/NINR Grant # R01 NR08048, PI Kathleen Knafl Jean O'Malley assisted in the preparation of these lecture notes and in organizing the background literature
157
References
1. Johnson RA, Wichern DW. Applied multivariate statistical analysis. Prentice-Hall, 1992. 2. McCorkle R, Young K. Development of a symptom distress scale. Cancer Nursing 1978; 1: 373-378. 3. Kovacs M. The children's depression inventory (CDI). Psychopharmacology Bulletin 1985; 21: 995-998. 4. Olsen DH, McCubbin HI, Barnes H, Larsen A, Larsen A, Muzen M, Wilson M. Family inventories. Family Social Science, 1982. 5. Ingersoll GM, Marrero DG. A modified quality of life measure for youths: psychometric properties. The Diabetes Educator 1991; 17: 114-118. 6. Cella DF, Tulsky DS, Gray G, Sarafian B, Linn E, Bonomi A, Silberman M, Yellen SB, Winicour P, Brannon J. The Functional Assessment of Cancer Therapy Scale: Development and validation of the general measure. Journal of Clinical Oncology 1993: 11; 570-579. 7. McHorney CA, Ware JE Jr., Raczek AE. The MOS 36 Item Short Form Health Survey (SF-36): II: Psychometric and clinical tests of validity in measuring physical and mental health constructs. Medical Care 1993; 31: 247-263. 8. Hatcher L. A step-by-step approach to using the SAS system for factor analysis and structural equation modeling. SAS Institute, 1994. 9. Grey M, Davidson M, Boland EA, Tamborlane WV. Clinical and psychosocial factors associated with achievement of treatment goals in adolescents with diabetes mellitus. Journal of Adolescent Health 2001; 28: 377-385. 10. Knafl GJ, Grey M. Factor analysis model evaluation using likelihood cross-validation. Statistical Methods for Medical Research in press. 11. Nunnally JC, Bernstein IH. Psychometric theory. McGraw-Hill, 1994. 12. Ferketich S, Muller M. Factor analysis revisited. Nursing Research 1990; 39: 59-62. 13. Polit DF. Data analysis and statistics for nursing research. Appleton & Lange, 1996. (see pp. 373-377 on presenting results for factor analysis) 14. SAS Institute Inc. SAS/STAT 9.1 user's guide. SAS Institute, 2004. 15. Spector PE. Summated rating scale construction: an introduction. Sage, 1992. 16. DeVellis RF. Scale development: theory and applications. Sage, 1991. 17. Knafl, K., B. Breitmayer, A. Gallo, & L. Zoeller. Family response to childhood chronic illness: description of 158 management styles. Journal of Pediatric Nursing 1996; 11: 315-326.