A General Framework For Estimating Multidimensional Contingency Fit

Organization Science informs ®
Vol. 21, No. 2, March–April 2010, pp. 540–553 doi 10.1287/orsc.1090.0464

issn 1047-7039 eissn 1526-5455 10 2102 0540 © 2010 INFORMS
A General Framework for Estimating

Multidimensional Contingency Fit
Simon C. Parker
Richard Ivey School of Business, University of Western Ontario, London, Ontario N6A 3K7, Canada,
[email protected]
Arjen van Witteloostuijn

Department of Management, University of Antwerpen, Prinsstraat 13, 2000 Antwerpen, Belgium,
[email protected]
T his paper develops a framework for estimating multidimensional fit. In the context of contingency thinking and the
resource-based view of the firm, there is a clear need for quantitative approaches that integrate fit-as-deviation, fit-as-
moderation, and fit-as-system perspectives, implying that the impact on organizational performance of series of bivariate
(mis)fits and bundles of multiple (mis)fits are estimated in an integrated fashion. Our approach offers opportunities to
do precisely this. Moreover, we suggest summary statistics that can be applied to test for the (non)significance of fit
linkages at both the disaggregated level of individual bivariate interactions, as well as the aggregated level of groups of
multivariate interactions. We systematically compare our approach with extant alternatives using simulations, including
the fit-as-mediation alternative. We find that our approach outperforms these established alternatives by including fit-as-
moderation and fit-as-deviation as special cases, by being better able to capture the nature of the underlying fit structure in
the data and by being relatively robust to mismeasurements, small sample sizes, and collinearity. We conclude by discussing
our method’s advantages and disadvantages.
Key words: contingency theory; multidimensional fit; organizational performance
History: Published online in Articles in Advance July 28, 2009.
Introduction and Roberts (1995) argue that complementarities across

The notion of alignment, congruence, fit, or match (fit, organizational features are associated with above-normal
for short) has been very popular in management stud- performance; they apply ideas from the economic theo-
ies ever since the classic contingency-type studies of the ries of complementarity and supermodularity (for empir-
1960s. In the early contributions, the key logic was that ical work in this tradition, see Athey and Stern 1998,
particular internal features of an organization, such as its Mohnen and Röller 2005). In a similar vein, Rivkin
processes, structures, or technologies, are best suited to (2000) develops a model of complexity that reveals
particular types of environment or contingencies such as how the number of organizational elements and their
complexity, dynamism, and uncertainty. The underlying interactions are positively linked to the emergence and
idea was that if the organization could indeed benefit sustainability of competitive advantages. Additionally,
from such a “fit,” above-normal or superior organiza- he analyzes the downside of fit: organizational inertia
tional performance would be within reach. This argu- (see Wright and Snell 1998). In a way, this implies
ment has been extended in different directions in the the argument that a static fit may turn into a dynamic
past three decades. For example, with the emergence misfit if the environment changes such that organi-
of the strategic choice perspective (Child 1972), char- zational adaptation—and hence flexibility—is required
acteristics of strategies and strategy-makers were intro- (see Zajac et al. 2000). This type of theoretical work is
duced. Related to this, ideal “configurational” types were closely aligned with the resource-based view of the firm
developed, prominent examples being those of Miles and because the complex and subtle interaction among sets
Snow (1978) and Porter (1980). Additionally, this con- of organizational resources (Dierickx and Cool 1989,
tingency or fit logic has been applied to a wide array and Donaldson 2000) is likely to produce a competi-
of functional domains, varying from top management tive advantage—and hence above-normal performance—
team studies (Boone et al. 2004) to international business that is difficult to imitate or compete away (Barney
(Luo and Park 2001). Contingency logic has penetrated 1991). Thus, modern organization theories confirm the
into other subdisciplines of management as well, such as early contingency logic that multivariate configurational
accounting (e.g., Hyatt and Prawitt 2001) and marketing approaches are needed to explain performance differ-
(e.g., Vorhies and Morgan 2003). ences, in and over time.
The underlying theoretical rationale is appealing The current paper adds to the large contingency-
(Donaldson 2001). In their theoretical essay, Milgrom type of literature by proposing an econometric approach
540
Parker and van Witteloostuijn: General Framework for Estimating Multidimensional Contingency Fit
Organization Science 21(2), pp. 540–553, © 2010 INFORMS 541
for estimating a comprehensive, flexible, robust, and section, we offer an appraisal and conclusion, briefly
multidimensional notion of fit. Although hundreds of reflecting upon how our general framework for measur-
empirical studies have been carried out in the fit-related ing fit relates to the broader literature, including alter-
tradition, the multidimensional conception of fit is so native approaches and the different traditions, and the
complex that empirical estimation is still anything but implications for organizational scientists.
easy, a situation that leads Burton et al. (2002, p. 1480)
to the conclusion that “[The concept of fit] is less well-
developed in terms of operational statements and empir- Background Literature
ical tests.” Our paper’s contribution is threefold. First, A classic conceptual contribution to the empirical fit
we develop a general and flexible multivariate contin- literature is Venkatraman (1989). He discusses three
gency model of organizational performance that includes criterion-specific perspectives of fit: fit as moderation,
different conceptions of fit as special cases. In doing fit as mediation, and fit as profile deviation. The medi-
so, we integrate fit-as-deviation, fit-as-moderation, and ation fit approach differs from the moderation one in
fit-as-system perspectives in a quantitative framework, that the former, contrary to the latter, is based upon a
including interaction (explicitly) and distance (implic- system of (generally two) equations. For example, the
itly) measures of fit in a general model specification, mediation approach first models strategy as a function of
both individually and as bundles. structure and, subsequently, performance as a function
Second, we suggest summary statistics derived from of strategy (see, e.g., Boone et al. 1996). In the mod-
the model that retain in their construction informa- eration approach, using the same example, performance
tion about the sources of fit and that can distinguish is modeled as a function of strategy and structure in a
“genuine” fit from random noise. Our approach can be single equation. The moderation fit concept relates to
applied to test for the (non)significance of fit linkages interaction effects. The argument is that if variables x
at both the disaggregated level of individual bivariate and y jointly produce a positive fit, their product term
interactions as well as the aggregated level of groups of x y will affect performance positively (and vice versa,
multivariate interactions. In this way, the impact on orga- in the case of a misfit). Strictly speaking, there can be
nizational performance of series of bivariate (mis)fits any number of variables in the interaction term; but in
and bundles of multiple (mis)fits are estimated in an practice, by far the majority of fit studies are limited to
integrated fashion. bivariate interactions, with an occasional exception of a
Third, we systematically compare our method with limited number of three-way interactions (see, e.g., Garg
extant alternatives using Monte Carlo simulations. We et al. 2003). The profile deviation fit notion relates to
believe a comparative approach is especially valuable bundles of multiple variables. Here, the logic is that for
given the plethora of different fit methods now available. a configuration of variables x1 xn y1 yn a set
As part of this approach, we also test the fit-as-mediation of their levels is linked to organizational performance
alternative. We find that our method outperforms these (or any other criterion variable, for that matter) as a set
established alternatives by including fit-as-moderation rather than as a series of separate bivariate relationships.
and fit-as-deviation as special cases; by being better able In this tradition, a popular method is to calculate devia-
to capture the nature of the underlying fit structure in the tion measures that capture the distance of a firm from an
data; and by being relatively robust to mismeasurements, ideal-type configuration, the hypothesis being that this
small sample sizes, and multicollinearity. Of course, this distance is negatively associated with organizational per-
does not imply that our approach is always superior formance (see, e.g., Vorhies and Morgan 2003).
or that our method does not come at any cost. Partic- From this brief overview, we conclude that two
ularly, the number of covariates increases rapidly if the aspects are key to distinguishing one criterion-specific
number of contingencies is expanded. Then, theory is fit approach from another. The first one contrasts fit as
critical for the selection of a workable number of con- moderation with fit as deviation (see Donaldson 2001).
tingencies. The key is, we believe, a finding from our Fit as moderation captures (mis)fit in the form of prod-
simulation exercise, which shows that our approach is uct terms of the contingency variables—say, x y; fit as
the only one able to identify the underlying nature of deviation measures misfit by calculating the distance of
fit. More broadly, we will reflect on the advantages and the actual value of a contingency variable from its ideal
disadvantages of our method in the appraisal. type—say, x − x∗ 2 , where x∗ is the ideal type’s value
The structure of the paper is as follows. The next sec- of x (e.g., the value of x that is associated with maxi-
tion briefly reviews fit studies in order to position our mum performance). Second, fit approaches may focus on
contribution in the context of the extant literature. The the individual contribution to performance of each and
one after develops our general model of organizational every contingency variable, or on the joint contribution
performance, followed by the introduction of our mea- of a bundle of contingency variables. In the terminol-
sure of fit. The penultimate section presents the results ogy of Drazin and Van de Ven (1985), the latter may be
of comparative Monte Carlo simulations. In the final referred to as the fit-as-system approach. In the current
542 Organization Science 21(2), pp. 540–553, © 2010 INFORMS
paper, we develop the so-called general interaction (GI) performance, p. The TL function is well known for its
approach that integrates all three conceptions of fit—i.e., generality and flexibility. It approximates any functional
fit as moderation, fit as deviation, and fit as system. relationship between firm performance on the one hand
To explore this, we compare the GI approach to four and a given set of explanatory variables on the other.
typical examples from the literature, each representing Formally, it provides a second-order Taylor approxi-
a relatively “pure” example of an empirical approach mation to any functional relationship, which means it
to measure fit as moderation, deviation, mediation, and approximates any performance equation as a series of
system. First, many fit-as-moderation studies introduce terms involving increasing powers of the variables, up
a series of (generally bivariate) interaction terms. Our to the second order.3 The practical importance of this
typical example is Skaggs and Ross Huffman (2003). point is that any measure of fit will only be appropriate
Second, deviation fit studies tend to calculate the dis- if it is based on a suitable underlying model. Because
tance of a focal firm vis-à-vis an ideal-type config- the researcher rarely if ever knows the true relationship
uration by computing multivariate (squared) Euclidian between performance and its determinants, it is therefore
distances measures. We take Vorhies and Morgan (2003) desirable to utilize a specification that is general enough
as a typical example. Third, in the advanced fit-as- to encompass (at least to a second-order approximation)
mediation tradition, (standardized) residuals from a any that may exist. This flexibility is a key advantage
first-step regression are entered into a second-stage of the GI approach. Hence our use of the TL function.
performance equation. We focus on Zajac et al. (2000) Moreover, as we will show below, the TL specification
as a sophisticated case in point. Fourth, system fit studies offers a way to integrate the fit as moderation (FAM)
often introduce individual and sum dummy measures of and the common symmetric approaches to fit as devia-
(mis)fit. Our benchmark is Burton et al. (2002, 2003).1 tion (FAD) into a single to-be-estimated equation.
The GI model can be written as
I E
A Model of Organizational Performance
n
n
p = + iI xi + iI xi2 + jE yj + jE yj2
Let I, E, and S denote three sets of variables, namely i=1 j=1
internal (organizational), external (environmental), and S
strategy variables, respectively.2 Each set contains at least
n
+ kS zk + kS z2k + ii I xi xi
one variable. There are nI variables in I, where each vari-
k=1 i =i
able in I is denoted by xi ; i.e., I = x1 x2 xnI ≡
I
xi n1 . Each variable in E is denoted by yj , and each + jj E yj yj + kk S zk zk
E S
in S by zk , with E = yj n1 and S = zk n1 . Without j =j k =k
losing generality, all variables are measured in natu-

n
nI E

n
n I S
ral logarithms: This makes the scale of measurement of + ijIE xi yj + ikIS xi zk

the performance and explanatory variables irrelevant and i=1 j=1 i=1 k=1
enables the use of the general translog function defined E S I E S

n
n
n
n
n
below. It is also assumed that the researcher has already + jkES yj zk + ijk xi yj zk (1)
chosen these variables for their potential usefulness for j=1 k=1 i=1 j=1 k=1
explaining variations in a given measure of firm perfor-
mance, p (also measured in logs). Hereafter we refer to where the s, s, s, and s are coefficients that will
all variables apart from p as “explanatory variables.” be interpreted in the remainder of this section. This will
At the outset, we define “fit” in terms of the extent include an explanation of how (1) is an extension of the
to which the explanatory variables as pairs or as bun- conventional TL function.4
dles affect p. A fit may be either beneficial or harmful Apart from an intercept —which takes care of scal-
for firms. For example, phrased in the fit-as-interaction ing issues—performance may be affected by three
tradition, a positive fit between two variables involves broad categories of factors. One category comprises
them interacting in a way that enhances performance, the first three sums of (1). We will call the compo-
whereas a negative fit has the opposite impact. In the nents of these sums individual determinants of perfor-
literature, negative fit is often referred to as “misfit.” mance. Note that these terms also happen to capture the
An absence of fit is also possible, arising (in a heuristic commonly applied symmetric FAD specification of fit
sense) if there is zero covariance between performance because the FAD specification p = 0 + 1 x − x∗ 2 +
and a given pair or bundle of explanatory variables. After 2 y − y ∗ 2 + 3 z − z∗ 2 can be rewritten as5
describing our model of performance, we will define fit p = + I x + I x2 + E y + E y 2 + S z + S z2 (2)
more precisely, also showing how distance-based mea-
sures of fit are nested in our approach. which is evidently a special case of (1). Viewed this way,
Below we use an extension of the general translog the GI model implies an integrated method for estimat-
(TL) function to propose a GI model of (log) firm ing both FAD and FAM types of fit. Below, the focus of
interest, as in the FAM approach, relates to the remain- groups (E and S, respectively), so between-group fit
ing terms of (1), i.e., to the interactions among variables. is suggested. In this example, we might expect 11ES
This is not to say that the individual (main and squared) (in the first sum on the last line of (1)) to be nega-
effects are not important—-they are. This is not only tive, reflecting a deterioration in the incumbent firm’s
because they implicitly capture symmetric FAD but also performance. This would be indicative of a negative fit
because in order to understand the effect of a bundle of or misfit between this strategy and this feature of the
variables one needs to understand the individual effects external environment. On the other hand, let z2 be the
as well (Cappelli and Neumark 2001). There are two cat- strategy of “enter now” for a new entrant and let y2 be a
egories of such interaction terms in model (1), which we variable capturing the environmental condition of “less
will refer to as within-group fit and (dual and multiple) advertising by a rival.” Then we might expect these two
between-group fit determinants of performance. Next we variables to have a positive fit impact on the performance
identify each of these two categories with the relevant of a new entrant, implying 22ES > 0 for such firms.
terms of (1) and explain their nomenclature and ratio- In contrast, the coefficients ijk [in the final sum
nale. Table 1 provides a summary for clarity and ease of of (1)] capture the importance of any multiple interac-
reference. tions between members of the three groups of variables.
The fourth, fifth, and sixth sums of (1) capture within- These terms actually constitute an extension to the con-
group fit measures. Thus, the ii I coefficients capture ventional TL function commonly used in the economics
all possible dual (bivariate) interactions among variables literature. In a mathematical sense, the terms capture
within the set of internal firm variables, I. For exam- part of a third-order approximation to any general per-
ple, if an owner’s experience (x1 , say) is more pro- formance function; but in the present context, its ratio-
ductive in a firm with a centralized decision-making nale is to represent multiple interactions across the three
structure (x2 , say), then one might expect performance groups of variables, as explained below. Something akin
to be positively related to x1 x2 : i.e., we would expect to to these three-way interactions recently appeared in an
see 12I > 0.6 This would be indicative of a positive fit. important paper by Cassiman and Veugelers (2006).
A similar interpretation applies to the other two sums, An example might help at this point. Imagine, for
which relate to interactions among variables within the instance, that following a particular strategy (e.g., z3 )
external variable set and within the strategy variable set. in the face of a particular environmental opportunity
The coefficients (in the seventh, eighth, and ninth (e.g., y3 ) only translates into better performance if the
sums of (1)) capture all possible dual interactions firm has in place a particular aspect of internal orga-
between variables in different variable groups. For exam- nization (x3 , for example). Then the dual fit between
ple, consider an incumbent firm that faces a more y3 and z3 is ineffective as far as performance is con-
dynamic, competitive environment as a result of a rival cerned, i.e., 33ES = 0, whereas in conjunction with x2 ,
entering the market with a novel differentiated prod- performance is enhanced, so 333 > 0. Thus, the mul-
uct. Suppose variable y1 captures this aspect of the new tiple between-group fit terms capture any performance-
competitive environment. At the same time, the incum- enhancing interactions that require elements of all three
bent firm pursues a strategy z1 of low-cost competition. of internal, external, and strategic elements to be present
The variables y1 and z1 come from different variable simultaneously.
Table 1 Summary of Determinants of Performance

Measuring Fit
Name Terms of GI model Description The foregoing has talked loosely of “fit” within and
A. Relating to fit as deviation between components of I, E, and S. We now propose a
Individual effects
n I 2 more formal way of measuring fit. Specifically, we will
i=1 iI xi + iI xi Individual effects on
n E 2
performance
propose a statistic that captures fit through bundles of
j=1 jE yj + jE yj
nS 2 interaction terms. In so doing, we integrate the fit-as-
k=1 kS zk + kS ln zk
system approach into the GI method. Our starting point
B. Relating to fit as moderation

is the observation that, in practice, all models of perfor-
Within-group fit x x
i =i ii I i i Dual interactions mance estimated from sample data are subject to sam-
j =j jj E yj yj between variables pling error, so it is important to have measures of fit that

k =k kk S zk zk within the groups are robust to this problem. In addition, we seek to quan-
n I n E
Between-group fit i=1 j=1 ijIE xi yj Dual interactions tify significance and importance irrespective of the sign
n I n S
i=1 k=1 ikIS xi zk between variables of its effects (because fit can have positive and misfit

nE
j=1
nS
k=1 jkES yj zk across groups negative effects on performance) and in a way that can
n I n E n S be assessed for statistical robustness.
ijk xi yj zk Multi-interactions
i=1 j=1 k=1
A convenient measure of fit that satisfies these two
across groups
properties is the statistical concept of the incremental
contribution (IC) of the various sums in (1). As will be These are three large-sample tests based on the F
explained more formally below, IC is the extent to which statistic, called the Wald (W), likelihood ratio (LR), and
a bundle of variables as a group contributes statisti- Lagrange multiplier (LM) tests, respectively. They all
cally to explaining cross-firm variations in performance. follow a " 2 r distribution under the null hypothesis
In this way, the significance of any individual contribu- that the dual interactions of the I variables are jointly
tion of bivariate (mis)fits and the overall contribution of insignificant. Asymptotically, i.e., as T → , values
bundles of (mis)fits can be estimated in an integrated of W, LR, and LM converge. In small samples, we have
framework (see Burton et al. 2002, p. 1479). Not only the ordering W ≥ LR ≥ LM (see, e.g., Greene 2003, for
does this enable the researcher to distinguish genuine fit a proof).
from random noise such as measurement error, but also Each of these three statistics is a valid measure of IC.
it enables fit to be evaluated at different levels of aggre- They can be used to measure fit for any of the sums in
gation. The researcher might, for example, be interested Table 1. All that is needed is to impose the zero restric-
in checking not only whether the individual sums com- tions on the terms whose IC is being tested and then to
posing “within-group fit” are important aspects of fit but compute the above statistics with r defined accordingly.
also whether their aggregate sum is, too. The latter indi- Hence these statistics are easy to compute in practice.
cates whether within-group fit as a whole is a coherent Panel A of Table 2 summarizes the IC statistics this pro-
aggregate measure of fit. In a similar manner, we might cedure gives rise to, where subscripts on the " 2 statistics
be interested in assessing whether aggregate “between- indicate the type of fit being measured. Higher values of
group fit” is also a coherent entity, by combining all four the IC statistics indicate a greater statistical impact on
sums making up the between-group fit entry in Table 1. performance from the specified interactions, irrespective
of whether its individual components affect performance
Statistical Framework and Incremental positively or negatively. Values that exceed the #-point
Contribution “critical value” from the " 2 r distribution suggest that
In order to define IC formally, we require a statisti- one can be 100×1−# percent confident that the aspect
cal model of performance, pt , where t (t = 1 T ) of fit being tested is not just random noise.
indexes the particular firm t from the total of T firms Notice that the IC statistics defined above do not
in the sample. For convenience, write the sample equiv- depend on the signs of the coefficients, but instead upon
alent of the right-hand side (RHS) of (1) in compact their statistical ability to affect performance; they are
matrix form as symmetrical in this respect. Hence the IC statistics can
pt = Wt b + ut (3) detect fit or misfit—i.e., whether interactions are most
where pt is a column vector of log firm performance or least beneficial for performance. In practice, the esti-
for all T firms in the sample; Wt is a k × T matrix mated coefficients of (1) should also be cited along-
containing unity and all of the (k − 1) variables on side the IC statistics because they measure the economic
I E S
the RHS of (1) (i.e., 1 xit n1 yjt n1 zkt n1 , and their
interactions); b is the vector of associated coefficients Table 2 Summary of GI Fit Statistics
(i.e., and every , , , and coefficient); and ut is
a vector of errors, with mean zero and variance 2 . Measure Degrees of freedom Description
This is a standard regression model, which can be A. Separate fit statistics
estimated by ordinary least squares (OLS). Further- I2 n I n I − 1/2 Fit within I variables
more, it is easy to test linear restrictions. All mod- E2
n E n E − 1/2 Fit within E variables
ern software packages compute F statistics of r linear S2 n S n S − 1/2 Fit within S variables
restrictions, denoted by F r T − k. For example, sup- 2
IE nI nE Dual fit between I and
pose that one wanted to test whether the dual interac- E variables
2
IS nI nS Dual fit between I and
tions of the I variables were jointly insignificant. This S variables
implies r = nI nI − 1/2. Then the well-known F statis- 2
ES nE nS Dual fit between E and
tic F r T − k forms the basis of our measures of IC. S variables
Consider the following statistics: 2
IES nI nE nS Multiple fit between
the variables
T
W= rF r T − k (4) B. Aggregate fit statistics
T −k
I2 + E2 + S2 n I n I − 1 + n E n E − 1 Total within-group fit
1 +n S n S − 1/2
LR = T ln 1 + rF r T − k (5)
T −k 2 2 2
IE + IS + ES n n E + n S + n E n S 1 + n I
I
Total between-group fit
2
+IES
T · rF r T − k I2 + E2 + S2 n I n I − 1 + n E n E − 1 Grand overall fit
LM = (6)
1 2
+IE 2
+ IS +n S n S − 1/2 + n I ·
T − k 1 + rF r T − k 2
+ES 2
+ IES n E + n S + n E n S 1 + n I
T −k
(rather than the statistical) significance of interactions. in cases when the three IC statistics disagree (results
Both will be cited in our numerical illustration below. available upon request). The simulation results were
Two questions arise at this point. Why use the three based on 10,000 bootstrapped draws from the residuals
statistics (4), (5), and (6) instead of F r T − k? And of the translog performance model, where J irrelevant
which of the three statistics should be used in prac- variables drawn from independent standard normal dis-
tice? These questions are addressed in the two subsec- tributions were used to test for size. This does not “pre-
tions that follow. As we will see next, the answer to the determine” our results to mirror those from our example,
first question is based on a convenient additive property as the bootstrapping does no more than to sample ran-
of the " 2 distribution, which permits easy aggregation domly from some (nonparametric) distributed random
and disaggregation of IC fit measures across categories. variable. Several simulations were performed for var-
Then we show that the answer to the second question is ious combinations of J and the sample size, N . All
an empirical one, informed by Monte Carlo simulations three " 2 statistics were reasonably well sized, even in
relating to size and power of the statistics. small samples, although LM has the best size properties
overall. Notably, the LM does especially well for the
Aggregation and Disaggregation case of larger numbers of restrictions on the irrelevant
As well as computing the seven IC fit measures cor- variables, J . But before reaching any conclusions about
responding to the seven sums in Panel B of Table 1, the choice of a preferred statistic in cases where they
obtaining more aggregated measures of fit might also be disagree, we also need to evaluate the power properties
of interest. This is easily done by performing joint sig- of the statistics.
nificance tests on every sum we are interested in. By the With the simulation results, we checked the ability
additive property of " 2 distributions, an asymptotically of the three IC statistics to correctly reject a false null
equivalent (and much simpler) procedure is to add up the hypothesis that J truly relevant explanatory variables
individual " 2 IC statistics corresponding to these sums. are actually irrelevant. Again, the Monte Carlo simu-
See Panel B of Table 2 for a summary of these aggregate lations were performed based on bootstrapped residu-
fit measures and their degrees of freedom. als from the fit model. The results show that there is
For illustrative purposes, let us return to the case of little to choose between the three statistics in terms
the first sum of Panel B in Table 1. Suppose one were of power. Reasonable power is obtained for all of the
to find a mixture of positive and negative components in restrictions considered given a sample size of 100 or
the sum and wanted to measure separately fit attributable more. As J increases, power goes to 1.00 as we move
to positive components and misfit attributable to nega- up the power curve; i.e., for any given , the wrong
tive components. Our approach can also accomplish this null hypothesis is increasingly untenable. In general, it is
by exploiting the additive property of " 2 distributions. well known that absolute power values are less interest-
To see this, let v < nI nI − 1/2 be the number of posing than relative values based on comparisons between
itive interactions and w ≤ 'nI nI − 1/2( − v be the the performance of the statistics. Overall, based on these
number of negative ones [ nI nI − 1/2 − v + w ≥ 0 simulation results, we recommend that researchers either
is the number of zero interactions]. Then calculate the cite all three IC statistics or, if only one is to be used,
IC measure for just positive interactions, say. This is dis- select the LM statistic. The LM has the best size proper-
tributed as " 2 with v degrees of freedom. The IC mea- ties, and good power properties. Moreover, as the most
conservative statistic it is also the least likely to overstate
sure for negative ones only is distributed as " 2 with w
occurrences of genuine fit in practice.
degrees of freedom. Clearly, any of the sums in Table 1
can be disaggregated in this fashion.
Comparison of the Approaches
Monte Carlo Simulations on the IC Statistics by Simulation
Because W ≥ LR ≥ LM, it follows that if LM indicates To provide an illustrative comparison between our GI
statistically significant fit, or if W indicates insignificant fit approach and others proposed in the literature, the
fit, there will be no conflict between the different IC following variables were defined: two I variables x1
fit measures. So when assessing whether fit is signifi- and x2 , one E variable y, and one S variable z. This
cant, LM can be regarded as the most “conservative” restriction to a set of four variables is done for the
in the sense that this statistic will indicate significant fit sake of convenience, without affecting the validity of
less frequently than the other two. Inference becomes the comparative analysis that follows. Define the vector
less clear, however, if the three statistics give differ- v += x1 x2 y z, where each variable is measured in
ing results. In our examples below, that problem does natural logarithms. For each variable, 100 observations
not arise; to address this issue more generally, we per- were randomly generated from a uniform distribution
formed Monte Carlo simulations to evaluate the perfor- with support '0 10(. The results below are insensitive
mance of the statistics in order to improve our ability to to this (essentially arbitrary) choice of random number
discriminate between them and make secure inferences generator.
In principle, the GI method belongs to the fit-as- the robustness of the approaches to mismeasurement of
moderation approach, although our specification (1) also the variables in v. For example, is the GI approach more
implicitly captures FAD elements based on symmetric or less sensitive to variable mismeasurement than the
distance measures, as explained above. The key dif- alternative approaches? A third question is how robust
ference between the GI approach and distance-based the various approaches are to the use of small sample
FAD alternatives is that the latter explicitly calculate sizes, in conjunction with the number of contingencies
deviations from ideal types, whereas such ideal type included. After all, the GI approach does seem to be
deviations (when expressed symmetrically, at least) are demanding in this respect by including numerous inter-
implicitly nested in our GI specification. As noted in action terms. Finally, a fourth question involves robust-
the literature review, this explicit FAD conceptualiza- ness to multicollinearity. This is particularly pertinent
tion is the principal alternative approach to the FAM for the GI approach (and other FAM approaches, for that
notion. The explicit FAD approach measures misfit as matter) because it might include numerous highly cor-
some function of the squared deviation of v from an related main and interaction terms in a single equation.
“ideal” profile, which is denoted by v∗ . Common FAD Robustness of inference to these four problems is taken
functions include the identity function (used below for to be a desirable property of fit approaches in what fol-
simplicity) and the square root function (giving rise to lows. To the best of our knowledge, researchers have not
the Euclidean distance). The results below do not depend yet explored comparative robustness properties of differ-
on which function is chosen. ent fit approaches. Yet in practice, some or all of these
In what follows, we will consider various fit mea- problems may be commonplace.
sures that are common versions of the FAM or the The next subsection considers the case where per-
FAD approach. To enable a fair comparison to be made formance is determined by the GI DGP (7). We first
between different fit approaches, two data generation illustrate the GI approach in practice and show that
processes (DGPs) are used, one corresponding to the it accurately represents the DGP when random noise
FAM and one to the FAD approach. A DGP is an equa- , is present. To address the first robustness issue, we
tion that generates data on performance outcomes as a then go on to explore the performance of alternative fit
mixture of deterministic and stochastic influences. The approaches using the same DGP. The question here is
use of two different DGPs enables one to assess the whether alternative approaches can also track the data
sensitivity of different approaches to the underlying per- well and detect the structure of fit or misfit when the
formance data-generation process. data are generated by a DGP (7). To provide a fair basis
Log performance is denoted by p; its values were for comparisons, the second subsection treats the case
computed using one or other of the following DGPs: where performance is determined by the FAD DGP (8).
We ask how robust the GI approach is to the differ-
p = − 21 x1 x2 − x2 yz + , (7) ent DGP. The third subsection treats the three remaining
p = − 41 v − v∗ v − v∗ + , (8) robustness issues for both FAM and FAD approaches,
namely, robustness of the different approaches to mis-
where , is a vector of 100 independent random draws measurement of the variables in v, to the use of small
from the normal distribution with mean 0 and variance 3. sample sizes, and to instances of collinearity. The final
There are no special implications of using this particular subsection assesses the costs and benefits of the various
distribution of ,, although we note that a higher vari- approaches.
ance would inject greater noise into the DGPs, which We compare our GI approach to four typical examples
makes it harder for any model to fit the data well. of alternative approaches, as explained in the literature
Key is that Equation (7) reflects a FAM data-generation review section above. The first is Vorhies and Morgan
process and Equation (8) a FAD one. By working (2003). They use an explicit FAD approach, applying a
with both (7) and (8), we can explore the comparative two-step procedure by, in the first step, estimating ideal
strengths and weaknesses of FAM- and FAD-based esti- types on the basis of average values for the contingency
mation approaches, or any other, in dealing with both variables for the top 10% of performers and in the sec-
types of underlying data. ond step including Euclidian distance measures in a per-
More specifically, the key question explored in this formance equation estimated for the remaining 90% of
section is the robustness of fit approaches to four dif- the sample. The second alternative is from Skaggs and
ferent problems that can arise in real-world applications. Ross Huffman (2003). They apply the standard FAM
One question is the robustness of the approaches to method, including a limited number of bivariate inter-
a different DGP than the one typically assumed. For action terms in their performance model. Third, Burton
instance, does the GI approach capture FAD sources et al. (2002) suggest a fit-as-system method. They define
of fit, as generated by (8), and how does this compare a series of dummy misfit variables that make up the
to explicit FAD distance-based estimation approaches key explanatory variables in a performance regression.
prominent in the literature? A second question relates to Fourth, Zajac et al. (2000) develop a fit-as-mediation
methodology. They suggest a two-equation estimation Table 3 Estimates of the GI Performance Model
procedure, where the residuals of the first (mediation)
Variable (1) (2) (3) (4) (5) (6)
regression are the key explanatory variables in the sec-
ond (performance) model. Constant 695 ∗
7930 18016 −1453 ∗ ∗
−104 324
133 1304 350 451 022 021
Comparisons Based on a GI DGP x1 −071 −1242 −034∗ −070 −046
070 034 204 076 012
The regression results from estimating the GI model
x12 −008 −857 −004 001 −009
using DGP (7) are presented in column 1 of Table 3. 134 084 050 011 042
IC statistics measuring fit are reported in Panel A of x2 013 −092 072∗ 043 −043
Table 4. Recall from (7) that all and only the within-x 016 008 333 058 014
term x1 x2 and the multiple-fit term x2 yz should be sta- x22 −002 −145 005 −005 −014
tistically significant. Because the DGP follows the GI 036 140 119 086 040
structure, it is no surprise that the results show this to be y −133 −1414 087∗ 059 039
the case, with significant IC statistics for these interac- 135 132 330 068 013
tions but none for the others.7 The coefficient estimates y2 002 −024 002 −086 −010
045 026 054 181 048
are also close to their “true” values of −050 and −100.
z −044 −2474∗ −075∗ −051 −123
We next ask what happens when alternative fit ap-
037 236 261 050 046
proaches are applied to these data. We start with
z2 000 034 −011 000 009
column 1 of Table 5, which presents the outcome of 005 038 070 009 061
implementing the Vorhies and Morgan (2003) FAD x 1 x2 −049∗ −084∗ 3598 −002 −306∗ −055∗
approach to measuring fit. This involves, first, estimating 840 275 187 028 5843 410
v∗ as average values of v sampled from the top 10% of x1 y 018 014 −696 006 005 017
performers to obtain v̂∗ ; second, computing a Euclidean 151 044 029 055 047 031
distance measure defined as D = vj − v̂∗ vj − v̂∗ for x2 y 005 −269∗ −3063 −017 −001 −003
the remaining 90% of the sample; and third, regressing 051 742 140 186 009 010
performance p on D using the specification x1 z 017 042 4311∗ 003 −000 023
126 106 210 028 002 048
p = 0 + 1 D + # (9) x2 z −011 −301∗ −5496∗ −012 012 −014
081 982 228 120 100 039
where # is a disturbance term. According to Vorhies yz 003 −310∗ 1349 −011 007 002
017 901 049 130 054 007
and Morgan (2003), misfit is detected when 1 is neg-
x1 yz −002 1306 000 001 −003
ative and significant.8 This prediction is borne out by 088 055 018 034 037
the results presented in column 1 of Table 5, suggest- x2 yz −099∗ −9574∗ −001 −1565∗ −097∗
ing that the Vorhies and Morgan (2003) approach does 4401 349 027 87706 1976
indeed detect the existence of FAM-generated fit when Adj. R2 (%) 9996 9426 8354 8124 9999 9993
it is present. Furthermore, the approach is superior to
randomly choosing v∗ (corresponding perhaps to incor- Notes. Dependent variable is pt . All variables defined in the text. All
are continuous variables apart from the variables in the last 8 rows
rect theoretical priors), as can be seen in simulations of column (3), which are dummy variable replacements. Method
(not reported here) where no significant effects from D of estimation in all columns: OLS. Number of observations in all
emerge. However, the Vorhies and Morgan method also columns is 100 apart from (6), where it is 22. The columns are
suffers from important drawbacks. It cannot identify the explained in the text. Absolute t-statistics are in parentheses.
∗
Indicates significance at 5%.
nature of fit (i.e., FAM or FAD) and so cannot isolate the
true determinants of what makes fit work. The adjusted
R2 in column 1 of Table 5 is also low compared with Panel A. At one level, this is not too surprising: by omit-
that of column 1, Table 3, reflecting the large amount of ting important explanatory variables a regression cannot
structure not being exploited by this method. be expected to give accurate results. What these results
Consider next the bivariate interaction approach, therefore illustrate is the well-known consequences of
which has dominated the FAM approach in the liter- mis-specification in the context of FAM methods.
ature to date. A typical example is Skaggs and Ross Burton et al. (2002) proposed a third, theory-driven,
Huffman (2003). Column 2 of Table 3 presents results approach to measuring FAM. They define a dummy
of including only bivariate interactions and excluding all variable SIT MISF whose value equals one for an obser-
multiple interactions in (1). The corresponding IC statis- vation for which any of the contingency variables for
tics appear in Panel B of Table 4. The restrictions of that observation are in a state of (theoretically deter-
only bivariate interactions are statistically inadmissible mined) misfit. Otherwise its value equals zero for that
(F = 6578: p < 00001). Reflecting this, the IC statis- observation. Burton et al. also define additional dummy
tics in Panel B of Table 4 differ markedly from those in variables capturing misfit, which is not attributable to
Table 4 IC Fit Statistics Table 5 Regression Results for Alternative Fit Approaches
IC statistic LM LR W 5% crit. val. Variable (1) (2) (3) (4) (5) (6)
A. Column (1) of Table 3 Constant −444 −2147 −13176 −112 −725 −5114∗
∗ ∗ ∗
Within-x fit 4594∗ 6151 ∗

8198 ∗
384 019 1708 560 103 259 239
Between-xy fit 433 443 453 599 D −263∗ −041∗ −033∗ 019
Between-xz 238 240 243 599 449 1967 4522 056
Between-yz fit 004 004 004 384 SIT MISF 509
Multiple fit 9666∗ 340∗ 2,897∗ 599 133
B. Column (2) of Table 3 NO X1 650
Within-x fit 837∗ 874∗ 913∗ 384 175
Between-xy fit 4467∗ 5919∗ 8075∗ 599
NO X2 −761
Between-xz 5908∗ 8936∗ 14435∗ 599
184
Between-yz fit 4946∗ 6824∗ 9787∗ 384
Multiple fit — — — — NO Y −1041∗
215
C. Column (3) of Table 3
405∗ NO Z 1178∗
Within-x fit 414∗ 422∗ 384
Between-xy fit 255 258 261 599 288
Between-xz fit 894∗ 936∗ 981∗ 599 û −404
Between-yz fit 028 029 029 384 101
Multiple fit 1278∗ 1367∗ 1465∗ 599 û −367
D. Column (4) of Table 3 096
Within-x fit 010 010 010 384 ûĉ 002
Between-xy fit 431 441 451 599 128
Between-xz fit 267 271 275 599 Adj. R2 (%) 3158 1893 000 8126 9583 −003
Between-yz fit 199 201 203 384
Multiple fit 010 010 010 599 Note. Variables are defined in the text. See the notes to Table 3 for
E. Column (5) of Table 3 entries.
Within-x fit 9763∗ 37408∗ 411325∗ 384
Between-xy fit 026 026 027 599
Between-xz 215 217 219 599 calculating the sample means of x1 x2 and x2 yz in our
Between-yz fit 035 035 035 384 simulation, we coded values of SIT MISF as one for
Multiple fit 9999∗ 93291∗ 11 × 106 ∗ 599 observations where either of these product variables
F. Column (6) of Table 3 have above-average values—and zero if neither do. Also
Within-x fit 1695∗ 3237∗ 7382∗ 384 following those authors, we coded NO X1 as one for
Between-xy fit 054 054 055 599
observations where SIT MISF = 1 but where x1 takes
Between-xz 112 115 118 599
Between-yz fit 002 002 002 384 values less than or equal to its sample mean (imply-
Multiple fit 2181∗ 10417∗ 248332∗ 599 ing the variable is causing least misfit)—and as zero if
these conditions do not hold. The other dummy variables
proposed by Burton et al. were defined analogously
(e.g., we coded NO X2 as equal to one for observations
each contingency variable in turn. So in the present con- where SIT MISF = 1 but where x2 takes values less than
text, a dummy variable NO X1 takes a value of one for an or equal to its sample mean). This is generous to Bur-
observation for which SIT MISF takes the value one but ton et al. because it comes closest to mimicking the true
there is no misfit in x1 for that observation (else it takes DGP, recognizing that theory alone might not always
the value zero). Dummy variables NO X2 , NO Y , and correctly identify instances of fit.
NO Z can be defined analogously. Burton et al. predict The results of implementing this approach are pre-
that in a regression of p on the complete set of dummy sented in column 2 of Table 5. Contrary to predic-
variables (SIT MISF, NO X1 , NO X2 , NO Y , and NO Z), tions, SIT MISF is positive and insignificant, whereas the
the variable SIT MISF should have a significant negative effects of the other dummy variables are mixed. Obvi-
effect, reflecting performance-damaging misfit, whereas ously, if the theory underlying the dummy variable cod-
either the other dummy variables are zero or they are ing is incorrectly specified, the Burton et al. approach
positive if there are any benefits from the absence of is liable to yield even more misleading results. After
misfit in a single dimension. all, even expert opinions about ideal types can differ
We implemented the Burton et al. (2002) method to (Doty et al. 1993), implying that some “expert” opinions
be as generous to it as possible by assuming that the are sometimes wrong.
researcher knows the true form of the DGP (7). Hence An anonymous referee wondered whether the GI
any values of x1 x2 or x2 yz that are greater than their model might outperform Burton et al. (2002) if dummy
respective means have the greatest negative effects on variables were used in place of all interactions in (1).
performance and so are associated with misfit. After This is a valuable exercise because it identifies the role
played by dummy variable codings per se as distinct from true DGP that fit is present. The Zajac et al. method
the specific use of them by Burton et al. We explored performs poorly because the residuals do not exploit the
this possibility by replacing all interactions in (1) with underlying structure inherent in the fit relationship and
dummy variables. For example, a dummy variable for so cannot adequately represent the fit that is present in
x1 x2 took the value one for values of this variable that the data.
exceeded its sample mean and zero for values below its In summary, the results in this section show that
sample mean. Dummy variables were defined similarly although one alternative method (Vorhies and Morgan
for all of the dual and multiple between-group fit deter- 2003) can actually detect FAM fit when it is present,
minants of performance. These dummies replaced the last none of the alternative fit methods analyzed here uncover
eight continuous variables of (1). The parameter esti- either the identity or full structure of fit for the FAM
mates appear in column 3 of Table 3, and the IC statistics data-generation process given by (7). Hence the GI fit
appear in Panel C of Table 4. Compared with column 1 approach would seem to be a valuable addition to the
of Table 3 and Panel A of Table 4, these results show that set of fit approaches for at least some types of DGP.
the dummy model performs adequately but not as well Of course, it remains to be seen how successful the
as the pure TL model. The parameter estimate on x1 x2 in GI approach is at mimicking alternative DGPs—i.e.,
column 3 of Table 3 takes the wrong sign and is not quite whether the GI approach also satisfies the first robust-
significant at 5%, although the IC statistics in Panel C of ness property above. This question is examined next.
Table 4 correctly identify both types of fit as significant
in Panel C. Unfortunately, some spurious between-xz fit Comparisons Based on a FAD DGP
also appears to be significant. Essentially, replacing con- Suppose now that the performance data are generated
tinuous variables with dummy variables discards valuable by (8). With this DGP, we might logically expect the
information, which is why the results based on dummy explicit FAD approach described in the previous subsec-
interactions identify the DGP inaccurately. Nevertheless, tion to perform well, whereas the GI approach does not.
by correctly identifying at least some aspects of fit, the To explore this possibility, we compare the GI approach
results appear to be superior to those of the Burton et al. to the one used by Vorhies and Morgan (2003). For
approach discussed above. illustrative purposes, four integers were randomly drawn
Finally, we analyze one more alternative approach to from the uniform distribution with support '0 10(. This
fit, suggested by Zajac et al. (2000). This is the “Fit as gave v∗ = 4 8 9 1. The FAD DGP involves substitut-
Mediation” two-step approach, which has the following ing this v∗ into the RHS of (8) and adding randomly
structure: generated normal numbers generated from N 0 3 to
compute values of p. The Vorhies and Morgan (2003)
c = f v + u (10) method described above was then applied. That is, we
p = 0 + 1 û + 2 û + 3 ûĉ + # (11) calculated the average v for the 10 cases with the largest
values of p, denoted by v̂∗ , and then computed values of
The first equation (10) specifies some auxiliary vari- D = v − v̂∗ v − v̂∗ for the remaining 90 observations
able c as some function f · of the vector of covari- before running the regression p = 0 +1 D. The Vorhies
ates v; the u are random regression errors. Predicted and Morgan method stipulates the regression estimate
values of c are given by ĉ = f v and residuals are given of 1 to be negative and statistically significant. This is
by û = c − ĉ. In (11) these predicted values and residuals indeed what is found in column 4 of Table 5.
are taken to influence performance; misfit is detected by We then applied the GI FAM approach to the same
a negative coefficient on 1 .The rational of Zajac et al. DGP (8). As noted in the previous section, expand-
is that firms whose c values diverge from those of the ing the quadratic form in (8) yields an equation, (2),
predicted conditional mean given by the first terms on that is a special case of (1), namely where only the
the RHS of (10) will likely be in misfit and end up with “individual effects” are present, with zero coefficients
lower performance. for all of the interaction effects. Column 4 of Table 3
To be as generous as possible to the Zajac et al. and Panel D of Table 4 shows that the GI approach
method, it was implemented by (i) assuming that the true is able to identify this case by correctly finding sig-
structure of f · is known: c = 0 +1 x1 x2 +2 x2 yz+u, nificant individual effects and insignificant interaction
and (ii) defining c as a genuine mediating variable: effects. Interestingly, these findings clearly identify the
c = 05 ∗ p + ,, where , was defined as a N 0 3 nature of the fit, namely FAD (recall that, in contrast,
random noise as above. Column 3 of Table 5 presents the Vorhies and Morgan method is unable to do this
the results. As is readily seen, this approach does indeed when the DGP is FAM). Furthermore, the adjusted R2
find a negative coefficient on û, but neither it nor any of is almost identical to that of the Vorhies and Morgan
the other coefficients are statistically significant. Hence estimates in column 4 of Table 5 (the slight reduction
the Zajac et al. method is silent about the existence and is attributable to the redundant interaction variables in
structure of fit or misfit, even though we know from the this case). This suggests a useful, important, and novel
possibility: In practice, when the true DGP is rarely if Table 6 Numbers of Contingencies and Covariates in
ever known, the GI approach can nevertheless identify GI Models
the type of fit underlying the data (e.g., FAM in col- nI nE nS No. of regressors Min. sample size
umn 1 and FAD in column 3). In an empirical sense,
therefore, FAD is “nested” in the GI approach. This 2 1 1 17 18
2 2 2 36 39
implies a response to Fiss’s (2007, p. 1183) critique of 3 3 3 82 90
the FAD approach: “[A] deviation score approach allows 4 4 4 155 163
the researcher only limited peeks into the black box 5 5 5 261 n.a.
of configurations. It often remains unclear which aspect 6 6 6 406 n.a.
of the misfit actually affects the outcome in question.” Notes. Number of regressors is
In this respect, the GI method appears to outperform
FAD-based alternatives. 1 + 2n I + n E + n S + 1
2
n j n j − 1 + n I n E + n I n S + n E n S + n I n E n S
j=I E S
Mismeasurement, Small Sample Sizes, This is the number of terms in (1) in the text. Min. sample size is the
minimum number of observations needed to apply the GI method in
and Collinearity practice and obtain accurate IC statistics. n.a. = not available, i.e.,
A second desirable property of fit measures is robust- design matrix Wt Wt is too large to invert to compute OLS formulae
ness to mismeasurement of variables. Below, we focus b̂ = Wt Wt −1 Wt pt .
on comparing the GI approach with the explicit FAD
approach used in Vorhies and Morgan (2003). To explore
this issue for both FAM and FAD fit approaches, the true can be estimated using a minimum of 18 observa-
data continue to be associated with v, whereas the error- tions. Lower rows of Table 6 correspond to ever larger
prone actual values available to the researcher are taken models. With 163 observations, for instance, the GI
to be 25 × v. If fit approaches are robust to mismeasure- method can handle 12 contingency variables and the
ment of variables, their indications of fit should be unaf- associated 155 regressors.9 Evidently, the number of
fected. This prediction is tested using DGP (7) for the GI explanatory variables grows rapidly as the number of
approach and using DGP (8) for the Vorhies and Morgan contingency variables increases. In contrast to the alter-
approach. The results appear in column 5 of Table 3, native approaches considered here, the minimum sample
Panel E of Table 4 for the GI approach, and column 5 size needed to operationalize the GI approach rapidly
of Table 5 for the Vorhies and Morgan approach. The becomes quite large. This can in and of itself pose an
results in these tables show that each method still cor- important practical limitation if data are scarce. Clearly,
rectly identifies fit so both methods appear to be robust although it is possible to create a large sample size in
to this type of mismeasurement. simulation exercises, it is often difficult or impossible
The third desirable property of fit measures is robust- when using real-world data.
ness to a small sample size. To explore this issue, the Another problem with the GI method is that when
number of observations of v and v∗ is reduced to 22. there are numerous contingencies, parameter estimation
This decreases the degrees of freedom in the GI regres- can become difficult or impossible. For example, some
sion to a mere five. Yet as column 6 of Table 3 and econometric programs place restrictions on the maxi-
Panel F of Table 4 reveals, the inference of the GI mum number of covariates in the design matrix. Fur-
approach is remarkably robust to the very small number thermore, it can be difficult to interpret coefficients in
of degrees of freedom. In contrast, column 6 of Table 5 these cases, and collinearity can also become a prob-
shows that the coefficient of the distance measure carries lem. Although in general collinearity does not affect F
the incorrect sign and is statistically insignificant, imply- statistics (Gujarati 1995) on which the IC statistics are
ing no sources of misfit. This is, of course, the opposite based—implying a broad degree of robustness in this
of the true state of affairs. sense—some software programs can break down when
So far, then, our findings suggest that the GI method collinearity is very pronounced. And even though step-
is probably the most robust and informative of the fit wise regression methods might be used to address these
approaches considered here. However, we have only problems when degrees of freedom are scarce, stepwise
applied the GI method using a small number of contin- regression is not without its limitations either. For exam-
gencies. At issue is the fact that the GI method prolif- ple, the type of stepwise algorithm used (e.g., forward
erates parameters when there are numerous explanatory selection, backward elimination, or some combination),
variables. Table 6 illustrates the problems that this can the order of parameter entry (or deletion), and the num-
cause. The first row of Table 6 corresponds to the empir- ber of candidate parameters can all affect the selected
ical analysis performed so far in this article, where model (e.g., Derksen and Keselman 1992). These prob-
nI = 2 and nE = nS = 1. This model generates 17 regres- lems can be particularly acute when the regressors are
sors in (1), which according to our simulation analysis correlated.
Costs and Benefits a method that avoids the problems of imposing an inap-
The GI approach promises several important benefits propriate theoretical structure on the data. We believe
compared with existing methods. These include robust- the GI approach is especially useful when researchers
ness to data generated by different DGPs, to mis- have little idea about the underlying nature of the fit-
specification of ideal fit profiles (because it makes no generating process. In this case, it can help to reveal the
explicit assumptions or inferences about such profiles), nature of the underlying process as well as the degree
to mismeasurement of variables, and to small sample of fit.
sizes. The GI approach also identifies which dimen- A further advantage of our approach rests in the way
sions of firm performance (if any) are associated with that the GI method deals with ideal types. Explicit FAD-
fit, which (if any) are associated with misfit, and even based approaches in the contingency literature need ideal
which DGP likely underlies the observed data. The lat- types of configurations of variables to calculate dis-
ter is a particularly valuable property because it helps tance terms or to identify (mis)fits. This implies either
elucidate the underlying structure of fit, aiding inter- that a theory must be in place to specify such ideal
pretation and potentially helping to inform appropriate types ex ante or that first-stage empirical analyses are
strategies. However, the GI approach is less well-suited required ex post as input in second-stage estimations.
to handling very large numbers of contingencies. These Both approaches imply very strong assumptions about
costs and benefits need to be carefully weighed in future the researcher’s a priori theoretical knowledge or the
empirical analyses of fit. Our own view is that for the quality of the post hoc first-stage empirical analyses.10
reasons enumerated above, the GI approach should be Our GI approach avoids such strong assumptions by
used whenever feasible, for instance when the number implicitly inducing ideal types in a one-shot estimation
of contingencies make conventional estimation possible procedure. Seen this way, the GI specification implies
(even when degrees of freedom are limited). But in mod- an inductive approach to distil ideal types from the data.
els where there are very large numbers of contingencies, This is not to say that the GI approach cannot be com-
alternative methods such as Vorhies and Morgan (2003) bined with deduction––it can. After all, deduction guides
should be used—albeit at the cost of not being able to the selection of potential contingency variables and the
determine the true nature of the DGP underlying the formulation of fit hypotheses, which can subsequently
observed fit. be tested within the nested context of a comprehensive
More specifically, our Monte Carlo simulations re- GI performance model. In the empirical analysis, such
vealed that the GI method is able to deal with relatively deductive reasoning does not impose any constraints on
small sample sizes. Clearly, though, there is inevitably a the estimation procedure.
sample size threshold below which estimation ceases to
be feasible, depending upon the number of covariates. Appraisal
Then either the number of covariates has to be reduced This paper developed a general approach to measuring fit
or another method must be applied. In the first case, the in the parametric performance-criterion tradition (Venka-
researcher may decide, on the basis of theory, to restrict traman 1989). In the large contingency literature, con-
the number of control variables, main effects (linear sensus on the way to measure and estimate fit is absent,
and/or nonlinear), and product terms to be included in implying that results across the literature are difficult to
the GI specification. In the second case, if the sample compare. Two widely used but different fit approaches
size is still too small relative to the number of the must- are referred to as fit as moderation (FAM) and fit as devi-
be-included covariates, another method must be pre- ation (FAD). The FAM approach is based on the assump-
ferred, such as that of Vorhies and Morgan (2003), Fiss’s tion that contingency effects are reflected in interaction
set-theoretic approach, nonparametric tests, or case stud- terms, whereas the FAD approach argues that this can
ies. If the degrees of freedom condition is satisfied, how- only be achieved by distance terms. Our general interac-
ever, the GI method can be flexibly applied within both tion (GI) approach is agnostic in this respect because the
the deductive and the inductive fit tradition. This we GI specification of fit integrates both the FAM and sym-
believe is one of our method’s key benefits. metric FAD notions as nested cases. Using simulations,
Here we would like to emphasize an aspect of our we show that the GI approach indeed correctly identifies
approach that we think is essential to further progress in all FAM and FAD instances of fit, even if the underlying
contingency research. That is, we believe that to date too data-generation process is of a FAD nature.
little attention has been paid to limitations of fit mea- More generally, provided that the number of con-
sures that imply specific assumptions as to the nature tingencies is not too great, our GI specification can
of the underlying fit-generating process. Contingency be superior to a series of alternative ways to mea-
researchers currently do not know the implications of sure and estimate fit—i.e., fit as distance from an ideal
assuming a particular theoretical conception of fit when type (e.g., Vorhies and Morgan 2003), fit as a dummy
in fact the data are generated by a totally different pro- (e.g., Burton et al. 2002), and fit as mediation (e.g., Zajac
cess. Our paper sheds light on this issue and proposes et al. 2000). That is, the GI approach does capture the
“true” instances of fit, which the other approaches fail fit-as-moderation or interaction, fit-as-deviation or con-
to do, whether produced by an underlying FAM or FAD gruence, and the fit-as-bundle or system perspectives.
data-generation process. This is the first, and most impor- We hope that the simplicity and power of the approach
tant, property of the GI approach. In addition, the GI we propose will be of considerable practical use to future
approach has four additional desirable properties: (1) it organizational scientists.
moves far beyond the dominant bivariate approach by
including a series of bivariate and higher-order interac- Acknowledgments
tion terms; (2) it is associated with test statistics (IC, or The authors thank the senior editor and three anonymous refer-
ees for helpful comments on earlier drafts; the usual disclaimer
incremental contribution statistics) that are broadly insen-
applies.
sitive to multicollinearity; (3) it is more robust to mis-
measurement of variables; and (4) it can deal with small Endnotes
sample sizes. However, the GI approach comes at the 1
In advance, it should be noted that some of the above exam-
cost of proliferating parameters in models with numerous ples already integrate different approaches to fit. Particularly,
contingencies. The issue of small sample sizes deserves Zajac et al. (2000) add interaction terms to their residual
further work in future research. approach, and Burton et al. (2002, 2003) estimate the impact
Future research might also try to extend the compar- of both individual and aggregated (mis)fit dummies. However,
ative analysis of different fit methods adopted here by neither these nor any other approach in the literature is as
comprehensive as ours. Additionally, system-type of fit stud-
paying greater attention to the alternative approaches of
ies often employ correlation and variance analyses to reveal
complementarity and supermodularity. Recently, using patterns of associations (e.g., Doty et al. 1993). However, in
the complementarity framework of Milgrom and Roberts the current paper, we focus on parametric regression methods.
(1995), an interesting series of empirical studies has 2
Our approach can easily be generalized to more than three
emerged that tries to empirically estimate complemen- groups of variables. For instance, the I-group could be split
tarities (see, e.g., Cassiman and Veugelers 2006). This into two sets of variables, one relating to “hard” internal issues
suggests that the application of the empirical supermod- of organization structure and another one capturing “soft”
ularity approach in the broader contingency tradition internal aspects of organizational culture. For our purposes,
though, there is no need to complicate the calculus further.
deserves further attention (Porter and Siggelkow 2008).
Note that, moreover, our three sets of variables cover the main
This also holds true for the temporal fit approach of thrust of the contingency-related literature.
Perez-Nordtvedt et al. (2008). Basically, they argue that 3
See Christensen et al. (1973, 1975) for a proof of this prop-
we need to test for the effect of temporal (mis)fit on firm erty of the TL specification in the context of production and
performance. To do so, longitudinal research is needed. utility functions.
4
In our paper, we do not refer to a time dimension at all. Equation (1) could also be derived using the “Expansion
Our method, in principle, can deal with longitudinal data Method” popularized by Emilio Casetti in the 1970s (for a dis-
and “dynamic” fit measures. A dynamic measure would cussion, see, e.g., Casetti 1992).
5
be, e.g., that the change in strategy from t to t + 1 fits Where = 0 + 1 x∗2 + 2 y ∗2 + 3 z∗2 , I = −21 x∗ ,
I = 1 , E = −22 y ∗ , E = 2 , S = −23 z∗ , and S = 3 .
well (or not, of course) with the change in the environ- This specification remains valid and estimable also when x∗ ,
ment from t to t + 1. It remains to be seen whether there y ∗ , and z∗ are somehow “known” or estimated externally
exists an even more general fit approach that can encom- because 0 , 1 , 2 , and 3 remain unknown parameters. Note
pass these frameworks as well as those discussed in this that the GI model does not nest all possible FAD specifica-
paper. We hope to explore this issue in future work. tions. For instance, it does not nest asymmetric specifications
We believe that the various measures proposed here, that penalize deviations above an optimal level differently from
together with the full set of estimated coefficients in the deviations below an optimal level. We leave this issue for future
underlying performance model, provide a rich source of research.
6
Positive coefficients on all interaction terms are consistent
information from which a full and rounded picture of with different notions of fit known as “complementarity” and
fit and firm performance can be obtained. Our general “supermodularity”: see, e.g., Athey and Stern (1998), Mohnen
framework is flexible in the sense that multivariate inter- and Röller (2005), Cassiman and Veugelers (2006), and Porter
actions can be easily introduced, as can series of bivari- and Siggelkow (2008).
ate product terms—thus introducing a system-type of fit 7
In this case and all others considered below, all three IC statis-
notion. Moreover, we believe that the IC statistic is supe- tics yield identical conclusions. Also, all of the aggregate fit
rior to the ones used in the extant literature, for the rea- statistics are highly significant, so issues relating to IC statistic
sons adduced above. Additionally, the multi-interactions type and aggregation are not discussed any further below.
8
This approach might be criticized as rather limited, by assum-
across groups jointly indicate configurations of complex
ing that only a unique v∗ maximizes performance (“unifi-
bundles of complementarities that reflect ideal types of nality”). See for example Fiss (2007), who proposes a more
performance-maximizing sets of organizational features sophisticated set-theoretic approach that can identify multiple
and environmental contingencies. In doing so, we offer a v∗ s that can all be associated with high performance (“equi-
general framework of measuring fit that encompasses the finality”). Our approach can deal with equifinality, too. For
example, let a b c d be four dummy variables. Suppose Donaldson, L. 2000. Organizational structures for competence-based
firm 1 has a = b = 1 and c = d = 0, whereas firm 2 has the management. R. Sanchez, A. Heene, eds. Theory Development
opposite: a = b = 0 and c = d = 1. Then, letting p denote for Competence-Based Management, Advances in Applied Busi-
performance, the FAM regression model p = ab + cd exhibits ness Strategy, Vol. 6a. JAI Press, Stanford, CA, 31–56.
equifinality for the two firms. If cases of a = b = c = d = 1 are Donaldson, L. 2001. The Contingency Theory of Organizations. Sage,
technically infeasible, then only equifinality is observed. Thousand Oaks, CA.
9
Of course, this example serves an illustrative purpose only: Doty, D. H., W. H. Glick, G. P. Huber. 1993. Fit, equifinality,
we certainly do not suggest that researchers should increase the and organizational effectiveness: A test of two configurational
number of regressors up to the maximum possible limit. theories. Acad. Management J. 36 1196–1250.
10
As a prominent researcher on fit recently put it, “[I]deal
Drazin, R., A. H. Van de Ven. 1985. Alternative forms of fit in con-
types thus largely depend on just how the sample is composed, tingency theory. Admin. Sci. Quart. 30 514–539.
rather than on substantive theory about what an ideal configura-
tion means and what makes it ideal. Furthermore, the obtained Fiss, P. C. 2007. A set-theoretic approach to organizational configu-
rations. Acad. Management Rev. 32 1180–1198.
results may be quite sensitive to even minor errors in estimat-
ing the ‘ideal’ configurations, and the reliability of deviation Garg, V. K., B. A. Walters, R. L. Priem. 2003. Chief executive scan-
scores will often be very low because it is the product of the ning emphases, environmental dynamism, and manufacturing
reliabilities of the original variables” (Fiss 2007, p. 1183). firm performance. Strategic Management J. 24 725–744.
Greene, W. 2003. Econometric Analysis, 5th ed. Prentice-Hall,
References New York.
Athey, S., S. Stern. 1998. An empirical framework for testing the- Gujarati, D. 1995. Basic Econometrics. McGraw-Hill, New York.
ories about complementarity in organizational design. NBER
Working Paper 6600, National Bureau of Economic Research, Hyatt, T. A., D. F. Prawitt. 2001. Does congruence between audit
Cambridge, MA. structure and auditors’ locus of control affect job performance?
Accounting Rev. 76 263–274.
Barney, J. B. 1991. Firm resources and sustained competitive advan-
tage. J. Management 17 99–120. Luo, Y., S. H. Park. 2001. Strategic alignment and performance of
market-seeking MNCs in China. Strategic Management J. 22
Boone, C., B. De Brabander, A. van Witteloostuijn. 1996. CEO locus
141–155.
of control and small firm performance: An integrative frame-
work and empirical test. J. Management Stud. 33 667–699. Miles, R. E., C. C. Snow. 1978. Organizational Strategy, Structure
and Process. McGraw-Hill, New York.
Boone, C., W. van Olffen, A. van Witteloostuijn, B. De Brabander.
2004. The genesis of top management team diversity: Selec- Milgrom, P., J. Roberts. 1995. Complementarities and fit: Strat-
tive turnover among top management teams in Dutch newspaper egy, structure, and organizational change in manufacturing.
publishing, 1970–94. Acad. Management J. 47 633–656. J. Accounting Econom. 19 179–208.
Burton, R. M., J. Lauridsen, B. Obel. 2002. Return on assets loss Mohnen, P., L.-H. Röller. 2005. Complementarities in innovation pol-
from situational and contingency misfits. Management Sci. 48 icy. Eur. Econom. Rev. 49 1431–1450.
1461–1485.
Perez-Nordtvedt, L., G. T. Payne, J. C. Short, B. L. Kedia. 2008. An
Burton, R. M., J. Lauridsen, B. Obel. 2003. Erratum: Return on assets entrainment-based model of temporal organizational fit, misfit,
loss from situational and contingency misfits. Management Sci. and performance. Organ. Sci. 19 785–801.
49 1119.
Porter, M. E. 1980. Competitive Strategy: Techniques for Analyzing
Cappelli, P., D. Neumark. 2001. Do “high-performance” work prac- Industries and Competitors. Free Press, New York.
tices improve establishment-level outcomes? Indust. Labor Rela-
tions Rev. 54 737–775. Porter, M. E., N. Siggelkow. 2008. Contextuality within activity sys-
tems and sustainability of competitive advantage. Acad. Man-
Casetti, E. 1992. The dual expansion method: An application for eval- agement Perspective 22(May) 34–56.
uating the effects of population growth on development. J. P.
Jones, E. Casetti, eds. Applications of the Expansion Method. Rivkin, J. W. 2000. Imitation of complex strategies. Management Sci.
Routledge, New York, 8–31. 46 824–844.
Cassiman, B., R. Veugelers. 2006. In search of complementarity Skaggs, B. C., T. Ross Huffman. 2003. A customer interaction
in innovation strategy: Internal R&D and external knowledge approach to strategy and production complexity in service firms.
acquisition. Management Sci. 52 68–82. Acad. Management J. 46 775–786.
Child, J. 1972. Organizational structure, environment and perfor- Venkatraman, N. 1989. The concept of fit in strategy research: Toward
mance: The role of strategic choice. Sociology 6 1–22. verbal and statistical correspondence. Acad. Management Rev.
Christensen, L. R., D. W. Jorgensen, L. J. Lau. 1973. Transcendental 14 423–444.
logarithmic production frontiers. Rev. Econom. Statist. 55 28–45. Vorhies, D. W., N. A. Morgan. 2003. A configuration theory assess-
Christensen, L. R., D. W. Jorgensen, L. J. Lau. 1975. Transcendental ment of marketing organization fit with business strategy and
logarithmic utility functions. Amer. Econom. Rev. 65 367–383. its relationship with marketing performance. J. Marketing 67
100–115.
Derksen, S., H. J. Keselman. 1992. Backward, forward and stepwise
automated subset selection algorithms: Frequency of obtaining Wright, P. M., S. A. Snell. 1998. Toward a unifying framework for
authentic and noise variables. British J. Math. Statist. Psych. 45 exploring fit and flexibility in strategic human resource manage-
265–282. ment. Acad. Management Rev. 23 756–772.
Dierickx, I., K. Cool. 1989. Assets stock accumulation and sus- Zajac, E. J., M. S. Kraatz, R. K. F. Bresser. 2000. Modeling the
tainability of competitive advantage. Management Sci. 35 dynamics of strategic fit: A normative approach to strategic
1504–1511. change. Strategic Management J. 21 429–453.

A General Framework For Estimating Multidimensional Contingency Fit

Uploaded by

Copyright:

Available Formats

A General Framework For Estimating Multidimensional Contingency Fit

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

A General Framework For Estimating Multidimensional Contingency Fit

Uploaded by

Copyright:

Available Formats

Organization Science informs ®

Vol. 21, No. 2, March–April 2010, pp. 540–553 doi 10.1287/orsc.1090.0464