Inoua Simple Economic Complexity Version 2021

See discussions, stats, and author profiles for this publication at: https://www.researchgate.
net/publication/291437352
A Simple Measure of Economic Complexity
Preprint · January 2016

Source: arXiv
CITATIONS READS
11 1,843
1 author:
Sabiou M. Inoua
Chapman University
17 PUBLICATIONS 23 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
A Rehabilitation of Classical Economics View project
All content following this page was uploaded by Sabiou M. Inoua on 04 October 2021.
The user has requested enhancement of the downloaded file.

A Simple Measure of Economic Complexity
Sabiou Inoua
Chapman University
This version: 2021
Abstract. The conventional view on economic development simplifies a country’s

production to one aggregate variable, GDP. Yet product diversification matters for
economic development, as recent, data-driven, “economic complexity” research
suggests. A country’s product diversity reflects the country’s diversity of productive
knowhow, or “capabilities”. Researchers derive from algorithms (inspired by network
theory) metrics that measure the number of capabilities in an economy, notably the
Economic Complexity Index (ECI), argued to predict economic growth better than
traditional variables such as human capital, and the country Fitness index. This paper
offers an alternative economic complexity measure (founded on information theory)
that derives from a simple model of production as a combinatorial process whereby a
set of capabilities combine with some probability to transform raw materials into a
product. A country’s number of capabilities is given by the logarithm of its product
diversity, as predicts the model, which also predicts a linear dependence between log-
diversity, ECI, and log-fitness. The model’s predictions fit the empirical data well; its
informational interpretation, we argue, is a natural theoretical framework for the
complexity view on economic development.
Keywords: economic growth, economic development, product diversification,

economic complexity metrics, entropy, information theory
1 Background
1.1 Product Diversification Matters for Economic Development
Contrary to a long tradition in economics according to which international
prosperity is achieved when national economies specialize, product diversity is
strongly correlated with economic development [1-6]. The “richest” countries make
almost all types of products, from the most rudimentary to the most sophisticated
ones; while the “poorest” countries make comparatively fewer and more
rudimentary products (Table 1).
Table 1. The World’s Most and Least Diversified Economies (2018, 4-digit HS).1
Ten Most Diversified Economies (2018) Ten Least Diversified Economies (2018)
Country Diversification Rank Country Diversification Rank
United States 1224 1 Gambia 180 162
China 1221 2 Maldives 178 163
India 1219 3 Saint Lucia 169 164
Japan 1211 3 Equat. Guinea 165 165
Germany 1223 5 Bhutan 143 166
Russia 1204 6 Central Afr. Rep. 130 167
Brazil 1196 7 Chad 118 168
Indonesia 1168 8 Comoros 105 169
United Kingdom 1221 9 Guinea-Bissau 29 170
France 1220 10 Small Islands [3, 28] 171
At the lowest extreme of the economic complexity spectrum are small-population
islands, which mostly export natural products (naturally occurring goods: fruits,
vegetables, fish).2 Then come economies specialized in highly demanded raw
materials (notably oil); these countries have higher incomes despite their low
1See data description and source in Subsection 4.1. The number of products a country makes
depends of course on the product nomenclature used (usually 2, 4, 5, or 6-digit product codes),
notably the SITC (Standard International Trade Classification) and the HS (Harmonized System).
The results from the two product nomenclatures and at different aggregation levels are very similar;
hence we present no systematic comparison of the results based on product nomenclature.
2The islands are, e.g., Bouvet (BVT), Netherlands Antilles (ANT), Kiribati (KIR), Northern Mariana
(MNP), Micronesia (FSM), Pitcairn (PCN), South Geargia & Sandwich (SGS), Tuvalu (TUV), Wallis
& Futuna (WLF).
2
3
product diversity (Figure 1). All in all, about 80% of countries’ GDP ranking can be
explained by the mere product-diversity ranking (Figure 1) if we put aside natural
resources, which, as documented throughout, are the main source of bias in this
purely qualitative view on production. (Islands, which expectedly would appear as
even greater outliers then oil-experts, are not included in Figure 1, due to missing
GDP data.)
Figure 1. Countries’ GDP versus Diversification Rankings (4-digit HS).

Red countries are countries with exporters with natural-resource rents
(averaged across years) at least 10% of their GDP.3
These associations are not just correlations and can be explained from a basic
combinatorial model of production.4 The complexity of a country’s production (the
diversity and sophistication of its products) reveals a diversity of productive
knowledge in that economy that combine to make various products. Qualitatively,
3 Throughout the straight lines are least-square fits.

4 This paper’s combinatorial model of production differs by its simplicity from earlier ones [7-9].
4
products differ precisely by the amount of knowhow involved in their production:
in theory, the spectrum of this knowhow content of products ranges from zero, for
naturally occurring goods (a natural resource sold in the raw, for example), to a
maximum value when all the available knowhows are involved in the making of
the product (consider an aircraft, for example). A product’s (technological)
sophistication or complexity can be defined by the amount of knowledge its
production requires; and the (technological) complexity of an economy, by the total
amount of knowledge involved in its output. But what precisely is an “amount of
productive knowledge” and how can we measure it?
1.2 Productive Knowledge in Conventional Growth Models
In conventional economic theory, a country’s productive knowledge is summarized
by an aggregate production function GDP = F(Capital, Labor), or the aggregate
output (or income) the country can produce from any combination of aggregate
labor and capital, where the function F is homogenous of degree 1, so that income
per capita is a function of the stock of capital per worker: GDP/Labor =
F(Capital/Labor, 1).5 But as Solow’s seminal contribution established [10, 11], capital
and labor accumulation cannot account for much of economic growth, which Solow
explains in terms of exogenous shifts of the production function F through a
multiplicative factor (denoted “A” and later named “Total Factor Productivity”),
whose growth rate is interpreted as measuring technological progress. Much of
later development of standard growth theory consisted of attaching theoretical
substance and identity to the “Solow residual” (in other words, to “endogenize” the
part of economic development not explained by the level of capital per worker), the
consensus being that the residual somehow captures productive knowledge or
“technology”, often identified with human capital (years of schooling), innovation,
or research and development [12-15].
5 Capital is traditionally denoted in economic theory as K, which we reserve for “knowledge”.

5
1.3 The Complexity View on Economic Development
In contrast to the aggregate production function approach to production is the
above-mentioned, data-driven, “complexity” approach to economic development
inspired by the empirical correlation between economic development and product
diversification. The diversity of a country’s production, as noted above, reflects the
diversity of its capabilities (elemental units of productive knowhow), which combine
to make more and more sophisticated products.
Figure 2. Network Model of Production. Countries make products using

capabilities: (A) The country-capability-product network. (B) The country-
product network. Capabilities are not directly observable: How to count them?
Thus, it should be possible to infer the amount of productive knowledge involved
in an economy from its product diversification data. Researchers framed this
problem in terms of network theory, modeling countries’ productions as a tripartite
network connecting countries to the products they make, products to the
capabilities their production requires, and countries to the capabilities they possess
(Figure 2). Thus, the core problem of the network approach to production was
conceived as one of reconstructing the partly unobservable country-product-
capability network from its empirically observed bipartite country-product
projection [16, 17]. Researchers conceived algorithms to that effect, notably the
6
Economic Complexity Index (ECI), which, as the authors argue, predict economic
growth better than traditional variables such as human capital [16, 17]. The ECI is
jointly computed with the Product Complexity Index (PCI) by an algorithm akin to
that which the web search engine Google uses to rank webpages.6Another
algorithm produces alternative country and product complexity measures [20],
named country Fitness (F) and product Quality (Q). Both algorithms will be
presented shortly. (Section 4 offers a step-by-step derivation of the metrics and the
basic logic underlying them, for the reader not familiar with this literature.)7
The primary data of the network view on production is formally, the country-
product binary matrix M [ Mcp ] connecting countries to the products they make:
Mcp 1 if country c makes product p , and Mcp 0, otherwise; this simple product
list data is not available, however; thus, one takes as proxy for countries’ product
lists, the countries’ export lists. (More on the data description in Subsection 4.1.)
Given the matrix M , the product diversity of country c (the number of its products)
and the ubiquity of product p (the number of its producers) are respectively:8
Dc p
Mcp , (1)
Up c
Mcp . (2)
The complexity metrics are (up to norming) the solutions to the equations:
Dc ECI c p
Mcp PCI p , (3)
Up PCI p c
Mcp ECI c . (4)
Fc c
McpQp , (5)
Qp [ c
Mcp Fc 1 ] 1 . (6)
6More precisely the ECI-PCI algorithm is more similar in spirit to an algorithm developed by J.
Kleinberg [18, 19] and used by Ask.com. It is an eigenvector problem, as one can see from (3)-(4).
7The complexity metrics are analyzed in various studies, some of which offer critiques, alternatives,
or refinements of the metrics, including the one presented here, in an earlier draft [21-27].
8 The natural concept is not ubiquity per se, but its inverse, which can be called product rarity.
2 Results and Discussions
2.1 Measuring an Economy’s Knowhow by Counting Its Products
A product is some transformed natural resources, some raw materials to which is
applied a set of knowhows to turn them into an economically valuable outcome; and
knowledge comes in discrete elementary units, or capabilities, that combine to make
more and more sophisticated knowledge. The results presented in this paper derives
from these definitions and two simple assumptions about the constraints on
knowledge sophistication and raw-material availability:
1. Any S capabilities can be put together to transform raw materials into a valuable
S
product only with probability (uniform across countries and products).
2. A country finds the raw materials needed for making a product involving S
S
capabilities only with probability (uniform across countries and products).
The two assumptions imply that a product tends to appear in a country’s product
list with a probability that decays exponentially with the product’s sophistication.
Thus, a product’s sophistication can be measured by its log-likelihood of appearing:

log prob(product )
S . (7)
log( )
Moreover, one can show (Subsection 4.2) that the total number of capabilities in an
economy that makes D products is given by the country’s log-product-diversity:9

log( D)
K . (8)
log(1 )
The model predicts the following relationships between knowhow, fitness, and ECI
that fit the data well up to the bias related to natural products (Figure 3-Figure 4):10
9 The derivation is straightforward if we assume away the model’s two constraints (1)-(2); then a
country possessing K capabilities makes D=2K products, whose sophistication range from 0 (for
unprocessed natural resources sold) to K. Thus, K is given by logD (up to a scaling constant).
10 Notation: mean(X) denotes a cross-country average of X (average X across all countries); std(X)
means the cross-country standard deviation of X; later, we will use mean(X|c) to mean the average of
X in a country c; and in equation (16)-(17), we write mean(X|K) for the average of X in a country with
K capabilities. The metrics are systematically compared in standardized form (namely their z-scores)
in the figures below, unless otherwise indicated by the scale of the plot.
7
8
log D mean(log D)
ECI . (9)
std(log D)
log F log D
. (10)
mean(log F ) mean(log D)
9
Figure 3. The Three Country Complexity Measures Related: Data

versus Model. (Top panel: 4-digit SITC; Bottom Panel: 2-digit HS.)
10
Figure 4. Log-Diversity vs ECI (top) vs Log-Fitness (bottom). (2-digit HS)

11
More specifically (Subsection 4.5), the model predicts the following relationships
between knowhow, diversity, complexity, sophistication, fitness, and quality,
assuming the two algorithms (3)-(6) measure accurately these variables:
D (1 )K , (11)
F 2K F(0), (12)
Q ( ) S Q(0), (13)
K mean( K )
ECI , (14)
std( K )
S mean(S)
PCI . (15)
std(S)
mean(S| K ) ( )K . (16)
1
mean(Q | K ) ( 2 )K , (17)
1
where F(0) and Q(0) are merely normalizing constants, and the notation mean(X|K)
stands for the (conditional) average of X in a country with K capabilities. The last
two predictions (16)-(17) are particularly nontrivial and fit equally well the empirical
data (Figure 5). They offer a simple criterion for assessing the accuracy of each
complexity algorithms.11 The F-Q metrics are better fit by the model because this
algorithm better deals with the bias related to natural products, whose rarity is due
to natural reasons and not to knowhow, and which are mostly exported by island
countries, the least complex economies: it is indeed an essential aspect (and
motivation) of the F-Q algorithm to emphasize the lowest-complexity countries in
the estimation of Qp , namely (6). (See Subsection 4.3.) In contrast, island countries
are major outliers of the ECI versus log-diversity theoretical fit [Figure 3, bottom
panel, (a)].
11 In an earlier draft (arxiv.org, 2016) we suggested that it takes more than a regression between logD
and logF to assess the accuracy of the F-Q algorithm, because fitness being the average product
quality multiplied by diversity, a strong correlation between logF and logD might be a fortuitous one
(that holds even with random data, as confirms Figure 5: Bottom Panel). The nontrivial prediction (17)
is the right criterion in this respect.
12
Figure 5. Countries’ Productive Knowhow (Log-Diversity) Can be Measured

by their Average PCI or by their Log Average Product Quality. (4-digit HS.)
13
The model’s two parameters come down effectively to one, the joint probability
. (18)
Once this probability is known, all variables are determined (including the scales or
norming constants). This parameter is simply related to the slope between log-
diversity and log-fitness, by virtue of the predictions (11) and (12), which imply
log(1 ) log(1 )
log D log F log F(0). (19)
log 2 log 2
Thus, we can estimate the model’s key probability parameter and the norming
constant F(0) through linear regression, which yields:12

cov(log D , log F )
var(log F )
2 1, (20)
F(0) exp{mean(log F ) [log(1 )] 1 mean(log D)log 2}. (21)
The number of capabilities in each country is then estimated through either one of
the three economic complexity measures:
log D log[ F /F(0)] log D log D

K mean[ ] std[ ]ECI. (22)
log(1 ) log 2 log(1 ) log(1 )
The probability and the spread of the distribution of K (notably the maximum
knowhow Kmax ) depends on the product nomenclature (Figure 6).
12 If one can manage to find F(0) explicitly in terms of the norming constants involved in the F-Q
algorithm (Section 4.3), then one can sharpen the regression equation by regressing logD on
log(F/F(0)), and split the error (or residual) term into a mean term, which would measure the bias due
to raw products, and a pure noise term.
14
Figure 6. World Distribution of Country Knowhow. [HS: 2-digit (Top)

versus 4-digit product (Bottom) categories.]
2.2 Measuring a Product’s Sophistication by Harmonically Counting Its Producers
The model’s two assumptions also amount effectively to one: the probability that a
country c makes a product p, which we write simply prob(c| p), decays exponentially
with the product’s sophistication Sp . That is:
Mcp Sp
prob( p|c) . (23)
Dc
The (unconditional) probability of finding a product p with sophistication Sp in the
world economy (all countries combined) is (by the law of total probability):
C C 1
prob( p) c 1
prob( p|c) prob(c) c 1
prob( p|c) .
C
Given (23), we get
Sp 1 C
Mcp
prob( p) c 1
. (24)
C Dc
After rearranging the terms, we get:
Sp 1
C . (25)
C Mcp
c 1
Dc
Thus, the model predicts that product sophistication is (up to norming) given by the
formula:
1 C
Sp log[mean H { Mcp Dc } ], (26)
1 Up
log
where meanH stands for (cross-country) hormonic mean. By the same token, we
have also established the following correspondence (up to norming):
S
( F , Q) ( D, ). (27)
That is, if we replace fitness by diversity in the F-Q algorithm, then this latter yields
Sp
Qp Q(0) . The three product complexity measures (S, PCI, and LogQ) are
strongly associated (Figure 7).
15
16
Figure 7. The three product sophistication measures related. (Top panel: 2-

digit HS; bottom panel: 4-digit HS: outlier products are mostly raw products.)
17
2.3 Diversity and Complexity are Orthogonal, but Not Independent
This subsection is an interlude suggested by a growing interpretation concerning the
orthogonality between complexity and diversity, a misapprehension that seems to
call into question the very dependence between these concepts (and hence
everything we said so far). Diversity and complexity are related notions intuitively.
The model predicts that product diversity is an exponential function of economic
complexity. One can show mathematically that the dependence cannot be a linear
one anyway, at least if complexity is measured by the ECI, or more precisely by its
non-standardized version (an eigenvector associated with the country-product
network: Subsection 4.2), which we denote k2 below. This follows from a basic yet
elegant mathematical result [23], establishing orthogonality of k2 and the product
diversity vector, which we denote d [Dc ] . By symmetry one can similarly establish
orthogonality between product ubiquity u [U p ] and the non-standardized PCI,
which we denote s2 .
Contrary to a spreading interpretation [23, 24, 27], however, orthogonality between
the two vectors, be it reminded, merely implies that the two vectors are not linearly
dependent (where linearity is to be taken in the strict mathematical sense, which
excludes affine dependence, or inclusion of an intercept). Orthogonality of
complexity and diversity is not incompatible with positive dependence between the
two vectors: to the contrary, the orthogonality combined with the positive
dependence merely put a constraint on average complexity. Thus k2d 0 and
s2 u 0 combined with cov(k2 , d) 0 and cov(s2 , u) 0 (which by now should be
taken as well-established both empirically and theoretically) simply imply that
mean(k2 ) 0, (28)
mean(s2 ) 0. (29)
18
Since the orthogonality is true mathematically, the sign condition (28)-(29) simply
reflects the dependence between complexity and diversity, which is strong as we
already know, and which the sign conditions confirm (Figure 8).
Figure 8. Distribution of the (non-standardized) ECI and PCI. Top:

Country Complexity. Bottom: Product Complexity. (2-digit HS).
19
3 Conclusion: Towards an Information Theory of Economic Growth
The results, in the final analysis, suggest that both the empirical data and the
algorithms of the economic complexity literature can be simply rationalized by a
combinatorial model of production in which bits of knowhow (or capabilities)
combine, with some probability , to make more and more sophisticated knowhow.
For simplicity we treated the model’s (effectively unique) parameter as uniform
across countries and across products; it is more accurate, however, to assume some
cross-country and cross-product variability of to account for the bias due to raw
products (or natural resources).
Fundamentally, the model rests entirely on the assumption that knowledge comes in
discrete units and that it expands combinatorially. This suggests a simple
informational interpretation of the model that seems to be the natural language for
the complexity view on economic development more generally. For our limited
purpose here, information theory can be summarized by an informational
interpretation of Boltzmann’s famous entropy formula:13
Information Content of System Log (Effective Number of Basic Information States).
That is, the amount of information that conveys an information source (a system to
whose states can be associated meaning) is measured by a logarithm of the system’s
effective number of basic states (those states that can carry information). A basic
information state can be the realization of an event, for example: the information that
conveys the realization of an event is then measured by the log-inverse-probability
of the event, and the total information conveyed by the whole system (here a
13Information theory becomes more intuitive (compared to its usual formulation based on probability
as a primitive concept) if the combinatorial foundation of information is more explicitly emphasized,
or even taken as the primitive concept, as seems to suggest Kolmogorov [28]. If indeed the basic
nature of information is that it comes in discrete units and that it expands combinatorially (or
exponentially, in the simplest case), then it is natural to measure the amount of information of an
information system by the logarithm of its effective number of information states.
20
probability space) is obtained by averaging the information contents conveyed by
the basic events (Shannon’s entropy formula); if the events are equally likely, then
the overall information content is measured by the logarithm of the total number of
possible events. Indeed Shannon’s entropy formula [29] can be viewed as a special
case of the general information formula: in this sense, Shannon formula defines the
effective number of information states to be the exponential of Shannon’s entropy.
Think of a country as an information source revealing information about the
knowledge content of the products it makes; and think of the country’s products as
events or messages that reveal information about the country’s (unobserved)
productive knowledge endowment. Thus, the information (or knowledge) content of
a country’s output is measured by the country’s log-product-diversity. Similarly,
think of the set of all countries potentially producing a product as the information
source; then each producer reveals partial information about the knowledge content
(or sophistication) of the product: the product’s knowledge content is then obtained
as an average of the partial information revealed by the producers, and is roughly
measured by the product’s log-frequency (or log-ubiquity) among countries.
However, this is only a rough measure that implicitly assumes uniform probability
of basic states or events: hence the need for a generalized (or effective) diversity and
ubiquity measures, theoretically given by the model’s predictions (8) and (26). The
complexity algorithms, on the other hand, can be viewed as an empirical way of
correcting diversity and ubiquity mutually, since a country’s knowhow is reflected
in the products it makes, and vice versa.
A systematic analysis of the economic implications of the information theory of
economic development is beyond this paper’s scope, since we choose to center the
discussion entirely on the purely qualitative dimension of production (where the
question is whether a country can make a product or not): more generally, a country
can be considered to be rich either because of its productive knowhow (as reflected
in its product diversity) or by the intensity of its production (or the average amount
21
of output the country is able to sell: the quantitative aspect of production determined
by shorter-term factors such as demand).14
14A discussion of the economic implications of the model is postponed to a follow-up work, which
contains a development accounting in terms of the two dimensions of production (diversity versus
intensity of output), sketched in an earlier draft (arxiv.org, 2016), but that expanded in subtlety.
4 Method: Data and Model
4.1 The Data
In principle, the complexity view on growth requires very simple data (for each
country, the list of products it makes), which are not yet available, however; hence
one takes as proxy for countries’ product lists, their export lists. While there will
inevitably be some error in centering the analysis on export data (for lack of detailed
data on production), the bias has proved minor a posteriori, given the accuracy of
the results (apparently, a country’s export mix is representative of its total output’s
composition). The results presented throughout this paper are based on the proxy
matrix:
1 if Xcp 0,
Mcp (30)
0 if X cp 0,
where Xcp is the amount country c exported in product p , using the Comtrade data
in HS (revision 2007), available for the years 1995-2018 [30].15 We also use for
comparison the Comtrade data in SITC (revision 2) as compiled and corrected for
mistakes by Feenstra et al. and available for the years 1962-2000 [31].
Unlike in this paper, the standard practice in the economic complexity literature is to
define the Mcp matrix more restrictively as
1, RCAcp 1,
Mcp (31)
0, RCAcp 1,
where RCAcp is the revealed comparative advantage of a country c in product p and
is defined as RCAcp (Xcp / p

Xcp ) / ( c
Xcp / cp
Xcp ).
15The trade data are accessible through the Atlas of Economic Complexity Dataverse (Harvard
University): https://dataverse.harvard.edu/dataverse/atlas. The income data are countries’ GDP in
PPP (purchasing power parity) from the Penn World Table (PWT8); we use the RGDPO variable (an
output-oriented GDP estimate), though the other measures give very similar results). The PWT is
accessible through the GGDC (Groningen Growth and Development Centre, University of
Groningen): https://www.rug.nl/ggdc/productivity/pwt/.
22
23
4.2 The ECI-PCI Algorithm
The ECI-PCI algorithm [16, 17] assumes that an economy’s knowhow is proportional
to the average knowledge content of its products, and, vice versa, a product’s
knowledge content is proportional to the average knowhow of its producers. Thus, if
kc measures the amount of knowhow in country c, and sp , the knowledge content of
product p, then
Kc p
WcpSp , (32)
Sp p
Wpc* Kc , (33)
where and are positive normalizing constants, and the weights
Mcp
Wcp , (34)
p
Mcp
Mpc
Wpc* . (35)
c
Mpc
Collecting the variables and weights into the vectors and matrices k [Kc ], s [Sp ],
W [Wcp ], and W [Wpc* ] , (32) and (33) become k Ws and s W* k. So we get
(WW* )k ( ) 1 k. (36)
(W* W)s ( ) 1 s. (37)
That is, the complexities of countries and products are given by eigenvectors of the
matrices WW * and W * W , respectively, where the associated eigenvalue is ( ) 1.
Because the averaging weights sum to 1, it is easy to see that any (positive) uniform
vectors k [K ,..., K]T and s [S,..., S]T are solutions to this eigenvector problem; these
are the eigenvectors associated with the largest eigenvalue, which is 1 (by a known
linear algebra result, the Perron-Frobenius theorem). Thus, the authors of this
algorithm choose the eigenvectors associated with the second largest eigenvalue. Let
k2 and s2 be these eigenvectors: then ECI and PCI are (up to the sign) the elements of
the chosen eigenvectors given in standardized form:

24
k2 mean(k 2 )
ECI sign[corr(k 2 , d)] , (38)
std(k 2 )
s2 mean(s2 )
PCI sign[corr(s2 , u)] . (39)
std(s2 )
We multiply by the signed correlation of the eigenvectors with country
diversification vector d and product ubiquity vector u, respectively, to ensure the
signs are correct; this is simply because the sense of an eigenvector being arbitrary,
the standardization specifies the metrics only up to the sign: for example, any chosen
eigenvector k is equivalent to any nonzero multiples k , so that
k mean( k) k mean(k)
. (40)
std( k) | | std(k)
4.3 The Fitness-Quality Algorithm
In essence, this algorithm [20] measures the complexity of an economy by the total
complexity of its products; and the complexity of a product, by the product’s inverse
ubiquity, multiplied by the harmonic mean of the complexities of the producers.
That is, the two metrics are jointly computed recursively as follows:
1 ( n)
Fc( n 1)
( n) p
McpQp , (41)
mean(Qp )
1 1
Qp( n 1)
( n) . (42)
mean( Fc ) 1
c
Mcp ( n)
Fc
The means are averages across all countries and all products, respectively, and the
initial conditions are unit complexities for all countries and all products. The
algorithm converges to a fix-point ( F( ) , Q( ) ) , which, in normalized form, define the
country Fitness and product Quality indices:
F( )
F , (43)
mean( F ( ) )
Q( )
Q . (44)
mean(Q( ) )
25
The crucial novelty of this algorithm is the following ingenious observation: if a low-
complexity country is among the producers of a product, this product is necessarily
a low-sophistication product; but to know that a highly complex economy is among
the producers of a product barely reveals any information about the product’s
complexity (since such country makes almost all product types). Thus, highly
complex economies should be discounted in the measure of product complexity,
dominated by the more informative, lowest-complexity, producer: this is precisely
what does the harmonic mean, whose following bounds are known:16
min{ Mcp Fc } mean H ( Mcp Fc ) U p min{ Mcp Fc }.
We know from the theoretical model why the harmonic mean is the natural choice.
4.4 The Model
For short, we refer to an S-sophisticated knowhow, an S-sophisticated product, and
a K-sophisticated economy respectively as S-knowhow, S-product, and K-country.
We combine the model’s two assumptions into one:
Assumption: Any S random combination of knowhows among a country’s knowhow list

S
corresponds to a product with probability .
That is, a K-country can explore up to ( SK ) possible S-collections of skillsets, among

S
which only a proportion given by are coherent productive skillsets. Thus, a K-
country makes (SK ) S

S-products, and, in total, it makes a total number of products:
K
D ( KS ) S
(1 )K . (45)
S 0
4.5 Model’s Predictions about the Complexity Algorithms
4.5.1 Model’s Prediction about ECI
As usual we index an empirical country and product by c and p, and we index the
theoretical counterparts by K and S, respectively, and refer to them as K-country and
16 See https://en.wikipedia.org/wiki/Harmonic_mean.
26
S-product. A K-country makes (1 )K products among which ( Ks ) S

are S-products.
Thus, the distribution of product sophistication in a K-country is
(SK ) S
prob(S| K ) ,S 0,..., K. (46)
(1 )K
The average product sophistication in a K-country is:

K
(S|K) S 0
S prob(S|K) (by definition)
K
1 K S
D S(S )
S 1
K
1 K K 1 S
D S ( ) (by a known identity)
S 1 S S 1
1 K K 1 S 1
D K S 1 S 1
( )
K 1
1
D K ( KN 1 ) N
(by setting N S 1)
N 0
1
D K(1 )K 1 .
That is,
(S| K ) K. (47)
1
This theoretical result justifies the measurement of a country’s output complexity by
its average product complexity (up to scaling): it explains why ECI works as a
measure of knowhow. We can check the extent to which the ECI-PCI algorithm does
effectively estimate a country’s knowhow as follows. Let the estimated country’s
complexity as measured by the ECI-PCI algorithm be written (up to norming) as
Kc(2) mean(S(2) |c), (48)
where Kc(2) is cth entry of the country complexity eigenvector k2 and S(2) |c is the
restriction of the product complexity eigenvector s2 to the products made by country
c. If product complexity S is accurately measured by s2 (which it can do only up to
the scale of measurement of sophistication, and an error term that should average
out), then
S(2) constant S error. (49)

27
And if in addition the combinatorial model of production is accurate, then the ECI-
PCI algorithm yields an ECIc that is as an estimate of the theoretical counterpart
K (K) log D (log D)

ECI . (50)
std( K ) std(log D)
4.5.2 Model’s Prediction about Fitness

S
Under the model’s predicted correspondence ( F , Q) ( D, ), we have:
S
Q(S) Q(0) . (51)
Thus Qp is an empirical version of the theoretical counterpart:
S
Q(S) Q(0) , (52)
The model also predicts that the average product quality in a K-country is
K S
(Q|K) Q(0) S 0
prob(S|K)
K S K
D 1Q(0) S 0
( ) S.
S
That is,
(Q| K ) D 1Q(0)2 K Q(0)( 2 )K . (53)

1
Therefore, the F-Q algorithm produces (up to norming) a fitness index
Fc mean( McpQp )Dc
which is an estimate of the theoretical counterpart
F F(0)2K. (54)
References
[1] F. Al-Marhubi, Export diversification and growth: an empirical investigation,

Applied Economics Letters, 7 (2000) 559-562.
[2] S. Lall, The Technological structure and performance of developing country
manufactured exports, 1985‐98, Oxford Development Studies, 28 (2000) 337-369.
[3] D. Herzer, F. Nowak-Lehnmann D, What does export diversification do for
growth? An econometric analysis, Applied economics, 38 (2006) 1825-1838.
[4] S. Lall, J. Weiss, J. Zhang, The “sophistication” of exports: A new trade measure,
World Development, 34 (2006) 222-237.
[5] R. Hausmann, J. Hwang, D. Rodrik, What you export matters, Journal of
Economic Growth, 12 (2007) 1-25.
[6] H. Hesse, Export diversification and economic growth, in, World Bank
Commission on Growth and Development, 2008.
[7] M.L. Weitzman, Recombinant growth, The Quarterly Journal of Economics, 113
(1998) 331-360.
[8] P. Auerswald, S. Kauffman, J. Lobo, K. Shell, The production recipes approach to
modeling technological innovation: An application to learning by doing, Journal of
Economic Dynamics and Control, 24 (2000) 389-450.
[9] R. Hausmann, C.A. Hidalgo, The network structure of economic output, Journal
of Economic Growth, 16 (2011) 309-342.
[10] R.M. Solow, A Contribution to the Theory of Economic Growth, The Quarterly
Journal of Economics, 70 (1956) 65-94.
[11] R.M. Solow, Technical change and the aggregate production function, The
review of Economics and Statistics, 39 (1957) 312-320.
[12] P.M. Romer, Increasing returns and long-run growth, Journal of Political
Economy, 94 (1986) 1002-1037.
[13] R.E. Lucas Jr, On the mechanics of economic development, Journal of Monetary
Economics, 22 (1988) 3-42.
[14] P. Aghion, P. Howitt, A model of growth through creative destruction,
Econometrica, 60 (1990) 323-351.
[15] G.M. Grossman, E. Helpman, Quality ladders in the theory of growth, The
Review of Economic Studies, 58 (1991) 43-61.
[16] C.A. Hidalgo, R. Hausmann, The building blocks of economic complexity,
Proceedings of the National Academy of Sciences, 106 (2009) 10570-10575.
[17] R. Hausmann, C.A. Hidalgo, S. Bustos, M. Coscia, A. Simoes, The atlas of
economic complexity: Mapping paths to prosperity, Mit Press, Cambridge, MA,
2014.
[18] J.M. Kleinberg, Authoritative sources in a hyperlinked environment, in:
Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms,
1998, pp. 668-677.
28
29
[19] J.M. Kleinberg, M. Newman, A.-L. Barabási, D.J. Watts, Authoritative sources in
a hyperlinked environment, Princeton University Press, 2011.
[20] A. Tacchella, M. Cristelli, G. Caldarelli, A. Gabrielli, L. Pietronero, A new
metrics for countries' fitness and products' complexity, Scientific reports, 2 (2012).
[21] G. Caldarelli, M. Cristelli, A. Gabrielli, L. Pietronero, A. Scala, A. Tacchella, A
network analysis of countries’ export flows: firm grounds for the building blocks of
the economy, PloS one, 7 (2012).
[22] F. Battiston, M. Cristelli, A. Tacchella, L. Pietronero, How metrics for economic
complexity are affected by noise, Complexity Economics, 3 (2014) 1-22.
[23] E. Kemp-Benedict, An interpretation and critique of the Method of Reflections,
(2014).
[24] P. Mealy, J.D. Farmer, A. Teytelboym, Interpreting economic complexity,
Science advances, 5 (2019) eaau1705.
[25] C. Sciarra, G. Chiarotti, L. Ridolfi, F. Laio, Reconciling contrasting views on
economic complexity, Nature communications, 11 (2020) 1-10.
[26] A. Van Dam, K. Frenken, Variety, complexity and economic development,
Research Policy, (2020).
[27] C.A. Hidalgo, Economic complexity theory and applications, Nature Reviews
Physics, (2021) 1-22.
[28] A.N. Kolmogorov, Combinatorial foundations of information theory and the
calculus of probabilities, Russian Mathematical Surveys, 38 (1983) 29-40.
[29] C.E. Shannon, A Mathematical Theory of Communication, Bell system technical
journal, 27 (1948) 379-423.
[30] G. Gaulier, S. Zignago, Baci: international trade database at the product-level
(the 1994-2007 version), (2010).
[31] R.C. Feenstra, R.E. Lipsey, H. Deng, A.C. Ma, H. Mo, World trade flows: 1962-
2000, in, National Bureau of Economic Research, 2005.
View publication stats

Inoua Simple Economic Complexity Version 2021

Uploaded by

Copyright:

Available Formats

Inoua Simple Economic Complexity Version 2021

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Inoua Simple Economic Complexity Version 2021

Uploaded by

Copyright:

Available Formats

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

A Simple Measure of Economic Complexity

Preprint · January 2016

A Rehabilitation of Classical Economics View project

The user has requested enhancement of the downloaded file.

This version: 2021

Abstract. The conventional view on economic development simplifies a country’s

Keywords: economic growth, economic development, product diversification,

1.1 Product Diversification Matters for Economic Development

Contrary to a long tradition in economics according to which international

prosperity is achieved when national economies specialize, product diversity is

rudimentary products (Table 1).

At the lowest extreme of the economic complexity spectrum are small-population

vegetables, fish).2 Then come economies specialized in highly demanded raw

explained by the mere product-diversity ranking (Figure 1) if we put aside natural

purely qualitative view on production. (Islands, which expectedly would appear as

Figure 1. Countries’ GDP versus Diversification Rankings (4-digit HS).

combinatorial model of production.4 The complexity of a country’s production (the

diversity and sophistication of its products) reveals a diversity of productive

knowledge in that economy that combine to make various products. Qualitatively,

3 Throughout the straight lines are least-square fits.

products differ precisely by the amount of knowhow involved in their production:

the product (consider an aircraft, for example). A product’s (technological)

sophistication or complexity can be defined by the amount of knowledge its

production requires; and the (technological) complexity of an economy, by the total

amount of knowledge involved in its output. But what precisely is an “amount of

productive knowledge” and how can we measure it?

1.2 Productive Knowledge in Conventional Growth Models

In conventional economic theory, a country’s productive knowledge is summarized

by an aggregate production function GDP = F(Capital, Labor), or the aggregate

per capita is a function of the stock of capital per worker: GDP/Labor =

explains in terms of exogenous shifts of the production function F through a

whose growth rate is interpreted as measuring technological progress. Much of

later development of standard growth theory consisted of attaching theoretical

consensus being that the residual somehow captures productive knowledge or

“technology”, often identified with human capital (years of schooling), innovation,

or research and development [12-15].

5 Capital is traditionally denoted in economic theory as K, which we reserve for “knowledge”.

1.3 The Complexity View on Economic Development

In contrast to the aggregate production function approach to production is the

above-mentioned, data-driven, “complexity” approach to economic development

inspired by the empirical correlation between economic development and product

diversification. The diversity of a country’s production, as noted above, reflects the

diversity of its capabilities (elemental units of productive knowhow), which combine

to make more and more sophisticated products.

Figure 2. Network Model of Production. Countries make products using

Thus, it should be possible to infer the amount of productive knowledge involved

in an economy from its product diversification data. Researchers framed this

problem in terms of network theory, modeling countries’ productions as a tripartite

network connecting countries to the products they make, products to the

conceived as one of reconstructing the partly unobservable country-product-

capability network from its empirically observed bipartite country-product

algorithm produces alternative country and product complexity measures [20],

2.1 Measuring an Economy’s Knowhow by Counting Its Products

A product is some transformed natural resources, some raw materials to which is

knowledge comes in discrete elementary units, or capabilities, that combine to make

knowledge sophistication and raw-material availability:

Thus, a product’s sophistication can be measured by its log-likelihood of appearing:

economy that makes D products is given by the country’s log-product-diversity:9