Inoua Simple Economic Complexity Version 2021

Download as pdf or txt
Download as pdf or txt
You are on page 1of 30

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/291437352

A Simple Measure of Economic Complexity

Preprint · January 2016


Source: arXiv

CITATIONS READS
11 1,843

1 author:

Sabiou M. Inoua
Chapman University
17 PUBLICATIONS   23 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

A Rehabilitation of Classical Economics View project

All content following this page was uploaded by Sabiou M. Inoua on 04 October 2021.

The user has requested enhancement of the downloaded file.


A Simple Measure of Economic Complexity

Sabiou Inoua

Chapman University

This version: 2021

Abstract. The conventional view on economic development simplifies a country’s


production to one aggregate variable, GDP. Yet product diversification matters for
economic development, as recent, data-driven, “economic complexity” research
suggests. A country’s product diversity reflects the country’s diversity of productive
knowhow, or “capabilities”. Researchers derive from algorithms (inspired by network
theory) metrics that measure the number of capabilities in an economy, notably the
Economic Complexity Index (ECI), argued to predict economic growth better than
traditional variables such as human capital, and the country Fitness index. This paper
offers an alternative economic complexity measure (founded on information theory)
that derives from a simple model of production as a combinatorial process whereby a
set of capabilities combine with some probability to transform raw materials into a
product. A country’s number of capabilities is given by the logarithm of its product
diversity, as predicts the model, which also predicts a linear dependence between log-
diversity, ECI, and log-fitness. The model’s predictions fit the empirical data well; its
informational interpretation, we argue, is a natural theoretical framework for the
complexity view on economic development.

Keywords: economic growth, economic development, product diversification,


economic complexity metrics, entropy, information theory
1 Background

1.1 Product Diversification Matters for Economic Development

Contrary to a long tradition in economics according to which international

prosperity is achieved when national economies specialize, product diversity is

strongly correlated with economic development [1-6]. The “richest” countries make

almost all types of products, from the most rudimentary to the most sophisticated

ones; while the “poorest” countries make comparatively fewer and more

rudimentary products (Table 1).

Table 1. The World’s Most and Least Diversified Economies (2018, 4-digit HS).1

Ten Most Diversified Economies (2018) Ten Least Diversified Economies (2018)
Country Diversification Rank Country Diversification Rank
United States 1224 1 Gambia 180 162
China 1221 2 Maldives 178 163
India 1219 3 Saint Lucia 169 164
Japan 1211 3 Equat. Guinea 165 165
Germany 1223 5 Bhutan 143 166
Russia 1204 6 Central Afr. Rep. 130 167
Brazil 1196 7 Chad 118 168
Indonesia 1168 8 Comoros 105 169
United Kingdom 1221 9 Guinea-Bissau 29 170
France 1220 10 Small Islands [3, 28] 171

At the lowest extreme of the economic complexity spectrum are small-population

islands, which mostly export natural products (naturally occurring goods: fruits,

vegetables, fish).2 Then come economies specialized in highly demanded raw

materials (notably oil); these countries have higher incomes despite their low

1See data description and source in Subsection 4.1. The number of products a country makes
depends of course on the product nomenclature used (usually 2, 4, 5, or 6-digit product codes),
notably the SITC (Standard International Trade Classification) and the HS (Harmonized System).
The results from the two product nomenclatures and at different aggregation levels are very similar;
hence we present no systematic comparison of the results based on product nomenclature.
2The islands are, e.g., Bouvet (BVT), Netherlands Antilles (ANT), Kiribati (KIR), Northern Mariana
(MNP), Micronesia (FSM), Pitcairn (PCN), South Geargia & Sandwich (SGS), Tuvalu (TUV), Wallis
& Futuna (WLF).

2
3

product diversity (Figure 1). All in all, about 80% of countries’ GDP ranking can be

explained by the mere product-diversity ranking (Figure 1) if we put aside natural

resources, which, as documented throughout, are the main source of bias in this

purely qualitative view on production. (Islands, which expectedly would appear as

even greater outliers then oil-experts, are not included in Figure 1, due to missing

GDP data.)

Figure 1. Countries’ GDP versus Diversification Rankings (4-digit HS).


Red countries are countries with exporters with natural-resource rents
(averaged across years) at least 10% of their GDP.3

These associations are not just correlations and can be explained from a basic

combinatorial model of production.4 The complexity of a country’s production (the

diversity and sophistication of its products) reveals a diversity of productive

knowledge in that economy that combine to make various products. Qualitatively,

3 Throughout the straight lines are least-square fits.


4 This paper’s combinatorial model of production differs by its simplicity from earlier ones [7-9].
4

products differ precisely by the amount of knowhow involved in their production:

in theory, the spectrum of this knowhow content of products ranges from zero, for

naturally occurring goods (a natural resource sold in the raw, for example), to a

maximum value when all the available knowhows are involved in the making of

the product (consider an aircraft, for example). A product’s (technological)

sophistication or complexity can be defined by the amount of knowledge its

production requires; and the (technological) complexity of an economy, by the total

amount of knowledge involved in its output. But what precisely is an “amount of

productive knowledge” and how can we measure it?

1.2 Productive Knowledge in Conventional Growth Models

In conventional economic theory, a country’s productive knowledge is summarized

by an aggregate production function GDP = F(Capital, Labor), or the aggregate

output (or income) the country can produce from any combination of aggregate

labor and capital, where the function F is homogenous of degree 1, so that income

per capita is a function of the stock of capital per worker: GDP/Labor =

F(Capital/Labor, 1).5 But as Solow’s seminal contribution established [10, 11], capital

and labor accumulation cannot account for much of economic growth, which Solow

explains in terms of exogenous shifts of the production function F through a

multiplicative factor (denoted “A” and later named “Total Factor Productivity”),

whose growth rate is interpreted as measuring technological progress. Much of

later development of standard growth theory consisted of attaching theoretical

substance and identity to the “Solow residual” (in other words, to “endogenize” the

part of economic development not explained by the level of capital per worker), the

consensus being that the residual somehow captures productive knowledge or

“technology”, often identified with human capital (years of schooling), innovation,

or research and development [12-15].

5 Capital is traditionally denoted in economic theory as K, which we reserve for “knowledge”.


5

1.3 The Complexity View on Economic Development

In contrast to the aggregate production function approach to production is the

above-mentioned, data-driven, “complexity” approach to economic development

inspired by the empirical correlation between economic development and product

diversification. The diversity of a country’s production, as noted above, reflects the

diversity of its capabilities (elemental units of productive knowhow), which combine

to make more and more sophisticated products.

Figure 2. Network Model of Production. Countries make products using


capabilities: (A) The country-capability-product network. (B) The country-
product network. Capabilities are not directly observable: How to count them?

Thus, it should be possible to infer the amount of productive knowledge involved

in an economy from its product diversification data. Researchers framed this

problem in terms of network theory, modeling countries’ productions as a tripartite

network connecting countries to the products they make, products to the

capabilities their production requires, and countries to the capabilities they possess

(Figure 2). Thus, the core problem of the network approach to production was

conceived as one of reconstructing the partly unobservable country-product-

capability network from its empirically observed bipartite country-product

projection [16, 17]. Researchers conceived algorithms to that effect, notably the
6

Economic Complexity Index (ECI), which, as the authors argue, predict economic

growth better than traditional variables such as human capital [16, 17]. The ECI is

jointly computed with the Product Complexity Index (PCI) by an algorithm akin to

that which the web search engine Google uses to rank webpages.6Another

algorithm produces alternative country and product complexity measures [20],

named country Fitness (F) and product Quality (Q). Both algorithms will be

presented shortly. (Section 4 offers a step-by-step derivation of the metrics and the

basic logic underlying them, for the reader not familiar with this literature.)7

The primary data of the network view on production is formally, the country-

product binary matrix M [ Mcp ] connecting countries to the products they make:

Mcp 1 if country c makes product p , and Mcp 0, otherwise; this simple product

list data is not available, however; thus, one takes as proxy for countries’ product

lists, the countries’ export lists. (More on the data description in Subsection 4.1.)

Given the matrix M , the product diversity of country c (the number of its products)

and the ubiquity of product p (the number of its producers) are respectively:8

Dc p
Mcp , (1)

Up c
Mcp . (2)

The complexity metrics are (up to norming) the solutions to the equations:

Dc ECI c p
Mcp PCI p , (3)

Up PCI p c
Mcp ECI c . (4)

Fc c
McpQp , (5)

Qp [ c
Mcp Fc 1 ] 1 . (6)

6More precisely the ECI-PCI algorithm is more similar in spirit to an algorithm developed by J.
Kleinberg [18, 19] and used by Ask.com. It is an eigenvector problem, as one can see from (3)-(4).
7The complexity metrics are analyzed in various studies, some of which offer critiques, alternatives,
or refinements of the metrics, including the one presented here, in an earlier draft [21-27].
8 The natural concept is not ubiquity per se, but its inverse, which can be called product rarity.
2 Results and Discussions

2.1 Measuring an Economy’s Knowhow by Counting Its Products

A product is some transformed natural resources, some raw materials to which is

applied a set of knowhows to turn them into an economically valuable outcome; and

knowledge comes in discrete elementary units, or capabilities, that combine to make

more and more sophisticated knowledge. The results presented in this paper derives

from these definitions and two simple assumptions about the constraints on

knowledge sophistication and raw-material availability:

1. Any S capabilities can be put together to transform raw materials into a valuable
S
product only with probability (uniform across countries and products).

2. A country finds the raw materials needed for making a product involving S
S
capabilities only with probability (uniform across countries and products).

The two assumptions imply that a product tends to appear in a country’s product

list with a probability that decays exponentially with the product’s sophistication.

Thus, a product’s sophistication can be measured by its log-likelihood of appearing:


log prob(product )
S . (7)
log( )

Moreover, one can show (Subsection 4.2) that the total number of capabilities in an

economy that makes D products is given by the country’s log-product-diversity:9


log( D)
K . (8)
log(1 )

The model predicts the following relationships between knowhow, fitness, and ECI

that fit the data well up to the bias related to natural products (Figure 3-Figure 4):10

9 The derivation is straightforward if we assume away the model’s two constraints (1)-(2); then a
country possessing K capabilities makes D=2K products, whose sophistication range from 0 (for
unprocessed natural resources sold) to K. Thus, K is given by logD (up to a scaling constant).
10 Notation: mean(X) denotes a cross-country average of X (average X across all countries); std(X)

means the cross-country standard deviation of X; later, we will use mean(X|c) to mean the average of
X in a country c; and in equation (16)-(17), we write mean(X|K) for the average of X in a country with
K capabilities. The metrics are systematically compared in standardized form (namely their z-scores)
in the figures below, unless otherwise indicated by the scale of the plot.

7
8

log D mean(log D)
ECI . (9)
std(log D)

log F log D
. (10)
mean(log F ) mean(log D)
9

Figure 3. The Three Country Complexity Measures Related: Data


versus Model. (Top panel: 4-digit SITC; Bottom Panel: 2-digit HS.)
10

Figure 4. Log-Diversity vs ECI (top) vs Log-Fitness (bottom). (2-digit HS)


11

More specifically (Subsection 4.5), the model predicts the following relationships

between knowhow, diversity, complexity, sophistication, fitness, and quality,

assuming the two algorithms (3)-(6) measure accurately these variables:

D (1 )K , (11)

F 2K F(0), (12)

Q ( ) S Q(0), (13)

K mean( K )
ECI , (14)
std( K )
S mean(S)
PCI . (15)
std(S)

mean(S| K ) ( )K . (16)
1

mean(Q | K ) ( 2 )K , (17)
1

where F(0) and Q(0) are merely normalizing constants, and the notation mean(X|K)

stands for the (conditional) average of X in a country with K capabilities. The last

two predictions (16)-(17) are particularly nontrivial and fit equally well the empirical

data (Figure 5). They offer a simple criterion for assessing the accuracy of each

complexity algorithms.11 The F-Q metrics are better fit by the model because this

algorithm better deals with the bias related to natural products, whose rarity is due

to natural reasons and not to knowhow, and which are mostly exported by island

countries, the least complex economies: it is indeed an essential aspect (and

motivation) of the F-Q algorithm to emphasize the lowest-complexity countries in

the estimation of Qp , namely (6). (See Subsection 4.3.) In contrast, island countries

are major outliers of the ECI versus log-diversity theoretical fit [Figure 3, bottom

panel, (a)].

11 In an earlier draft (arxiv.org, 2016) we suggested that it takes more than a regression between logD
and logF to assess the accuracy of the F-Q algorithm, because fitness being the average product
quality multiplied by diversity, a strong correlation between logF and logD might be a fortuitous one
(that holds even with random data, as confirms Figure 5: Bottom Panel). The nontrivial prediction (17)
is the right criterion in this respect.
12

Figure 5. Countries’ Productive Knowhow (Log-Diversity) Can be Measured


by their Average PCI or by their Log Average Product Quality. (4-digit HS.)
13

The model’s two parameters come down effectively to one, the joint probability

. (18)

Once this probability is known, all variables are determined (including the scales or

norming constants). This parameter is simply related to the slope between log-

diversity and log-fitness, by virtue of the predictions (11) and (12), which imply

log(1 ) log(1 )
log D log F log F(0). (19)
log 2 log 2

Thus, we can estimate the model’s key probability parameter and the norming

constant F(0) through linear regression, which yields:12


cov(log D , log F )
var(log F )
2 1, (20)

F(0) exp{mean(log F ) [log(1 )] 1 mean(log D)log 2}. (21)

The number of capabilities in each country is then estimated through either one of

the three economic complexity measures:

log D log[ F /F(0)] log D log D


K mean[ ] std[ ]ECI. (22)
log(1 ) log 2 log(1 ) log(1 )

The probability and the spread of the distribution of K (notably the maximum

knowhow Kmax ) depends on the product nomenclature (Figure 6).

12 If one can manage to find F(0) explicitly in terms of the norming constants involved in the F-Q
algorithm (Section 4.3), then one can sharpen the regression equation by regressing logD on
log(F/F(0)), and split the error (or residual) term into a mean term, which would measure the bias due
to raw products, and a pure noise term.
14

Figure 6. World Distribution of Country Knowhow. [HS: 2-digit (Top)


versus 4-digit product (Bottom) categories.]
2.2 Measuring a Product’s Sophistication by Harmonically Counting Its Producers

The model’s two assumptions also amount effectively to one: the probability that a

country c makes a product p, which we write simply prob(c| p), decays exponentially

with the product’s sophistication Sp . That is:

Mcp Sp
prob( p|c) . (23)
Dc

The (unconditional) probability of finding a product p with sophistication Sp in the

world economy (all countries combined) is (by the law of total probability):

C C 1
prob( p) c 1
prob( p|c) prob(c) c 1
prob( p|c) .
C
Given (23), we get

Sp 1 C
Mcp
prob( p) c 1
. (24)
C Dc

After rearranging the terms, we get:

Sp 1
C . (25)
C Mcp
c 1
Dc

Thus, the model predicts that product sophistication is (up to norming) given by the

formula:

1 C
Sp log[mean H { Mcp Dc } ], (26)
1 Up
log

where meanH stands for (cross-country) hormonic mean. By the same token, we

have also established the following correspondence (up to norming):

S
( F , Q) ( D, ). (27)

That is, if we replace fitness by diversity in the F-Q algorithm, then this latter yields
Sp
Qp Q(0) . The three product complexity measures (S, PCI, and LogQ) are

strongly associated (Figure 7).

15
16

Figure 7. The three product sophistication measures related. (Top panel: 2-


digit HS; bottom panel: 4-digit HS: outlier products are mostly raw products.)
17

2.3 Diversity and Complexity are Orthogonal, but Not Independent

This subsection is an interlude suggested by a growing interpretation concerning the

orthogonality between complexity and diversity, a misapprehension that seems to

call into question the very dependence between these concepts (and hence

everything we said so far). Diversity and complexity are related notions intuitively.

The model predicts that product diversity is an exponential function of economic

complexity. One can show mathematically that the dependence cannot be a linear

one anyway, at least if complexity is measured by the ECI, or more precisely by its

non-standardized version (an eigenvector associated with the country-product

network: Subsection 4.2), which we denote k2 below. This follows from a basic yet

elegant mathematical result [23], establishing orthogonality of k2 and the product

diversity vector, which we denote d [Dc ] . By symmetry one can similarly establish

orthogonality between product ubiquity u [U p ] and the non-standardized PCI,

which we denote s2 .

Contrary to a spreading interpretation [23, 24, 27], however, orthogonality between

the two vectors, be it reminded, merely implies that the two vectors are not linearly

dependent (where linearity is to be taken in the strict mathematical sense, which

excludes affine dependence, or inclusion of an intercept). Orthogonality of

complexity and diversity is not incompatible with positive dependence between the

two vectors: to the contrary, the orthogonality combined with the positive

dependence merely put a constraint on average complexity. Thus k2d 0 and

s2 u 0 combined with cov(k2 , d) 0 and cov(s2 , u) 0 (which by now should be

taken as well-established both empirically and theoretically) simply imply that

mean(k2 ) 0, (28)

mean(s2 ) 0. (29)
18

Since the orthogonality is true mathematically, the sign condition (28)-(29) simply

reflects the dependence between complexity and diversity, which is strong as we

already know, and which the sign conditions confirm (Figure 8).

Figure 8. Distribution of the (non-standardized) ECI and PCI. Top:


Country Complexity. Bottom: Product Complexity. (2-digit HS).
19

3 Conclusion: Towards an Information Theory of Economic Growth

The results, in the final analysis, suggest that both the empirical data and the

algorithms of the economic complexity literature can be simply rationalized by a

combinatorial model of production in which bits of knowhow (or capabilities)

combine, with some probability , to make more and more sophisticated knowhow.

For simplicity we treated the model’s (effectively unique) parameter as uniform

across countries and across products; it is more accurate, however, to assume some

cross-country and cross-product variability of to account for the bias due to raw

products (or natural resources).

Fundamentally, the model rests entirely on the assumption that knowledge comes in

discrete units and that it expands combinatorially. This suggests a simple

informational interpretation of the model that seems to be the natural language for

the complexity view on economic development more generally. For our limited

purpose here, information theory can be summarized by an informational

interpretation of Boltzmann’s famous entropy formula:13

Information Content of System Log (Effective Number of Basic Information States).

That is, the amount of information that conveys an information source (a system to

whose states can be associated meaning) is measured by a logarithm of the system’s

effective number of basic states (those states that can carry information). A basic

information state can be the realization of an event, for example: the information that

conveys the realization of an event is then measured by the log-inverse-probability

of the event, and the total information conveyed by the whole system (here a

13Information theory becomes more intuitive (compared to its usual formulation based on probability
as a primitive concept) if the combinatorial foundation of information is more explicitly emphasized,
or even taken as the primitive concept, as seems to suggest Kolmogorov [28]. If indeed the basic
nature of information is that it comes in discrete units and that it expands combinatorially (or
exponentially, in the simplest case), then it is natural to measure the amount of information of an
information system by the logarithm of its effective number of information states.
20

probability space) is obtained by averaging the information contents conveyed by

the basic events (Shannon’s entropy formula); if the events are equally likely, then

the overall information content is measured by the logarithm of the total number of

possible events. Indeed Shannon’s entropy formula [29] can be viewed as a special

case of the general information formula: in this sense, Shannon formula defines the

effective number of information states to be the exponential of Shannon’s entropy.

Think of a country as an information source revealing information about the

knowledge content of the products it makes; and think of the country’s products as

events or messages that reveal information about the country’s (unobserved)

productive knowledge endowment. Thus, the information (or knowledge) content of

a country’s output is measured by the country’s log-product-diversity. Similarly,

think of the set of all countries potentially producing a product as the information

source; then each producer reveals partial information about the knowledge content

(or sophistication) of the product: the product’s knowledge content is then obtained

as an average of the partial information revealed by the producers, and is roughly

measured by the product’s log-frequency (or log-ubiquity) among countries.

However, this is only a rough measure that implicitly assumes uniform probability

of basic states or events: hence the need for a generalized (or effective) diversity and

ubiquity measures, theoretically given by the model’s predictions (8) and (26). The

complexity algorithms, on the other hand, can be viewed as an empirical way of

correcting diversity and ubiquity mutually, since a country’s knowhow is reflected

in the products it makes, and vice versa.

A systematic analysis of the economic implications of the information theory of

economic development is beyond this paper’s scope, since we choose to center the

discussion entirely on the purely qualitative dimension of production (where the

question is whether a country can make a product or not): more generally, a country

can be considered to be rich either because of its productive knowhow (as reflected

in its product diversity) or by the intensity of its production (or the average amount
21

of output the country is able to sell: the quantitative aspect of production determined

by shorter-term factors such as demand).14

14A discussion of the economic implications of the model is postponed to a follow-up work, which
contains a development accounting in terms of the two dimensions of production (diversity versus
intensity of output), sketched in an earlier draft (arxiv.org, 2016), but that expanded in subtlety.
4 Method: Data and Model

4.1 The Data

In principle, the complexity view on growth requires very simple data (for each

country, the list of products it makes), which are not yet available, however; hence

one takes as proxy for countries’ product lists, their export lists. While there will

inevitably be some error in centering the analysis on export data (for lack of detailed

data on production), the bias has proved minor a posteriori, given the accuracy of

the results (apparently, a country’s export mix is representative of its total output’s

composition). The results presented throughout this paper are based on the proxy

matrix:

1 if Xcp 0,
Mcp (30)
0 if X cp 0,

where Xcp is the amount country c exported in product p , using the Comtrade data

in HS (revision 2007), available for the years 1995-2018 [30].15 We also use for

comparison the Comtrade data in SITC (revision 2) as compiled and corrected for

mistakes by Feenstra et al. and available for the years 1962-2000 [31].

Unlike in this paper, the standard practice in the economic complexity literature is to

define the Mcp matrix more restrictively as

1, RCAcp 1,
Mcp (31)
0, RCAcp 1,

where RCAcp is the revealed comparative advantage of a country c in product p and

is defined as RCAcp (Xcp / p


Xcp ) / ( c
Xcp / cp
Xcp ).

15The trade data are accessible through the Atlas of Economic Complexity Dataverse (Harvard
University): https://dataverse.harvard.edu/dataverse/atlas. The income data are countries’ GDP in
PPP (purchasing power parity) from the Penn World Table (PWT8); we use the RGDPO variable (an
output-oriented GDP estimate), though the other measures give very similar results). The PWT is
accessible through the GGDC (Groningen Growth and Development Centre, University of
Groningen): https://www.rug.nl/ggdc/productivity/pwt/.

22
23

4.2 The ECI-PCI Algorithm

The ECI-PCI algorithm [16, 17] assumes that an economy’s knowhow is proportional

to the average knowledge content of its products, and, vice versa, a product’s

knowledge content is proportional to the average knowhow of its producers. Thus, if

kc measures the amount of knowhow in country c, and sp , the knowledge content of

product p, then

Kc p
WcpSp , (32)

Sp p
Wpc* Kc , (33)

where and are positive normalizing constants, and the weights

Mcp
Wcp , (34)
p
Mcp

Mpc
Wpc* . (35)
c
Mpc

Collecting the variables and weights into the vectors and matrices k [Kc ], s [Sp ],

W [Wcp ], and W [Wpc* ] , (32) and (33) become k Ws and s W* k. So we get

(WW* )k ( ) 1 k. (36)

(W* W)s ( ) 1 s. (37)

That is, the complexities of countries and products are given by eigenvectors of the

matrices WW * and W * W , respectively, where the associated eigenvalue is ( ) 1.

Because the averaging weights sum to 1, it is easy to see that any (positive) uniform

vectors k [K ,..., K]T and s [S,..., S]T are solutions to this eigenvector problem; these

are the eigenvectors associated with the largest eigenvalue, which is 1 (by a known

linear algebra result, the Perron-Frobenius theorem). Thus, the authors of this

algorithm choose the eigenvectors associated with the second largest eigenvalue. Let

k2 and s2 be these eigenvectors: then ECI and PCI are (up to the sign) the elements of

the chosen eigenvectors given in standardized form:


24

k2 mean(k 2 )
ECI sign[corr(k 2 , d)] , (38)
std(k 2 )

s2 mean(s2 )
PCI sign[corr(s2 , u)] . (39)
std(s2 )

We multiply by the signed correlation of the eigenvectors with country

diversification vector d and product ubiquity vector u, respectively, to ensure the

signs are correct; this is simply because the sense of an eigenvector being arbitrary,

the standardization specifies the metrics only up to the sign: for example, any chosen

eigenvector k is equivalent to any nonzero multiples k , so that

k mean( k) k mean(k)
. (40)
std( k) | | std(k)

4.3 The Fitness-Quality Algorithm

In essence, this algorithm [20] measures the complexity of an economy by the total

complexity of its products; and the complexity of a product, by the product’s inverse

ubiquity, multiplied by the harmonic mean of the complexities of the producers.

That is, the two metrics are jointly computed recursively as follows:

1 ( n)
Fc( n 1)
( n) p
McpQp , (41)
mean(Qp )

1 1
Qp( n 1)
( n) . (42)
mean( Fc ) 1
c
Mcp ( n)
Fc

The means are averages across all countries and all products, respectively, and the

initial conditions are unit complexities for all countries and all products. The

algorithm converges to a fix-point ( F( ) , Q( ) ) , which, in normalized form, define the

country Fitness and product Quality indices:

F( )
F , (43)
mean( F ( ) )

Q( )
Q . (44)
mean(Q( ) )
25

The crucial novelty of this algorithm is the following ingenious observation: if a low-

complexity country is among the producers of a product, this product is necessarily

a low-sophistication product; but to know that a highly complex economy is among

the producers of a product barely reveals any information about the product’s

complexity (since such country makes almost all product types). Thus, highly

complex economies should be discounted in the measure of product complexity,

dominated by the more informative, lowest-complexity, producer: this is precisely

what does the harmonic mean, whose following bounds are known:16

min{ Mcp Fc } mean H ( Mcp Fc ) U p min{ Mcp Fc }.

We know from the theoretical model why the harmonic mean is the natural choice.

4.4 The Model

For short, we refer to an S-sophisticated knowhow, an S-sophisticated product, and

a K-sophisticated economy respectively as S-knowhow, S-product, and K-country.

We combine the model’s two assumptions into one:

Assumption: Any S random combination of knowhows among a country’s knowhow list


S
corresponds to a product with probability .

That is, a K-country can explore up to ( SK ) possible S-collections of skillsets, among


S
which only a proportion given by are coherent productive skillsets. Thus, a K-

country makes (SK ) S


S-products, and, in total, it makes a total number of products:
K
D ( KS ) S
(1 )K . (45)
S 0

4.5 Model’s Predictions about the Complexity Algorithms

4.5.1 Model’s Prediction about ECI

As usual we index an empirical country and product by c and p, and we index the

theoretical counterparts by K and S, respectively, and refer to them as K-country and

16 See https://en.wikipedia.org/wiki/Harmonic_mean.
26

S-product. A K-country makes (1 )K products among which ( Ks ) S


are S-products.

Thus, the distribution of product sophistication in a K-country is

(SK ) S
prob(S| K ) ,S 0,..., K. (46)
(1 )K

The average product sophistication in a K-country is:


K
(S|K) S 0
S prob(S|K) (by definition)
K
1 K S
D S(S )
S 1
K
1 K K 1 S
D S ( ) (by a known identity)
S 1 S S 1
1 K K 1 S 1
D K S 1 S 1
( )
K 1
1
D K ( KN 1 ) N
(by setting N S 1)
N 0
1
D K(1 )K 1 .
That is,

(S| K ) K. (47)
1
This theoretical result justifies the measurement of a country’s output complexity by

its average product complexity (up to scaling): it explains why ECI works as a

measure of knowhow. We can check the extent to which the ECI-PCI algorithm does

effectively estimate a country’s knowhow as follows. Let the estimated country’s

complexity as measured by the ECI-PCI algorithm be written (up to norming) as

Kc(2) mean(S(2) |c), (48)

where Kc(2) is cth entry of the country complexity eigenvector k2 and S(2) |c is the

restriction of the product complexity eigenvector s2 to the products made by country

c. If product complexity S is accurately measured by s2 (which it can do only up to

the scale of measurement of sophistication, and an error term that should average

out), then

S(2) constant S error. (49)


27

And if in addition the combinatorial model of production is accurate, then the ECI-

PCI algorithm yields an ECIc that is as an estimate of the theoretical counterpart

K (K) log D (log D)


ECI . (50)
std( K ) std(log D)

4.5.2 Model’s Prediction about Fitness


S
Under the model’s predicted correspondence ( F , Q) ( D, ), we have:
S
Q(S) Q(0) . (51)

Thus Qp is an empirical version of the theoretical counterpart:

S
Q(S) Q(0) , (52)

The model also predicts that the average product quality in a K-country is
K S
(Q|K) Q(0) S 0
prob(S|K)
K S K
D 1Q(0) S 0
( ) S.
S

That is,

(Q| K ) D 1Q(0)2 K Q(0)( 2 )K . (53)


1

Therefore, the F-Q algorithm produces (up to norming) a fitness index

Fc mean( McpQp )Dc

which is an estimate of the theoretical counterpart

F F(0)2K. (54)
References

[1] F. Al-Marhubi, Export diversification and growth: an empirical investigation,


Applied Economics Letters, 7 (2000) 559-562.
[2] S. Lall, The Technological structure and performance of developing country
manufactured exports, 1985‐98, Oxford Development Studies, 28 (2000) 337-369.
[3] D. Herzer, F. Nowak-Lehnmann D, What does export diversification do for
growth? An econometric analysis, Applied economics, 38 (2006) 1825-1838.
[4] S. Lall, J. Weiss, J. Zhang, The “sophistication” of exports: A new trade measure,
World Development, 34 (2006) 222-237.
[5] R. Hausmann, J. Hwang, D. Rodrik, What you export matters, Journal of
Economic Growth, 12 (2007) 1-25.
[6] H. Hesse, Export diversification and economic growth, in, World Bank
Commission on Growth and Development, 2008.
[7] M.L. Weitzman, Recombinant growth, The Quarterly Journal of Economics, 113
(1998) 331-360.
[8] P. Auerswald, S. Kauffman, J. Lobo, K. Shell, The production recipes approach to
modeling technological innovation: An application to learning by doing, Journal of
Economic Dynamics and Control, 24 (2000) 389-450.
[9] R. Hausmann, C.A. Hidalgo, The network structure of economic output, Journal
of Economic Growth, 16 (2011) 309-342.
[10] R.M. Solow, A Contribution to the Theory of Economic Growth, The Quarterly
Journal of Economics, 70 (1956) 65-94.
[11] R.M. Solow, Technical change and the aggregate production function, The
review of Economics and Statistics, 39 (1957) 312-320.
[12] P.M. Romer, Increasing returns and long-run growth, Journal of Political
Economy, 94 (1986) 1002-1037.
[13] R.E. Lucas Jr, On the mechanics of economic development, Journal of Monetary
Economics, 22 (1988) 3-42.
[14] P. Aghion, P. Howitt, A model of growth through creative destruction,
Econometrica, 60 (1990) 323-351.
[15] G.M. Grossman, E. Helpman, Quality ladders in the theory of growth, The
Review of Economic Studies, 58 (1991) 43-61.
[16] C.A. Hidalgo, R. Hausmann, The building blocks of economic complexity,
Proceedings of the National Academy of Sciences, 106 (2009) 10570-10575.
[17] R. Hausmann, C.A. Hidalgo, S. Bustos, M. Coscia, A. Simoes, The atlas of
economic complexity: Mapping paths to prosperity, Mit Press, Cambridge, MA,
2014.
[18] J.M. Kleinberg, Authoritative sources in a hyperlinked environment, in:
Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms,
1998, pp. 668-677.

28
29

[19] J.M. Kleinberg, M. Newman, A.-L. Barabási, D.J. Watts, Authoritative sources in
a hyperlinked environment, Princeton University Press, 2011.
[20] A. Tacchella, M. Cristelli, G. Caldarelli, A. Gabrielli, L. Pietronero, A new
metrics for countries' fitness and products' complexity, Scientific reports, 2 (2012).
[21] G. Caldarelli, M. Cristelli, A. Gabrielli, L. Pietronero, A. Scala, A. Tacchella, A
network analysis of countries’ export flows: firm grounds for the building blocks of
the economy, PloS one, 7 (2012).
[22] F. Battiston, M. Cristelli, A. Tacchella, L. Pietronero, How metrics for economic
complexity are affected by noise, Complexity Economics, 3 (2014) 1-22.
[23] E. Kemp-Benedict, An interpretation and critique of the Method of Reflections,
(2014).
[24] P. Mealy, J.D. Farmer, A. Teytelboym, Interpreting economic complexity,
Science advances, 5 (2019) eaau1705.
[25] C. Sciarra, G. Chiarotti, L. Ridolfi, F. Laio, Reconciling contrasting views on
economic complexity, Nature communications, 11 (2020) 1-10.
[26] A. Van Dam, K. Frenken, Variety, complexity and economic development,
Research Policy, (2020).
[27] C.A. Hidalgo, Economic complexity theory and applications, Nature Reviews
Physics, (2021) 1-22.
[28] A.N. Kolmogorov, Combinatorial foundations of information theory and the
calculus of probabilities, Russian Mathematical Surveys, 38 (1983) 29-40.
[29] C.E. Shannon, A Mathematical Theory of Communication, Bell system technical
journal, 27 (1948) 379-423.
[30] G. Gaulier, S. Zignago, Baci: international trade database at the product-level
(the 1994-2007 version), (2010).
[31] R.C. Feenstra, R.E. Lipsey, H. Deng, A.C. Ma, H. Mo, World trade flows: 1962-
2000, in, National Bureau of Economic Research, 2005.

View publication stats

You might also like