World Bank Group On Ethiopia May 2022 IDU08539

Download as pdf or txt
Download as pdf or txt
You are on page 1of 71

Public Disclosure Authorized

Policy Research Working Paper 10000


Public Disclosure Authorized

The Impact of Ethiopia’s Road Investment


Program on Economic Development
and Land Use
Evidence from Satellite Data
Public Disclosure Authorized

Simon Alder
Kevin Croke
Alice Duhaut
Robert Marty
Ariana Vaisey
Public Disclosure Authorized

Development Economics
Development Impact Evaluation Group
April 2022
Policy Research Working Paper 10000

Abstract
This paper studies the impacts of the large-scale Road Sector with moderate-to-high initial levels of economic activity. By
Development Program in Ethiopia between 1997 and 2016 contrast, there was little, or even negative, growth in areas
on local economic activity and land cover (urbanization and with low levels of initial economic activity. Finally, the find-
cropland). It exploits spatial and temporal variation in road ings show that road upgrades contributed to a reduction in
upgrades across Ethiopia, together with high-resolution cropland in areas with medium-to-high baseline nighttime
panel data derived from satellite imagery. The findings lights. The results suggest that Ethiopia’s ambitious road
show that road upgrades contributed to increases in local infrastructure development program overall increased local
economic activity, as proxied by nighttime lights and urban economic activity and urbanization, but that it also had
land area. However, there is significant heterogeneity in the important distributional implications that need to be taken
results across baseline levels of economic activity. Specifi- into account when planning such infrastructure programs.
cally, gains from road upgrades are concentrated in areas

This paper is a product of the Development Impact Evaluation Group, Development Economics. It is part of a larger
effort by the World Bank to provide open access to its research and make a contribution to development policy discussions
around the world. Policy Research Working Papers are also posted on the Web at http://www.worldbank.org/prwp. The
authors may be contacted at [email protected].

The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development
issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the
names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those
of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and
its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent.

Produced by the Research Support Team


The Impact of Ethiopia’s Road Investment Program on Economic
Development and Land Use: Evidence from Satellite Data∗
Simon Alder1 , Kevin Croke2 , Alice Duhaut3 , Robert Marty3 and Ariana Vaisey4
1
Swiss National Bank
2
Harvard T.H. Chan School of Public Health
3
World Bank
4
University of Chicago Law School

JEL codes: R4, R12, R14

∗ The findings, interpretations, and conclusions expressed in this paper are entirely those of the authors. The views expressed

in this paper are those of the authors and do not necessarily reflect those of the SNB. In addition, they do not necessarily
represent the views of the International Bank for Reconstruction and Development/World Bank and its affiliated organizations,
or those of the Executive Directors of the World Bank or the governments they represent. This project benefited from UK
aid from the UK government.We thank Michael Andrews for his research assistance. Computational reproducibility verified by
DIME Analytics.
1 Motivation

Governments invest in roads to foster national integration, promote economic development, and facilitate

trade. Between 1995 and 2015, for example, World Bank-supported transport projects amounted to ap-

proximately 161 billion USD.1 Yet, although road investments constitute a major portion of both donor

and government capital expenditure, the economic impact of investments in large-scale road infrastructure

is debated: while the effects of large economic corridors and rural connectivity programs have mostly been

studied in dense, rapidly growing countries in Asia, the impacts of transport investments in lower-density

contexts are less studied. In particular, the potentially heterogeneous benefits of such programs to different

types of areas—and therefore the impact of such programs on growth as well as inequality—are less well

understood.

This paper provides evidence on the links between large-scale investments in roads of different types and

changes in local economic activity and land use (urbanization and cropland). To shed light on this issue, we

study a large program of economic corridors and rural road upgrades that has been ongoing in Ethiopia since

1997: the Road Sector Development Program (RSDP). The scale of the program and the context of Ethiopia

allow us to contribute to the literature in two ways. First, while most of the evidence about the development

impact of rural roads and economic corridors comes from rapidly developing or densely populated countries,

we study the impacts of these investments in a setting where both urbanization and industrialization are

low. Ethiopia’s urbanization rate of 17% is less than half of the Sub-Saharan Africa regional average of 37%

(World Bank, 2015), and at the program’s start, Ethiopia had one of the least-developed road networks and

lowest motor vehicle utilization rates in the world (Adamopoulos, 2018). Second, the transport program we

study includes both large economic corridor upgrades aimed at connecting cities, as well as smaller-scale

rural road improvements. This allows us to examine differential patterns of effects according to the type of

transport investment, as reflected in the second part of our analysis.

The large number of roads upgraded and localities connected over the program period allow us to study

the impact of the program as a whole, and of the different types of investments separately. The overall

objectives of the program were to support the rehabilitation, improvement, and construction of roads with

access to ports (particularly important given that Ethiopia is landlocked), roads that link to major economic

centers and resources, and roads connecting areas with potential for commercial agriculture (Ethiopian Roads

Authority, 2015a).

In this paper, we ask whether road upgrades promote growth in local economic activity and produce

changes in land use, particularly urban land and cropland. We use three empirical approaches: (1) a
1 Number extracted from the World Bank projects database, accessible here https://projects.worldbank.org/en/
projects-operations/projects-list?sectorcode_exact=TI

2
differences-in-differences approach that dynamically uses areas not yet treated as a control group, (2) a long-

difference approach that focuses on understanding the impact of road upgrades on incidentally connected

localities, and (3) a market access approach.

First, we leverage temporal and spatial variation of road upgrades to estimate a differences-in-differences

model, restricting our sample to units that received an upgrade at some point over the study period. In doing

so, we control for unobserved time-invariant heterogeneity in locations, as well as common temporal trends

throughout Ethiopia. Moreover, we recover average treatment effects by length of exposure to improved

roads. Treatment effects in post-treatment periods allow an understanding of how long impacts took to

materialize after improvements, whereas treatment effects in pre-treatment periods allow us to test whether

treatment and control units had parallel trends in outcome variables before treatment.

Our second empirical approach focuses on understanding the impact of RSDP on areas that incidentally

benefited from the program; that is, on localities that were not targeted by RSDP but were near an upgraded

road that connected two targeted localities. Localities that were incidentally connected (because they happen

to lie between two targeted localities) provide a possible source of exogenous variation in network connections.

We rely on a long-difference framework, comparing non-targeted localities near an upgraded road to those far

from an upgraded road. To further address endogenous road placements, we instrument road upgrades with

a hypothetical road network that represents what would have been constructed if connecting all targeted

locations in a least-cost manner was the chief concern.

For our third empirical approach, we construct a standard measure of market access that computes, for

each unit, the size of the total market that can be reached, weighted by the inverse transportation cost of

traveling to different markets. While the previous two approaches consider the treatment as an improvement

of any type, the market access measure allows us to use the changes in speed limits associated with road

upgrades. We use a long-difference strategy to understand how changes in market access are associated with

changes in outcome variables. To uncover the causal impact of market access on outcome variables, we control

for baseline levels of market access, local economic activity, and pre-trends in selected variables. In addition,

to further account for non-random road placement, we instrument market access with a version of market

access that excludes immediately surrounding localities when constructing the market access measure. As

a robustness check on these results, we also implement a panel data model that controls for time-invariant

characteristics.

These different empirical approaches each come with their own set of advantages and limitations. For

example, the difference-in-difference approach enables us to leverage the granular nature of both the satellite

imagery and the road data and to test whether locations were endogenously targeted based on pre-trends.

However, to temporally fix the time of “treatment,” this approach considers treatment to begin the first year

3
an area benefited from RSDP. Yet, many areas benefited from RSDP road improvements in multiple years.

Long-difference approaches alleviate this concern by examining aggregate changes in the road network, but

they require stronger assumptions for causal interpretations to be valid. Consequently, we emphasize findings

that appear consistent across the multiple approaches.

Ethiopia’s RSDP is now in its fifth phase, with each phase focusing on different priorities. The first three

phases (1997–2010) focused on improving trunk roads and regional roads, while the fourth and fifth phases

(2010–2020) focused on rural roads to support the development of commercial agriculture. The vast majority

of upgrades took place in the fourth and fifth phases (Figure 1). Using data on road upgrades from 1997

to 2016, our analysis focuses on RSDP phases I-IV. Aggregating across phases, we find that road upgrades

contributed to increases in local economic activity (proxied by nighttime lights) and urbanization. However,

the aggregate results mask heterogeneity in results across baseline levels of economic activity. Specifically,

while areas with moderate to high initial economic activity benefited the most from the upgrades, we have

mixed evidence for areas with low initial economic activity. The differences-in-differences and long-difference

models show that these areas either did not benefit or only saw small gains from upgrades, and market access

results show these areas either did not benefit or even saw a reduction in outcomes due to upgrades. Lastly,

we find that road upgrades contributed to a reduction in cropland in areas with moderate to high initial

economic activity; these cropland areas largely transitioned to urban land.

We contribute to multiple strands of the literature. First, we contribute to a young literature on the effects

of transportation infrastructure on economic development in Ethiopia. Several recent papers have examined

the effect of Ethiopia’s RSDP on firm-level outcomes. For example, Fiorini et al. (2021) focus on the role

of road infrastructure on the effect of trade. Using the Ethiopia manufacturing census from 1998 to 2009 to

construct measures of firm productivity, they find that better local road infrastructure increases the benefits

from lowering tariffs on imports and exports. In a related paper, Fiorini and Sanfilippo (2019) find that

road access via RSDP increases overall employment, reduces agricultural employment, and increases service

sector (although not manufacturing) employment. Using the same manufacturing census data, Shiferaw

et al. (2015) estimate the relationship between RSDP road investments and firm performance, finding that

road quality is linked to firm entry. Moneke (2019) studies the effects of roads and electricity and finds

that roads alone had smaller welfare effects compared to roads combined with electrification. He uses an

“inconsequential units” identification strategy based on colonial-era straight-line road plans, using units

that were not directly of interest for the colonizer to identify the impact of the program. Relying on land

use and economic activity as outcome variables, we similarly find notable spatial heterogeneity in outcomes.

Adamopoulos (2018) studies the impact of RSDP on agricultural productivity in Ethiopia, using a panel from

1996-2014 at the medium administrative level (Woreda) and estimated travel times to agricultural markets.

4
He calculates that RSDP reduced transport costs and increased Woreda-level average yields. Kebede (2021)

uses village-level data and finds that rural roads on average increased real agricultural income. Dercon et al.

(2009) study a panel of 15 agricultural villages over a decade (1994–2004) and find that access to all-weather

roads reduces poverty by 6.9 percentage points and also increases consumption growth markedly (by 16.3

percentage points). Atkin and Donaldson (2015) study trade costs in Ethiopia and Nigeria and find that

the effect of distance on trade costs is about four times larger than in the United States. We add to this

literature a focus on land use: beyond nighttime luminosity, we also use a recently released landcover panel

dataset that allows us to track landcover at an annual frequency.2 This allows us to go beyond the traditional

use of nighttime lights as proxy for economic activity and also focus on the quantity of land allocated to

agriculture or farm production (cropland area) versus urban uses throughout the country.

Second, we contribute to the broad literature on the effects of transportation infrastructure.3 Our

analysis is related to that of Mitnik et al. (2018), who use nighttime lights as an outcome with a two-way

fixed effects approach and find that road rehabilitation in Haiti from 2004 to 2013 leads to increases of

6—26% in luminosity (corresponding to 0.5–2.1% increases in GDP). Similarly, BenYishay et al. (2018)

evaluate road upgrades in the West Bank using panel data and find a 0.341 increase in absolute luminosity;

given that baseline nighttime lights was 2.6, a 0.341 increase represents a significant increase. Storeygard

(2016) constructs a panel of Sub-Saharan African cities using nighttime lights data and studies the effect

of trade costs on economic activity. Asher and Novosad (2020) find effects of rural roads on non-farm

employment but not on agricultural outcomes, income, or assets. Asher et al. (2020) estimate the effect

of road construction on deforestation using satellite-based measures of forest cover. Faber (2014) finds

that a national highway program in China aimed at connecting provincial capitals caused a reduction in

GDP growth in small peripheral counties that the highways passed through. We add to this literature

on transportation infrastructure investments by using satellite imagery to study development in a context

where urbanization rates are low and where we observe a large-scale national investment program that was

gradually rolled out over regions and time.

Third, we also contribute to the urban and regional economics literature on land use more generally.4 For

example, Deng et al. (2008) use high-resolution satellite imagery data to study urban expansion in China.

This literature often builds on the monocentric city model, where the land use in and around a central

business district depends on transportation costs. Michaels et al. (2012) consider a model of urbanization

and structural transformation where land is allocated to residential, agricultural, or non-agricultural land

use. Motamed et al. (2014) show that the transition from rural to urban activity depends on trade costs as
2 The data is publicly available at https://www.esa-landcover-cci.org/?q=node/175.
3 See Redding and Turner (2014) and Donaldson (2015) for surveys of the literature.
4 See Duranton and Puga (2015) for an overview of theoretical and empirical literature on urban land use.

5
measured by access to water transportation. Christensen and McCord (2016) study geographic determinants

of urbanization in China and find heterogeneous effects. Our contribution to this literature is to directly

document the effect of transportation infrastructure investments on agricultural and urban land use based

on high-resolution satellite imagery and detailed data on a national road expansion program.

The paper is structured as follows. Section 2 provides background information on RSDP and growth in

Ethiopia. Section 3 presents the data. Section 4 discusses the empirical strategy. Section 5 presents the

results, and Section 6 concludes.

2 Background

2.1 The Road Sector Development Program

As of the late 1990s, the state of Ethiopia’s road network was poor. Most areas in the country were not

connected to economic centers, which isolated areas from markets and social services, and roads that existed

were deteriorating. The Government of Ethiopia recognized that the poor state of the network impeded

economic growth; consequently, it launched the RSDP in 1997.

This program has gone through four phases and is currently in its fifth phase. The different phases

had different objectives: RSDP I (1997–2002) focused mainly on rehabilitating and upgrading federal-level

trunk roads; RSDP II (2002–2007) and III (2007-2010) expanded the program’s focus to upgrading and

building new link and regional roads; RSDP IV (2010–2015) introduced the Universal Rural Road Access

Plan (URRAP), shifting the program’s focus to rural and community roads while maintaining investment

at other levels; and RSDP V (2015–present) shifted focus back to the federal level, prioritizing construction,

upgrading, and heavy maintenance of trunk and link roads (Ethiopian Roads Authority, 2015a) Phases I–III

saw investments in 32,693km of roads combined, Phase IV saw investments in 85,860km of roads, and the

first year of Phase V saw 9,917km of roads improved. Figure 1a shows improvements in the road network

over time; the bulk of road improvements were on 20km/h URRAP roads during Phase IV.

Overall, in the first 19 years of RSDP (1997–2016), 17.4 billion USD was invested across 128,470km

of roads (Ethiopian Roads Authority, 2015a). The road development was categorized across federal roads

(74.7% of total expenditure), regional roads (12.1%), Woreda/community roads (12.9%), and urban roads

(0.3%). Construction of new federal and regional roads comprised 40.8% of total expenditure. During this

time, the road network grew 326%—increasing from 26,550km in 1997 to 113,066km in 2016. Of the roads

constructed, 41.4% were built during RSDP IV, between 2010 and 2015 (Ethiopian Roads Authority, 2015a).

Overall, the program has led to notable improvements in road access. The Ethiopian Roads Authority (ERA)

6
Figure 1: Improvement in Road Network

(b) Improvement in road network by province


(a) Road network improvements by speed (km/h)

reports that the average distance to an all-weather road dropped from 21km to 4.9km, the proportion of the

road network deemed in good condition grew from 22% to 72%, and the percentage of Kebeles (Ethiopia’s

fourth and smallest administrative division; there are over 15,000 Kebeles in Ethiopia) connected to an all-

weather road grew from 40% to 76% (Ethiopian Roads Authority, 2015a). In addition, Adamopoulos (2018)

shows that travel times from Woredas to grain markets fell by over two hours from 1996 to 2014.

The allocation of roads to provinces followed the priorities set for the different phases and the relative size

of the road network in different provinces, as shown in Figure 1b. For example, Ethiopia’s largest province

of Oromia, which surrounds the capital Addis Ababa, benefited greatly from Phase IV of the program, with

large investments in highways. The Southern Nations, Nationalities, and Peoples’ Region—the largely rural,

third-most populous province and home to up to 45% of Ethiopia’s coffee production—particularly benefited

from investments in the first two phases of RSDP. Amhara, home to a large share of Ethiopia’s cattle and

the second-largest province in terms of population, saw the same investment pattern. Figure 1b shows that

regions tended to benefit equally from RSDP relative to their size. In addition, Appendix S1 shows that

Woredas across different levels of baseline nighttime lights saw relatively similar trends in the length of road

improved, with the main difference being that Woredas with the highest baseline nighttime lights saw most

improvement from larger roads. Overall, Woredas across different levels of initial economic activity saw

notable road upgrades.

7
2.2 Economic and Urban Growth

Along with the continued development of the Ethiopian road network, the various phases of RSDP coincided

with rapid urban population and economic growth. Ethiopia’s economy has been one of the fastest-growing

in the region, experiencing an average of 8.1% annual growth between 1996 and 2015, compared to a 4.6%

annual growth rate for Sub-Saharan Africa. However, it has been challenged by high inflation rates between

2008 and 2010 that led to local currency depreciation, which in turn increased fuel and transportation costs

(World Bank Group, 2012). Spurred by an average of 2.2 billion USD in official development assistance

per year between 1996 and 2015, Ethiopia has focused its economic development strategy on education,

energy, and roads projects—such as RSDP (World Bank Group, 2012). Ozlu et al. (2015) note that despite

only 17.3% of Ethiopia’s population residing in urban areas in 2012, the urban population is one of the

fastest-growing in the world. This growth rate was estimated to be around 3.8% per year during the 2010s.

Schmidt et al. (2018) identified the drivers of this rapid urbanization as road infrastructure, secondary city

development, and rural-to-urban migration.

Addis Ababa has been the main destination of this rural-to-urban migration, attracting 40% of the total

rural-to-urban migration between 2008 and 2013 (Bundervoet, 2018). However, despite the high percentage

of migration flowing to Addis Ababa, Bundervoet (2018) notes that smaller cities and towns were also the

focus of rural migration. This demonstrates not only that major urban centers have rapidly urbanized during

the RSDP period, but that population is also growing in smaller cities and rural townships.

3 Data

This paper relies on three types of geospatial data for our treatment variables and the three outcomes of

interest (local economic activity, urban land, and cropland). First, we rely on digitized road improvement

data shared by the ERA to capture the evolution of RSDP. Second, we rely on nighttime lights to capture the

evening use of light and electricity as a proxy for local economic activity. Third, we rely on annual landcover

data from the European Space Agency (ESA) GlobCover program to capture the evolution of the surface

area of urban land and cropland. The following four sections describe the data sources (Section 3.1); how

data are aggregated to different units of analysis (Section 3.2); how units are separated into different groups

of economic activity for exploring heterogeneity of results (Section 3.3); and, lastly, we present descriptive

statistics for the area and period of interest (Section 3.4).

8
3.1 Data Sources

The road data covers the stock of existing roads and their improvement between 1996 and 2016 annually,

capturing RSDP phases I-IV. Three sources of data went into constructing the dataset (created and provided

by the ERA): (1) federal road data from ERA, which includes dates of budget disbursements towards

RSDP; (2) regional road network studies implemented at various times within selected regions that captured

the inventory and conditions of roads; and (3) a nationwide road network inventory and condition survey

implemented in 2014 and 2015. Speed limits were assigned to each road segment based on the pavement

type and condition of the road, and road improvements are indicated by an increase in the speed limit (see

Appendix S2 for a table showing ERA’s classification for road types, conditions, and speeds).

Our first outcome variable is nighttime lights, which has been shown to be a strong predictor of local

economic activity (Donaldson and Storeygard, 2016; Sutton and Costanza, 2002; Doll et al., 2006; Ghosh

et al., 2013), local GDP (Henderson et al., 2012), and measures of welfare such as local levels of asset

wealth (Weidmann and Schutte, 2017). Nighttime lights data come from two data sources: the Defense

Meteorological Program, Operational Line Scan System (DMSP-OLS) and the Visible Infrared Imaging

Radiometer Suite (VIIRS). Both sources are made available by the National Oceanic and Atmospheric

Administration (NOAA). We rely on both sources to capture the full study period from 1996 to 2016;

DMSP-OLS is available from 1992 to 2013 at a roughly 1km resolution, and VIIRS is available from 2012 to

the present at a roughly 750m resolution. Two challenges must be addressed to construct a consistent time

series based on the two nighttime lights data sources. First, DMSP-OLS data is captured over time using

six different satellites; differences in the satellites, including the sensors and time of orbit, have resulted in

significant differences in light values captured from different satellites across years (Wu et al., 2013; Zhang

et al., 2016). Second, there are notable differences between DMSP-OLS and VIIRS data: VIIRS has a

finer resolution, can capture nighttime lights at both lower and higher magnitudes, and suffers less from

“blooming,” whereby lights in one pixel impact surrounding pixels (Elvidge et al., 2013). We therefore

rely on a dataset from Li et al. (2020), who address both issues to develop a harmonized global nighttime

lights dataset from 1992 to 2018; Li et al. (2020) use a stepwise calibration approach to create a consistent

DMSP-OLS dataset.5

One key difference between the calibrated DMSP-OLS data and the simulated DMSP-OLS-like data

(i.e., VIIRS data simulated to be like DMSP-OLS) is that the DMSP-OLS-like data captures small values of
5 The stepwise calibration approach refers to a four-step process, where DMSP-OLS satellites are individually calibrated to
match earlier years; the process starts with earlier satellites and moves sequentially to later satellites. The method produces
nighttime lights trends that more accurately track trends in electricity consumption compared to DMSP-OLS calibration
methods developed by Zhang et al. (2016) and Elvidge et al. (2014). See Li and Zhou (2017) for details on how the dataset is
created.

9
nighttime lights where the DMSP-OLS data observes no light (see Appendix S3 for a comparison of DMSP-

OLS data and DMSP-OLS-like data). This difference results from VIIRS’s ability to capture low-level light

that DMSP-OLS cannot capture; when simulating DMSP-OLS-like data from VIIRS data, some of the low-

level light is retained. However, these differences should not have a large impact on results. First, these

differences are concentrated in low-lit regions; Appendix S4 shows that average nighttime light trends are

relatively consistent across DMSP-OLS and DMSP-OLS-like data. Second, as the empirical strategy section

later describes, we control for factors common to all units in a given time period, such as a difference in the

satellite used to capture nighttime lights.

While nighttime lights are now a staple of the economic literature on roads programs, including in Africa

(Donaldson and Storeygard, 2016), the GlobCover data is a more recent addition to the toolbox. This dataset

is produced by the European Space Agency (ESA), and is available annually from 1992 to 2018 at 300m

resolution (European Space Agency, 2017, 2019). It classifies each pixel into one of the 36 land cover classes

defined in the United Nations Land Cover Classification System. From this dataset, we use two land cover

classes. First, we use land classified as urban, defined as 300m pixels covered mostly (>50%) by artificial

surfaces and associated areas. Second, we create a cropland category, defined as 300m pixels where over 50%

of the pixel is cropland (this aggregates three cropland categories in the underlying data).

3.2 Aggregating Data

We rely on Kebeles (Ethiopia’s smallest administrative unit) as our primary unit of analysis. We use a dataset

that includes the boundaries of Kebeles for all regions in Ethiopia except Somali Region (15,670 Kebeles).6

For Somali Region, we rely on boundaries of administrative units one higher than Kebeles—Woredas (N=44).

However, for simplicity, we refer to the units of the combined dataset as ”Kebeles” throughout the paper.

Kebeles are small enough to capture granular changes in the road network; 15,714 Kebeles cover Ethiopia,

and the average size of a Kebele is 70km2 (the median size is 24km2 ). Appendix S5 shows a map of Kebeles.

When aggregating data to Kebeles, we take the average values of nighttime lights and the total number of

urban and cropland pixels.

To test the sensitivity of results to the unit of analysis, when relevant, we also estimate models using

1x1km pixels—the original resolution of DMSP-OLS data (1.1 million 1x1km pixels cover Ethiopia). As

GlobCover landcover data are captured at a 300m resolution, we aggregate cropland and urban to the

1x1km pixels by creating binary variables indicating whether any portion of the 1x1km pixel contains urban

or cropland area. However, we use Kebeles as the primary unit of analysis because they represent a more
6 TheKebele shapefile is available at https://ethiopia.africageoportal.com/datasets/africageoportal::kebeles-level-4/
about

10
policy-relevant unit than pixels.

For continuous outcome variables (nighttime lights at the 1km grid and Kebele level, and cropland

and urban land at the Kebele level), we transform the variables using the inverse hyperbolic sine (IHS)
p
transformation, which is defined as Yi = log(yi + yi2 + 1). The interpretation is equivalent to that of

logarithms. The transformation is defined at zero and provides a way to use a transformation that is suited

for cases where there are a large number of zeros (Mitnik et al., 2018).

3.3 Separating Units by Baseline Economic Activity

Across analyses, we explore heterogeneity of results according to baseline levels of economic activity. We

group units (1x1km pixels and Kebeles) according to the baseline level of nighttime lights of the Woreda in

which the unit is located. We divide Woredas into four groups. The first group consists of Woredas that

registered no nighttime lights at baseline. We then divide Woredas that registered nighttime lights into

three similarly sized groups using 33rd and 66th percentile cut-offs, based on the maximum nighttime lights

value recorded within the Woreda. There are 557 dark Woredas, 74 Woredas with low light (max. 1–5 NTL

value), 78 Woredas with medium light (max. 6–8 NTL value), and 71 Woredas with high light (max. > 8

NTL value). Appendix S4, which presents trends in dependent variables for Woredas in each category, shows

that Woredas in each category experienced growth in nighttime lights and urbanization. Using the baseline

value of nighttime lights at the Woreda level—not the level of the individual unit (e.g., Kebele)—ensures

that two nearby units from the same town or city are not separated into different analyses. For example, the

approach ensures that a brightly lit pixel in a city’s center and a dimly lit pixel in the same city’s periphery

are included in the same group. To test the sensitivity of these groupings, we also divide Woredas into two

groups: those that had zero baseline nighttime lights, and those that had some positive baseline lights.

3.4 Descriptive Statistics

Table 1 shows descriptive statistics of outcome variables at the Kebele level, and Figure 2 shows maps

of dependent variables and road upgrades. Nighttime lights and urban land are heavily concentrated in

Ethiopia; most of the country is neither lit nor settled. However, during the study period, Ethiopia expe-

rienced significant growth in both urban area and the intensity of lit pixels. The number of Kebeles with

some positive nighttime lights increased 103% from 1996 to 2013 and further increased by 414% from 2013

to 2016.7 In addition, the number of Kebeles with some urban land grew by 96% (from 542 to 1,062 out

of 15,714). In contrast to nighttime lights and urban area, much of Ethiopia is classified as cropland, and
7 Much of the increase from 2013 to 2016 is likely driven by 2013 using DMSP-OLS data and 2016 data using DMSP-OLS-like
data, which is derived from VIIRS and better captures low-level light.

11
cropland experienced less change over time. For example, 15,081 Kebeles (95.9%) contained some cropland

in 1996 and 15,045 (95.7%) contained some cropland in 2016—a 0.2 percentage point change.

When considering only areas near (within 5km of) an improved road, urban land and luminosity are

also heavily concentrated. Table 1 shows that out of the 14,933 Kebeles near an improved road, 626 (4%)

registered positive nighttime light values at baseline, and this value grew to 1,267 by 2013 (a 102% increase)

and further grew to 6, 214 (a 390% increase from 2013); however, much of that increase may be due to

differences in underlying nighttime lights data (2013 uses DMSP-OLS data and 2016 uses DMSP-OLS-like

data, which better captures low values of nighttime lights). Urban area saw similar growth over time, from

520 Kebeles with some urban area to 1,035 (a 99% increase). While a small proportion of Kebeles near an

improved road contain lit or urban cells, a high proportion of lit or urban cells are near an improved road.

Table 1: Summary Statistics of Outcome Variables Across Kebeles

Variable All Treated Areas Control Areas


(Within 5km of Improved Road) (>5km of Improved Road)
1996 2013 2016 1996 2013 2016 1996 2013 2016
Average across Kebeles
Avg. NTL 0.35 0.57 1.88 0.37 0.6 1.9 0.01 0.02 1.45
Avg. Urban Area (km2 ) 0.072 0.156 0.189 0.066 0.156 0.189 0.159 0.207 0.225
Avg. Cropland Area (km2 ) 40.365 40.854 40.557 40.86 41.334 41.031 30.918 31.647 31.512
N Kebeles with Positive Value
NTL 628 1,278 6,472 626 1,267 6,214 2 11 258
Urban 542 910 1,062 520 884 1,035 22 26 27
Cropland 15,081 15,044 15,045 14,368 14,333 14,332 713 711 713
N 15,714 14,933 781

Nighttime lights data come from Li et al. (2020), who provides an intercalibrated nighttime lights dataset derived from
DMSP-OLS and VIIRS from 1992 to 2018; DMSP-OLS data from 1992–2013 is used, and VIIRS is used to simulate
DMSP-OLS-like data from 2014 to 2018. Urban and cropland categoriescome from the ESA Globcover dataset. The table
shows values in 1996 (first year of road data available), 2013 (last year of DMSP-OLS data available), and 2016 (last year of
road data available, and where VIIRS is used to simulate DMSP-OLS-like data).

4 Empirical Strategy

As shown in the previous two sections, Ethiopia experienced substantial changes in its road network (Figure

1) and in economic activity (Table 1). In this section, we exploit the temporal and spatial variation in roads

and economic activity to estimate the effect of road upgrades.

We rely on three complementary empirical approaches to estimate the impact of road upgrades on eco-

nomic activity and changes in land cover (urban land and cropland). First, we rely on a difference-in-

difference design that focuses on the impact of road upgrades on surrounding areas. Second, we rely on a

long-difference approach that focuses on the benefits of the program for incidentally treated areas compared

to the areas that did not benefit from the program. Third, we examine how improvements in market access

12
Figure 2: Nighttime lights, urban area, cropland area, and road improvements

13
due to road upgrades contributed to changes in economic activity and land cover.

4.1 Differences-in-Differences

First, we implement a differences-in-differences approach to look at the short-term effects of the road up-

grades. We use the method proposed by Callaway and Sant’Anna (2021) that is designed for applications

with multiple time periods and where the timing of treatment varies across units. Here, the control group

dynamically changes and is comprised of units not-yet-treated. Average treatment effects are computed for

each group and each time period; in our setting, the treated “group” corresponds to units that were near

roads improved in the same year, while the control groups are units yet to be treated. Group-time average

treatment effects are estimated using a doubly robust approach, which uses weighted least squares for esti-

mating regressions and inverse probability tilting for estimating propensity scores (Callaway and Sant’Anna,

2021; Sant’Anna and Zhao, 2020). We rely on an unconditional difference-in-differences approach.

Following the method proposed by Callaway and Sant’Anna (2021), we recover average treatment effects

by length of exposure to improved roads, which generates coefficients for both before and after treatment.

Treatment effects in post-treatment periods allow an understanding of how long impacts took to materialize

after improvements, while treatment effects in pre-treatment periods allow testing for whether treatment

and control units had parallel trends in outcome variables before treatment. This strategy is valid as long

as the trends in the control and treated groups are parallel for each pair of control and treated group and

there is no correlation between the timing of the road construction or upgrade and one of the outcomes of

interest. To assess a potential violation of the underlying hypothesis, we examine the pre-trends in our data

in the results section. Specifically, a primary threat to our identification would be if factors affected localities

near upgraded roads with the same timing as RSDP. However, given the significant variation in upgraded

roads across time and geographic areas, this risk is relatively small. To this point, we find that another

large program—a national electrification program—that was implemented at a similar time as RSDP was

rolled out differently. In 2005, Ethiopia initiated the Universal Electricity Access Program (UEAP). At the

start of the program, approximately 6% of the population was connected to electricity, and only about 15%

of the population lived in electrified areas (i.e., areas with some form of electricity supply for residences

and businesses). The program, as of 2015, reportedly added connections to over 5,000 towns and villages,

connecting 60% of towns and villages to the grid (MoWIE, 2019). The program added a large share of

these connections in urban areas, mainly Addis Ababa (World Bank Group, Independent Evaluation Group,

2015). The timing of UEAP overlapped with part of RSDP. However, areas targeted by the program differed

(see Appendix S6, which compares roads improved through RSDP and new grid lines from the Electricity

14
Access Rural Expansion Project—which was a key component of UEAP).

Moreover, we find that road projects were often affected by idiosyncratic delays in road construction.

Consequently, the timing of actual road completion at the regional level within a phase should be as good as

random. Delays in planned road construction were reported in RSDP and the Universal Rural Roads Access

Program (URRAP) due to gaps in project handover and quality issues in locally assembled construction

equipment. These delays are reflected in the road construction accomplishment rates, per RSDP.

Our base model includes all road types and units. We explore heterogeneity along two dimensions: initial

level of economic activity and road type. To examine differences across initial levels of economic activity, we

separate units according to baseline nighttime lights. To examine differences across road type, we separate

roads by whether the road’s speed limit was above or below 50km/h after improvement (50km/h and above

corresponds to federal roads, highways, and expressways) and estimate the baseline model separately for each

case. While this approach is indicative of the differential effects of different types of investments, different

road types are likely to be affected by selection into treatment, such that these results are mainly indicative

of correlation. To explore sensitivity to Addis Ababa-driven effects, we also estimate models excluding units

within 100km of Addis Ababa. To check the robustness of results using this specification, we also estimate

results using a two-way fixed effects model (see appendix S7).

4.2 Long Difference Estimation

Second, we estimate the effects of RSDP from its inception to its end, comparing areas that were included in

the program and those that were not. For this analysis, we account for selection issues, which arise from the

fact that RSDP was targeted to areas based on specific criteria. For Phases I–III, regional states submitted

proposals to ERA, which then evaluated the proposals against five selection criteria and funding constraints

(Shiferaw et al., 2015). Selection criteria for new roads placed a 40% weight on economic potential (20%

for areas with economic development potential and another 20% for areas with surplus food and cash crop

orientation), and another 40% focused on reducing inequities in road accessibility.8

To address the selection issue, we focus on areas that were incidentally connected, which provides a

source of exogenous variation in network connections. This approach is common in the literature on transport

networks (Michaels, 2008; Datta, 2012; Faber, 2014; Banerjee et al., 2020). Our strategy largely follows Faber

(2014), who examines the impact of a highway system in China designed to connect provincial capitals with

larger cities. The key idea is that areas that happened to lie between targeted cities were incidentally, rather

than purposefully, connected, generating quasi-random variation in road access in the sample that excludes

the targeted cities.


8 See Shiferaw et al. (2015) for a list of selection criteria and the process for identifying road projects.

15
For this analysis, we focus only on RSDP I–III, which concentrated on trunk and link roads. In RSDP

IV, Ethiopia began URRAP, which had the objective of connecting all of Ethiopia’s Kebeles to the nearest

all-weather road (Ethiopian Roads Authority, 2015b; Iimi et al., 2018). As of 2016, the latest year in our

roads dataset during RSDP IV, not all Kebeles were connected. However, it does not make sense to include

RSDP IV when using this instrumental variable strategy, as all Kebeles were targeted for treatment as part

of URRAP, and this analysis focuses only on non-targeted treated areas.

We rely on RSDP road improvement data to identify areas targeted by RSDP. The dataset is organized

by road link and lists the names of cities at each end of the link. We use this list of cities and Ethiopia’s

nine regional capitals as the targeted areas. This process yields 919 targeted locations (Appendix S8 shows

sample sizes after removing targeted locations and the number of units near an RSDP road).

We estimate the following long-difference model using Kebeles as our main model. The same model is

run at using the 1x1km grid to check the sensitivity of our main results,

∆yi = β0 + β1 N ear Improved Roadi + i , (1)

where we exclude units within 5km of targeted areas. ∆y is the change in the outcome variable from baseline

(1996) to endline (2009, the last year of RSDP III), and N ear Improved Road is a binary variable that

indicates whether a unit is near a road improved from RSDP. To explore heterogeneity in impacts, we

estimate alternative versions of equation 1 where we interact N ear Improved Road with variables indicating

whether the unit was in a Woreda with zero, low, medium, or high baseline nighttime lights.

Identifying the impact of RSDP from OLS (model 1) assumes that non-targeted areas that were connected

to road projects were as good as randomly selected. However, this assumption is violated if regional states

also choose to connect politically or economically important areas between the primarily targeted locations.

To this point, non-targeted treated and control areas are not balanced across baseline levels and pre-trends

of outcome variables (see Appendix S9).

To address this concern, we follow Faber (2014) and construct two minimum spanning trees to use as

instruments for road placement. The MSTs are hypothetical road networks developed under the sole goal of

connecting all targeted destinations subject to cost minimization. The first MST connects targeted areas in

a way that minimizes the total Euclidean distance of the network, and the second MST minimizes the total

construction cost of the network. Construction costs are proxied using elevation and land cover (additional

construction costs are assigned to areas that are built-up or that contain wetlands or water; see Appendix

S10, which provides details on how construction costs are computed and how the MSTs are constructed).

We compute two variations of the Euclidean distance and least-cost MSTs. First, we construct MSTs that

16
connect all targeted locations. Second, we construct MSTs separately for each regional state, connecting the

targeted areas within each state and then appending the regional MSTs. This second approach assumes that

regional states may give preference to proposing road projects that connect economically important areas

within their own regions. Figure 3 shows maps of the MSTs that connect all targeted areas, and Appendix

S10 shows maps of the MSTs that append regional MSTs. Appendix S8 shows the number of units near

each MST.

One last concern is that non-targeted areas connected to RSDP roads are mechanically more likely to

be near a MST than unconnected areas. This is problematic if distance to targeted areas correlates with

characteristics that also affect growth in local economic activity. To address this concern, following Faber

(2014), we control for the log distance to the nearest targeted area.

Figure 3: Minimum spanning trees and improved roads using RSDP I-III

4.3 Estimating Impact of Improved Market Access

The previous two empirical approaches (difference-in-difference and long difference estimation) focus on

whether road improvements benefited the areas immediately surrounding improved roads. Here, we ask

whether changes in connectivity to more distant markets led to changes in nighttime lights, urban land, or

cropland. For this, we construct a measure of market access that captures, for each administrative unit, how

well the unit is connected (in terms of travel time) to other units (markets) in Ethiopia. The size of other

markets is measured by their population at baseline, and we calculate changes in market access that result

from improved transport infrastructure over time.9 This is akin to creating a continuous treatment, as it uses

changes in road speed limits (higher speeds reduce travel times). An advantage of this approach is that it

does not require us to make the sharp distinction between treated and control groups. This is an advantage
9 Forpopulation, we rely on the Gridded Population of the World (v4) dataset, which captures population as of 2000 at an
approximately 1km resolution https://sedac.ciesin.columbia.edu/data/collection/gpw-v4.

17
in the context of transportation infrastructure, because such investments might also affect locations further

away. The market access measure captures such spillovers. As with our previous empirical approaches, we

rely on Kebeles as our primary unit; however, to test the sensitivity of results, we also repeat the analysis

using Woredas (N = 780). The 1x1km pixels are less suitable for market access analysis.

The measure of market access is constructed using the following equation (Donaldson and Hornbeck,

2016):

X
M Ai,t = popj × tt−θ
ij,t , (2)
j, j6=i

where population of the other markets (popj ) is held constant at 2000 levels and where we measure the

travel time ttij,t between the origin unit i and the destination j at time t (see Appendix S11 for details on

estimating travel time using the RSDP road network data). We use the most populous pixel within each unit

as the location for estimating travel times. In our primary models, we use an elasticity (θ) of 3.8, following

Donaldson and Hornbeck (2016); however, we test the robustness of our results to alternate measures.

We aim to understand whether market access growth is associated with changes in economic activity,

urbanization, or cropland. As our primary model, we use a long-difference framework adopted from Alder

(2017), who estimates how changes in market access are associated with changes in nighttime lights in India.

We regress changes in our outcome variables on changes in market access, as specified in the following

equation:

∆IHS(yi ) = β0 + β1 ∆log(M A)i + βN ∆log(M A)i × Ni + βC Ci + γz + iw , (3)

where ∆IHS(yi ) represents changes in outcome variables as specified in the above section, and we take

the inverse hyperbolic sine transformation before differencing; M Ai is the measure of market access; Ni

is a vector of indicator variables for different levels of baseline nighttime lights; Ci is a vector of controls,

including initial market access, average nighttime lights at baseline, pre-trends in luminosity, and pre-trends

in the number of urban pixels; and γz are Zone fixed effects (where these represent Zone-year fixed effects in

a panel framework). In computing the long difference, baseline values are from 1996 and endline values are

from 2016. In all models, we cluster standard errors on Woredas (when estimating the model using Woredas

as the unit of analysis, we cluster using Zones, Ethiopia’s second-level administrative division, of which there

are 64). We explore heterogeneity in the impact of market access by interacting market access with indicator

variables (Ni ) for different baseline levels of nighttime lights.

An identification challenge is that market access may be endogenous. For example, road upgrades may

18
have been targeted to areas that would have experienced growth anyway. Following Alder et al. (2018), we

address this endogeneity concern by controlling for pre-trends in urban growth and luminosity. In addition,

following Donaldson and Hornbeck (2016), Blankespoor et al. (2017), and Jedwab and Storeygard (2016),

we implement a doughnut instrumental variables (IV) strategy. Here, we instrument market access with an

alternative version of market access that excludes units within a certain buffer. Specifically, when calculating

market access for each unit, units within a specified buffer are excluded. For our primary model, we use a

50km buffer, but we also test 20km and 100km buffers. This instrument helps address endogeneity concerns

by excluding units more prone to non-random road placement (Blankespoor et al., 2017). When including

interaction terms with market access, we instrument using the interaction of the “doughnut” market access

variable and the interaction terms. For example, we instrument M A and M A × Distance Addis with

M Adoughnut and M Adoughnut × Distance Addis.

To take full advantage of the panel nature of our data, we test the robustness of the long-difference

results using a two-way fixed effects model. This approach follows Blankespoor et al. (2017), who measure

the impact of road improvements in Mexico. The methodology is discussed in more detail in Appendix S12.

5 Results

5.1 Difference-in-Differences

Figures 4 and 5 show results after recovering the average treatment effect by length of exposure to road

improvements using Kebeles as the unit of analysis. We find that road upgrades contributed to growth in

both nighttime lights and urban land, and a decrease in cropland. Urban areas grow immediately after the

upgrades, while nighttime lights growth takes place five years after the upgrades. Most of the impact is

driven by proximity to roads with higher speed limits, but this might be the result of selection of locations

into treatment. In addition, pre-treatment coefficients generally appear flat throughout all results, suggesting

no-to-minimal pre-trends.10

Comparing places with different initial luminosity levels, coefficients on road improvements for nighttime

lights show relatively similar magnitudes in areas with low and high baseline nighttime lights; road improve-

ments lead to approximately 10–20% growth in nighttime lights. In models with urban land as the dependent

variable, the coefficient on road improvements is notably larger in areas with high initial nighttime lights

compared to areas with low initial lights. Results using only two groups of baseline nighttime lights show

similar results, with gains concentrated in Woredas with positive nighttime lights (see Appendix S13).
10 Acrossmost regressions, the p-value of the Wald pre-test of the parallel trends assumption is generally near zero; however, only
a small number of pre-treatment coefficients appear significant in any regression, and the coefficients are small in magnitude.

19
The results thus suggest that road upgrades may have contributed to a transition from cropland to urban

land. Examining the land cover classes that transitioned to urban supports this point; of all pixels that

transitioned to urban, 52% were previously cropland (see Appendix S14, which shows the distribution of

land cover classes that transitioned to urban).

As robustness checks, we also implement the traditional two-way fixed effects estimator and change the

unit of analysis to 1x1km pixels. When estimating models using 1x1km pixels, we remove pixels within 1km

of improved roads; doing so helps to remove impacts that may be driven by simply the addition of new

roadside lights. Our results do not change significantly using two-way fixed effects, albeit the coefficients

on leads are significant for higher-speed roads for cropland. However, in areas with high baseline nighttime

lights, we observe a reduction in cropland. This is consistent with the finding that road improvements are

strongly correlated with growth in urban land (see Appendix S7). Results using 1x1km pixels as the unit of

analysis generally show similar patterns (see Appendix S13).

20
Figure 4: Association of road improvements to nighttime lights, urbanization, and cropland using Kebeles as
the unit of analysis. Dark, Low, Medium, and High are different groups of baseline nighttime lights. Low-High groups are
formed using 3-quantiles of the maximum value of nighttime lights within Woredas that had some positive nighttime lights.
Dark indicates Woredas where the maximum value of nighttime lights at baseline was 0. Nighttime Lights is the average
nighttime lights within Kebeles, Urban is the total urban area within Kebeles, and Cropland is the total cropland area within
Kebeles; we use the inverse hyperbolic sine transformation on all outcome variables, which has a similar interpretation as logs.

21
Figure 5: Association of road improvements to nighttime lights, urbanization, and cropland using Kebeles as
the unit of analysis. The road type considers the speed limit of the road after the road was upgraded. Nighttime Lights is the
average nighttime lights within Kebeles, Urban is the total urban area within Kebeles, and Cropland is the total cropland area
within Kebeles; we use the inverse hyperbolic sine transformation on all outcome variables, which has a similar interpretation
as logs.

22
5.2 Long Difference and Instrumental Variable Results

Table 2 shows long-difference OLS (columns 1–6) and instrumental variable (columns 7–12) results on areas

incidentally connected to RSDP I–III.11 Roads are associated with growth in nighttime lights and urban

land, primarily in areas with medium-to-high levels of baseline nighttime lights. First-stage results of the

MSTs are all highly significant. IV results show roughly similar coefficients compared to the OLS model and,

similar to OLS results, show the largest gains from roads occurring in areas with high baseline nighttime

lights. Focusing on the IV results, road upgrades are associated with a 17% increase in nighttime lights

in Kebeles with high initial light and an 18% increase in Kebeles with medium initial nighttime lights. In

addition, road upgrades are associated with a 27% increase in urban land in Kebeles with high initial light.

Results using different versions of the MST instrument (least-distance MSTs and MSTs constructed using all

targeted areas at once—not first within regions, then appended together) show similar results (see Appendix

S8). In addition, results using two groups of baseline nighttime lights show gains concentrated in Woredas

with positive nighttime lights.

Table 2: Long Difference, OLS and IV Results - RSDP I - III

NTL Urban Cropland NTL Urban Cropland


(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12)
Imp Rd. 0.03∗∗ 0.003 0.01∗ −0.001 0.005 0.01 0.08∗∗ 0.04 0.01 −0.01 0.04 0.05
(0.01) (0.01) (0.01) (0.005) (0.01) (0.01) (0.04) (0.03) (0.02) (0.02) (0.04) (0.04)
Imp Rd.×N T L96 Low 0.02 0.02 −0.01 0.02 0.02 −0.06
(0.02) (0.01) (0.03) (0.03) (0.02) (0.06)
Imp Rd.×N T L96 Med 0.13∗∗ 0.02 0.01 0.18∗∗ 0.04 −0.004
(0.06) (0.02) (0.02) (0.09) (0.03) (0.02)
Imp Rd.×N T L96 High 0.22∗∗ 0.20∗∗ −0.04∗ 0.17∗ 0.27∗∗ −0.05∗
(0.10) (0.08) (0.02) (0.09) (0.13) (0.03)
Constant −0.02 −0.01 −0.10∗∗ −0.09∗∗ −0.02 −0.02 −0.18 −0.16 −0.09 −0.07 −0.15 −0.16
(0.06) (0.06) (0.05) (0.04) (0.07) (0.07) (0.13) (0.13) (0.08) (0.07) (0.15) (0.15)
Type OLS OLS OLS OLS OLS OLS IV IV IV IV IV IV
1st Stage F-Stat N/A N/A N/A N/A N/A N/A 84.5 25.4 84.5 25.4 84.5 25.4
Observations 10,220 10,220 10,220 10,220 10,220 10,220 10,220 10,220 10,220 10,220 10,220 10,220
Adjusted R2 0.003 0.02 0.001 0.02 -0.0001 -0.0001 -0.01 0.02 0.001 0.02 -0.003 -0.004

Unit of analysis is Kebeles. Standard errors are clustered on Woredas. For models (6)–(12), the least-cost-distance MST is
used, where the MST is first constructed for each region, then the regional MSTs are appended. The models include a control
for the log distance to the nearest targeted area. “Imp Rd.” indicates that a Kebele was near (within 5km of) an improved
road. N T L96 Low, Medium, and High are dummy variables for different groups of baseline nighttime lights; groups are
formed using 3-quantiles of the maximum value of nighttime lights within Woredas that had some positive nighttime lights.
The excluded group is Woredas where the maximum value of nighttime lights at baseline was 0. NTL is the average nighttime
light within Kebeles, Urban is the total urban area within Kebeles, and Cropland is the total cropland area within Kebeles;
we use the inverse hyperbolic sine transformation on all outcome variables, which has a similar interpretation as logs. ∗ p<0.1;
∗∗ p<0.05; ∗∗∗ p< 0.01

11 The
table shows IV results using the least-cost MST, where MSTs are constructed at the region level, then appended together.
Appendix S8 shows results using alternate constructions of the MST.

23
5.3 Impact of Market Access

Tables 3 and 4 show the correlation between changes in market access and changes in outcome variables using

Kebeles and Woredas as units of analysis, respectively. Both tables show OLS results (columns 1–6) and

results when instrumenting market access with the 50km doughnut market access variable (columns 7–12).

Focusing on IV results using Kebeles as the unit of analysis, results show that market access is primarily

associated with gains in nighttime lights across non-dark Kebeles (i.e., Kebeles with some positive nighttime

light pixels at baseline), with gains largest in areas with the highest baseline nighttime lights. Nighttime

light results are consistent across OLS results and results using Woredas as the unit of analysis.

Results using urban land as the dependent variable tend to be sensitive to using OLS or IV models and

to the unit of analysis. Results using Kebeles as the unit of analysis show that gains in market access are

associated with a small reduction in urban land in areas with low initial light but no significant association

between market access and urban land in areas with medium or high initial light. Results using Woredas

as the unit of analysis—particularly the IV models—show a positive association between market access and

urban land in areas with higher initial light.

Results showing that gains from road upgrades tend to be concentrated in areas with higher baseline

nighttime lights is consistent with our results on the impacts of road upgrades on the immediate areas

surrounding the upgrades. Results examining the impact on areas incidentally connected to RSDP similarly

show the greatest benefits to areas with the highest initial light. In addition, difference-in-difference results

do show benefits to areas with below-median initial nighttime lights, but coefficients tend to be larger in

areas with higher initial lights.

Results using different values of θ when calculating market access, different doughnut market access

sizes (using 20km and 100km), and two groups of baseline nighttime lights (Woredas with no baseline light

versus Woredas with some baseline light) all show similar results (see Appendix S15). Results using a panel,

two-way fixed effects approach also show similar results; however, here results show a significant association

between market access and urban land using both Kebeles and Woredas as the unit of analysis (see Appendix

S12).

Overall, the results using market access show that the gains in terms of nighttime lights from road

upgrades are concentrated in areas with moderate-to-high initial levels of economic activity. This finding

is consistent with the results based on long-differences specifications. Similarly, the areas with higher ini-

tial light show a positive association between market access and urban land, which is consistent with the

difference-in-difference and long-differences results.

24
Table 3: Association of changes in market access on changes in outcome variables using a long difference, OLS results [Kebeles]

NTL Urban Cropland NTL Urban Cropland


(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12)
MA −0.04∗∗∗ −0.05∗∗∗ −0.003∗∗ −0.004∗∗ 0.003 0.002 −0.04∗∗∗ −0.05∗∗∗ 0.01∗ 0.004∗ 0.002 0.001
(0.01) (0.01) (0.002) (0.002) (0.002) (0.002) (0.01) (0.01) (0.003) (0.002) (0.003) (0.004)
MA×N T L96 Low 0.04∗∗∗ −0.01∗∗∗ 0.004 0.05∗∗ −0.01∗∗ 0.002
(0.01) (0.003) (0.003) (0.02) (0.004) (0.004)
MA×N T L96 Med 0.07∗∗∗ −0.003 −0.001 0.08∗∗∗ −0.002 0.002
(0.01) (0.004) (0.004) (0.02) (0.01) (0.004)
MA×N T L96 High 0.15∗∗∗ 0.02 0.01 0.16∗∗∗ 0.04 0.01
(0.02) (0.02) (0.01) (0.03) (0.03) (0.01)
MA, 1996 0.03∗∗∗ 0.03∗∗∗ 0.01∗∗∗ 0.01∗∗∗ −0.002∗∗ −0.002∗∗ 0.07∗∗∗ 0.06∗∗∗ 0.01∗∗∗ 0.01∗∗∗ −0.004∗ −0.004∗∗
(0.003) (0.003) (0.001) (0.001) (0.001) (0.001) (0.01) (0.01) (0.001) (0.001) (0.002) (0.002)
Log mean light, 1996 −0.01 −0.16∗∗∗ 0.45∗∗∗ 0.43∗∗∗ −0.05∗ −0.05∗∗ 0.02 −0.14∗∗∗ 0.46∗∗∗ 0.42∗∗∗ −0.05∗ −0.06∗∗
(0.05) (0.05) (0.12) (0.13) (0.02) (0.03) (0.05) (0.05) (0.12) (0.13) (0.02) (0.03)
Pre-trend: log mean light 0.06 0.13∗ 0.15 0.17 −0.001 0.01 0.08 0.16∗∗ 0.16 0.19 −0.002 0.01
(0.08) (0.08) (0.15) (0.15) (0.03) (0.03) (0.08) (0.08) (0.15) (0.16) (0.03) (0.03)
Pre-trend: log N urban pixels 0.34∗∗ 0.34∗∗∗ 0.56∗∗∗ 0.56∗∗∗ −0.04 −0.04 0.35∗∗ 0.35∗∗∗ 0.57∗∗∗ 0.57∗∗∗ −0.04 −0.04
(0.13) (0.13) (0.22) (0.21) (0.12) (0.12) (0.14) (0.13) (0.21) (0.21) (0.12) (0.12)
Model OLS OLS OLS OLS OLS OLS IV IV IV IV IV IV
Observations 15,714 15,714 15,714 15,714 15,714 15,714 15,714 15,714 15,714 15,714 15,714 15,714
Adjusted R2 0.18 0.19 0.22 0.23 0.14 0.14 0.17 0.18 0.22 0.22 0.14 0.14

The unit of analysis is Kebeles. Standard errors are clustered on Woredas and all models include Zone fixed effects. MA is the
logged difference of MA from 1996 to 2016. IV indicates that MA is instrumented with a 50km MA Doughnut variable; that
is, in calculating market access, Kebeles within 50km are excluded. N T L96 Low, Medium, and High are dummy variables for
different groups of baseline nighttime lights; groups are formed using 3-quantiles of the maximum value of nighttime lights
within Woredas that had some positive nighttime lights. The excluded group is Woredas where the maximum value of
nighttime lights at baseline was 0. NTL is the average nighttime lights within Kebeles, Urban is the total urban area within
Kebeles, and Cropland is the total cropland area within Kebeles; we use the inverse hyperbolic sine transformation on all
outcome variables, which has a similar interpretation as logs. ∗ p<0.1; ∗∗ p<0.05; ∗∗∗ p< 0.01

Table 4: Association of changes in market access on changes in outcome variables using a long difference, OLS results [Woredas]

NTL Urban Cropland NTL Urban Cropland


(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12)
MA −0.01 −0.03 −0.02 −0.03∗ −0.004 −0.004 −0.05 −0.07 0.03 0.01 −0.002 −0.003
(0.02) (0.02) (0.01) (0.02) (0.004) (0.004) (0.04) (0.04) (0.03) (0.03) (0.01) (0.01)
MA×N T L96 Low 0.06∗∗ 0.002 −0.002 0.12∗∗∗ 0.04∗ 0.003
(0.03) (0.02) (0.004) (0.04) (0.02) (0.01)
MA×N T L96 Med 0.13∗∗∗ 0.04 0.001 0.20∗∗∗ 0.12∗∗∗ 0.01∗
(0.03) (0.03) (0.005) (0.05) (0.03) (0.01)
∗∗∗
MA×N T L96 High 0.20 0.20∗ 0.02 0.32∗∗∗ 0.40∗∗∗ 0.05∗∗
(0.04) (0.11) (0.01) (0.06) (0.11) (0.02)
MA, 1996 0.02∗∗∗ 0.01∗ 0.01∗∗∗ 0.01∗∗∗ 0.001 0.001 0.04∗∗∗ 0.02 0.01 −0.01 −0.001 −0.003
(0.004) (0.01) (0.003) (0.002) (0.001) (0.001) (0.01) (0.01) (0.01) (0.01) (0.002) (0.003)
Log mean light, 1996 −0.15∗ −0.29∗∗∗ 0.29 0.15 −0.12 −0.13 −0.11 −0.35∗∗∗ 0.32∗ 0.03 −0.11 −0.15
(0.08) (0.07) (0.19) (0.16) (0.10) (0.10) (0.07) (0.09) (0.19) (0.19) (0.10) (0.11)
Pre-trend: log mean light 0.11 0.10 −0.20 −0.19 0.12 0.13 0.11 0.08 −0.20 −0.18 0.13 0.13∗
(0.13) (0.11) (0.23) (0.21) (0.08) (0.08) (0.13) (0.10) (0.23) (0.21) (0.08) (0.08)
Pre-trend: log N urban pixels −0.24∗∗ −0.22∗∗ 0.33∗ 0.34∗ −0.05∗∗ −0.05∗ −0.26∗∗ −0.23∗∗ 0.36∗ 0.40∗ −0.05∗ −0.04∗
(0.10) (0.10) (0.19) (0.20) (0.02) (0.02) (0.10) (0.10) (0.21) (0.23) (0.02) (0.02)
Model OLS OLS OLS OLS OLS OLS IV IV IV IV IV IV
Observations 780 780 780 780 780 780 780 780 780 780 780 780
Adjusted R2 0.33 0.36 0.28 0.31 0.37 0.37 0.33 0.35 0.27 0.26 0.37 0.36

The unit of analysis is Woredas. Standard errors are clustered on Zones and all models include Zone fixed effects. MA is the
logged difference of MA from 1996 to 2016. IV indicates that MA is instrumented with a 50km MA Doughnut variable; that
is, in calculating market access, Woredas within 50km are excluded. N T L96 Low, Medium, and High are dummy variables for
different groups of baseline nighttime lights; groups are formed using 3-quantiles of the maximum value of nighttime lights
within Woredas that had some positive nighttime lights. The excluded group is Woredas where the maximum value of
nighttime lights at baseline was 0. NTL is the average nighttime lights within Woredas, Urban is the total urban area within
Woredas, and Cropland is the total cropland area within Woredas; we use the inverse hyperbolic sine transformation on all
outcome variables, which has a similar interpretation as logs. ∗ p<0.1; ∗∗ p<0.05; ∗∗∗ p< 0.01

25
6 Conclusion

Since 1997, Ethiopia has implemented RSDP, a countrywide program of road construction and improvements.

As in other contexts, a primary purpose of these transportation investments is to accelerate economic growth.

For example, a key objective of RSDP is to lower transport costs to support the production and distribution

of goods across the country and for export. From 1997 to 2016, 17.4 billion USD was invested across 128,470

km of roads. This paper evaluates the impact of RSDP on satellite-based measures of local economic activity

and land use.

Identifying the impact of transportation investments is difficult, as road placements may be endogenously

related to outcome variables of interest (e.g., roads targeted to areas already experiencing growth in economic

activity or urban land). We therefore rely on several complementary empirical approaches to uncover impacts

of road upgrades. First, we exploit temporal and spatial variation of road upgrades; we employ a difference-

in-difference model that controls for unobserved time-invariant heterogeneity in locations as well as common

temporal trends throughout Ethiopia. Second, we focus the analysis on localities incidentally connected

to RSDP roads as a potential source of exogenous variation in networks connections; here, following Faber

(2014), we instrument road upgrades with a hypothetical road network that would have been constructed

if connecting areas targeted by RSDP in the least-cost way was the only priority. Third, we examine the

impact of gains in market access, rather than whether a locality was near an upgraded road; we follow Alder

(2017) and use a long-difference approach to understand how changes in market access are associated with

changes in satellite-based outcome variables, controlling for baseline-level characteristics and pre-trends in

outcome variables. To check the robustness of these results, we take advantage of the panel data nature of

the model and, following Blankespoor et al. (2017), also employ a two-way fixed effects model.

Altogether, we find that gains from road upgrades in nighttime lights and urbanization were concentrated

in areas with higher levels of initial economic activity. In examining impacts of upgrades on surrounding

areas, the difference-in-difference models show impacts on local economic activity and urbanization across

all areas, but impacts were largest in areas with above-median levels of nighttime lights at baseline. Long

difference IV results show some positive impacts in areas with the least initial light; however, the coefficients

are small in magnitude for the darkest areas and are significantly larger for areas with the highest baseline

nighttime lights. Similarly, market access models tend to show the strongest impacts among areas with

medium-to-high levels of baseline nighttime lights. In addition, the results show either no positive impact of

upgrades in the darkest areas or that upgrades resulted in a reduction of economic activity or urbanization

in these areas.

The result showing that gains from road improvements are concentrated in larger markets is in line

26
with existing literature. For example, BenYishay et al. (2018) find that road improvements in the West

Bank shifted nighttime lights from darker areas to brighter areas; road improvements led to a reduction

in nighttime lights in areas with the darkest baseline nighttime lights but led to gains in nighttime lights

for other targeted areas. Faber (2014) finds that highways connecting major metropolitan areas caused a

reduction in GDP growth in peripheral counties also connected to the new highways. In addition, Mitnik

et al. (2018) find that poor communities do not benefit from road improvements, but they also find that the

richest communities do not benefit either; namely, gains are concentrated in communities in the middle of a

ranking according to unsatisfied basic needs.

Road upgrades primarily benefiting larger markets also fits within the context of new migration trends in

Ethiopia. In particular, since the early 2000s, Ethiopia has experienced increased rural-to-urban migration

and a decrease in rural-to-rural migration (OECD and Policy Studies Institute, 2020).12 Moreover, most

migration has taken place within regions. Consequently, the size and importance of regional market centers

is growing. Increased growth due to road improvements in larger markets could have made them more

attractive places to migrate to, and improved connections could have facilitated migration. While our

study cannot confirm this mechanism, existing literature has shown that transportation investments can

encourage migration. For example, Morten and Oliveira (2018) find that new highways in Brazil intended

to connect Brasilia with other state capitals led to increased migration between states that became better

connected. They argue that improved connectivity encourages migration because it lowers both the costs of

migration itself and the utility costs of being away from friends and relatives (improved connectivity reduces

costs to travel back and visit the original location). However, while improved connectivity may lower costs to

migration, Castaing Gachassin (2013) finds that road upgrades improving local conditions reduced migration

in Tanzania.

Our findings also show a trade-off between urban land and cropland. About half of all pixels that

transitioned to urban land were previously cropland. In addition, difference-in-difference models show that

larger roads in areas with above-median baseline nighttime lights contributed in particular to a reduction

in cropland area. This finding is in line with Fiorini and Sanfilippo (2019), who show that RSDP increased

employment in the service sector but reduced agricultural employment.

References

Adamopoulos, T. (2018). Spatial integration, agricultural productivity, and development: A quantitative

analysis of Ethiopia’s road expansion program. International Growth Center Working Paper.
12 This
is also consistent with the effects of rural roads on employment shifts out of the agricultural sector that Asher and
Novosad (2020) document for India.

27
Alder, S. (2017). Chinese roads in India: The effect of transport infrastructure on economic development.

Working paper.

Alder, S., Roberts, M., and Tewari, M. (2018). The effects of transport infrastructure on india’s urban and

rural development. Working paper.

Asher, S., Garg, T., and Novosad, P. (2020). The ecological impact of transportation infrastructure. The

Economic Journal, 130(629):1173–1199.

Asher, S. and Novosad, P. (2020). Rural roads and local economic development. American Economic Review,

110(3):797–823.

Atkin, D. and Donaldson, D. (2015). Who’s getting globalized? The size and implications of intra-national

trade costs. Working paper.

Banerjee, A., Duflo, E., and Qian, N. (2020). On the Road: Transportation Infrastructure and Economic

Growth in China. Journal of Development Economics, 145.

BenYishay, A., Trichler, R., Runfola, D., and Goodman, S. (2018). Final Report: Evaluation of the Infras-

tructure Needs Program II, A geospatial impact evaluation of the effect of INP II road improvements on

economic development.

Blankespoor, B., Bougna, T., Garduno-Rivera, R., and Selod, H. (2017). Roads and the geography of

economic activities in Mexico. Working paper.

Bundervoet, T. (2018). Internal Migration in Ethiopia: Evidence from a Quantitative and Qualitative

Research Study. Other Social Protection Study. World Bank.

Callaway, B. and Sant’Anna, P. H. (2021). Difference-in-differences with multiple time periods. Journal of

Econometrics, 225(2):200–230. Themed Issue: Treatment Effect 1.

Castaing Gachassin, M. (2013). Should I Stay or Should I Go? The Role of Roads in Migration Decisions.

Journal of African Economies, 22(5):796–826.

Christensen, P. and McCord, G. C. (2016). Geographic determinants of China’s urbanization. Regional

Science and Urban Economics, 59:90–102.

Datta, S. (2012). The impact of improved highways on Indian firms. Journal of Development Economics,

99(1):46–57.

28
Deng, X., Huang, J., Rozelle, S., and Uchida, E. (2008). Growth, population and industrialization, and

urban land expansion of China. Journal of Urban Economics, 63(1):96–115.

Dercon, S., Gilligan, D. O., Hoddinott, J., and Woldehanna, T. (2009). The impact of agricultural exten-

sion and roads on poverty and consumption growth in fifteen Ethiopian villages. American Journal of

Agricultural Economics, 91(4):1007.

Doll, C. N., Muller, J.-P., and Morley, J. G. (2006). Mapping regional economic activity from night-time

light satellite imagery. Ecological Economics, 57(1):75–92.

Donaldson, D. (2015). The gains from market integration. Annual Review of Economics, 7.

Donaldson, D. and Hornbeck, R. (2016). Railroads and american economic growth: A “market access”

approach”. The Quarterly Journal of Economics, 131(2):799.

Donaldson, D. and Storeygard, A. (2016). The view from above: Applications of satellite data in economics.

Journal of Economic Perspectives, 30(4):171–98.

Duranton, G. and Puga, D. (2015). Urban land use. In Handbook of regional and urban economics, volume 5,

pages 467–560. Elsevier.

Elvidge, C., Hsu, F.-C., Baugh, K., and Ghosh, T. (2014). National trends in satellite-observed lighting.

Global Urban Monit. Assess. Earth Obs., page 91–118.

Elvidge, C. D., Baugh, K., Zhizhin, M., and Chi Hsu, F. (2013). Why VIIRS data are superior to DMSP

for mapping nighttime lights. In Proceedings of the Asia-Pacific Advanced Network.

Ethiopian Roads Authority (2015a). Road sector development program 19 years performance assessment.

Ethiopian Roads Authority (2015b). The road sector development program phase V.

European Space Agency (2017). 300 m annual global land cover time series from 1992 to 2015.

European Space Agency (2019). New Release of the C3S Global Land Cover products for 2016, 2017 and

2018 consistent with the CCI 1992 – 2015 map series.

Faber, B. (2014). Trade Integration, Market Size, and Industrialization: Evidence from China’s National

Trunk Highway System. The Review of Economic Studies, 81(3):1046–1070.

Fiorini, M. and Sanfilippo, M. (2019). Roads and jobs in Ethiopia. WIDER Working Paper 2019/116.

29
Fiorini, M., Sanfilippo, M., and Sundaram, A. (2021). Trade liberalization, roads and firm productivity.

Journal of Development Economics, 153:102712.

Ghosh, T., Anderson, S. J., Elvidge, C. D., and Sutton, P. C. (2013). Using nighttime satellite imagery as

a proxy measure of human well-being. Sustainability, 5(12):4988–5019.

Henderson, J. V., Storeygard, A., and Weil, D. N. (2012). Measuring economic growth from outer space.

American Economic Review, 102(2):994–1028.

Iimi, A., Mengesha, H. A., Markland, J., Asrat, Y., and Kassahun, K. (2018). Heterogeneous Impacts of

Main and Feeder Road Improvements: Evidence from Ethiopia. World Bank Policy Research Working

Paper, (8548).

Jedwab, R. and Storeygard, A. (2016). The Heterogeneous Effects of Transportation Investments: Evidence

from sub-Saharan Africa 1960-2010. https://urbanisation.econ.ox.ac.uk/papers/the-heterogeneous-effects-

of-transportation-investments-evidence-from-sub-saharan-africa-1960-2010.

Kebede, H. A. (2021). The gains from market integration: The welfare effects of new rural roads in Ethiopia.

Available at SSRN: https://ssrn.com/abstract=3971749.

Li, X. and Zhou, Y. (2017). A Stepwise Calibration of Global DMSP/OLS Stable Nighttime Light Data

(1992-2013). Remote Sensing, 9(6).

Li, X., Zhou, Y., Zhao, M., and Zhao, X. (2020). A harmonized global nighttime light dataset 1992–2018.

Sci Data, 7(168).

Michaels, G. (2008). The Effect of Trade on the Demand for Skill: Evidence from the Interstate Highway

System. The Review of Economics and Statistics, 90.

Michaels, G., Rauch, F., and Redding, S. J. (2012). Urbanization and structural transformation. The

Quarterly Journal of Economics, 127(2):535–586.

Mitnik, O. A., Sanchez, R., and Yanez-Pagans, P. (2018). Bright Investments: Measuring the Impact of

Transport Infrastructure Using Luminosity Data in Haiti.

Moneke, N. (2019). Can big push infrastructure unlock development? Evidence from Ethiopia. Working

paper.

Morten, M. and Oliveira, J. (2018). The effects of roads on trade and migration: Evidence from a planned

capital city.

30
Motamed, M. J., Florax, R. J., and Masters, W. A. (2014). Agriculture, transportation and the timing of

urbanization: Global analysis at the grid cell level. Journal of Economic Growth, 19(3):339–368.

MoWIE (2019). National electrification program 2.0: Integrated planning for universal access. Technical

report, Ethiopia Ministry of Water, Irrigation and Electricity.

OECD and Policy Studies Institute (2020). Rural Development Strategy Review of Ethiopia.

Ozlu, M. O., Alemayehu, A., Mukim, M., Lall, S. V., Kerr, O., Kaganova, O., Viola, C. O., Hill, R., Hamilton,

E., Gapihan, A. T., Ayane, B. L., Llano, A. I. A. D., Egziabher, T. G., and Gebretsadik, S. Z. (2015).

Ethiopia - Urbanization review : Urban institutions for a middle-income Ethiopia.

Redding, S. and Turner, M. A. (2014). Transportation costs and the spatial organization of economic activity.

Sant’Anna, P. H. and Zhao, J. (2020). Doubly robust difference-in-differences estimators. Journal of Econo-

metrics, 219(1):101–122.

Schmidt, E., Dorosh, P. A., Kedir Jemal, M., and Smart, J. (2018). Ethiopia’s spatial and structural

transformation: Public policy and drivers of change. Publisher: International Food Policy Research

Institute (IFPRI)Ethiopian Development Research Institute (EDRI).

Shiferaw, A., Söderbom, M., Siba, E., and Alemu, G. (2015). Road infrastructure and enterprise dynamics

in Ethiopia. The Journal of Development Studies, 51(11):1541–1558.

Storeygard, A. (2016). Farther on down the road: Transport costs, trade and urban growth in sub-Saharan

Africa. The Review of Economic Studies, 83(3):1263–1295.

Sutton, P. C. and Costanza, R. (2002). Global estimates of market and non-market values derived from

nighttime satellite imagery, land cover, and ecosystem service valuation. Ecological Economics, 41(3):509–

527.

Weidmann, N. B. and Schutte, S. (2017). Using night light emissions for the prediction of local wealth.

Journal of Peace Research, 54(2):125–140.

World Bank (2015). Ethiopia Urbanization Review: Urban Institutions for a Middle Income Ethiopia.

World Bank Group (2012). Ethiopia - Promoting economic growth: Ethiopia: Country results profile

(English).

World Bank Group, Independent Evaluation Group (2015). Ethiopia - Energy Access Project - Review.

Technical report.

31
Wu, J., He, S., Peng, J., Li, W., and Zhong, X. (2013). Intercalibration of DMSP-OLS night-time light data

by the invariant region method. International Journal of Remote Sensing, 34(20):7356–7368.

Zhang, Q., Pandey, B., and Seto, K. C. (2016). A Robust Method to Generate a Consistent Time Series

From DMSP/OLS Nighttime Light Data. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE

SENSING, 54(10).

32
The Impact of Ethiopia’s Road Investment
Program on Economic Development and Land
Use: Evidence from Satellite Data

Supplementary Information

The findings, interpretations, and conclusions expressed in this paper are entirely those of the authors. The
views expressed in this paper are those of the authors and do not necessarily reflect those of the SNB.
In addition, they do not necessarily represent the views of the International Bank for Reconstruction and
Development/World Bank and its affiliated organizations, or those of the Executive Directors of the World
Bank or the governments they represent.
Contents
S1 Road Improvements by Baseline Woreda Nighttime Lights S2

S2 RSDP Road Dataset S3

S3 Nighttime Lights Data S4

S4 Trends in Woreda-Level Outcome Variables Over Time S5

S5 Kebele Information S8

S6 Universal Electricity Access Program versus RSDP S9

S7 Two-Way Fixed Effect Models S10


S7.1 Results using Kebeles as Unit of Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . S10
S7.2 Results using 1x1km Pixels as Unit of Analysis . . . . . . . . . . . . . . . . . . . . . . . . . S11

S8 Additional IV Results Analyzing the Impact of Areas Incidentally Connected to


the RSDP S12
S8.1 Treated and Control Summary Stats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . S12
S8.2 OLS Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . S12
S8.3 First Stage Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . S13
S8.4 Second Stage Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . S14
S8.5 Second Stage Results - Different Baseline Nighttime Light Groups . . . . . . . . . . . . . . S16

S9 Balance Across Non-Targeted Treated and Control Areas S18

S10 Constructing Minimum Spanning Trees S19

S11 Estimating Travel Time from Road Network Data S21

S12 Market Access Two-Way Fixed Effect Model S23

S13 Difference in Differences - Additional Results S26


S13.1 Results using 1x1km Grid as Unit of Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . S26
S13.2 Results when excluding Kebeles within 100km of Addis Ababa . . . . . . . . . . . . . . . . S27
S13.3 Heterogeneity of impacts across Woredas with zero and positive baseline nighttime lights . . S28

S14 What land cover type does urban replace? S29

S15 Market Access, Long Difference - Additional Results S30


S15.1 Varying Doughnut Sizes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . S30
S15.2 Varying Values of Theta . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . S32
S15.3 Varying Baseline Nighttime Light Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . S34

S1
S1 Road Improvements by Baseline Woreda Nighttime Lights

This section shows road improvements over time across Woredas with different levels of baseline nighttime

lights. Figure S1 shows trends in kilometers of road above 30, 50, and 70km/h. The trends appear roughly

equal across Woredas with different levels of baseline nighttime lights, which indicates that roads improve-

ments were widespread and not targeted to areas with initially high nighttime lights. Figure S2 shows the

proportion of the road network by road speed limit across Woredas with different baseline levels of nighttime

lights. As with figure S1, figure S2 shows that Woredas across different levels of baseline nighttime lights

benefited from faster roads.

Figure S1: Length of road above select speed limits (km/h)

Figure S2: Length of road by speed limit

S2
S2 RSDP Road Dataset

Table S1 shows the assumed travel speed information by road type and condition that the Ethiopia Road

Authority used in constructing the RSDP panel dataset. The dataset provided by ERA contains the speed

of the road in each year (where a speed of 0km/h is assigned to roads that had not been constructed yet).

As the table shows, an increase in speed limit indicates that a road was improved except for town roads.

Consequently, as our analysis relies on increased speeds to indicate improvement, our analysis focuses on

roads connecting towns rather than within town roads.

Pavement type and condition Speed before reha- Speed after rehabil-
bilitation or con- itation or construc-
struction tion
Asphalt road (expressway) 120km/h
Asphalt roads (Highway - as- 70km/h
phalt concrete or surface treat-
ment 50km/h
Federal gravel Road (high class 35km/h 50km/h
gravel roads)
Regional gravel Road (intermedi- 25km/h 45km/h
ate class gravel roads)
URRAP roads - lower district 20km/h 35km/h
level roads (low class gravel
roads)
Earth surfaced roads (very low 20km/h 30km/h
class public roads)
Federal gravel or regional rural 25km/h to 35km/h 70km/h
roads to asphalt roads
Town roads (asphalt) 30km/h 30km/h
Town roads (cobbled) 20km/h 20km/h
Town roads (gravel) 15km/h 15km/h
Town roads (earth) 10km/h 10km/h

Table S1: Road types, conditions, and speeds. Information provided by the Ethiopian Roads Authority

S3
S3 Nighttime Lights Data

Figure S3 shows nighttime lights from Li et al. (2020), who develops a harmonized global dataset of nighttime

lights from 1992 to 2018. The dataset uses both DMSP-OLS (available from 1992 to 2013) and VIIRS

(available from 2012 to present). From 1992-2013, the dataset intercalibrates annual DMSP-OLS data; from

2014-2018, the dataset uses VIIRS to simulate DMSP-OLS like data. Figure S3 shows that differences still

exist between the DMSP-OLS data and the simulated DMSP-OLS data; in particular, the simulated DMSP-

OLS data captures low values of nighttime lights (particularly seen in Ethiopia’s northwest), while these

values are not captured in the raw DMSP-OLS data.

Figure S3: Nighttime Lights in Multiple Years

S4
S4 Trends in Woreda-Level Outcome Variables Over Time

This section shows trends in outcomes variables over time. Trends are shown across different groupings of

Woredas, where Woredas are grouped by baseline levels of nighttime lights into dark Woredas (Woredas with

no positive nighttime lights), and those with low, medium, and high baseline levels of nighttime lights (we

use 3-quantiles of the maximum value of nighttime lights across Woredas with some positive nighttime lights

to form the low, medium, and high groups).

Figure S4 shows average trends over time. Woredas across all groupings saw growth in nighttime lights

and urban land; in addition, all followed a general pattern of increasing then decreasing cropland area. The

figure shows a sharp increase in nighttime lights among Woredas with a maximum nighttime value of 0

at baseline. This increase is due to a difference in the underlying nighttime lights data; from 1992-2013,

DMSP-OLS data is used; and from 2014 onwards, simulated DMSP-OLS data is used, which captures more

low-level light.

Figure S5 shows the distribution in growth rates in outcome variables from 1992 to 2016. The figure

shows that most Woredas saw growth in nighttime lights; while many Woredas saw growth in Urban area,

a notable proportion saw no change in urban area. In addition, most Woredas saw no change in Cropland

area. Among Woredas that did see change in cropland, Woredas were roughly equally split in seeing growth

or a reduction in cropland, except for Woredas with high initial nighttime lights, where Woredas tended to

see a reduction in cropland.

S5
Figure S4: Trends in Outcome Variables Over Time

S6
Figure S5: Trends in Outcome Variables Over Time

S7
S5 Kebele Information

Table S2 shows summary statistics of the size of Kebeles and figure S6 shows a map of Kebeles (the Kebele

file used for this analysis provided Kebele boundaries for all regions except Somali; for Somali region, we use

Woreda boundaries). The median area of Kebeles is 24kkm2 and on average area is 70.3km2 .

Table S2: Area of Kebeles (km2 )

Min 25th Percentile Median Mean 75th Percentile Max


0.129 13.105 24.729 70.267 44.458 20,043.053

Figure S6: Kebele Map

S8
S6 Universal Electricity Access Program versus RSDP

Figure S7 compares RSDP roads improved since the launch of the Universal Electricity Access Program

(2005) with electrical grid lines planned or under construction as of 20071 as part of the Electricity Access

Rural Expansion Project (Phase 2).

Figure S7: RSDP roads improved since 2005 and grid lines planned or under construction as of 2007

1 Data downloaded from: https://datacatalog.worldbank.org/dataset/ethiopia-electricity-transmission-network

S9
S7 Two-Way Fixed Effect Models

In addition to estimating difference-in-differences models using an approach proposed by Callaway and

Sant’Anna (2018), we also estimated two-way fixed effect models. We implement a simple OLS two-way

fixed effects model. The model used is the following:

−2
X 10
X
yi,t = β0 + βf I(year − Ri = f ) + βl I(year − Ri = l) + γi + δt + i,t , (1)
f =−10 l=0

where yi,t is the value of the outcome variable in unit i (Kebeles for our primary model and, to check the

sensitivity of these results, 1x1km pixels) in year t and Ri is the year when the unit i was treated. We define

f as -10 when f is ≤ 10 and l as 10 when l is ≥ 10, γi are unit fixed effects and δt are year fixed effects. Year

fixed effects control for temporal trends throughout the study period (e.g., average nighttime lights generally

increases across areas over the study area). Unit fixed effects control for time-invariant characteristics of

units that might also be associated with the outcome.

Figures S8 and S9 show results using Kebeles as the unit of analysis, and figures S10 and S11 show

results using 1km pixels as the unit of analysis. The figures generally show similar trends as the difference-

in-difference results. Results show road upgrades associated with gains in nighttime lights and urban land,

particularly in areas with medium-to-high baseline nighttime lights and among faster roads. However, two-

way fixed effect models tend to show more pre-trends compared to difference-in-difference results.

S7.1 Results using Kebeles as Unit of Analysis

Figure S8: Association of road improvements to nighttime lights, urbanization, and cropland using Kebeles
as the unit of analysis. Dark, Low, Medium, and High are different groups of baseline nighttime lights. Low-High groups
are formed using 3-quantiles of the maximum value of nighttime lights within Woredas that had some positive nighttime lights.
Dark indicates Woredas where the maximum value of nighttime lights at baseline was 0. Nighttime Lights is the average
nighttime lights within Kebeles, Urban is the total urban area within Kebeles, and Cropland is the total cropland area within
Kebeles; we use the inverse hyperbolic sine transformation on all outcome variables, which has a similar interpretation as logs.

S10
Figure S9: Association of road improvements to nighttime lights, urbanization, and cropland using Kebeles as
the unit of analysis. The road type considers the speed limit of the road after the road was upgraded. Nighttime Lights is the
average nighttime lights within Kebeles, Urban is the total urban area within Kebeles, and Cropland is the total cropland area
within Kebeles; we use the inverse hyperbolic sine transformation on all outcome variables, which has a similar interpretation
as logs.

S7.2 Results using 1x1km Pixels as Unit of Analysis

Figure S10: Association of road improvements to nighttime lights, urbanization, and cropland using 1x1km
pixels as the unit of analysis. Dark, Low, Medium, and High are different groups of baseline nighttime lights. Low-High
groups are formed using 3-quantiles of the maximum value of nighttime lights within Woredas that had some positive nighttime
lights. Dark indicates Woredas where the maximum value of nighttime lights at baseline was 0. Nighttime Lights is the inverse
hyperbolic sine of nighttime lights of the 1x1km pixel, which has a similar interpretation as logs. Urban and Cropland are
binary variables indicating whether the pixel contains urban or cropland.

Figure S11: Association of road improvements to nighttime lights, urbanization, and cropland using Kebeles
as the unit of analysis. The road type considers the speed limit of the road after the road was upgraded. Nighttime Lights
is the inverse hyperbolic sine of nighttime lights of the 1x1km pixel, which has a similar interpretation as logs. Urban and
Cropland are binary variables indicating whether the pixel contains urban or cropland.

S11
S8 Additional IV Results Analyzing the Impact of Areas Inciden-

tally Connected to the RSDP

Our main specification examining the impact of the RSDP on areas incidentally connected to RSDP roads

uses Kebeles as the unit of analysis and uses a minimum spanning tree (MST) constructed from regional

least cost distance MSTs appended together as an instrument. In this section, we test the sensitivity of

results to using 1x1km pixels as the unit of analysis and using different versions of the MSTs. Section S8.1

shows sample sizes and the number of units near the RSDP and MSTs. Section S8.2 shows OLS results,

section S8.3 shows first-stage results from IV models, section S8.4 shows second-stage results, and section

S8.5 shows second-stage results when using two groups of baseline nighttime lights as opposed to four.

S8.1 Treated and Control Summary Stats

For instrumental variable models, units that are near targeted areas are removed. Table S3 shows the number

of units after removing targeted areas. In addition, S3 shows the number of units treated (near an RSDP

road) and near the MST networks.

Table S3: Number of Treated and Control Units for IV Analysis Across Different Units

Sample Size N Units Near RSDP and Different MSTs


Unit N Total N, Targeted RSDP MST Least MST Least MST Least MST Least
Areas Removed Dist Dist: Regional Cost Cost: Regional
1x1km Grid 1,337,567 113,908 1,271,094 95,603 106,534 124,674 136,769
Kebeles 15,714 3,157 10,220 1,956 2,233 2,781 3,020

S8.2 OLS Results

Table S4 shows results from OLS models; models (1)–(6) show results using 1x1km pixels as the unit of

analysis and models (7)–(12) show results using Kebeles as the unit of analysis. The results show road

upgrades associated with gains in nighttime lights and urban land, particularly in areas with medium-to-

high initial baseline nighttime lights.

S12
Table S4: Long Difference, OLS Results - RSDP I - III

NTL Urban Cropland NTL Urban Cropland


(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12)
Imp Rd. 0.02∗∗∗ 0.005 0.001∗∗∗ −0.0001 0.001 0.002 0.03∗∗ 0.003 0.01∗ −0.001 0.005 0.01
(0.01) (0.004) (0.0002) (0.0001) (0.001) (0.002) (0.01) (0.01) (0.01) (0.005) (0.01) (0.01)
Imp Rd.×N T L96 Low 0.001 0.001 −0.004 0.02 0.02 −0.01
(0.01) (0.001) (0.003) (0.02) (0.01) (0.03)
Imp Rd.×N T L96 Med 0.08∗∗ 0.001∗ 0.002 0.13∗∗ 0.02 0.01
(0.04) (0.001) (0.004) (0.06) (0.02) (0.02)
Imp Rd.×N T L96 High 0.19∗∗∗ 0.01∗∗ −0.0001 0.22∗∗ 0.20∗∗ −0.04∗
(0.06) (0.005) (0.01) (0.10) (0.08) (0.02)
Constant 0.04∗∗∗ 0.04∗∗∗ 0.002∗∗∗ 0.002∗∗∗ 0.004 0.004 −0.02 −0.01 −0.10∗∗ −0.09∗∗ −0.02 −0.02
(0.01) (0.01) (0.001) (0.001) (0.01) (0.01) (0.06) (0.06) (0.05) (0.04) (0.07) (0.07)
Unit 1km 1km 1km 1km 1km 1km Keb. Keb. Keb. Keb. Keb. Keb.
Observations 1,271,082 1,271,082 1,271,094 1,271,094 1,271,094 1,271,094 10,220 10,220 10,220 10,220 10,220 10,220
Adjusted R2 0.004 0.01 0.0002 0.002 0.0000 0.0001 0.003 0.02 0.001 0.02 -0.0001 -0.0001

Standard errors are clustered on Woredas. The models include a control for the log distance to the nearest targeted area.
“Imp Rd.” indicates that a unit was near (within 5km of) an improved road. N T L96 Low, Medium, and High are dummy
variables for different groups of baseline nighttime lights; groups are formed using 3-quantiles of the maximum value of
nighttime lights within Woredas that had some positive nighttime lights. The excluded group is Woredas where the maximum
value of nighttime lights at baseline was 0. When using 1x1km pixels as the unit of analysis, NTL is the inverse hyperbolic
sine of nighttime lights of the 1x1km pixel, which has a similar interpretation as logs, and Urban and Cropland are binary
variables indicating whether the pixel contains urban or cropland. When using Kebeles as the unit of analysis, NTL is the
average nighttime lights within Kebeles, Urban is the total urban area within Kebeles, and Cropland is the total cropland area
within Kebeles; we use the inverse hyperbolic sine transformation on all outcome variables. ∗ p<0.1; ∗∗ p<0.05; ∗∗∗ p< 0.01

S8.3 First Stage Results

Table S5 shows first stage results of IV models for different variations of the minimum spanning trees and

using both 1x1km pixels (models 1–4) and Kebeles (models 5–8) as the unit of analysis. The MSTs are all

strongly associated with roads improved from RSDP I-III; the F-Stat on models using 1x1km pixels range

from 194-250, and the F-Stat on models using Kebeles range from 31-88.

Table S5: First Stage, RSDP I-III

Near Improved Road


(1) (2) (3) (4) (5) (6) (7) (8)
Near Least Cost MST 0.29∗∗∗ 0.24∗∗∗
(0.02) (0.03)
Near Min. Distance MST 0.27∗∗∗ 0.15∗∗∗
(0.02) (0.03)
Near Least Cost MST: Regional 0.26∗∗∗ 0.22∗∗∗
(0.02) (0.02)
Near Min. Distance MST: Regional 0.25∗∗∗ 0.15∗∗∗
(0.02) (0.03)
Constant 0.89∗∗∗ 0.98∗∗∗ 0.92∗∗∗ 0.99∗∗∗ 2.67∗∗∗ 3.06∗∗∗ 2.69∗∗∗ 3.02∗∗∗
(0.08) (0.08) (0.08) (0.08) (0.15) (0.14) (0.15) (0.15)
Unit 1km 1km 1km 1km Keb. Keb. Keb. Keb.
1st Stage F-Stat 250.2 213.3 208.6 194.8 88.1 31.3 84.5 35.9
Observations 1,271,082 1,271,082 1,271,082 1,271,082 10,220 10,220 10,220 10,220
Adjusted R2 0.21 0.19 0.21 0.19 0.22 0.20 0.22 0.20

Standard errors are clustered on Woredas. “Near” indicates that the unit was within 5km of the MST or improved road.
Least Cost MSTs minimize the estimated construction cost of connecting targeted areas (construction cost is a function of
slope and land cover), while Min. Distance MSTs minimize the Euclidean distance between targeted areas. Regional MSTs
first compute targeted areas within Ethiopia’s regions, and are then appended together. ∗ p<0.1; ∗∗ p<0.05; ∗∗∗ p< 0.01

S13
S8.4 Second Stage Results

Tables S6-S9 show results from the second stage of IV models using different versions of the MSTs. Across

different versions of the MSTS, results show road upgrades associated with growth in NTL and urban area,

particularly in areas with medium-to-high baseline nighttime lights.

Table S6: Long Difference, IV Results using Least Euclidean Distance MST - RSDP I - III

NTL Urban Cropland NTL Urban Cropland


(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12)
Imp Rd. 0.03∗ 0.01 0.001 0.0000 0.004 0.004 0.01 −0.005 0.04 0.04 −0.05 −0.06
(0.02) (0.01) (0.001) (0.001) (0.01) (0.01) (0.06) (0.05) (0.04) (0.04) (0.07) (0.07)
Imp Rd.×N T L96 Low −0.02∗∗ 0.0001 −0.01 0.01 −0.01 −0.06
(0.01) (0.0003) (0.01) (0.03) (0.01) (0.09)
Imp Rd.×N T L96 Med 0.13∗∗ 0.004∗∗ 0.001 0.19∗∗ 0.06 0.06
(0.06) (0.002) (0.01) (0.10) (0.05) (0.05)
Imp Rd.×N T L96 High 0.25∗∗ 0.03∗ 0.01 0.22∗ 0.50∗ −0.06
(0.11) (0.01) (0.03) (0.13) (0.27) (0.05)
Constant 0.04∗∗ 0.03∗ 0.001 0.0003 0.001 0.0004 0.03 −0.005 −0.20 −0.28 0.18 0.19
(0.02) (0.02) (0.001) (0.001) (0.01) (0.01) (0.20) (0.21) (0.13) (0.17) (0.25) (0.25)
Unit 1km 1km 1km 1km 1km 1km Keb. Keb. Keb. Keb. Keb. Keb.
1st Stage F-Stat 213.31 58.81 213.32 58.81 213.32 58.81 31.34 8.19 31.34 8.19 31.34 8.19
Observations 1,271,082 1,271,082 1,271,094 1,271,094 1,271,094 1,271,094 10,220 10,220 10,220 10,220 10,220 10,220
Adjusted R2 0.004 0.01 0.0001 -0.001 -0.0000 -0.0000 0.002 0.02 -0.003 -0.04 -0.01 -0.01

Standard errors are clustered on Woredas. The models include a control for the log distance to the nearest targeted area.
“Imp Rd.” indicates that a unit was near (within 5km of) an improved road. N T L96 Low, Medium, and High are dummy
variables for different groups of baseline nighttime lights; groups are formed using 3-quantiles of the maximum value of
nighttime lights within Woredas that had some positive nighttime lights. The excluded group is Woredas where the maximum
value of nighttime lights at baseline was 0. When using 1x1km pixels as the unit of analysis, NTL is the inverse hyperbolic
sine of nighttime lights of the 1x1km pixel, which has a similar interpretation as logs, and Urban and Cropland are binary
variables indicating whether the pixel contains urban or cropland. When using Kebeles as the unit of analysis, NTL is the
average nighttime lights within Kebeles, Urban is the total urban area within Kebeles, and Cropland is the total cropland area
within Kebeles; we use the inverse hyperbolic sine transformation on all outcome variables. ∗ p<0.1; ∗∗ p<0.05; ∗∗∗ p< 0.01

S14
Table S7: Long Difference, IV Results using Least Euclidean Distance Regional MST - RSDP I - III

NTL Urban Cropland NTL Urban Cropland


(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12)
Imp Rd. 0.02 −0.01 0.001 0.0000 0.01 0.01 −0.01 −0.04 0.02 0.02 −0.03 −0.03
(0.01) (0.01) (0.001) (0.001) (0.01) (0.01) (0.05) (0.05) (0.03) (0.04) (0.06) (0.06)
Imp Rd.×N T L96 Low −0.01 0.0002 −0.01∗ 0.01 −0.004 −0.05
(0.01) (0.0003) (0.01) (0.03) (0.01) (0.08)
Imp Rd.×N T L96 Med 0.14∗∗ 0.004∗∗ −0.0003 0.19∗∗ 0.05 0.04
(0.06) (0.002) (0.01) (0.08) (0.04) (0.05)
Imp Rd.×N T L96 High 0.27∗∗ 0.03∗ 0.004 0.19 0.44∗ −0.05
(0.11) (0.02) (0.02) (0.12) (0.25) (0.05)
Constant 0.05∗∗∗ 0.04∗∗ 0.001 0.0003 −0.001 −0.001 0.12 0.10 −0.14 −0.20 0.09 0.10
(0.02) (0.02) (0.001) (0.001) (0.01) (0.01) (0.18) (0.18) (0.11) (0.15) (0.22) (0.22)
Unit 1km 1km 1km 1km 1km 1km Keb. Keb. Keb. Keb. Keb. Keb.
1st Stage F-Stat 194.84 55.31 194.84 55.31 194.84 55.31 35.92 10.65 35.92 10.65 35.92 10.65
Observations 1,271,082 1,271,082 1,271,094 1,271,094 1,271,094 1,271,094 10,220 10,220 10,220 10,220 10,220 10,220
Adjusted R2 0.004 0.01 0.0002 -0.002 -0.0001 -0.0001 -0.003 0.02 0.001 -0.01 -0.002 -0.004

Standard errors are clustered on Woredas. The models include a control for the log distance to the nearest targeted area.
“Imp Rd.” indicates that a unit was near (within 5km of) an improved road. N T L96 Low, Medium, and High are dummy
variables for different groups of baseline nighttime lights; groups are formed using 3-quantiles of the maximum value of
nighttime lights within Woredas that had some positive nighttime lights. The excluded group is Woredas where the maximum
value of nighttime lights at baseline was 0. When using 1x1km pixels as the unit of analysis, NTL is the inverse hyperbolic
sine of nighttime lights of the 1x1km pixel, which has a similar interpretation as logs, and Urban and Cropland are binary
variables indicating whether the pixel contains urban or cropland. When using Kebeles as the unit of analysis, NTL is the
average nighttime lights within Kebeles, Urban is the total urban area within Kebeles, and Cropland is the total cropland area
within Kebeles; we use the inverse hyperbolic sine transformation on all outcome variables. ∗ p<0.1; ∗∗ p<0.05; ∗∗∗ p< 0.01

Table S8: Long Difference, IV Results using Least Cost MST - RSDP I - III

NTL Urban Cropland NTL Urban Cropland


(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12)
Imp Rd. 0.05∗∗∗ 0.02∗∗ 0.002∗ −0.0000 0.01∗∗∗ 0.01∗∗∗ 0.09∗∗ 0.07∗∗ 0.01 0.01 0.02 0.02
(0.02) (0.01) (0.001) (0.0004) (0.005) (0.01) (0.04) (0.03) (0.02) (0.02) (0.04) (0.04)
Imp Rd.×N T L96 Low −0.003 0.001 −0.02∗∗∗ 0.03 0.02 −0.05
(0.02) (0.001) (0.01) (0.04) (0.02) (0.06)
Imp Rd.×N T L96 Med 0.13∗∗ 0.003∗∗ −0.01 0.17∗ 0.04 −0.005
(0.07) (0.001) (0.01) (0.10) (0.03) (0.02)
Imp Rd.×N T L96 High 0.27∗∗ 0.03∗ −0.001 0.20∗ 0.38∗∗ −0.06
(0.11) (0.02) (0.01) (0.11) (0.20) (0.04)
Constant 0.01 0.01 0.001 0.0002 −0.01 −0.01 −0.23∗ −0.26∗∗ −0.10 −0.16∗ −0.07 −0.07
(0.02) (0.02) (0.001) (0.001) (0.01) (0.01) (0.13) (0.13) (0.07) (0.09) (0.14) (0.14)
Unit 1km 1km 1km 1km 1km 1km Keb. Keb. Keb. Keb. Keb. Keb.
1st Stage F-Stat 250.25 69.65 250.25 69.65 250.25 69.65 88.07 24.73 88.07 24.73 88.07 24.73
Observations 1,271,082 1,271,082 1,271,094 1,271,094 1,271,094 1,271,094 10,220 10,220 10,220 10,220 10,220 10,220
Adjusted R2 0.002 0.01 0.0001 -0.002 -0.001 -0.001 -0.01 0.01 0.001 0.001 -0.001 -0.001

Standard errors are clustered on Woredas. The models include a control for the log distance to the nearest targeted area.
“Imp Rd.” indicates that a unit was near (within 5km of) an improved road. N T L96 Low, Medium, and High are dummy
variables for different groups of baseline nighttime lights; groups are formed using 3-quantiles of the maximum value of
nighttime lights within Woredas that had some positive nighttime lights. The excluded group is Woredas where the maximum
value of nighttime lights at baseline was 0. When using 1x1km pixels as the unit of analysis, NTL is the inverse hyperbolic
sine of nighttime lights of the 1x1km pixel, which has a similar interpretation as logs, and Urban and Cropland are binary
variables indicating whether the pixel contains urban or cropland. When using Kebeles as the unit of analysis, NTL is the
average nighttime lights within Kebeles, Urban is the total urban area within Kebeles, and Cropland is the total cropland area
within Kebeles; we use the inverse hyperbolic sine transformation on all outcome variables. ∗ p<0.1; ∗∗ p<0.05; ∗∗∗ p< 0.01

S15
Table S9: Long Difference, IV Results using Least Cost Regional MST - RSDP I - III

NTL Urban Cropland NTL Urban Cropland


(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12)
Imp Rd. 0.04∗∗∗ 0.02 0.002∗ −0.0000 0.01∗∗∗ 0.02∗∗∗ 0.08∗∗ 0.04 0.01 −0.01 0.04 0.05
(0.01) (0.01) (0.001) (0.0004) (0.004) (0.01) (0.04) (0.03) (0.02) (0.02) (0.04) (0.04)
Imp Rd.×N T L96 Low 0.0002 0.001 −0.02∗∗∗ 0.02 0.02 −0.06
(0.01) (0.001) (0.01) (0.03) (0.02) (0.06)
Imp Rd.×N T L96 Med 0.15∗∗ 0.003∗∗ −0.01 0.18∗∗ 0.04 −0.004
(0.07) (0.001) (0.01) (0.09) (0.03) (0.02)
Imp Rd.×N T L96 High 0.25∗∗∗ 0.03∗ −0.005 0.17∗ 0.27∗∗ −0.05∗
(0.09) (0.01) (0.01) (0.09) (0.13) (0.03)
Constant 0.02 0.02 0.0005 0.0003 −0.01∗ −0.01 −0.18 −0.16 −0.09 −0.07 −0.15 −0.16
(0.02) (0.02) (0.001) (0.001) (0.01) (0.01) (0.13) (0.13) (0.08) (0.07) (0.15) (0.15)
Unit 1km 1km 1km 1km 1km 1km Keb. Keb. Keb. Keb. Keb. Keb.
1st Stage F-Stat 208.63 62.22 208.63 62.22 208.63 62.22 84.51 25.44 84.51 25.44 84.51 25.44
Observations 1,271,082 1,271,082 1,271,094 1,271,094 1,271,094 1,271,094 10,220 10,220 10,220 10,220 10,220 10,220
Adjusted R2 0.003 0.01 0.0001 -0.002 -0.001 -0.001 -0.01 0.02 0.001 0.02 -0.003 -0.004

Standard errors are clustered on Woredas. The models include a control for the log distance to the nearest targeted area.
“Imp Rd.” indicates that a unit was near (within 5km of) an improved road. N T L96 Low, Medium, and High are dummy
variables for different groups of baseline nighttime lights; groups are formed using 3-quantiles of the maximum value of
nighttime lights within Woredas that had some positive nighttime lights. The excluded group is Woredas where the maximum
value of nighttime lights at baseline was 0. When using 1x1km pixels as the unit of analysis, NTL is the inverse hyperbolic
sine of nighttime lights of the 1x1km pixel, which has a similar interpretation as logs, and Urban and Cropland are binary
variables indicating whether the pixel contains urban or cropland. When using Kebeles as the unit of analysis, NTL is the
average nighttime lights within Kebeles, Urban is the total urban area within Kebeles, and Cropland is the total cropland area
within Kebeles; we use the inverse hyperbolic sine transformation on all outcome variables. ∗ p<0.1; ∗∗ p<0.05; ∗∗∗ p< 0.01

S8.5 Second Stage Results - Different Baseline Nighttime Light Groups

Our primary models examine heterogeneity of impacts of road upgrades across four groups of baseline

nighttime lights. To test the sensitivity of the results to different groups of baseline nighttime lights, we also

estimate models using just two groups: Woredas with no nighttime lights and Woredas with some positive

nighttime lights at baseline. Here, results are generally similar to when using four groups: road upgrades

are associated with growth in nighttime lights and urban land, particularly in areas with higher baseline

nighttime lights.

S16
Table S10: Long Difference, OLS and IV Results - RSDP I - III

NTL Urban Cropland NTL Urban Cropland


(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12)
Imp Rd. 0.04∗∗∗ 0.02 0.002∗ 0.0000 0.01∗∗∗ 0.02∗∗∗ 0.08∗∗ 0.05 0.01 −0.01 0.04 0.05
(0.01) (0.01) (0.001) (0.0004) (0.004) (0.01) (0.04) (0.03) (0.02) (0.02) (0.04) (0.04)
Imp Rd.×N T L96 Lit 0.11∗∗∗ 0.01∗∗∗ −0.01∗∗ 0.12∗∗∗ 0.08∗∗∗ −0.03
(0.03) (0.003) (0.01) (0.04) (0.03) (0.03)
Constant 0.02 0.02 0.0005 0.0001 −0.01∗ −0.01 −0.18 −0.17 −0.09 −0.08 −0.15 −0.16
(0.02) (0.02) (0.001) (0.001) (0.01) (0.01) (0.13) (0.13) (0.08) (0.07) (0.15) (0.15)
Unit 1km 1km 1km 1km 1km 1km Keb. Keb. Keb. Keb. Keb. Keb.
1st Stage F-Stat 208.63 104.26 208.63 104.26 208.63 104.26 84.51 42.55 84.51 42.55 84.51 42.55
Observations 1,271,082 1,271,082 1,271,094 1,271,094 1,271,094 1,271,094 10,220 10,220 10,220 10,220 10,220 10,220
Adjusted R2 0.003 0.005 0.0001 -0.001 -0.001 -0.001 -0.01 0.01 0.001 0.01 -0.003 -0.004

Standard errors are clustered on Woredas. The models include a control for the log distance to the nearest targeted area.
“Imp Rd.” indicates that a unit was near (within 5km of) an improved road. N T L96 Lit indicates that the Woredas had
some positive nighttime lights at baseline. The excluded group is Woredas where the maximum value of nighttime lights at
baseline was 0. When using 1x1km pixels as the unit of analysis, NTL is the inverse hyperbolic sine of nighttime lights of the
1x1km pixel, which has a similar interpretation as logs, and Urban and Cropland are binary variables indicating whether the
pixel contains urban or cropland. When using Kebeles as the unit of analysis, NTL is the average nighttime lights within
Kebeles, Urban is the total urban area within Kebeles, and Cropland is the total cropland area within Kebeles; we use the
inverse hyperbolic sine transformation on all outcome variables. ∗ p<0.1; ∗∗ p<0.05; ∗∗∗ p< 0.01

S17
S9 Balance Across Non-Targeted Treated and Control Areas

We check balance in outcome variables across non-targeted treated and control areas. We use both levels

of outcome variables at baseline (1996) and compare differences in pre-trends (using the difference between

1992 and 1996). Levels and trends of outcome variables tend to be unbalanced, emphasizing that simple

comparisons between treated and non-treated areas are insufficient to understand impacts of road upgrades.

Table S11: Summary Statistics Across Non-Targeted Treated and Control Areas: Kebeles, RSDP I-III

(1) (2) T-test


Treatment Control Difference
(< 5km RSDP) (> 5km RSDP)
Variable Mean/SE Mean/SE (1) - (2)
NTL (1996) 0.06 0.02 0.04∗∗
(0.02) (0.01)
Urban (1996) 0.03 0.22 -0.19∗∗∗
(0.01) (0.04)
Cropland (1996) 134.21 130.81 3.4
(2.66) (1.78)
NTL (Pre-trends) 0.03 0.01 0.02∗
(0.01) (0)
Urban (Pre-trends) 0 0.02 -0.02∗∗
(0) (0.01)
Cropland (Pre-trends) 0.69 0.59 0.1
(0.09) (0.07)
N 3,157 7,063

Table S12: Summary Statistics Across Non-Targeted Treated and Control Areas: 1km Grid, RSDP I-III

(1) (2) T-test


Treatment Control Difference
(< 5km RSDP) (> 5km RSDP)
Variable Mean/SE Mean/SE (1) - (2)
NTL (1996) 0.0496 0.0056 0.044∗∗∗
(0.0024) (2e-04)
Urban (1996) 0.001 8e-04 2e-04∗∗∗
(1e-04) (0)
Cropland (1996) 0.6086 0.2338 0.3748∗∗∗
(0.0014) (4e-04)
NTL (Pre-trends) 0.016 0.0022 0.0138∗∗∗
(0.0012) (1e-04)
Urban (Pre-trends) 1e-04 1e-04 0
(0) (0)
Cropland (Pre-trends) 0.0013 3e-04 0.001∗∗∗
(1e-04) (0)
N 113,908 1,157,186

S18
S10 Constructing Minimum Spanning Trees

Minimum spanning trees (MST) are constructed to instrument improved roads. To develop the MSTs,

we first construct the least cost path between each pair of targeted areas, then use Kruskal’s algorithm to

construct the MST (Kruskal (1956)). Following Faber (2014), two costs are used. First, we use the Euclidean

distance between Woredas, calculated using the great circle distance between pairs of points. Second, we

connect Woredas using the least cost path, where cost is the construction cost of crossing a pixel. We use

the following equation to proxy construction cost for crossing each 1x1km pixel (Faber (2014)).

ci = 1 + slopei + 25 ∗ Developedi + 25 ∗ W ateri + 25 ∗ W etlandi (2)

ci is the construction cost for crossing pixeli , Slopei is the average slope gradient of pixeli , and Developedi ,

Wateri and Wetlandi are indicator variables indicating whether the pixel is developed (i.e., built-up) water

or wetland areas. Slope is derived from elevation data from the Shuttle Radar Topohraph Mission (SRTM),

and Developed, Water and Wetland come from the European Space Agency Globcover dataset using 1996

data. The original resolution of the Globcover dataset is 300 meters; consequently, the indicator variables

are set to 1 if any portion of 1x1km pixels are comprised of the relevant land cover class. Figure S12 shows

the construction cost surface.

Figure S12: Construction cost surface used to construct the least cost distance. Lighter colors indicate higher construction
costs.

S19
Between each pair of targeted areas, the least cost path is computed using Dijkstra’s algorithm. Each

pixel can connect to one of the 4 horizontal/vertical neighboring cells or one of the 4 neighboring adjacent

cells. Kruskal’s algorithm is then used to compute the MST.

We use two approaches for determining which targeted areas should be connected. First, we develop MSTs

that connect all targeted areas. Second, we develop MSTs that connect targeted areas within each region,

then append the regional MSTs together. Given that regional governments initially submitted proposals

for which roads should be improved, the regional approach assumes that regions may prioritize connecting

cities within the region as opposed to outside of the region. Figure S13 shows MSTs that initially connect

all targeted areas, and figure S14 shows MSTs where regional MSTs are appended.

Figure S13: Minimum Spanning Trees, Appended: RSDP I - III

Figure S14: Regional Minimum Spanning Trees, Appended: RSDP I - III

S20
S11 Estimating Travel Time from Road Network Data

To determine changes in market access, we estimate travel times from each administrative unit (Kebeles or

Woredas). Travel times are estimated using the speed assigned to roads in the RSDP dataset in each year.

To estimate travel times, we first transform the RSDP shapefile into a raster grid with 3x3km resolution.

We assign each grid the speed limit of the fastest road that crosses the grid; if no road intersects with the

grid, we assign the grid a speed of 5km/h (walking speed); figure S15 shows rasterized version of the road

network in 1996 and 2016. We then compute an alternate raster where each grid is the travel time it takes

to cross the pixel.

To determine the least cost path between two pairs of locations, we use Dijkstra’s algorithm to minimize

the total cost (travel time) to traverse the grid between the locations. Figure S16 shows the least cost path

between two example cities—Debre Markos and Jimma—in 1996 and 2016. In 1996, the travel time between

the two cities was 10.24 hours using a route 508km in length; in 2016, the travel time was 7.04 hours, where

a longer route of 514km (using faster roads) was used.

Figure S15: Speed Limit Grid Across Ethiopia

S21
Figure S16: Fastest route between Debre Markos and Jimma in 1996 and 2016. The green dots show the cities
(Debre Markos on top in the figure and Jimma in the bottom), and the white line shows the fastest route.

S22
S12 Market Access Two-Way Fixed Effect Model

Our primary model to measure the impact of changes in market access uses a long difference framework.

To test the robustness of long difference results, and to take advantage of the panel nature of our data, we

implement a two-way fixed effect model. This approach follows from Blankespoor et al. (2017) who use a

similar two-way fixed effects approach to measure how changes in market access from 1986 to 2014 impact

employment and specialization in Mexico. Our model is described by the below equation:

init
Yi,t = β1 M Ai,t + βC M Ai,t Ni + βC Ci,t + γt + δi + i,t (3)

where Y is our outcome variable (nighttime lights, urbanization or cropland) for unit (Kebeles or Woredas)

i in time t, M A is the log of market access, N is a vector of interaction terms where values are constant

over time (low, medium, or high nighttime lights at baseline), C is a vector of time-varying unit-level

characteristics (temperature and precipitation), γt are time fixed effects, δi are Woreda fixed effects and i,t

is the error term. Our outcome variables include average nighttime lights, the number of pixels with values

above 2 and 6 and the number of urban pixels within each Woreda. The coefficients of interest are βM and

βI , which capture the impact of market access and how the impact varies across different variables. As in

the panel data models, we also implement a version of 3 that instruments market access with the market

access “doughnut“ variable.

Figures S13 and S14 show results when using four groups of nighttime lights, and figures S15 and S16

test the sensitivity of these results using two groups of nighttime lights. Results show that road upgrades

are associated with growth in nighttime lights and urban land, particularly in areas with medium-to-high

baseline nighttime lights. Coefficients tend to be insignificant for cropland.

S23
Table S13: Association of market access on outcome variables using OLS, panel data

NTL Urban Cropland NTL Urban Cropland


(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12)
MA −0.01 −0.03∗∗∗ 0.01∗∗∗ −0.003 −0.0003 0.001 0.02 −0.02 −0.02∗ −0.07∗∗∗ −0.01 −0.003
(0.01) (0.01) (0.002) (0.002) (0.002) (0.002) (0.02) (0.03) (0.01) (0.02) (0.01) (0.01)
MA×N T L96 Low 0.06∗∗∗ 0.02∗∗∗ 0.004 0.10∗∗ 0.02 0.01
(0.02) (0.01) (0.004) (0.04) (0.02) (0.01)
MA×N T L96 Med 0.12∗∗∗ 0.03∗∗∗ −0.003 0.15∗∗∗ 0.09∗ 0.01
(0.04) (0.01) (0.005) (0.05) (0.05) (0.01)
MA×N T L96 High 0.20∗∗∗ 0.25∗∗∗ −0.06∗∗ 0.14∗ 0.48∗∗∗ −0.07
(0.04) (0.05) (0.02) (0.08) (0.14) (0.07)
Unit Keb. Keb. Keb. Keb. Keb. Keb. W or. W or. W or. W or. W or. W or.
Year FE Y Y Y Y Y Y Y Y Y Y Y Y
Unit FE Y Y Y Y Y Y Y Y Y Y Y Y
Observations 329,973 329,973 329,973 329,973 329,973 329,973 16,380 16,380 16,380 16,380 16,380 16,380
R2 0.70 0.70 0.90 0.90 0.99 0.99 0.91 0.91 0.96 0.97 1.00 1.00
Adjusted R2 0.69 0.69 0.89 0.90 0.99 0.99 0.90 0.90 0.96 0.96 1.00 1.00

Standard errors clustered on Woredas when using Kebeles as the unit of analysis and standard errors clustered on Zones when
using Woredas as the unit of analysis. MA is the logged value of market access. N T L96 Low, Medium, and High are dummy
variables for different groups of baseline nighttime lights; groups are formed using 3-quantiles of the maximum value of
nighttime lights within Woredas that had some positive nighttime lights. The excluded group is Woredas where the maximum
value of nighttime lights at baseline was 0. NTL is the average nighttime lights within units (Kebeles or Woredas), Urban is
the total urban area within units, and Cropland is the total cropland area within units; we use the inverse hyperbolic sine
transformation on all outcome variables, which has a similar interpretation as logs. ∗ p<0.1; ∗∗ p<0.05; ∗∗∗ p< 0.01

Table S14: Association of market access on outcome variables using 50km doughnut IV, panel data

NTL Urban Cropland NTL Urban Cropland


(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12)
MA −0.002 −0.04∗∗∗ 0.02∗∗∗ −0.002 −0.001 0.003 0.04 −0.04 0.05∗∗ −0.05∗∗ −0.01 0.004
(0.01) (0.01) (0.004) (0.003) (0.005) (0.01) (0.03) (0.04) (0.03) (0.02) (0.01) (0.01)
MA×N T L96 Low 0.08∗∗∗ 0.02∗∗∗ 0.005 0.19∗∗∗ 0.03 0.01
(0.03) (0.01) (0.01) (0.06) (0.03) (0.01)
MA×N T L96 Med 0.18∗∗∗ 0.05∗∗∗ −0.01 0.22∗∗∗ 0.12∗∗∗ −0.001
(0.05) (0.01) (0.01) (0.07) (0.05) (0.01)
MA×N T L96 High 0.25∗∗∗ 0.37∗∗∗ −0.09∗∗ 0.21∗ 0.71∗∗∗ −0.09
(0.06) (0.08) (0.04) (0.11) (0.17) (0.08)
Unit Keb. Keb. Keb. Keb. Keb. Keb. W or. W or. W or. W or. W or. W or.
Year FE Y Y Y Y Y Y Y Y Y Y Y Y
Unit FE Y Y Y Y Y Y Y Y Y Y Y Y
Observations 329,973 329,973 329,973 329,973 329,973 329,973 16,380 16,380 16,380 16,380 16,380 16,380
R2 0.70 0.70 0.90 0.90 0.99 0.99 0.91 0.91 0.96 0.96 1.00 1.00
Adjusted R2 0.69 0.69 0.89 0.90 0.99 0.99 0.90 0.90 0.96 0.96 1.00 1.00

Standard errors clustered on Woredas when using Kebeles as the unit of analysis and standard errors clustered on Zones when
using Woredas as the unit of analysis. MA is the logged value of market access. Market access is instrumented with a 50km
MA Doughnut variable; that is, in calculating market access, units within 50km are excluded. N T L96 Low, Medium, and High
are dummy variables for different groups of baseline nighttime lights; groups are formed using 3-quantiles of the maximum
value of nighttime lights within Woredas that had some positive nighttime lights. The excluded group is Woredas where the
maximum value of nighttime lights at baseline was 0. NTL is the average nighttime lights within units (Kebeles or Woredas),
Urban is the total urban area within units, and Cropland is the total cropland area within units; we use the inverse hyperbolic
sine transformation on all outcome variables, which has a similar interpretation as logs. ∗ p<0.1; ∗∗ p<0.05; ∗∗∗ p< 0.01

S24
Table S15: Association of market access on outcome variables using OLS, panel data

NTL Urban Cropland NTL Urban Cropland


(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12)
MA −0.01 −0.03∗∗∗ 0.01∗∗∗ −0.004∗ −0.0003 0.001 0.02 −0.02 −0.02∗ −0.08∗∗∗ −0.01 −0.002
(0.01) (0.01) (0.002) (0.002) (0.002) (0.002) (0.02) (0.03) (0.01) (0.02) (0.01) (0.01)
MA×N T L96 Lit 0.11∗∗∗ 0.06∗∗∗ −0.01∗ 0.13∗∗∗ 0.18∗∗∗ −0.02
(0.02) (0.01) (0.004) (0.03) (0.04) (0.02)
Unit Keb. Keb. Keb. Keb. Keb. Keb. W or. W or. W or. W or. W or. W or.
Year FE Y Y Y Y Y Y Y Y Y Y Y Y
Unit FE Y Y Y Y Y Y Y Y Y Y Y Y
Observations 329,973 329,973 329,973 329,973 329,973 329,973 16,380 16,380 16,380 16,380 16,380 16,380
R2 0.70 0.70 0.90 0.90 0.99 0.99 0.91 0.91 0.96 0.96 1.00 1.00
Adjusted R2 0.69 0.69 0.89 0.89 0.99 0.99 0.90 0.90 0.96 0.96 1.00 1.00

Standard errors clustered on Woredas when using Kebeles as the unit of analysis and standard errors clustered on Zones when
using Woredas as the unit of analysis. MA is the logged value of market access. N T L96 Low, Medium, and High are dummy
variables for different groups of baseline nighttime lights; groups are formed using 3-quantiles of the maximum value of
nighttime lights within Woredas that had some positive nighttime lights. The excluded group is Woredas where the maximum
value of nighttime lights at baseline was 0. NTL is the average nighttime lights within units (Kebeles or Woredas), Urban is
the total urban area within units, and Cropland is the total cropland area within units; we use the inverse hyperbolic sine
transformation on all outcome variables, which has a similar interpretation as logs. ∗ p<0.1; ∗∗ p<0.05; ∗∗∗ p< 0.01

Table S16: Association of market access on outcome variables using 50km doughnut IV, panel data

NTL Urban Cropland NTL Urban Cropland


(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12)
MA −0.002 −0.03∗∗∗ 0.02∗∗∗ 0.0001 −0.001 0.003 0.04 −0.04 0.05∗∗ −0.05∗∗ −0.01 0.004
(0.01) (0.01) (0.004) (0.003) (0.005) (0.01) (0.03) (0.04) (0.03) (0.02) (0.01) (0.01)
MA×N T L96 Lit 0.15∗∗∗ 0.09∗∗∗ −0.02∗ 0.20∗∗∗ 0.28∗∗∗ −0.02
(0.03) (0.02) (0.01) (0.04) (0.06) (0.03)
Unit Keb. Keb. Keb. Keb. Keb. Keb. W or. W or. W or. W or. W or. W or.
Year FE Y Y Y Y Y Y Y Y Y Y Y Y
Unit FE Y Y Y Y Y Y Y Y Y Y Y Y
Observations 329,973 329,973 329,973 329,973 329,973 329,973 16,380 16,380 16,380 16,380 16,380 16,380
R2 0.70 0.70 0.90 0.90 0.99 0.99 0.91 0.91 0.96 0.96 1.00 1.00
Adjusted R2 0.69 0.69 0.89 0.89 0.99 0.99 0.90 0.90 0.96 0.96 1.00 1.00

Standard errors clustered on Woredas when using Kebeles as the unit of analysis and standard errors clustered on Zones when
using Woredas as the unit of analysis. MA is the logged value of market access. Market access is instrumented with a 50km
MA Doughnut variable; that is, in calculating market access, units within 50km are excluded. N T L96 Low, Medium, and High
are dummy variables for different groups of baseline nighttime lights; groups are formed using 3-quantiles of the maximum
value of nighttime lights within Woredas that had some positive nighttime lights. The excluded group is Woredas where the
maximum value of nighttime lights at baseline was 0. NTL is the average nighttime lights within units (Kebeles or Woredas),
Urban is the total urban area within units, and Cropland is the total cropland area within units; we use the inverse hyperbolic
sine transformation on all outcome variables, which has a similar interpretation as logs. ∗ p<0.1; ∗∗ p<0.05; ∗∗∗ p< 0.01

S25
S13 Difference in Differences - Additional Results

We explore the sensitivity of difference-in-difference results to using 1x1km pixels instead of Kebeles as the

unit of analysis, excluding areas near Addis Ababa and when using different groups of baseline nighttime

lights. First, figures S17 and S18 show results when using the 1x1km grid as the unit of analysis. Second,

figures S19 and S20 show results when excluding Kebeles within 100km of Addis Ababa. Third, figure S21

shows heterogeneity of results across areas with baseline zero or positive nighttime lights (as opposed to the

primary models that use four different groups of baseline nighttime lights). Results generally show similar

results to the main specification.

S13.1 Results using 1x1km Grid as Unit of Analysis

Figure S17: Association of road improvements to nighttime lights, urbanization, and cropland using 1x1km
pixels as the unit of analysis. Dark, Low, Medium, and High are different groups of baseline nighttime lights. Low-High
groups are formed using 3-quantiles of the maximum value of nighttime lights within Woredas that had some positive nighttime
lights. Dark indicates Woredas where the maximum value of nighttime lights at baseline was 0. Nighttime Lights is the inverse
hyperbolic sine of nighttime lights of the 1x1km pixel, which has a similar interpretation as logs. Urban and Cropland are
binary variables indicating whether the pixel contains urban or cropland.

Figure S18: Association of road improvements to nighttime lights, urbanization, and cropland using 1x1km
pixels as the unit of analysis. The road type considers the speed limit of the road after the road was upgraded. Nighttime
Lights is the inverse hyperbolic sine of nighttime lights of the 1x1km pixel, which has a similar interpretation as logs. Urban
and Cropland are binary variables indicating whether the pixel contains urban or cropland.

S26
S13.2 Results when excluding Kebeles within 100km of Addis Ababa

Figure S19: Association of road improvements to nighttime lights, urbanization, and cropland using Kebeles
as the unit of analysis - excluding areas within 100km of Addis-Ababa. Dark, Low, Medium, and High are different
groups of baseline nighttime lights. Low-High groups are formed using 3-quantiles of the maximum value of nighttime lights
within Woredas that had some positive nighttime lights. Dark indicates Woredas where the maximum value of nighttime
lights at baseline was 0. Nighttime Lights is the average nighttime lights within Kebeles, Urban is the total urban area within
Kebeles, and Cropland is the total cropland area within Kebeles; we use the inverse hyperbolic sine transformation on all
outcome variables, which has a similar interpretation as logs.

Figure S20: Association of road improvements to nighttime lights, urbanization, and cropland using Kebeles
as the unit of analysis - excluding areas within 100km of Addis-Ababa. The road type considers the speed limit of
the road after the road was upgraded. Nighttime Lights is the average nighttime lights within Kebeles, Urban is the total urban
area within Kebeles, and Cropland is the total cropland area within Kebeles; we use the inverse hyperbolic sine transformation
on all outcome variables, which has a similar interpretation as logs.

S27
S13.3 Heterogeneity of impacts across Woredas with zero and positive baseline
nighttime lights

Figure S21: Association of road improvements to nighttime lights, urbanization, and cropland using Kebeles
as the unit of analysis. N T L96 Lit indicates that the Woredas had some positive nighttime lights at baseline. The excluded
group is Woredas where the maximum value of nighttime lights at baseline was 0. Nighttime Lights is the average nighttime
lights within Kebeles, Urban is the total urban area within Kebeles, and Cropland is the total cropland area within Kebeles;
we use the inverse hyperbolic sine transformation on all outcome variables, which has a similar interpretation as logs.

S28
S14 What land cover type does urban replace?

The Globcover data shows 6,956 300-meter pixels that transitioned to Urban at some point from 1992 to

2018. Table S17 shows the frequency of land cover classes that transitioned to urban areas. Cropland is the

most frequent land cover class that transitioned to urban; considering all cropland classes (which includes (1)

cropland, rainfed, (2) mosaic cropland (>50%), (3) Cropland, rainfed - Herbaceous cover, and (4) Cropland,

irrigated or post-flooding), cropland makes up 52.6% of landcover that transitioned to urban during the

study period.

Table S17: Summary Statistics of Dependent Variables

Land Cover Class N Proportion


Cropland, rainfed 1825 0.262
Mosaic cropland (>50%); natural vegetation (tree, shrub, herbaceous cover) (<50%) 1540 0.221
Mosaic natural vegetation (>50%); cropland (<50%) 1429 0.205
Shrubland 731 0.105
Grassland 584 0.084
Cropland, rainfed - Herbaceous cover 151 0.022
Cropland, irrigated or post-flooding 148 0.021
Mosaic herbaceous cover (>50%); tree and shrub (<50%) 124 0.018
Mosaic tree and shrub (>50%); herbaceous cover (<50%) 123 0.018
Tree cover, broadleaved, deciduous, open (15-40%) 111 0.016
Bare areas 44 0.006
Consolidated bare areas 41 0.006
Tree cover, flooded, saline water 33 0.005
Tree cover, broadleaved, evergreen, closed to open (>15%) 19 0.003
Sparse vegetation (tree, shrub, herbaceous cover) (<15%) 19 0.003
Tree cover, broadleaved, deciduous, closed to open (>15%) 10 0.001
Sparse shrub (<15%) 9 0.001
Water bodies 8 0.001
Deciduous shrubland 4 0.001
Sparse herbaceous cover (<15%) 2 0
Shrub or herbaceous cover, flooded - fresh, saline or brakish water 1 0
TOTAL 6956 1

S29
S15 Market Access, Long Difference - Additional Results

This section examines the sensitivity of long difference market access results. Section S15.1 tests different

doughnut sizes when instrumenting market access, section S15.2 tests different values of θ (travel elasticity)

when constructing market access, and S15.3 estimates models using two groups of baseline nighttime lights.

S15.1 Varying Doughnut Sizes

In this section, we explore the sensitivity of market access results to using different doughnut sizes for the

market access instrument; i.e., excluding Woredas within different distances. Our main specification uses

50km; figure S18 shows results using a 20km buffer and S19 using a 100km buffer. The coefficient values

change slightly from the main specification; however, the magnitude and significance of the coefficients

generally stay the same.

Table S18: Association of changes in market access on changes in outcome variables using a long difference, 20km doughnut
IV results

NTL Urban Cropland NTL Urban Cropland


(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12)
MA −0.09∗∗∗ −0.11∗∗∗ −0.0005 −0.002 0.003 0.002 −0.10 −0.15∗ 0.005 −0.03 −0.0001 −0.005
(0.02) (0.02) (0.01) (0.005) (0.01) (0.01) (0.07) (0.08) (0.05) (0.06) (0.02) (0.02)
MA×N T L96 Low 0.07∗∗ −0.02∗∗∗ 0.005 0.24∗∗∗ 0.04 0.01
(0.03) (0.01) (0.01) (0.07) (0.04) (0.01)
MA×N T L96 Med 0.13∗∗∗ −0.02∗∗ 0.004 0.36∗∗∗ 0.19∗∗∗ 0.02∗∗
(0.03) (0.01) (0.01) (0.09) (0.06) (0.01)
MA×N T L96 High 0.29∗∗∗ 0.04 0.03 0.58∗∗∗ 0.67∗∗∗ 0.08∗∗∗
(0.04) (0.04) (0.02) (0.11) (0.20) (0.03)
MA, 1996 0.13∗∗∗ 0.11∗∗∗ 0.03∗∗∗ 0.03∗∗∗ −0.01∗∗ −0.01∗∗ 0.07∗∗∗ 0.03∗ 0.04∗∗ 0.01 −0.002 −0.005
(0.01) (0.01) (0.004) (0.003) (0.003) (0.003) (0.01) (0.02) (0.02) (0.01) (0.01) (0.01)
Log mean light, 1996 −0.001 −0.13∗∗∗ 0.45∗∗∗ 0.43∗∗∗ −0.05∗ −0.06∗∗ −0.12 −0.32∗∗∗ 0.31 0.08 −0.11 −0.14
(0.05) (0.05) (0.12) (0.13) (0.02) (0.03) (0.08) (0.08) (0.19) (0.19) (0.10) (0.10)
Pre-trend: log mean light 0.08 0.15∗∗ 0.15 0.18 −0.002 0.01 0.12 0.10 −0.20 −0.17 0.12 0.13∗
(0.08) (0.08) (0.15) (0.15) (0.03) (0.03) (0.13) (0.11) (0.23) (0.21) (0.08) (0.08)
Pre-trend: log N urban pixels 0.34∗∗ 0.34∗∗ 0.57∗∗∗ 0.57∗∗∗ −0.04 −0.04 −0.26∗∗∗ −0.23∗∗ 0.34∗ 0.37∗ −0.05∗ −0.04∗
(0.14) (0.13) (0.21) (0.21) (0.12) (0.12) (0.09) (0.09) (0.20) (0.22) (0.02) (0.02)
Unit Keb. Keb. Keb. Keb. Keb. Keb. W or. W or. W or. W or. W or. W or.
Observations 15,714 15,714 15,714 15,714 15,714 15,714 780 780 780 780 780 780
Adjusted R2 0.17 0.18 0.22 0.23 0.14 0.14 0.33 0.36 0.28 0.30 0.37 0.37

Standard errors are clustered on Woredas when using Kebeles as the unit of analysis and are clustered on Zones when using
Woredas as the unit of analysis. All models include Zone fixed effects. MA is the logged difference of MA from 1996 to 2016.
MA is instrumented with a 20km MA Doughnut variable; that is, in calculating market access, units within 20km are
excluded. N T L96 Low, Medium, and High are dummy variables for different groups of baseline nighttime lights; groups are
formed using 3-quantiles of the maximum value of nighttime lights within Woredas that had some positive nighttime lights.
The excluded group is Woredas where the maximum value of nighttime lights at baseline was 0. NTL is the average nighttime
lights within Kebeles, Urban is the total urban area within Kebeles, and Cropland is the total cropland area within Kebeles;
we use the inverse hyperbolic sine transformation on all outcome variables, which has a similar interpretation as logs. ∗ p<0.1;
∗∗ p<0.05; ∗∗∗ p< 0.01

S30
Table S19: Association of changes in market access on changes in outcome variables using a long difference, 100km doughnut
IV results

NTL Urban Cropland NTL Urban Cropland


(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12)
MA −0.10∗∗∗ −0.11∗∗∗ 0.01∗ 0.01∗ 0.003 0.002 −0.17 −0.21∗ −0.11∗ −0.12∗∗ −0.01 −0.01
(0.03) (0.03) (0.01) (0.01) (0.01) (0.01) (0.12) (0.11) (0.06) (0.06) (0.04) (0.04)
MA×N T L96 Low 0.08∗∗ −0.02∗∗∗ 0.002 0.25∗∗∗ 0.04 0.01
(0.04) (0.01) (0.01) (0.08) (0.04) (0.01)
MA×N T L96 Med 0.16∗∗∗ −0.01 −0.001 0.40∗∗∗ 0.17∗∗∗ 0.03∗
(0.04) (0.01) (0.01) (0.10) (0.06) (0.02)
MA×N T L96 High 0.32∗∗∗ 0.06 0.03 0.60∗∗∗ 0.57∗∗∗ 0.10∗∗
(0.06) (0.04) (0.03) (0.11) (0.17) (0.04)
MA, 1996 0.20∗∗∗ 0.16∗∗∗ 0.04∗∗∗ 0.04∗∗∗ −0.01 −0.01 0.10∗∗∗ 0.04 0.09∗∗ 0.05∗ −0.003 −0.01
(0.02) (0.02) (0.01) (0.004) (0.01) (0.01) (0.02) (0.03) (0.04) (0.03) (0.01) (0.01)
Log mean light, 1996 0.03 −0.12∗∗ 0.46∗∗∗ 0.43∗∗∗ −0.05∗∗ −0.06∗∗ −0.11 −0.32∗∗∗ 0.31∗ 0.11 −0.11 −0.15
(0.05) (0.05) (0.12) (0.13) (0.02) (0.03) (0.08) (0.08) (0.18) (0.19) (0.10) (0.11)
Pre-trend: log mean light 0.10 0.17∗∗ 0.16 0.19 −0.003 0.01 0.12 0.10 −0.20 −0.18 0.12 0.13
(0.08) (0.08) (0.15) (0.15) (0.03) (0.03) (0.13) (0.11) (0.23) (0.21) (0.08) (0.08)
Pre-trend: log N urban pixels 0.35∗∗ 0.34∗∗ 0.57∗∗∗ 0.57∗∗∗ −0.04 −0.04 −0.24∗∗∗ −0.23∗∗ 0.34∗ 0.36∗ −0.05∗ −0.05∗
(0.14) (0.13) (0.22) (0.21) (0.12) (0.12) (0.08) (0.09) (0.17) (0.19) (0.02) (0.02)
Unit Keb. Keb. Keb. Keb. Keb. Keb. W or. W or. W or. W or. W or. W or.
Observations 15,714 15,714 15,714 15,714 15,714 15,714 780 780 780 780 780 780
Adjusted R2 0.16 0.17 0.22 0.22 0.14 0.14 0.31 0.35 0.28 0.31 0.37 0.37

Standard errors are clustered on Woredas when using Kebeles as the unit of analysis and are clustered on Zones when using
Woredas as the unit of analysis. All models include Zone fixed effects. MA is the logged difference of MA from 1996 to 2016.
MA is instrumented with a 100km MA Doughnut variable; that is, in calculating market access, units within 100km are
excluded. N T L96 Low, Medium, and High are dummy variables for different groups of baseline nighttime lights; groups are
formed using 3-quantiles of the maximum value of nighttime lights within Woredas that had some positive nighttime lights.
The excluded group is Woredas where the maximum value of nighttime lights at baseline was 0. NTL is the average nighttime
lights within Kebeles, Urban is the total urban area within Kebeles, and Cropland is the total cropland area within Kebeles;
we use the inverse hyperbolic sine transformation on all outcome variables, which has a similar interpretation as logs. ∗ p<0.1;
∗∗ p<0.05; ∗∗∗ p< 0.01

S31
S15.2 Varying Values of Theta

In this section, we explore the sensitivity of market access results to using different values of θ (travel

elasticity). Our main specification uses 3.8; S20 uses a value of 2, S21 uses a value of 5 and S22 uses a

value of 8. The coefficient values change slightly from the main specification, however the magnitude and

significance of the coefficients generally stay the same.

Table S20: Association of changes in market access on changes in outcome variables using a long difference. θ = 2, using a
50km doughnut IV

NTL Urban Cropland NTL Urban Cropland


(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12)
MA −0.27∗∗∗ −0.32∗∗∗ 0.04∗ 0.03 0.01 0.01 −0.36 −0.50∗ −0.12 −0.20 −0.02 −0.03
(0.10) (0.09) (0.02) (0.02) (0.03) (0.03) (0.24) (0.25) (0.11) (0.12) (0.09) (0.09)
MA×N T L96 Low 0.18∗∗ −0.03∗∗ 0.002 0.45∗∗∗ 0.07 0.01
(0.07) (0.01) (0.01) (0.15) (0.07) (0.02)
MA×N T L96 Med 0.31∗∗∗ −0.02 −0.0004 0.71∗∗∗ 0.31∗∗∗ 0.04
(0.07) (0.02) (0.02) (0.17) (0.10) (0.03)
MA×N T L96 High 0.62∗∗∗ 0.10 0.05 1.07∗∗∗ 1.04∗∗∗ 0.17∗∗
(0.11) (0.08) (0.04) (0.21) (0.31) (0.07)
MA, 1996 0.30∗∗∗ 0.23∗∗∗ 0.07∗∗∗ 0.07∗∗∗ −0.01 −0.01 0.17∗∗∗ 0.06 0.15∗∗ 0.08∗ −0.003 −0.02
(0.04) (0.04) (0.01) (0.01) (0.01) (0.01) (0.03) (0.06) (0.07) (0.04) (0.01) (0.01)
Log mean light, 1996 0.03 −0.11∗∗ 0.46∗∗∗ 0.43∗∗∗ −0.05∗∗ −0.06∗∗ −0.11 −0.32∗∗∗ 0.31 0.10 −0.11 −0.14
(0.05) (0.05) (0.12) (0.13) (0.02) (0.03) (0.08) (0.09) (0.18) (0.18) (0.10) (0.10)
Pre-trend: log mean light 0.11 0.16∗∗ 0.16 0.19 −0.003 0.005 0.12 0.11 −0.20 −0.17 0.12 0.13∗
(0.08) (0.08) (0.15) (0.15) (0.03) (0.03) (0.13) (0.10) (0.23) (0.20) (0.08) (0.08)
Pre-trend: log N urban pixels 0.34∗∗ 0.33∗∗ 0.57∗∗∗ 0.57∗∗∗ −0.04 −0.04 −0.24∗∗ −0.22∗∗ 0.35∗ 0.36∗ −0.05∗∗ −0.05∗
(0.14) (0.13) (0.21) (0.21) (0.12) (0.12) (0.09) (0.09) (0.19) (0.20) (0.02) (0.02)
Unit Keb. Keb. Keb. Keb. Keb. Keb. W or. W or. W or. W or. W or. W or.
Observations 15,714 15,714 15,714 15,714 15,714 15,714 780 780 780 780 780 780
Adjusted R2 0.16 0.17 0.22 0.22 0.14 0.14 0.32 0.37 0.29 0.32 0.37 0.37

Standard errors are clustered on Woredas when using Kebeles as the unit of analysis and are clustered on Zones when using
Woredas as the unit of analysis. All models include Zone fixed effects. MA is the logged difference of MA from 1996 to 2016.
MA is instrumented with a 50km MA Doughnut variable; that is, in calculating market access, units within 50km are
excluded. N T L96 Low, Medium, and High are dummy variables for different groups of baseline nighttime lights; groups are
formed using 3-quantiles of the maximum value of nighttime lights within Woredas that had some positive nighttime lights.
The excluded group is Woredas where the maximum value of nighttime lights at baseline was 0. NTL is the average nighttime
lights within Kebeles, Urban is the total urban area within Kebeles, and Cropland is the total cropland area within Kebeles;
we use the inverse hyperbolic sine transformation on all outcome variables, which has a similar interpretation as logs. ∗ p<0.1;
∗∗ p<0.05; ∗∗∗ p< 0.01

S32
Table S21: Association of changes in market access on changes in outcome variables using a long difference. θ = 5, using a
50km doughnut IV

NTL Urban Cropland NTL Urban Cropland


(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12)
MA −0.06∗∗∗ −0.08∗∗∗ 0.01∗∗ 0.01∗ 0.002 0.001 −0.08 −0.11 0.03 0.01 −0.001 −0.004
(0.02) (0.02) (0.01) (0.004) (0.01) (0.01) (0.07) (0.07) (0.04) (0.05) (0.02) (0.02)
MA×N T L96 Low 0.07∗∗ −0.01∗∗ 0.003 0.19∗∗∗ 0.05∗ 0.01
(0.03) (0.01) (0.01) (0.06) (0.03) (0.01)
MA×N T L96 Med 0.13∗∗∗ −0.004 0.003 0.32∗∗∗ 0.18∗∗∗ 0.02∗
(0.03) (0.01) (0.01) (0.08) (0.04) (0.01)
MA×N T L96 High 0.25∗∗∗ 0.06 0.02 0.50∗∗∗ 0.58∗∗∗ 0.08∗∗
(0.04) (0.04) (0.02) (0.09) (0.16) (0.04)
MA, 1996 0.11∗∗∗ 0.10∗∗∗ 0.02∗∗∗ 0.02∗∗∗ −0.01∗ −0.01∗ 0.06∗∗∗ 0.03 0.02 −0.01 −0.002 −0.01
(0.01) (0.01) (0.002) (0.002) (0.003) (0.003) (0.01) (0.02) (0.02) (0.01) (0.004) (0.005)
Log mean light, 1996 0.02 −0.13∗∗ 0.46∗∗∗ 0.43∗∗∗ −0.05∗ −0.06∗∗ −0.11 −0.34∗∗∗ 0.32∗ 0.06 −0.11 −0.15
(0.05) (0.05) (0.12) (0.13) (0.02) (0.03) (0.07) (0.08) (0.19) (0.19) (0.10) (0.11)
Pre-trend: log mean light 0.09 0.16∗∗ 0.16 0.19 −0.003 0.01 0.11 0.09 −0.20 −0.18 0.13 0.13∗
(0.08) (0.08) (0.15) (0.16) (0.03) (0.03) (0.13) (0.10) (0.23) (0.21) (0.08) (0.08)
Pre-trend: log N urban pixels 0.35∗∗ 0.34∗∗∗ 0.57∗∗∗ 0.57∗∗∗ −0.04 −0.04 −0.25∗∗ −0.22∗∗ 0.36∗ 0.39∗ −0.05∗ −0.04∗
(0.14) (0.13) (0.21) (0.21) (0.12) (0.12) (0.10) (0.09) (0.21) (0.23) (0.02) (0.02)
Unit Keb. Keb. Keb. Keb. Keb. Keb. W or. W or. W or. W or. W or. W or.
Observations 15,714 15,714 15,714 15,714 15,714 15,714 780 780 780 780 780 780
Adjusted R2 0.17 0.18 0.22 0.22 0.14 0.14 0.32 0.35 0.27 0.28 0.37 0.36

Standard errors are clustered on Woredas when using Kebeles as the unit of analysis and are clustered on Zones when using
Woredas as the unit of analysis. All models include Zone fixed effects. MA is the logged difference of MA from 1996 to 2016.
MA is instrumented with a 50km MA Doughnut variable; that is, in calculating market access, units within 50km are
excluded. N T L96 Low, Medium, and High are dummy variables for different groups of baseline nighttime lights; groups are
formed using 3-quantiles of the maximum value of nighttime lights within Woredas that had some positive nighttime lights.
The excluded group is Woredas where the maximum value of nighttime lights at baseline was 0. NTL is the average nighttime
lights within Kebeles, Urban is the total urban area within Kebeles, and Cropland is the total cropland area within Kebeles;
we use the inverse hyperbolic sine transformation on all outcome variables, which has a similar interpretation as logs. ∗ p<0.1;
∗∗ p<0.05; ∗∗∗ p< 0.01

Table S22: Association of changes in market access on changes in outcome variables using a long difference. θ = 8, using a
50km doughnut IV

NTL Urban Cropland NTL Urban Cropland


(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12)
MA −0.06∗∗∗ −0.08∗∗∗ 0.01∗∗ 0.01∗ 0.002 0.001 −0.08 −0.11 0.03 0.01 −0.001 −0.004
(0.02) (0.02) (0.01) (0.004) (0.01) (0.01) (0.07) (0.07) (0.04) (0.05) (0.02) (0.02)
MA×N T L96 Low 0.07∗∗ −0.01∗∗ 0.003 0.19∗∗∗ 0.05∗ 0.01
(0.03) (0.01) (0.01) (0.06) (0.03) (0.01)
MA×N T L96 Med 0.13∗∗∗ −0.004 0.003 0.32∗∗∗ 0.18∗∗∗ 0.02∗
(0.03) (0.01) (0.01) (0.08) (0.04) (0.01)
MA×N T L96 High 0.25∗∗∗ 0.06 0.02 0.50∗∗∗ 0.58∗∗∗ 0.08∗∗
(0.04) (0.04) (0.02) (0.09) (0.16) (0.04)
MA, 1996 0.11∗∗∗ 0.10∗∗∗ 0.02∗∗∗ 0.02∗∗∗ −0.01∗ −0.01∗ 0.06∗∗∗ 0.03 0.02 −0.01 −0.002 −0.01
(0.01) (0.01) (0.002) (0.002) (0.003) (0.003) (0.01) (0.02) (0.02) (0.01) (0.004) (0.005)
Log mean light, 1996 0.02 −0.13∗∗ 0.46∗∗∗ 0.43∗∗∗ −0.05∗ −0.06∗∗ −0.11 −0.34∗∗∗ 0.32∗ 0.06 −0.11 −0.15
(0.05) (0.05) (0.12) (0.13) (0.02) (0.03) (0.07) (0.08) (0.19) (0.19) (0.10) (0.11)
Pre-trend: log mean light 0.09 0.16∗∗ 0.16 0.19 −0.003 0.01 0.11 0.09 −0.20 −0.18 0.13 0.13∗
(0.08) (0.08) (0.15) (0.16) (0.03) (0.03) (0.13) (0.10) (0.23) (0.21) (0.08) (0.08)
Pre-trend: log N urban pixels 0.35∗∗ 0.34∗∗∗ 0.57∗∗∗ 0.57∗∗∗ −0.04 −0.04 −0.25∗∗ −0.22∗∗ 0.36∗ 0.39∗ −0.05∗ −0.04∗
(0.14) (0.13) (0.21) (0.21) (0.12) (0.12) (0.10) (0.09) (0.21) (0.23) (0.02) (0.02)
Unit Keb. Keb. Keb. Keb. Keb. Keb. W or. W or. W or. W or. W or. W or.
Observations 15,714 15,714 15,714 15,714 15,714 15,714 780 780 780 780 780 780
Adjusted R2 0.17 0.18 0.22 0.22 0.14 0.14 0.32 0.35 0.27 0.28 0.37 0.36

Standard errors are clustered on Woredas when using Kebeles as the unit of analysis and are clustered on Zones when using
Woredas as the unit of analysis. All models include Zone fixed effects. MA is the logged difference of MA from 1996 to 2016.
MA is instrumented with a 50km MA Doughnut variable; that is, in calculating market access, units within 50km are
excluded. N T L96 Low, Medium, and High are dummy variables for different groups of baseline nighttime lights; groups are
formed using 3-quantiles of the maximum value of nighttime lights within Woredas that had some positive nighttime lights.
The excluded group is Woredas where the maximum value of nighttime lights at baseline was 0. NTL is the average nighttime
lights within Kebeles, Urban is the total urban area within Kebeles, and Cropland is the total cropland area within Kebeles;
we use the inverse hyperbolic sine transformation on all outcome variables, which has a similar interpretation as logs. ∗ p<0.1;
∗∗ p<0.05; ∗∗∗ p< 0.01

S33
S15.3 Varying Baseline Nighttime Light Groups

In this section, we explore heterogeneity of impacts of changes in market access across two groups of baseline

nighttime lights—Woredas with no positive nighttime lights at baseline, and Woredas with some positive

nighttime lights at baseline. Tables S23 and S24 show results using Kebeles and Woredas respectively.

Results are consistent to models that use four groups of nighttime lights; road upgrades are associated with

growth in nighttime lights and urban land, particularly in areas with positive baseline nighttime lights. As

with results using four groups of baseline nighttime lights, coefficients on models explaining urban land tend

to be more significant when using Woredas as the unit of analysis.

Table S23: Association of changes in market access on changes in outcome variables using a long difference, OLS results
[Kebeles]

NTL Urban Cropland NTL Urban Cropland


(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12)
MA −0.04∗∗∗ −0.04∗∗∗ −0.003∗∗ −0.003∗∗ 0.003 0.003 −0.04∗∗∗ −0.03∗∗∗ 0.01∗ 0.01∗ 0.002 0.002
(0.01) (0.01) (0.002) (0.002) (0.002) (0.002) (0.01) (0.01) (0.003) (0.003) (0.003) (0.003)
MA×N T L96 Lit 0.23∗∗∗ −0.005 0.002 0.23∗∗∗ 0.01 0.01
(0.04) (0.01) (0.01) (0.04) (0.02) (0.01)
MA, 1996 0.03∗∗∗ 0.03∗∗∗ 0.01∗∗∗ 0.01∗∗∗ −0.002∗∗ −0.002∗∗ 0.07∗∗∗ 0.06∗∗∗ 0.01∗∗∗ 0.01∗∗∗ −0.004∗ −0.004∗∗
(0.003) (0.003) (0.001) (0.001) (0.001) (0.001) (0.01) (0.01) (0.001) (0.001) (0.002) (0.002)
Log mean light, 1996 −0.01 −0.08 0.45∗∗∗ 0.45∗∗∗ −0.05∗ −0.05∗ 0.02 −0.05 0.46∗∗∗ 0.46∗∗∗ −0.05∗ −0.05∗∗
(0.05) (0.05) (0.12) (0.12) (0.02) (0.02) (0.05) (0.05) (0.12) (0.12) (0.02) (0.02)
Pre-trend: log mean light 0.06 0.04 0.15 0.15 −0.001 −0.001 0.08 0.06 0.16 0.16 −0.002 −0.003
(0.08) (0.07) (0.15) (0.15) (0.03) (0.03) (0.08) (0.07) (0.15) (0.15) (0.03) (0.03)
Pre-trend: log N urban pixels 0.34∗∗ 0.32∗∗ 0.56∗∗∗ 0.56∗∗∗ −0.04 −0.04 0.35∗∗ 0.33∗∗ 0.57∗∗∗ 0.57∗∗∗ −0.04 −0.04
(0.13) (0.13) (0.22) (0.22) (0.12) (0.12) (0.14) (0.13) (0.21) (0.22) (0.12) (0.12)
Model OLS OLS OLS OLS OLS OLS IV IV IV IV IV IV
Observations 15,714 15,714 15,714 15,714 15,714 15,714 15,714 15,714 15,714 15,714 15,714 15,714
Adjusted R2 0.18 0.19 0.22 0.22 0.14 0.14 0.17 0.18 0.22 0.22 0.14 0.14

The unit of analysis is Kebeles. Standard errors are clustered on Woredas and all models include Zone fixed effects. MA is the
logged difference of MA from 1996 to 2016. IV indicates that MA is instrumented with a 50km MA Doughnut variable; that
is, in calculating market access, Kebeles within 50km are excluded. N T L96 Lit indicates that the Woredas had some positive
nighttime lights at baseline. The excluded group is Woredas where the maximum value of nighttime lights at baseline was 0.
NTL is the average nighttime lights within Kebeles, Urban is the total urban area within Kebeles, and Cropland is the total
cropland area within Kebeles; we use the inverse hyperbolic sine transformation on all outcome variables, which has a similar
interpretation as logs. ∗ p<0.1; ∗∗ p<0.05; ∗∗∗ p< 0.01

S34
Table S24: Association of changes in market access on changes in outcome variables using a long difference, OLS results
[Woredas]

NTL Urban Cropland NTL Urban Cropland


(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12)
MA −0.01 −0.004 −0.02 −0.02 −0.004 −0.003 −0.05 −0.05 0.03 0.03 −0.002 −0.002
(0.02) (0.02) (0.01) (0.01) (0.004) (0.004) (0.04) (0.04) (0.03) (0.03) (0.01) (0.01)
MA×N T L96 Lit 0.41∗∗∗ 0.18∗∗ 0.03 0.46 ∗∗∗
0.29∗∗∗ 0.03∗
(0.09) (0.07) (0.02) (0.07) (0.09) (0.02)
MA, 1996 0.02∗∗∗ 0.01 0.01∗∗∗ 0.01∗∗ 0.001 0.0004 0.04∗∗∗ 0.02∗∗ 0.01 −0.001 −0.001 −0.002
(0.004) (0.01) (0.003) (0.002) (0.001) (0.001) (0.01) (0.01) (0.01) (0.01) (0.002) (0.002)
Log mean light, 1996 −0.15∗ −0.24∗∗∗ 0.29 0.26 −0.12 −0.12 −0.11 −0.23 ∗∗∗
0.32∗ 0.25 −0.11 −0.12
(0.08) (0.07) (0.19) (0.18) (0.10) (0.10) (0.07) (0.07) (0.19) (0.18) (0.10) (0.10)
Pre-trend: log mean light 0.11 0.08 −0.20 −0.22 0.12 0.12 0.11 0.07 −0.20 −0.22 0.13 0.12
(0.13) (0.12) (0.23) (0.22) (0.08) (0.08) (0.13) (0.11) (0.23) (0.22) (0.08) (0.07)
Pre-trend: log N urban pixels −0.24∗∗ −0.20∗∗ 0.33∗ 0.34∗ −0.05∗∗ −0.05∗ −0.26∗∗ −0.22 ∗∗
0.36∗ 0.38∗ −0.05∗ −0.04∗
(0.10) (0.09) (0.19) (0.18) (0.02) (0.02) (0.10) (0.09) (0.21) (0.20) (0.02) (0.02)
Model OLS OLS OLS OLS OLS OLS IV IV IV IV IV IV
Observations 780 780 780 780 780 780 780 780 780 780 780 780
Adjusted R2 0.33 0.37 0.28 0.29 0.37 0.37 0.33 0.36 0.27 0.28 0.37 0.37

The unit of analysis is Woredas. Standard errors are clustered on Zones and all models include Zone fixed effects. MA is the
logged difference of MA from 1996 to 2016. IV indicates that MA is instrumented with a 50km MA Doughnut variable; that
is, in calculating market access, Woredas within 50km are excluded. N T L96 Lit indicates that the Woredas had some positive
nighttime lights at baseline. The excluded group is Woredas where the maximum value of nighttime lights at baseline was 0.
NTL is the average nighttime lights within Woredas, Urban is the total urban area within Woredas, and Cropland is the total
cropland area within Woredas; we use the inverse hyperbolic sine transformation on all outcome variables, which has a similar
interpretation as logs. ∗ p<0.1; ∗∗ p<0.05; ∗∗∗ p< 0.01

S35
References

Blankespoor, B., Bougna, T., Garduno-Rivera, R., and Selod, H. (2017). Roads and the geography of

economic activities in Mexico. Working paper.

Callaway, B. and Sant’Anna, P. H. C. (2018). Difference-in-Differences with Multiple Time Periods and

an Application on the Minimum Wage and Employment. DETU Working Papers 1804, Department of

Economics, Temple University.

Faber, B. (2014). Trade Integration, Market Size, and Industrialization: Evidence from China’s National

Trunk Highway System. The Review of Economic Studies, 81(3):1046–1070.

Kruskal, J. B. (1956). On the Shortest Spanning Subtree of a Graph and the Traveling Salesman Problem.

In Proceedings of the American Mathematical Society, 7.

Li, X., Zhou, Y., Zhao, M., and Zhao, X. (2020). A harmonized global nighttime light dataset 1992–2018.

Sci Data, 7(168).

S36

You might also like