Geological imaging and inverse modeling

Can Sentinel-2 be used to improve

geological mapping?

Idriss MONTHE - S196395

1st Master in geological engineering

Academic year 2020-2021

Table of contents
List of figures ............................................................................................................................. i

Introduction .............................................................................................................................. 2

A. The region of interest ....................................................................................................... 2

B. Data about the region of interest .................................................................................... 4

C. Identification of spaceborne images ............................................................................. 11

1. Resampling ................................................................................................................... 12

2. SRTM of the ROI ........................................................................................................ 13

D. Spectra and band maths ................................................................................................ 14

1. Enhanced image for geological substrate detection ................................................. 14

2. Spectra display............................................................................................................. 15

3. Band ratios ................................................................................................................... 16

3.1. NDVI ..................................................................................................................... 16

3.2. Others interesting ratio ....................................................................................... 17

4. Analysis of the topography ......................................................................................... 19

5. Discussion on the paper of van der Werff & van der Meer (2016) ......................... 20

6. Characteristics of sensor to favour for improvement of geological mapping ........ 20

E. Image classification ........................................................................................................ 21

1. Unsupervised classification......................................................................................... 21

2. Supervised classification ............................................................................................. 24

2.1. Selection of training sets ...................................................................................... 24

2.2. Supervised classification techniques .................................................................. 25

Conclusion ............................................................................................................................... 26

Improvements for the future of geological remote sensing ................................................ 27

References ............................................................................................................................... 28
List of figures

Figure 1 : Location map of the region of interest (from Google map in 2021) ......................... 3
Figure 2 : Hydrological map of Burundi with major watersheds (BGR, 2021) ......................... 6
Figure 3 : A view of the ROI on Google Earth (Google Earth June 2021) ................................ 6
Figure 4 : Regional framework of the Karagwe-Ankole Belt : K, Kabanga massif ; M,
Musongati massif ; B, Butare town (Fernandez-Alonso, et al., 2012) ....................................... 7
Figure 5 : Detailed geology of the ROI (Fernandez-Alonso, et al., 2012) ................................. 8
Figure 6 : Map of the soils of the ROI (Institut National pour L'étude Agronomique Du Congo
Belge, 2021) ............................................................................................................................... 9
Figure 7 : Agro-Ecological zones of Burundi (Gomez-Elipe & Aguirre Jaime, 2017) ........... 10
Figure 8 : Sentinel 2 scene in (natural) true colour .................................................................. 11
Figure 9 : Sentinel 2 scene in false colour (infrared) ............................................................... 12
Figure 10 : SRTM of the image subset of the ROI .................................................................. 14
Figure 11 : Image resulting of histogram anomorphosis of band 11 ....................................... 15
Figure 12 : Objects selected for the spectra visualisation ........................................................ 15
Figure 13 : View of the spectra of the objects .......................................................................... 16
Figure 14 : NDVI band image .................................................................................................. 17
Figure 15: Image obtained after Land/Sea Mask ..................................................................... 17
Figure 16: Result from the alteration ratio ............................................................................... 18
Figure 17 : Image from the iron oxide ratio ............................................................................. 19
Figure 18: Comparison of the topography and the Land/Sea Mask images ............................ 19
Figure 19 : Unsupervised classification with 5 clusters and no elevation ............................... 22
Figure 20 : Unsupervised classification with 10 clusters and no elevation ............................. 22
Figure 21: Unsupervised classification with 10 clusters and elevation ................................... 23
Figure 22: Unsupervised classification with 5 clusters and elevation ..................................... 23
Figure 23 : Position of the training sets.................................................................................... 24


Mankind is always trying to discover more over the Earth. Traditional mean of exploring the Earth has always been the “usual” geological
exploration in which humans. This way can be very difficult in large areas, especially where accessibility is not easy.

Thus, the question to know how to identify, at first approximation, the geology of a ROI before going to the field for confirmation is a
crucial one. To answer this question, many techniques have emerged and one of them is geological sensing.

The objective of this project is to investigate what Sentinel-2 data can uncover of the geology of a study area in Central Africa (Burundi).

The steps of this report will go through a presentation of the region of interest, the presentation of information about natural ecosystem and
anthropic infrastructures in the ROI, a processing through SNAP software to reveal possible features of interest and finally a classification of
features through the Multispec software.

A. The region of interest

Burundi is a small country of 27,834 km² located in the centre of Africa. According to the 2008 census, the population sums up to 8.04
million persons. The country is mainly conformed by a high plateau with variable altitude (from 772 m.a.s.l at the Tanganyika Lake to 2,670
m.a.s.l. at mount Heha). It has an equatorial climate with mean annual temperatures that vary according to the altitude between 23 °C to 17 °C.
The mean precipitation is 1277 mm with two rainy seasons (February to May and September to November) and two dry seasons (June to August
and December to January).

The region of interest (ROI) in the scope of this study is located at south western part of the country (Figure 1). This ROI has an area of
about 4670 km² (including 1415 km² of Lake Tanganyika) and a perimeter of 279 km. It covers 6 provinces (Bujumbura rural, Mwaro, Gitega,
Bururi, Makamba and Rumonge). There are five natural forest reserves in our ROI (Monge, Kigwena, Bururi, Vyanda and Makamba).

Figure 1 : Location map of the region of interest (from Google map in 2021)

B. Data about the region of interest

Burundi is located on the eastern side of the western branch of the East African Rift
System. It is mainly lying on metamorphic rocks, nonmetamorphic and intrusive rocks and on
Archean basement, the oldest rocks (OECD NUCLEAR ENERGY AGENCY, 1985). These
Archean rocks occur in three small complexes at the periphery of the country.
Cenozoic volcanic rocks are in north west part of the country while Cenozoic sediments
are in Tanganyika Lake and Ruzizi (part of the NNW-SSE trending Ubendide-Ruzizide Belt of
the region) valley. This latter consists mainly of geosynclinal metasediments, graphitic
metapelites, quartzites, meta-conglomerates, marbles, some basic metavolcanics and volcano-
sedimentary complexes.
Lower Proterozoic, known as ruzizian, are located in north west as the continuation of
the Congo-Nile watershed of Rwanda; and east of the Ruzizi valley and Tanganyika Lake. Due
to the Ubandian-Ruzizian orogeny, there are many isoclinal folds, dipping southwest and west.
Then, Ubendide-Ruzizide belt was intruded by Ruzizian and Burundian plutons, ranging from
granites to ultramafites. Metamorphic grade ranges from greenschist to amphibolite facies.
Middle Proterozoic, the main eon of the Burundi, is represented by rocks of the
Kibaride-Burundide Belt from southern Katanga (Democratic republic of Congo) to Uganda.
This belt consists of metasediments (conglomerates, quartzites, metapelites).
Upper Proterozoic in Burundi is known as the Malagarasian, which is part of the
Bukoban (Tanzania) - Katangan (Zambia and DRC) system. These deposits start with molasse
type sediments derived from the Kibaride-Burandide mountain chains. Beds overlying these
early sediments are mostly unconformable and range from psephitic to pelitic.
The tectonic setting of the country is as (Dr Hahne, 2014):
− The major part of Burundi is built up of lithological units of the Precambrian Kibara
Belt. As this structure is not continuous along its total length, the north eastern part of
this belt was renamed to Karagwe-Ankole Belt. Its main structural trend is NE-SW.
− Lake Tanganyika represents an active rift and is part of a major right lateral fault zone
which is linked with the reactivated Kibaran shear zone.
Looking at mineral resources, cassiterite, wolframite and other tungsten minerals,
columbo-tantalite, bastnaesite and gold were mostly mined from secondary deposits. Important
resources of apatite, rare earths, nickel, copper, cobalt, lead, zinc, vanadium, titanium and
possibly platinum in primary deposits have been located.

Figure 2 shows that the ROI falls into Lac Tanganyika and Ruvubu watersheds. Taking
advantage of the depth of Lake Tanganyika, the city of Rumonge, in our ROI, has a port for
transport by boat. On Figure 3, we can see the ROI in true colour satellite view.

Figure 2 : Hydrological map of Burundi with major watersheds (BGR, 2021)

Figure 3 : A view of the ROI on Google Earth (Google Earth June 2021)
The Karagwe-Ankole Belt is characterised by two structurally contrasting domains: the
Western domain (WD) and the Eastern Domain (ED). Both are separated by a boundary zone.
Figure 4 shows that our ROI is part of the Western Domain. A detailed geology of the ROI is
found on Figure 5.

Figure 4 : Regional framework of the Karagwe-Ankole Belt: K, Kabanga massif; M,

Musongati massif; B, Butare town (Fernandez-Alonso, et al., 2012)
Figure 5 : Detailed geology of the ROI (Fernandez-Alonso, et al., 2012)
Figure 6 : Map of the soils of the ROI (Institut National pour L'étude Agronomique Du Congo Belge, 2021)

Figure 7 : Agro-Ecological zones of Burundi (Gomez-Elipe & Aguirre Jaime, 2017)
Figure 6 presents the different soils type in the ROI. The corresponding agro-ecological
zones are illustrated in Figure 7.

After the collection of data in our ROI, the methodology of our study consists of:
- Identification of spaceborne images
- Calculations on spectra and band to enhance image and potentially reveal the
geological substrate
- Image classification.
C. Identification of spaceborne images

Being on the downloading site of Sentinel-2 images

( on the 21st of May 2021, we have found 796 images
corresponding (intersecting) to our ROI. From these images, we have selected the best one
according to the cloud coverage. So, we took an image with 60.2% of cloud coverage. This a
quite a huge percentage but these clouds are mainly present above Lake Tanganyika. By using
Snap software, we displayed that image in natural true colour (Figure 8) and in false colour
infrared (Figure 9). The black rectangle represents our ROI.

Figure 8 : Sentinel 2 scene in (natural) true colour

Figure 9 : Sentinel 2 scene in false colour (infrared)
For reasons of efficiency in the processing time of our image by our PC, we must reduce
the extent of our image. Our image having bands with different spatial resolutions, it would
first be necessary to standardize these resolutions into one. This preliminary step is resampling.

1. Resampling

Generally, during the projection of an object, the pixel centres of the target product
doesn’t correspond to the centres of the pixels of the input product. Resampling entitles the
process of determination and interpolation of pixels in the source product for computation of
the pixel values in the target product. The effects of resampling will especially be visible if the
pixels in the target product are larger than the source pixels.

There are several resampling methods. However, SNAP software only has three:

- Nearest neighbour: every pixel value in the output product is set to the nearest input
pixel value. This method is very simple, fast and doesn’t calculate new value by

interpolation. However, some pixels will be lost while others will be duplicated. We can
also observe a loss of sharpness in the resulting product (image).
- Bilinear interpolation: the new pixel is obtained by weighting the value of the four
surrounding pixels. In this method, extremes are balanced and image losses sharpness
compared to nearest neighbour. This leads to a less contrast compared to nearest
neighbour while new values are obtained when they were not initially present.
- Cubic convolution: the new pixel is obtained by calculation as in the bilinear
interpolation but by weighting the sixteen surrounding pixels. Then, extremes are also
balanced but image is sharper compared to bilinear interpolation. This method has the
same constraints than bilinear interpolation and the time computation is much longer.

Looking at the different bands of our image, we can see that the higher spatial resolution
is 10 m, so we will resample all the bands to this resolution. The nearest neighbour option
should be used for categorical data since no new value is created. The bilinear and cubic options
should not be used with categorical data but produce better quality outputs for continuous data.
As images are continuous data, we can therefore exclude nearest neighbour method. In order to
minimize calculation times, we will use the bilinear method.

2. SRTM of the ROI

Shuttle Radar Topography Mission is a spatial program launched by the USA through
its National Aeronautics and Space Administration and the National Geospatial-Intelligence (NGA)
in February 2000. It consists in the acquisition of radar images of the Earth to build a kind of image
of the topography. It uses a method looking like a triangulation but just with two points. After
emitting a signal in the direction of the Earth, one antenna (inside the shuttle) receives the returning
signal while another antenna located outside on a 60 m mast also receive the signal. This allows to
have signals at different reception angles. This method is known as single-pass interferometry. Thus,
differences between the two signals allows the calculation of surface elevation. These surface
elevation (digital elevation model) doesn’t take into account anthropic constructions like buildings.

The products obtained by this mission are:

- SRTM Non-Void Filled: data with a resolution of 1 arc-second (30 m) inside the USA
and 3 arc-seconds outside the USA (obtained by cubic convolution resampling)
- SRTM void Filled: same resolution than Non-Void Filled but with some voids in several
- SRTM 1 Arc-Second Global: less voids, open access, resolution of 1 arc-second (30 m).

We have subset our initial image by trying to avoid many clouds and by letting down
the Democratic Republic of Congo because we are working with Burundi. Figure illustrate
SRTM data obtained in the region of the subset. Scale range varies from blue (sea level) to
green (top).

Figure 10 : SRTM of the image subset of the ROI

D. Spectra and band maths
1. Enhanced image for geological substrate detection

To try to have a first potentially view of the geological substrate of our ROI, we can
play with histogram anamorphosis and look up tables of the individual bands of our S2 image.
Looking at the statistics of the different bands, we observe that the bands in short wavelength
range spectra reveal the low absorption by the water. So, we decide to work with a band with
higher wavelength range. Band 11 is chosen because of the better contrast between low
reflectance and high reflectance object. Playing with its histogram by taking 95% of the data
and ranging the colours from the blue to the red (derived from 7 colours), we obtained the result
on Figure 11. We can see yellow-greenish formations in the north, the centre and the south. The
green part is the vegetation and the yellow one associated with the green need to be investigated
as it can be indices of particular geology formation.

Figure 11 : Image resulting of histogram anamorphosis of band 11
2. Spectra display

In order to compare different spectra on our image, we need to select some objects to
display their spectra and compare the differences. Figure 12 illustrates the different objects

Figure 12 : Objects selected for the spectra visualisation

The spectra of the different objects selected is shown in Figure 13.

Figure 13 : View of the spectra of the objects

Water has a better absorption so a lower reflectance. Thus, its spectra remain very low.
At band 9, we observe a little jump in the water spectra. It is because band 9 is not at a very
good atmospheric window. For the vegetation, we see an increase in the green and an absorption
in the red, and then an increase in the infrared. This is typically the behaviour of vegetation
spectra. So, the main difference between water and vegetation is when we enter in the infrared.
The soil’s spectrum has a global ascending trend while the others tend to decrease at the end.
Urban areas have the maximum reflectance of all the objects, especially from the infrared till
the end. As we have seen earlier, it’s not easy to distinguish rock types with histogram
anamorphosis in this case. So, we will try to enhance differences in soil/rock types by using
band ratios.

3. Band ratios
3.1. NDVI

As we have seen earlier, vegetation has a low reflectance in the Red and a high
reflectance in the NIR, so we can use those two bands to compute an index to characterize
vegetation. This index is sensitive to the quantity of the vegetation. Typically, vegetation has
NDVI value higher than 0. Generally, this value is between 0.1 and 0.7. The greatest values are
for very dense vegetation. So, applying an anamorphosis of histogram to the NDVI leads to the
same mask of vegetation than earlier. We can also identify the same features of interest (soil)
as those features have NDVI values around 0. In fact, as we have seen with the spectrum view,
reflectance values of soil in the Red and the NIR are almost of the same order, this leads to this
NDVI value around 0. Figure 14 illustrates those features (red dots). The colour scale ranges
from black to white (high values).

Figure 14 : NDVI band image
3.2. Others interesting ratio

In order to apply iron oxide ration on our image, we will first mask the water by an
automated approach. By using the Land/Sea Mask, we obtained the Figure 15.

Figure 15: Image obtained after Land/Sea Mask

The two features highlighted in the NDVI clearly appear now as outcrop. It is just due
to different value in the histogram related. After the retrieval of water, we can now compute
iron oxide ratios on our image. In this central area, the geology map shows that we have mainly
metasediments so we will apply alteration ratio (Band11/Band12) to try to differentiate
sediments with other rock types. The result is displayed on Figure 16. The image confirms the
predominance of alteration soil (metasediments) on the ROI. We can also identify the two
previous features as the Archean craton of the geology map.

Figure 16: Result from the alteration ratio

Applying the iron oxide ratio (Band 4/Band2), we could try to identify a feature located
at the south east of the ROI. The result is displayed in Figure 17. This feature seems to be
associated to metamorphic intrusion (according to the geological map).

Figure 17 : Image from the iron oxide ratio
4. Analysis of the topography

Topography can be a really interesting tool for accessing first overview of potential
features of interest. In Figure 18, we can see correlation for the outcrops in both topography
image and natural colour Land/Sea Mask image.

Figure 18: Comparison of the topography and the Land/Sea Mask images

5. Discussion on the paper of van der Werff & van der Meer (2016)

Van der Werff & van der Meer published a paper on the use of sentinel-2A MSI and
Landsat 8 OLI to provide data continuity for geological Remote Sensing. The main goal of this
paper was to confirm the potential of Sentinel-2A MSI to provide continuity for geological
remote sensing. They evaluate band rations products of Sentinal-2A MSI on data of a ROI in
Spain and data simulated by Sentinel-2A MSI. As they had detailed information on the geology
of that place, they could confirm the results obtained with those band ratios.

The ratios applied were related to alteration and iron oxides. They also applied ratios on
vegetation to mask that vegetation and minimize the bias it could have caused. By comparison
of the results obtain by band ratios of Sentinel-2A MSI and Landsat 8 OLI, they assessed that,
for human eye, the images obtained are quite similar with a correlation of approximately 0.8
and higher. However, the data ranges differ significantly. Then, they confirmed their results by
comparing their images with a geological map of the area. Thus, they assessed that their
resulting maps for both Sentinel-2A and Landsat correspond to the geologic model associated
to this ROI: an epithermal deposit. Moreover, Sentinel-2A has another great advantage which
is low revisit time; this could help to monitor rapid changes in data from different periods.

As early mentioned, Sentinel-2A is a great tool to compute the geological model of a

ROI. Especially, it can be very useful to map iron oxides. In fact, when the area of study is
large, especially when trying to map iron for the usability of an area for cultivating crops,
remote sensing is the only suitable tool for surveying at high temporal and spatial interval. Yet
a relatively high spectral resolution is needed for mapping iron contents with reflectance data
(Van der Werff & Van der Meer, 2015). So, we need to cover the 0.9 μm iron absorption feature.
In fact, while space-borne sensors like ASTER and Landsat only have one band for this purpose,
Sentinal-2 has several bands for the same purpose. Moreover, by applying a curve-fitting
technique for Sentinel-2, it is possible to approximate the iron absorption feature at a
hyperspectral resolution. Then, we can have access to information on the 0.9 μm absorption
feature that until now was reserved for hyperspectral instruments. So, with Sentinel-2, we have
that information with a lower spatial resolution than hyperspectral data, but with a large spatial
coverage and frequent revisit time.

6. Characteristics of sensor to favour for improvement of geological mapping

As we have seen just before, it is good to have higher spatial resolution of hyperspectral
data associated with large spatial coverage and frequent revisit time of a sensor like Sentinel-2.

For high spatial resolution, the sensor has to have a small Instantaneous Field of View.
However, this reduces the amount of energy that can be detected as the area of the ground
resolution cell within the IFOV becomes smaller. This leads to reduced radiometric resolution

As it is better to detect great amount of energy for good quality images, it is thus better
to have high radiometric resolution. But the increase of radiometric resolution leads to a
decrease of spatial resolution. To increase the amount of energy detected (and thus, the
radiometric resolution) without reducing spatial resolution, we would have to broaden the
wavelength range detected for a particular channel or band. Unfortunately, this would reduce
the spectral resolution of the sensor. Conversely, coarser spatial resolution would allow
improved radiometric and/or spectral resolution (Government of Canada, 2021).

At the end, we just can’t have it all. Thus, these three types of resolution must be
balanced against the desired capabilities and objectives of the sensor.

E. Image classification

For this step, we will classify the pixels of the images into classes. First, we will proceed
through an unsupervised classification by letting the software doing it by itself, and after we
will make a supervised classification by given a training set to the software where we have
previously specified some pixels corresponding to some features.

For both classifications, we will use Bands 2, 3, 4 and 8 as they initially have the best
spatial resolution (10 m). For each classification, we will make cases with and without elevation

1. Unsupervised classification

The technique used for this part is ISODATA, an iterative self-organizing data analysis.

We will make 2 classifications of 10 clusters and 5 clusters respectively with a criterion

of convergence of 99%. It means the program will stop when 99% of the pixels will not change
from the previous iteration. The minimum cluster size was chosen at 49. The results are display
in Figure 19 and Figure 20. From Figure 19, we can see that water and the outcrop previously
identified are placed in the same cluster. It is due to the fact that our image has clouds and their
shadows. These clouds and shadows account for two cluster so the 3 remaining have to contain
vegetation, soil, outcrop and water. In Figure 20, there are just 5 predominant colours (clusters),
so and optimal number of clusters should be between 8 (10 minus the 2 for clouds and shadows)
and 7 (5 clusters + 2 for clouds and shadows).

Figure 19 : Unsupervised classification with 5 clusters and no elevation

Figure 20 : Unsupervised classification with 10 clusters and no elevation

Figure 21: Unsupervised classification with 10 clusters and elevation
From Figure 21 and Figure 22 we can observe that the topography significant
influence is on the classification of water.

Figure 22: Unsupervised classification with 5 clusters and elevation

2. Supervised classification
2.1. Selection of training sets

As the supervised classification needs a set of data where we know the features in place,
we have to give a training set representing these data. The choice was made on the basis of the
previous results. So, we defined 4 features in our training set: water, vegetation, outcrop, soil.
Especially from Figure 23, we can see their positions (red circles).

Figure 23 : Position of the training sets

2.2. Supervised classification techniques

Here there are many techniques available. We will focus on two of them: Maximum
Likelihood and Minimum Distance (Snap help).

The maximum likelihood classifier is one of the most popular classification methods, in
which a pixel with the maximum likelihood is classified into the corresponding class. The
likelihood is defined as the posterior probability of a pixel belonging to a given class.

With the minimum distance classifier, any pixel in the scene is categorized using the
distance between the image data of the pixel and the means of the classes derived from the
training sets. The pixel is designated to the class with the shortest distance.


Sentinel-2 remote sensing is an innovative tool for prospecting the geology of a region.
in this report, it was a question for us of carrying out an analysis of the potential of remote
sensing in support of the geological mapping of an ROI located in Burundi. This ROI is located
in the south east of the country and covers an area of 4670 km². It covers 6 provinces and several
natural forest reserves.

From the point of view of geology, Burundi is mainly based on metamorphic, non-
metamorphic and intrusive rocks, and Archean rocks. Thus, in accordance with the geological
map of the area, metasediments cover most of the ROI.

The methodology of the work consisted in obtaining a satellite image (sentinel-2) of the
ROI and carrying out several treatments (spectra and band maths; and image classification) in
order to try to highlight the lithological variations in agreement with the actual geology of the

First of all, we resample the bands so that they are all of the same spatial resolution,
which will then allow us to cut our ROI in the initial map. Then, the analysis of reflectance
spectra allowed us to identify areas (vegetation, urban area, bare soil and water). It was thus
possible subsequently to apply calculations to the bands in order to identify geological
structures of interest. In addition, the analysis of the topography of the area allowed us to
identify the geological structures initially detected during the calculations on the strips. We thus
concluded that topography is a good first indicator of structures of interest in the field.

A discussion on the characteristics of an innovative sensor was then carried out. It turns
out that it is impossible to ideally combine spatial, radiometric and spectral resolutions.

Finally, the unsupervised classification of our ROI made it possible to highlight a

number of clusters suitable for a supervised classification. We were also able to assess the
impact of topography on a classification.

Nevertheless, the technical deficiencies of our working machine prevented us from

going further in the study with a supervised classification of our image.

At the end of this study, it appears obvious that remote sensing has enormous potential
in the context of support for geological mapping. Especially by the use of Sentinel-2 which
combines all the desired characteristics (spatial, radiometric and spectral resolutions) in a
remote sensing sensor.

Improvements for the future of geological remote sensing

One of the possible improvements of geological remote sensing in the future is by using
the artificial intelligence. This can be achieved by the use of machine learning which is a
subdivision of artificial intelligence. The application of this method in geosciences and remote
sensing is fairly new and limited. Machine learning can be compared as universal approximators
as they can learn the behaviour of a system from a set of training data. The great innovation is
that with machine learning we don’t need a prior knowledge about the nature of the
relationships between the data. So, it can be very useful in supervised classification for example.
In fact, the application of machine learning can be categorized into three areas (Lary, Alavi,
Gandomi, & Walker, 2016):

• The system's deterministic model is computationally expensive and ML can be used as

a code accelerator tool.
• There is no deterministic model but an empirical ML-based model can be derived using
the existing data.
• Classification problems.

At the end, we can combine the power of Sentinel-2 in the acquisition of images to the
capacity of interpretation of machine learning.

