Results in Engineering 17 (2023) 100940

Technical Note

The Sustainable Development Goals and Aerospace Engineering: A critical

note through Artificial Intelligence
Alejandro Sánchez-Roncero a , Òscar Garibo-i-Orts a,f , J. Alberto Conejero a , Hamidreza Eivazi b ,
Fermín Mallor b , Emelie Rosenberg b , Francesco Fuso-Nerini c,d , Javier García-Martínez e ,
Ricardo Vinuesa b,d , Sergio Hoyas a,∗
Instituto de Matemática Pura y Aplicada, Universitat Politècnica de València, Camino de Vera 46024, València, Spain
ACES – Association of Spanish Scientists in Sweden, Stockholm, Sweden
Division of Energy Systems, KTH Royal Institute of Technology, Stockholm, Sweden
KTH Climate Action Centre, Stockholm, Sweden
Molecular Nanotechnology Lab, Department of Inorganic Chemistry, University of Alicante, Alicante, Spain
GRID - Grupo de Investigación en Ciencia de Datos, Valencian International University - VIU, València, Spain


Keywords: The 2030 Agenda of the United Nations (UN) revolves around the Sustainable Development Goals (SDGs).
Sustainability A critical step towards that objective is identifying whether scientific production aligns with the SDGs’
United Nations achievement. To assess this, funders and research managers need to manually estimate the impact of their
Sustainable Development Goals
funding agenda on the SDGs, focusing on accuracy, scalability, and objectiveness. With this objective in mind, in
Artificial Intelligence
this work, we develop ASDG, an easy-to-use Artificial-Intelligence-based model for automatically identifying the
Aerospace Engineering
potential impact of scientific papers on the UN SDGs. As a demonstrator of ASDG, we analyze the alignment
of recent aerospace publications with the SDGs. The Aerospace data set analyzed in this paper consists of
approximately 820,000 papers published in English from 2011 to 2020 and indexed in the Scopus database.
The most-contributed SDGs are 7 (on clean energy), 9 (on industry), 11 (on sustainable cities), and 13 (on
climate action). The establishment of the SDGs by the UN in the middle of the 2010 decade did not significantly
affect the data. However, we find clear discrepancies among countries, likely indicative of different priorities.
Also, different trends can be seen in the most and least cited papers, with apparent differences in some SDGs.
Finally, the number of abstracts the code cannot identify decreases with time, possibly showing the scientific
community’s awareness of SDG.

1. Introduction • SDG 1: End poverty in all its forms everywhere.

In 2015 all state members of the United Nations (UN) adopted the • SDG 2: End hunger, achieve food security and improved nu-
2030 Agenda for Sustainable Development. The UN intends to promote trition and promote sustainable agriculture.
peace and prosperity for people and the planet with a vision for the near
future. To make that vision a reality, the 2030 Agenda consists of 17 • SDG 3: Ensure healthy lives and promote well-being for all at
Sustainable Development Goals (SDGs) [1]. They represent the actions all ages.
that countries from all over the world (both developed and developing)
• SDG 4: Ensure inclusive and equitable quality education and
should implement as global cooperation for the future of our planet.
promote lifelong learning opportunities for all.
The 17 SDGs, see description in [1], are as follows (those most
closely related to Aerospace Engineering have been written in italic • SDG 5: Achieve gender equality and empower all women and
font): girls.

* Corresponding author.
E-mail address: [email protected] (S. Hoyas).

Received 4 November 2022; Received in revised form 9 January 2023; Accepted 3 February 2023
Available online 10 February 2023
2590-1230/© 2023 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
A. Sánchez-Roncero, Ò. Garibo-i-Orts, J.A. Conejero et al. Results in Engineering 17 (2023) 100940

• SDG 6: Ensure availability and sustainable management of A critical example of the importance of understanding the situa-
water and sanitation for all. tion of a certain funder about the SDGs is climate change. The SDGs
must be accomplished while we are amid a climate emergency, as con-
• SDG 7: Ensure access to affordable, reliable, sustainable, and firmed in the last Intergovernmental Panel on Climate Change [6]. This
modern energy for all. is particularly important in the case of Aerospace Engineering [7,8].
• SDG 8: Promote sustained, inclusive, and sustainable eco- To summarize the importance of aerodynamics, for example, about a
nomic growth, full and productive employment, and decent work quarter of today’s energy is spent moving fluids along pipes or vehi-
for all. cles through air or water. Turbulence dissipates 25% of this energy,
which is responsible for up to 5% of the CO2 dumped by humanity every
• SDG 9: Build resilient infrastructure, promote inclusive and sus- year [9]. Considering that 340 billion liters of fuel were used in 2017 for
tainable industrialization, and foster innovation. air transportation worldwide (as reported by IATA [10]), there is con-
• SDG 10: Reduce inequality within and among countries. siderable potential for energy savings and fuel consumption reduction.
Before the coronavirus Disease-19 (COVID-19) pandemic, this quantity
• SDG 11: Make cities and human settlements inclusive, safe, re- grew yearly at an unsustainable 3% rate.
silient and sustainable. To summarize, ASDG can be seen as a contribution to SDG17. How-
ever, we will not consider this SDG as it is the most difficult to identify.
• SDG 12: Ensure sustainable consumption and production pat-
Certainly, few papers coming from the aerospace world are devoted to
this issue. Excepting this SDG, ASDG is ready to analyze any field of sci-
• SDG 13: Take urgent action to combat climate change and its ence and technology. This work presents the first application of ASDG:
impacts. Aerospace Engineering. A summary of ASDG is given in the next section,
together with a description of the database. The results are explained in
• SDG 14: Conserve and sustainably use the oceans, seas, and
the third section. Conclusions and future work are described in the last
marine resources for sustainable development.
• SDG 15: Protect, restore and promote sustainable use of ter-
restrial ecosystems, sustainably manage forests, combat desertifi- 2. Methods
cation, and halt and reverse land degradation and halt biodiversity
loss. The code employed for this article, ASDG, can identify the connec-
• SDG 16: Promote peaceful and inclusive societies for sus- tion between a paper and an SDG through its abstract. It uses four differ-
tainable development, provide access to justice for all and build ent models: Non-Negative Matrix Factorization (NMF) [11], Distributed
effective, accountable and inclusive institutions at all levels. Representations of Topics (Top2Vec) [12], Latent Dirichlet Allocation
(LDA) [13], and BERTopic [14]. Due to their inherently different na-
• SDG 17: Strengthen the means of implementation and revi- ture, the information that each model extracts from a text is different.
talize the Global Partnership for Sustainable Development. In other words, their functionalities are complementary. To take advan-
tage of this fact, ASDG introduces a voting mechanism. Similar ideas
The two main questions this article wants to contribute are: is the have been used very recently for studying the social network Twitter
Aerospace Engineering scientific community focused on fulfilling the [15]. In the voting stage, ASDG takes the scores of each model for
SDGs? What are the most relevant SDGs in this community? To the best each text as inputs. Using this information, ASDG decides which iden-
of our knowledge, there is no published work about this relationship. tified SDGs have enough confidence to assume that the text relates to
In this work, our answer is given using Artificial Intelligence (AI) tools. them.
It is important to note that the recent paradigm change introduced The validation of ASDG was carried out in a previous publication
by the fast digitalization of business, academics, daily life, and even [5]. The model’s training (based on 510 manually-curated text files re-
policy-making is profound. A recent study by Vinuesa et al. [2] in-depth lated to each SDG) was described in that work. Briefly, after download-
examined how AI affects the accomplishment of the UN’s 2030 Agenda. ing all papers referenced in [2], for a total of 186 works, we manually
Although they discovered that 79% of the aims would be positively af- selected papers with at least an Abstract and Body differentiated, ex-
fected by AI, they also noted that the growth of AI could hinder or even tracting the sections in 40% of them. A Deep Neural Network [16] was
have a detrimental impact on the achievement of 35% of these targets. used to extract the remaining 60% automatically. This tool is based on
The SDGs are all interconnected, and while there are numerous syner- images instead of converting the pdf file to text. We validated this tool
gies, it is vital to recognize and properly document any trade-offs to with the extracted pdf files and checked out every abstract. As the au-
reach the full potential of AI’s ability to contribute to creating a sus- thors of [2] classified all these papers based on an expert consensus, we
tainable future. Furthermore, Gupta et al. [3] extended their work to labeled these papers to classify all these papers correctly, obtaining an
discussions on the implications of AI on the SDGs at the indicator level. 81% agreement.
In this regard, it is crucial to emphasize that implementing clear and The methods mentioned above are briefly described next.
understandable strategies requires employing AI-based technologies to
achieve the SDGs. According to Vinuesa and Sirmaeck [4], deploying 2.1. NMF
interpretable AI would produce an algorithmic usage that focuses on
accountability and transparency. Non-negative Matrix Factorization model (NMF) [11]. This method
With this in mind, a preliminary version of our code ASDG (Auto- can reduce the space dimension of the problem, extracting essential
matic Classification of Impact to Sustainable Development Goals) can features. We consider 16 topics, as SDG 17 is currently not considered.
be found in [5]. We believe that a promising way to achieve significant All training and validation texts have been preprocessed. This includes:
progress in the SDG Agenda is by using AI-based methods to inform
policy decisions to maximize the synergies and minimize the trade-offs. • Words lemmatization + stop words
With this goal in mind, we created ASDG. This AI-based framework con- • Removing numeric and non-ASCII characters.
stitutes a step in this direction by enabling the automatic classification • Words frequency and documents frequency were set to 1. This con-
of hundreds of thousands of scientific papers by their impact on each figuration means that no words are excluded.
SDG. • Bigrams were allowed.

A. Sánchez-Roncero, Ò. Garibo-i-Orts, J.A. Conejero et al. Results in Engineering 17 (2023) 100940

All training texts are automatically identified with the appropriate perspective. This has the advantage of separating the clustering
SDG, using this information to associate each topic with one SDG. The technique from the topic generation, allowing more flexibility.
score corresponding to each topic for each text file is queried after the
model has been trained. The named SDGs are multiplied by that score, 2.5. Voting
then recorded in a subject association map (nTopics x 17). The values
for each topic are normalized (values/sum (values)), and those topics A combination of the previously described model is used to take
with scores of less than 0.1 are discarded. The final result is a matrix, advantage of their respective strengths, as the models complement each
where each row represents the likelihood that each SDG will be associ- other. After a careful study, one document is linked to an SDG if:
ated with a particular and single topic.
• Any model’s score on an SDG is greater than 0.4 (maximum 0.5),
2.2. Top2Vec or
• The model’s score on an SDG is greater than 0.1 for LDA and
A Top2Vec model [12] was trained using the embedding model “all- BerTopic.
MiniLM-L6-v2”. This embedding was pre-trained on a larger corpus,
which works better when the training corpus is small. A light prepro- Using this voting system, we successfully classified 81% of the papers
cessing is required to remove non-ASCII characters. In this case, no based only on the information in the abstract.
document segmentation is defined. The extraction of topics was unsu-
pervised. Since the association of the training texts with the SDGs was 2.6. Database and implementation
known beforehand, we queried the associated texts and their scores for
each topic, creating an association matrix as it was done with the NMF Regarding the database, we have downloaded 820,000 documents,
model. comprising articles, conference papers, and books from the Scopus
database [18]. The search criterion relied on seeking the words
2.3. LDA “aerospace,” “aeronautics,” “aeronautical,” and “aviation” in all the
metadata of the papers. We selected papers from 2011 to 2020, saving
A latent-Dirichlet-allocation model [13] was also trained with the the following data:
following configuration:
• Abstract
• Number of topics: 16. • Year
• Passes: 400. Iterations: 1000. Chunk size: 2000 • Citations, as of November 2022.
• Bigrams are allowed • Country
• Minimum word count: 1, Maximum word frequency: 0.7 • Keywords
• Open-access information
The training and validation texts were preprocessed similarly to the
NMF case. In this case, the model assumes that the documents follow a For obvious reasons, the language of the document must be English.
Dirichlet distribution over topics and topics over words. Thus, it in- This procedure may lead to over-represented affiliations in English, and
herently allows having more than one topic in each document. The some papers of one of the authors, i.e., [19,20], are not found based on
association matrix was calculated as with the other models. Only the these keywords. However, the casuistic can be extremely long, and it is
UN training texts were used. Note that this method has been success- nearly impossible to add every possible author to the list. Nevertheless,
fully employed to automatically classify the AI curricula of a wide range the number of papers studied is high. We firmly believe it represents
of universities based on their respective contents [17]. the state of Aerospace Engineering to SDGs, as we are analyzing a pro-
duction of more than 80,000 papers yearly. The set was downloaded
2.4. Bertopic in packages of around 20,000 documents each, taking special care of
not repeating any document. Finally, around one hundred papers were
BERTopic is a topic modeling technique very similar to Top2Vec discarded because they did not contain an abstract.
since both are unsupervised clustering-based techniques [14]. BERTopic To summarize, and following the flowchart of Fig. 1, for every doc-
extracts coherent topic representation via implementing a class-based ument, we have performed the following algorithm:
variation of the term frequency-inverse document frequency (TF-IDF).
The steps it follows are: 1. Extract the abstract and metadata from a CSV file.
2. Lemmatize and remove any non-ASCII character.
• Generating the document embeddings with a pretrained transfor- 3. Compute the score for every SDG and every method 𝛼𝑥 .
mer-based language model. The embedded words which are seman- 4. Evaluate the score for every SDG, following the rules of the box of
tically similar will be placed close to each other in semantic space. Fig. 1.
In this way, document-level information is extracted from the cor- 5. Extract the SDG with the maximum score.
pora. 6. Save this SDG with the document’s metadata to an output file.
• The document embeddings are dimensionally reduced. This is be-
cause as data increases dimensionality, the distance to the closest This algorithm was implemented in Python version 3.9. The code is
point tends to approach the distance to the farthest point. As a re- easily parallelizable, as every document can be run independently. We
sult, in high dimensional space, spatial locality becomes ill-defined, ran it on a typical computer, taking less than 3 hours to classify all the
and distance measures differ little [14]. abstracts.
• A density-based method cluster is created. This technique assumes
that words near the cluster’s centroid are most representative of 3. Results
that cluster. However, in practice, a cluster will not always lie
within a sphere around a cluster centroid which might conduce To study the results of our analysis, we will use the term frequency,
to the extraction of misleading topics. defined as
• Topics vectors are extracted from the cluster. A class-based version 𝐷sdg
of TF-IDF is used to overcome the limitation of the centroid-based 𝐹= .

A. Sánchez-Roncero, Ò. Garibo-i-Orts, J.A. Conejero et al. Results in Engineering 17 (2023) 100940

Fig. 1. Flowchart of the ASDG framework, where 𝛼𝑥 stands for the score in method 𝑥 for SDG sdg. This process has been carried out for the 820,000 documents in
the database.

Table 1
Frequency expressed as a percentage of selected SDGs in 2011 and 2020. The
last row shows the difference between these two rows.

SDG 3 7 9 11 12 13 15

2011 5.30 21.92 12.8 9.85 12.1 20.4 4.00

2020 4.84 24.10 13.1 9.80 14.43 17.39 4.16
Diff -0.46 2.19 0.30 -0.05 2.29 -2.96 0.16

• This study has been done with abstracts, which makes the identifi-
cation more difficult.
• We have used a high threshold, avoiding false identifications as
much as possible.

Second, some SDGs seem to be more important in Aerospace Engineer-

ing, namely:

• Society: SDGs 3 (good health), 7 (clean energy) and 11 (sustainable

Fig. 2. Distribution of the documents in frequency grouped in two sets, namely • Economy: SDGs 9 (industry) and 12 (responsible consumption).
2011–2015 (blue circles) and 2016–2020 (red squares). Frequency is obtained • Environment: SDGs 13 (climate action) and 15 (life on land).
by dividing the papers assigned to a particular SDG over the total papers of
that period. Note that the SDGs are grouped into the three categories reported After studying the global picture, we will focus on these SDGs.
in [2], i.e. Society, Economy, and Environment. The black square indicates the
Finally, the variation after the introduction of the SDGs is small.
fraction of documents ASDG could not identify.
There is a positive point here: ASDG can better identify the abstracts,
probably because researchers are more aware of the SDGs. However,
In this equation, 𝐷 indicates any set of abstracts with a particu- there is a negative aspect too: the frequency of papers about climate
lar restriction. For example, 𝐷 could be the set of all papers pub- change is lower after adopting SDGs than before.
lished in 2011. 𝐷𝑡𝑜𝑡𝑎𝑙 is the total number of papers on that set, and This global situation is completed with Fig. 3. The number of papers
𝐷sdg is the subset of papers identified for a particular SDG. In many has almost doubled during the last ten years. This happens for all the
cases, this parameter will be preferable to the number of documents. SDGs, and the variations are small. Thus, as the number of documents
In every case, the definition of the particular subset will be absolutely increases steadily by at least 6% yearly, we think the frequency is a bet-
clear. ter parameter than the number of papers. Fig. 3 (right) also shows that
As the SDGs were launched in 2015, there has been enough time to the number of unidentified papers is constantly reducing, reinforcing
see their introduction’s consequences. The global picture is shown in the idea that researchers are steering their work toward fulfilling the
Fig. 2. Here we show the frequency of every SDG and the non-classified SDGs.
abstracts. To facilitate the presentation of the data, we have divided Apart from that, no clear pattern emerges from Fig. 3. The trends
the data set into two large groups, before and after the adoption of the in the most popular SDGs seem to be present before the appearance
SDGs in 2015. The years 2011-2015 are represented by blue circles, and of SDGs. Some further insight can be gained from Table 1. As we can
2016-2020 by red squares. We have also grouped the SDGs following see, SDGs 7 and 12 have attracted more attention. As we mentioned
the classification of [2]. Several ideas can be gained from this image. earlier, this has partly been at the cost of SDG 13. This means that
First of all, there is a large number of unidentified papers. There are climate action is getting reduced as responsible production grows. As
two causes for this: these SDGs are tightly coupled, this is probably not as serious as it

A. Sánchez-Roncero, Ò. Garibo-i-Orts, J.A. Conejero et al. Results in Engineering 17 (2023) 100940

Fig. 3. Distribution of the database in terms of absolute numbers (left) and frequency (right) for every year. The SDGs are represented by their colors, starting from
1 at the bottom and following the order in Fig. 2 in a counterclockwise sense. The black region corresponds to unidentified abstracts, and the white dotted lines
indicate the transition between the three large groups.

Fig. 4. Distribution in the frequency of selected SDGs by year. Background:

Fig. 5. Sorting of the most important SDGs identified by their colors. The posi-
China, dashed lines: European Union, and ash-dotted lines: USA.
tion of every dot indicates the order taking into account the mean citation value
of the SDG that year. The size of the marker indicates the order considering only
seems from the point of view of climate action. It could be that further the most cited paper. The width of the line indicates the order considering the
research in engineering (7 and 12) is necessary for advancing in SDG percentage of Open-access documents.
ASDG can also analyze the priorities of different countries and supra- almost always the SDG with the lowest MCI and a high percentage of
national entities. In Fig. 4, we can see the relative importance of the OA papers. This could indicate that many documents in this SDG receive
selected SDGs for the USA, China, and the European Union. Since ten very few (or no) citations, so apart from receiving less attention every
years ago Aerospace Engineering community in the USA has been fo- year, there are many low-quality papers. On the other hand, SDGs 3, 7,
cused on climate action. However, China seems to focus on engineering and 12 exhibit a very good MCI. In the case of SDG 7, the number of
and production instead of climate action. Quite curiously, the European citations appears to be independent of the OA percentage.
Union is closer to China than the USA. In this case, extensive effort is Finally, it is also important to note that SDG 3 results can be bi-
devoted to SDG 9. Also, SDG 12 is receiving much attention, clearly ased as health-related publications use to have a very large number of
growing with the years. citations.
Finally, we will focus our analysis on the number of citations. Sadly,
this is the first tool to measure the quality of a researcher. Thus, it is 4. Conclusions
essential to identify if some fields are preferred because the number
of citations working on them is more significant. In Fig. 5, we have In this work, we have used the tool ASDG [5] to study the align-
sorted the SDGs by the mean citation index (MCI), i.e., the number of ment of Aerospace Engineering with the Sustainable Development Goals
citations divided by the number of papers. The size of the dots indicates of the United Nations. ASDG uses NMF, LDA, TopVec, and Bertopic
the order considering only the most cited document. The largest marker methods to identify the SDGs. We identified two main questions in the
corresponds to the maximum. Finally, the lines’ width between years 𝑖 introduction. First, the introduction of the SDG Agenda did not have a
and 𝑖 + 1 indicates the paper’s availability as open access (OA) in the clear impact on the scientific production of the Aerospace Engineering
year 𝑖 + 1. Again, the wider the line, the greater the percentage of OA community. This result is quite concerning, as the community’s atten-
papers. tion is drifting apart from extraordinary and urgent challenges such as
Two groups emerge in this figure. Top cited SDGs, 3, 12, 7, and 15, climate change, although differences among countries exist. Second, we
and less cited SDGs, 9, 12, and 13. This last case is curious since there have identified 7 SDGs to which most of the works on Aerospace Engi-
are highly cited papers in this SDG (years 12, 15, 16, and 17), but it is neering belong.

A. Sánchez-Roncero, Ò. Garibo-i-Orts, J.A. Conejero et al. Results in Engineering 17 (2023) 100940

One possible limitation of this work is that it is based on abstracts, References

with far better availability than the whole paper. Moreover, at this
