Trends in Artificial Intelligence
Trends in Artificial Intelligence
Trends in Artificial Intelligence
com
textbookfull
Hamido Fujita
Philippe Fournier-Viger
Moonis Ali
Jun Sasaki (Eds.)
123
Lecture Notes in Artificial Intelligence 12144
Series Editors
Randy Goebel
University of Alberta, Edmonton, Canada
Yuzuru Tanaka
Hokkaido University, Sapporo, Japan
Wolfgang Wahlster
DFKI and Saarland University, Saarbrücken, Germany
Founding Editor
Jörg Siekmann
DFKI and Saarland University, Saarbrücken, Germany
More information about this series at http://www.springer.com/series/1244
Hamido Fujita Philippe Fournier-Viger
• •
123
Editors
Hamido Fujita Philippe Fournier-Viger
Iwate Prefectural University Harbin Institute of Technology (Shenzhen)
Takizawa, Japan Shenzhen, China
Moonis Ali Jun Sasaki
Texas State University Iwate Prefectural University
San Marcos, TX, USA Takizawa, Japan
This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface
In recent decades, society has entered a digital era where computers have become
ubiquitous in all aspects of life, including education, governance, science, healthcare,
and industry. Computers have become smaller, faster and the cost of data storage and
communication have greatly decreased. As a result, more and more data is being
collected and stored in databases. Besides, novel and improved computing architectures
have been designed for efficient large-scale data processing such as big data frame-
works, FPGAs, and GPUs. Thanks to these advancements and recent breakthroughs in
artificial intelligence, researchers and practitioners have developed more complex and
effective artificial intelligence-based systems. This has led to a greater interest in
artificial intelligence to solve real-world complex problems, and the proposal of many
innovative applications.
This volume contains the proceedings of the 33rd edition of the International
Conference on Industrial, Engineering, and other Applications of Applied Intelligent
Systems (IEA AIE 2020), which was held during September 22–25, 2020, in
Kitakyushu, Japan. IEA AIE is an annual event that emphasizes applications of applied
intelligent systems to solve real-life problems in all areas including engineering, sci-
ence, industry, automation and robotics, business and finance, medicine and biome-
dicine, bioinformatics, cyberspace, and human-machine interactions. This year, 119
submissions were received. Each paper was evaluated using a double-blind peer review
by at least three reviewers from an international Program Committee consisting of 82
members from 36 countries. Based on the evaluation, a total of 62 papers were selected
as full papers and 17 as short papers, which are presented in this book. We would like
to thank all the reviewers for the time spent to write detailed and constructive com-
ments to authors, and to these latter for the proposal of many high-quality papers.
In the program of IEA AIE 2020, two special sessions were organized named
Collective Intelligence in Social Media (CISM 2020) and Intelligent Knowledge
Engineering in Decision Making Systems (IKEDS 2020). Moreover, three keynote
talks were given by distinguished researchers, one by Prof. Tao Wu from Shanghai Jiao
Tong University School of Medicine (China), one by Enrique Herrera Viedma from the
University of Granada (Spain), and another by Ee-Peng Lim from Singapore Man-
agement University (Singapore). Lastly, we would like to thank everyone who con-
tributed to the success of this year’s edition of IEA AIE that is authors, Program
Committee members, reviewers, keynote speakers, and organizers.
General Chair
Hamido Fujita Iwate Prefectural University, Japan
General Co-chairs
Moonis Ali Texas State University, USA
Franz Wotawa TU Graz, Austria
Organizing Chair
Jun Sasaki Iwate Prefectural University, Japan
Program Chairs
Philippe Fournier-Viger Harbin Institute of Technology (Shenzhen), China
Hideyuki Takagi Kyushu University, Japan
Publicity Chair
Toshitaka Hayashi Iwate Prefectural University, Japan
viii Organization
Program Committee
Rui Abreu University of Lisbon, Portugal
Otmane Ait Mohamed Corcordia University, Canada
Hadjali Allel ENSMA, France
Xiangdong An The University of Tennessee, USA
Artur Andrzejak Heidelberg University, Germany
Farshad Badie Aalborg University, Denmark
Ladjel Bellatreche ENSMA, France
Fevzi Belli Paderborn University, Germany
Adel Bouhoula University of Carthage, Tunisia
Ivan Bratko University of Ljubljana, Slovenia
João Paulo Carvalho University of Lisbon, Portugal
Chun-Hao Chen National Taipei University of Technology, Taiwan
Shyi-Ming Chen National Taiwan University of Science
and Technology, Taiwan
Flávio Soares Corrêa da University of São Paulo, Brazil
Silva
Giorgos Dounias University of the Aegean, Greece
Alexander Ferrein Aachen University of Applied Science, Germany
Philippe Fournier-Viger Harbin Institute of Technology (Shenzhen), China
Hamido Fujita Iwate Prefectural University, Japan
Vicente García Díaz University of Oviedo, Spain
Alban Grastien The Australian National University, Australia
Maciej Grzenda Warsaw University of Technology, Poland
Jun Hakura Iwate Prefectural University, Japan
Tim Hendtlass School of Biophysical Sciences and Electrical
Engineering, Australia
Dinh Tuyen Hoang Yeungnam University, South Korea
Tzung-Pei Hong National University of Kaohsiung, Taiwan
Wen-Juan Hou National Central University, Taiwan
Ko-Wei Huang National Kaohsiung University of Science
and Technology, Taiwan
Quoc Bao Huynh Ho Chi Minh City University of Technology, Vietnam
Said Jabbour University of Artois, France
He Jiang Dalian University of Technology, China
Rage Uday Kiran University of Aizu, Japan
Yun Sing Koh The University of Auckland, New Zealand
Adrianna Kozierkiewicz Wroclaw University of Science and Technology,
Poland
Dariusz Krol Wroclaw University of Science and Technology,
Poland
Philippe Leray University of Nantes, France
Mark Levin Russian Academy of Sciences, Russia
Jerry Chun-Wei Lin Western Norway University of Applied Sciences,
Norway
Organization ix
Industrial Applications
Networking Applications
Multimedia Applications
Machine Learning
Colored Petri Net Modeling for Prediction Processes in Machine Learning . . . 663
Ibuki Kawamitsu and Morikazu Nakamura
A Fuzzy Crow Search Algorithm for Solving Data Clustering Problem . . . . . 782
Ze-Xue Wu, Ko-Wei Huang, and Chu-Sing Yang
Pattern Mining
TKU-CE: Cross-Entropy Method for Mining Top-K High Utility Itemsets . . . 846
Wei Song, Lu Liu, and Chaomin Huang
Process Decomposition and Test Selection for Distributed Fault Diagnosis . . . 914
Elodie Chanthery, Anna Sztyber, Louise Travé-Massuyès,
and Carlos Gustavo Pérez-Zuñiga
1 Introduction
Nowadays, teachers in school are far from the only way in which students can
get knowledge. Besides from the traditional education such as classrooms, there
are plenty of sources to be chosen, like massive open online courses (MOOCs)
or open educational materials. Sufficient and frequent quizzes help students get
better learning outcomes than just studying textbooks or notes [3,8]. However,
creating reasonable and meaningful questions is a costly task in both time and
money. The amount of related quizzes is not comparable with the amount of
growing online educational materials. Accordingly, it is worthwhile to build a
This research is partially supported by the “Aim for the Top University Project” of
National Taiwan Normal University (NTNU), sponsored by the Ministry of Education
and Ministry of Science and Technology, Taiwan, R.O.C. under Grant no. MOST 108-
2221-E-003-010.
c Springer Nature Switzerland AG 2020
H. Fujita et al. (Eds.): IEA/AIE 2020, LNAI 12144, pp. 3–17, 2020.
https://doi.org/10.1007/978-3-030-55789-8_1
4 Y.-H. Liao and J.-L. Koh
2 Related Works
Researchers have dealt with question generation by rule-based approaches in the
past [12]. These solutions depended on well-designed rules, based on profound
linguistic knowledge, to transform declarative sentences into their syntactic rep-
resentations and then generate interrogative sentences. [7] took an “overgenerate-
and-rank” strategy, which used a set of rules to generate more-than-enough
questions and leveraged a supervised learning method to rank the produced
questions. The rule-based approaches performed well on well-structured input
text. However, because of the limitation of hand-crafted rules, the systems failed
to deal with subtle or complicated text. In addition, these heuristic rule-based
approaches focused on the syntactic information of input words, most of which
ignored the semantic information.
Du et al. [6] first proposed a Seq2seq framework with attention mechanism
to model the question generation task for reading comprehension. Their work
considered context information both from a sentence and a paragraph. [18] pre-
sented another QG model with two kinds of decoders. By considering the types
of words including interrogatives, topic words, and ordinary words, the model
aimed to generate questions for an open-domain conversational system. To lever-
age more data with potentially useful information, the answer of a question was
considered in [15] and [17]. The work of Tang et al. showed that the QA and QG
tasks enhanced each other in their training framework.
The aforementioned QG works were remarkable; nevertheless, their successes
were inseparable with SQuAD, a relatively large, publicly available, general pur-
pose dataset. If there is a shortage of training data in a domain where we are
interested, e.g., teaching materials in middle school, it is difficult to train the
models to an acceptable level. For educational purpose, Z. Wang et al. [19] pro-
posed a Seq2seq-based model, QG-net, that captures the way how humans ask
questions from a general purpose dataset, SQuAD, and directly applied the built
model to the learning material, the OpenStax textbooks. Similarly, our study
aims to build a system for middle school education but there does not exist a
large-scale dataset with manually labeled sentence-question pairs. However, our
work mainly differs from QG-net in the following aspects. First, we focus on the
effectiveness of domain adaptation: we tune the proposed model by hundreds of
labeled pairs in our target domain, middle school textbooks. Second, since we
have labeled data in target domain, we therefore do quantitative evaluations,
which were unseen for the generated questions in the target domain of QG-net.
Furthermore, our study proposes a semi-supervised approach to leverage more
generated questions as training pairs to fine-tune the baseline model.
Recently, in natural language processing (NLP) community, there were var-
ious applications of the Seq2Seq model. A Seq2seq model typically consists an
encoder and a decoder. Most of the frameworks are implemented by RNNs. The
encoder looks through the input text as a context reader, and converts it to a
context vector with textual information. The vector is then decrypted by the
decoder as a question generator. The decoding procedure often take the atten-
tion mechanism [1,10] to generate a meaningful question corresponding to the
6 Y.-H. Liao and J.-L. Koh
input text. [16] proposed the pointer network, a modification of Seq2seq, to deal
with the words absent in the training set. Their work was later used by Z. Wang
et al. [19] to point out which part of an input content is more possible to appear
in the output question. We will describe in detail how we apply the Seq2seq
model and its variations in Sect. 3.
Deep neural networks (DNNs) often benefit from transfer learning. In NLP,
transfer learning has also been successfully applied in tasks like QA [5], among
other things. [5] demonstrated that a simple transfer learning technique can be
very useful for the task of multi-choice question answering. Besides, the paper
showed that by an iterative self-labeling technique, unsupervised transfer learn-
ing is still useful. Inspired by [5], we performed experiments to investigate the
transferability of encoder and decoder learned from a source QG dataset to a
target dataset using a sequence-to-sequence pointer-generator network. The size
of the target dataset considered in our study is even smaller than that used in [5].
Although unsupervised transfer learning for QG is still a challenge, we showed
that transfer learning is helpful in a semi-supervised approach.
3 Methods
where LQ denotes the length of output question Q, and qt denotes each word
within Q, respectively. Besides, θ denotes the set of parameters of a prediction
model to get P (Q|S, θ), the conditional probability of the predicted question
sequence Q, given the input S. A basic assumption is that the answer of the
generated question should be a consecutive segment in S. Accordingly, we mainly
consider how to generate the factual questions.
For preparing the source domain data and target domain data, the data pre-
processing steps are required as follows.
Question Generation Through Transfer Learning 7
Source Domain Data. In the DRCD dataset, each given data consists of
a triple (P , Q, A), where P is a paragraph, Q is a question from the given
paragraph, and A is the answer in the paragraph. The following processing is
performed to get a pair of input sentence S and the corresponding question Q.
1. Extract the sentence S that contains the answer A from the paragraph P .
2. Generate the sentence-question pairs (S, Q).
3. Segment the texts in each sentence-question pair via Jieba [14] .
The constructed dataset is denoted by DBDRCD .
Moreover, we apply the intra decoder attention mechanism [20], where the
attention distribution adt for each decoding step t is calculated as:
T
edtt = hdt W dattn hdt + bdattn , (4)
The PTN will create an extra dictionary for the unseen words occurring in the
input sentence. Those words will be probably chosen to be the output when their
probabilities from the attentions weights of input are large. PP T N (qt = xi ) = aeti ,
where xi is in S. Finally, P (qt ) = Pgen ×Pvoc (qt )+(1−Pgen )×PP T N (qt ). By the
usage of point network, this model can effectively deal with out-of-vocabulary
problem. In order to prevent repetition in the generation model, coverage mech-
anism [13] is applied. A coverage value covt is computed, which is the sum of
attention distributions over all previous decoder time steps and contributes to
the loss:
covt = aeti , (8)
i<t
covloss = min(aeti , covt ). (9)
i<t
The total loss is losst = logP (w∗t ) + λ × covlosst , where P (w∗t ) is the negative
log likelihood (NLL) of the target word qt for the time step t. λ is a given hyper-
parameter to control the weight
of coverage loss. The overall loss for the whole
sequence is: loss = 1/LS × t losst . Note that we did not put Chinese characters
into GRU directly. Instead, we first built a dictionary of frequent words from the
word segmentation results of the training set. For each word in the dictionary,
if it is in the vocabulary of fastText [2], its corresponding pre-trained word
embedding is loaded.
In the training process of a Seq2seq model, the inference of a new token
is based on the current hidden state and the previous predicted token. A bad
inference will then make the next inference worse. This phenomenon is a kind
of error propagation. D. Bahdanau et al. [1] thus proposed a learning strategy,
named teacher forcing, to ease the problem. Instead of always using the gener-
ated tokens, the strategy gently changed the training process from fully using
the true tokens, toward mostly using the generated tokens. This method can
yield performance improvement for sequence prediction tasks such as QG. In
the proposed model, we guide the training by 0.75 at beginning, and decay the
ratio by multiplying 0.9999 after each epoch.
1. Given an epoch number epon to train the PTN on the source domain, which
contains abundant training data, and save the model of each epoch until reach
the number epon.
2. Select a best model according to a selection strategy as the base model Mb .
In the experiment, we choose the model with the highest average BLEU-4 on
the validation set of target domain.
3. Fine-tune the model Mb . That is, we initialize another training process with
learnt parameters of the model Mb on the dataset set of target domain.
10 Y.-H. Liao and J.-L. Koh
The PTN consists of the following layers: the embedding layer for
encoder/decoder, the bi-GRU layer and attention network for encoder, the GRU
layer and attention network for decoder, the output layer for question genera-
tion, and the parameter for computing Pgen . We try various strategies of domain
adoptions by freezing or retraining some layers of the model. The details of var-
ious fine-tuning strategies are described in the experiments.
on the validation set of target domain is then selected to perform the iteration
from step 3 to 7.
4 Performance Evaluation
4.1 Experiments Setup
For training the base model, we used the DBDRCD , which is an open domain tra-
ditional Chinese machine reading comprehension (MRC) dataset. The dataset
contains 10,014 paragraphs from 2,108 Wikipedia articles and 30,000+ ques-
tions generated by annotators. We excluded those sentence-question pairs over
80 and 50 words, respectively, so 26,175 pairs are extracted. Moreover, the dictio-
nary of vocabulary contains the ones with frequency no less than 3. Therefore,
the vocabulary size of DBDRCD is 28,981. In the target domain, the dataset
DBtextbook contains 480 labeled pairs. We applied random sampling to separate
the data into train/test/validation sets in the proportion 7/2/1. Accordingly,
there are 336 pairs with vocabulary size 787 in the training set, 48 pairs for
validation, and 96 pairs for testing. The effectiveness of transfer learning is eval-
uated by the model’s performance on the test sets of target domain. For the
Question Generation Through Transfer Learning 11