A Review of Traffic Congestion Prediction Using Artificial Intelligence

Download as pdf or txt
Download as pdf or txt
You are on page 1of 18

Hindawi

Journal of Advanced Transportation


Volume 2021, Article ID 8878011, 18 pages
https://doi.org/10.1155/2021/8878011

Review Article
A Review of Traffic Congestion Prediction Using
Artificial Intelligence

Mahmuda Akhtar and Sara Moridpour


Department of Civil and Infrastructure Engineering, RMIT University, Melbourne, VIC 3000, Australia

Correspondence should be addressed to Mahmuda Akhtar; [email protected]

Received 8 August 2020; Revised 7 January 2021; Accepted 18 January 2021; Published 30 January 2021

Academic Editor: Michael Bazant

Copyright © 2021 Mahmuda Akhtar and Sara Moridpour. This is an open access article distributed under the Creative Commons
Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is
properly cited.
In recent years, traffic congestion prediction has led to a growing research area, especially of machine learning of artificial
intelligence (AI). With the introduction of big data by stationary sensors or probe vehicle data and the development of new AI
models in the last few decades, this research area has expanded extensively. Traffic congestion prediction, especially short-term
traffic congestion prediction is made by evaluating different traffic parameters. Most of the researches focus on historical data in
forecasting traffic congestion. However, a few articles made real-time traffic congestion prediction. This paper systematically
summarises the existing research conducted by applying the various methodologies of AI, notably different machine learning
models. The paper accumulates the models under respective branches of AI, and the strength and weaknesses of the models
are summarised.

1. Introduction Ensuring economic growth and the road users’ comfort


are the two requirements for the development of a country,
Artificial intelligence (AI) is the most important branch of which is impossible without smooth traffic flow. With the
computer science in this era of big data. AI was born 50 years ago development in the transportation sector by collecting traffic
and came a long way, making encouraging progress, especially in information, authorities are putting more attention on traffic
machine learning, data mining, computer vision, expert systems, congestion monitoring. Traffic congestion prediction pro-
natural language processing, robotics, and related applications vides the authorities with the required time to plan in the
[1]. Machine learning is the most popular branch of AI. Other allocation of resources to make the journey smooth for
classes of AI include probabilistic models, deep learning, arti- travellers. Traffic congestion prediction problem discussed
ficial neural network systems, and game theory. These classes are in this paper can be defined as an estimation of parameters
developed and applied in a wide range of sectors. Recently, it has related to traffic congestion into the short-term future, e.g.,
been the leading research area in transportation engineering, 15 minutes to a few hours by applying different AI meth-
especially in traffic congestion prediction. odologies by using collected traffic data. There are usually
Traffic congestion has a direct and indirect impact on a five parameters to evaluate, including traffic volume, traffic
country’s economy and its dwellers’ health. According to density, occupancy, traffic congestion index, and travel time
Ali et al. [2], traffic congestion causes Pak Rs. 1 million while monitoring and predicting traffic congestions.
every day in terms of opportunity cost and fuel con- Depending on the nature of the collected data, a variety of AI
sumption due to traffic congestion. Traffic congestion af- approaches are applied to evaluate the congestion param-
fects on individual level as well. Time loss, especially during eters. This article systematically discusses the models and
peak hours, mental stress, and the added pollution to the their advantage and disadvantages. The primary motivation
global warming are also some important factors caused due of this review is to gather the articles focusing solely on
to traffic congestion. traffic congestion prediction models. The keywords used in
2 Journal of Advanced Transportation

the search process included “traffic congestion prediction” second from approximately 20000 taxies of Beijing, China.
OR “traffic congestion estimation” OR “congestion pre- Data included the taxi number, the latitude-longitude of the
diction modelling” OR “prediction of traffic congestion” OR vehicle, timestamp when sampling, and whether there was a
“road congestion forecast” OR “traffic congestion forecast.” passenger or not. Data updating frequency of this dataset
For efficient screening, research paper search was done varies from 10 s to 5 min according to the quality of GPS
according to year using search engines like Scopus, Google device [4, 5, 9]. Other probe data included low-frequency
Scholar, and Science Direct. After collecting all the peer- Probe Vehicle Data (PVD) [10] and bus GPS data [11, 12].
reviewed journal and conference papers written in the However, sometimes probe data show significant fluctua-
English language, 48 articles were found for review. Any tion. Besides, map matching is usually a must for probe data.
studies focusing on the cause of traffic congestion, traffic But data can minimize this limitation. Probe data collected
congestion control, traffic congestion impact, traffic con- from one city cannot be used directly for modelling other
gestion propagation, traffic congestion prevention, etc. were city networks. This is because the data collected from Beijing,
excluded from this manuscript. China, includes latitude-longitude of the vehicle, which is
A general layout of the prediction approaches is pro- unique. However, a generalised model using probe data can
vided in Section 2. The data collection sources and con- be generated for different cities.
gestion forecasting models are explained in Sections 3–6 and Other data sources, e.g., data from tolling system and
they provide the overall discussion and concluding remarks. data provided by transportation authority, will add more
reliable data as the sources are dependable. However, a lot of
2. General Layout the times, study area needs to be adjusted as in most cases,
tolled road information is not available. Tracking cellular
Traffic congestion forecasting has two basic steps of data phone movements without privacy breach can also be a
collection and prediction model development. Every step of source of data. However, the heterogeneity of the vehicle
the methodology is important and may affect the results if distribution will be hard to determine from this dataset, if
not done correctly. After data collection, data processing not impossible. Besides, due to pedestrian or cyclists trav-
plays a vital role to prepare the training and testing datasets. elling through the sidewalk, there might be many outliers in
Case area differs for different research. After developing the the dataset if modelling is done for a road network. Data
model, it is validated with other base models and ground collected from a questionnaire to the general public/drivers
true results. Figure 1 shows the general components of traffic may provide a misleading result [13].
congestion prediction studies. These branches were further
divided into more specific sub-branches and are discussed in
the following sections. 3.1. Clustering Algorithms. Some studies use clustering the
acquired data before applying the main congestion models
3. Data Source of prediction. This hybrid modelling technique is applied to
fine-tune the input values and to use them in the training
Traffic datasets used in different studies can be mainly di- phase. Figure 2 shows the commonly used AI clustering
vided into two classes, including stationary and probe data. models in this field of research. The models are described
Stationary data can be further divided into sensor data and briefly in this section.
fixed cameras. On the other hand, probe data that were used Fuzzy C-Means (FCM) is a popular nondeterministic
in the studies were GPS data mounted on vehicles. clustering technique in data mining. In traffic engineering
Stationary sensors continuously capture spatiotemporal researches, traffic pattern recognition plays an important
data of traffic. However, sensor operation may interrupt role. Besides, these studies often face the limitation of
anytime. Authorities should always consider this temporary missing or incomplete data. To deal with these constraints,
failure of the sensor while planning by using this data. The FCM has become a commonly applied clustering technique.
advantage of the sensor data is that there is no confusion on The advantage of this approach is, unlike original C-means
the location of the vehicles. The most used dataset was clustering methods, it can overcome the issue of getting
Performance Measurement System (PeMS) that collects trapped in the local optimum [14]. However, FCM requires
highway data across all major metropolitan areas of the State setting a predefined cluster number, which is not always
of California of traffic flow, sensor occupancy, and travel possible while dealing with massive data without any prior
speed in real-time. Most of the studies used dataset from the knowledge of the data dimension. Besides, this model be-
I-5 highway, in San Diego, California, every 5 minutes [3–6]. comes computationally expensive with data size increment.
Other systems included the Genetec blufaxcloud travel-time Different studies have applied FCM successfully by im-
system engine (GBTTSE) [7] and the Topologically Inte- proving its limitations. Some studies changed the fuzzy
grated Geographic Encoding and Referencing (TIGER) line index value for each FCM algorithm execution [15], some
graph [8]. calculated the Davies-Bouldin (DB) index [10], while others
On the other hand, probe data has the advantage of applied the K-means clustering algorithm [16, 17].
covering the entire road network. A network consists of K-means clustering is an effective and relatively flexible
different structured roads. Therefore, studies, especially algorithm while dealing with large datasets. It is a popular
those that considered the network wide area, used probe unsupervised machine learning algorithm. Depending on
data. The most used dataset was GPS data collecting every the features, cluster number varied from two [18] to 50
Journal of Advanced Transportation 3

Traffic
Methodolog congestion
y prediction

Data pre Dataset


processing

Scope of Stationary Probe vehicle


study area sensor data data

Traffic Modelling Clustering No clustering


applying AI AI models
parameters

Traffic
volume Probabilistic
reasoning
Traffic
density
Shallow machine
learning
Sensor
occupancy
Deep machine
learning
Traffic
speed

Traffic
congestion
index

Traffic congestion
state leveling

Model validation

Figure 1: The layout of traffic congestion prediction system.

AI clustering algorithms

Density-based spatial
Fuzzy C-means clustering of applications K-means
with noise

Figure 2: Commonly used AI clustering algorithms.

[19–21]. Like FCM, K-means clustering requires a pre- K-means clustering overcoming the limitations and
defined cluster number and selecting K original cluster exploited the pattern using principal component analysis
centres. GAP [22] and WEKA toolbox [23] were used to (PCA) [24, 25].
define the value. For large datasets, as the sample distri- DBSCAN is more of a general clustering application in
bution is unknown in the beginning, it is not always possible machine learning and data mining. This method overcomes
to fulfil these two requirements. A few studies used adaptive the limitation of FCM of predefining the cluster number. It
4 Journal of Advanced Transportation

can automatically generate arbitrary cluster shapes sur- Artificial intelligence


rounded by clusters of different characteristics and can easily
recognise outlier. However, it requires two parameters to
preset. A suitable parameter determination method, e.g., Probabilistic Shallow machine Deep machine
reasoning learning learning
trial and error method [8] and human judgement [26] makes
the model computationally expensive and requires a clear Figure 3: Branches of artificial intelligence in this article.
understanding of the dataset.
From the above discussion, it is concluded that only 16
out of 48 studies have done clustering before applying where Y � original value, Yi � predicted value, and
prediction models. Several time-series models and shallow n � number of instances.
machine learning (SML) algorithms have used clustering FP
FPR � , (4)
approach. However, deep learning algorithms can process TN + FP
input data on different layers of the model, thus may not
need clustering beforehand. TP
DR � , (5)
FN + TP
4. Applied Methodology where FP, TN, FN, and TP represent the false positive, true
negative, false negative, and true positive, respectively.
Traffic flow is a complex amalgamation of heterogenous The rest of this section will discuss the methodology the
traffic fleet. Thus, traffic pattern prediction modelling could authors have applied in the studies.
be an easy and efficient congestion prediction approach.
However, depending on the data characteristics and quality,
different classes of AI are applied in various studies. Figure 3 4.1. Probabilistic Reasoning. Probabilistic reasoning is a
shows the main branches—probabilistic reasoning and significant section of AI. It is applied to deal with the field of
machine learning (ML). Machine learning comprised of uncertain knowledge and reasoning. A variety of these al-
both shallow and deep learning algorithms. However, with gorithms are commonly used in traffic congestion prediction
the progress of this article, these sections were subdivided studies. The studies discussed hereunder probabilistic rea-
into detailed algorithms. soning is shown in Figure 4.
To generalise traffic congestion forecasting studies using
different models is not straight forward. The common
4.1.1. Fuzzy Logic. Zadeh is a commonly applied model in
factors of all the articles include the study area, data col-
dynamic traffic congestion prediction as it allows vagueness
lection horizon, predicted parameter, prediction intervals,
instead of binary outcomes. In this method, several mem-
and validation procedure. Most of the articles took studied
bership functions are developed those represent the degree
corridor segment as the study area [5, 27–30]. Other study
of truth. With the vastness with time, traffic data are be-
areas included the traffic network [31, 32], ring road [9], and
coming complex and nonlinear. Due to its ability to deal
arterial road [33]. Data collection horizon varied from 2
with uncertainty in the dataset, fuzzy logic has become
years [34] to less than a day [35] in the studies. Congestion
popular in traffic congestion prediction studies.
estimation is done predicting traffic flow parameters, e.g.,
A fuzzy system comprises of several fuzzy sets, which is
traffic speed [4], density, speed [5], and congestion index
built of membership functions. There are usually three
[31], to mention a few. The Congestion Index (CI) approach
codification shapes to choose for the membership functions
is suitable to monitor the congestion level continuously in a
(MFs) of input: triangular, trapezoidal, and Gauss function.
spatiotemporal dimension. Studies those compared their
The fuzzy rule-based system (FRBS) is the most common
results with the ground truth value or with other models
fuzzy logic system in traffic engineering research. It consists
used mean absolute error (MAE) (equation (1)), symmetric
of several IF-THEN rules that logically relate the input
mean absolute percentage error (sMAPE) (equation (1)),
variables with output. It can effectively deal with the
MAPE, root-mean-squared error (RMSE) (equation (3)),
complexity resulting from real-world traffic situations by
false positive rate (FPR) (equation (4)), and detection rate
representing them in simple rules. These rules combine the
(DR) (equation (5)). Many studies used SUMO to validate
relations among different traffic states to detect the resulting
their models:
traffic condition [36]. However, with the growth in data
1 n 􏼌􏼌􏼌 􏼌􏼌 complexity, the total number of rules also grows, lessening
MAE � 􏽘􏼌Yi − Yi 􏼌􏼌, (1) the accuracy of the whole system, thus making it compu-
n i�1
tationally expensive. To better manage this problem, two
􏼌􏼌 􏼌
1 n 􏼌􏼌Yi − Yi 􏼌􏼌􏼌 types of fuzzy logic controls are applied. In hierarchical
sMAPE � 􏽘 􏼌􏼌 􏼌􏼌 􏼌􏼌 􏼌􏼌 . 100, (2) control (HFRBS), according to the significance, the input
n i�1 􏼐􏼌􏼌Yi 􏼌􏼌 + 􏼌􏼌Yi 􏼌􏼌􏼑/2 variables are ordered and MFs are employed. Figure 5 shows
􏽳������������ a simple HFRBS structure. MFs are optimized by applying
2 different algorithms, e.g., genetic algorithm (GA) [30], hy-
􏽐ni�1 Yi − Yi 􏼁 (3)
RMSE � , brid genetic algorithm (GA), and cross-entropy (CE)
n
Journal of Advanced Transportation 5

Probabilistic reasoning

Hidden markov Gaussian mixture


Fuzzy logic Bayesian network
model model

Figure 4: Subdivision of probabilistic reasoning models.

Variable 1 As discussed before, with the development of optimi-


Subsystem 1
sation algorithms, optimisation of the fuzzy logic system’s
Rule base 1 membership functions is becoming diverse. With time, the
simplest form of FRBS-TSK has become popular due to its
Variable 2
Subsystem 2
good interpretability. Some other sectors of transportation
Output where fuzzy logic models are popular include traffic light/
Rule base 2
signal control [39, 40], traffic flow prediction (Zhang and Ye
Variable 3 [41]), traffic accident prediction [42], and modified fuzzy
Figure 5: A simple structure of HFRBS. logic for freeway travel time estimation (Zhang and Ge [43]).
The fuzzy logic system is the only probabilistic reasoning
model that can have an outcome of more than congested/
[28, 37] compared the performance of evolutionary crisp noncongested state of the traffic state. This is one of the main
rule learning (ECRL) and evolutionary fuzzy rule learning advantages that has made this methodology popular.
(EFRL) for road traffic congestion prediction. It was seen However, no study has provided any reasonable logic on
that ECRL models outperformed EFRL in terms of averaged selecting the membership function, which is a significant
accuracy and no of rules but was computationally expensive. limitation of fuzzy logic models.
The Takagi-Sugeno-Kang (TSK) (FRBS) model is one of
the simple fuzzy models due to its mathematical treatability.
A weighted average computes the output of this model. 4.1.2. Hidden Markov Model. The hidden Markov model
Another simple FRBS model is Mamdani-type model. The (HMM) is a combination of stochastic characteristics of
output of this model is a fuzzy set which needs defuzzifi- Markov process and discrete characteristics of Markov
cation, which is time-consuming. Due to its good inter- chains. It is a stochastic, time-series event recognition
pretability, it can improve the accuracy of fuzzy linguistic technique. Some studies have applied Markov chain model
models. Cao and Wang [3] applied this model to show the for traffic pattern recognition during congestion prediction
congestion severity change among road grades. A few [21, 25, 44]. Pearson correlation coefficient (PCC) is com-
studies used this method to fuse heterogenous parameters monly applied among the parameters during pattern con-
[7, 13]. The TSK model works on improving the inter- struction. Zaki et al. [32] applied HMM to select the
pretability of an accurate fuzzy model. TSK is applied for its appropriate prediction model from several models they
fast calculation characteristics [37]. developed applying the adaptive neurofuzzy inference sys-
The fuzzy comprehensive evaluation (FCE) uses the tem (ANFIS). They obtained optimal state transition by four
principle of fuzzy transformation and maximum member- processing steps: initialization, recursion, termination, and
ship degree. This model consists of several layers, which is a backtracking. The last step analysed the previous step to
useful objective evaluation method, assessing all relevant determine the probability of the current state by using the
factors. The number of layers depends on the objective Viterbi algorithm. Based on the log-likelihood of the initial
complicacy and the number of factors. Kong et al. [4] and model parameter, defined by expectation maximization
Yang et al. [5] applied FCE in which the weights and the (EM) algorithm, of HMM with the traffic pattern, a suitable
fuzzy matrix of multi-indexes were adapted according to the congestion model was selected for prediction. Mishra et al.
traffic flow to estimate traffic congestion state. Adaptive [23] applied the discretised multiple symbol HMM (MS-
control adjusts weight coefficient based on judgement ma- HMM) prediction model named future state prediction
trix. Certain weights are assigned to calculate the mem- (FSP). They evaluated model adaptability for different road
bership degree of the parameters [35]. segments. A label was generated containing hidden states of
Other than GA and PSO, Ant Colony Optimization MS-HMM, and the output was used for FSP to result in the
(ACO) algorithm was also introduced by Daissaoui et al. [38] next hidden state label.
in fuzzy logic system. They provided the theory for a smart In traffic engineering, especially while utilising probe
city, where each vehicle GPS data was taken as a pheromone, vehicle data, HMM is very useful in map-matching. Sun et al.
consistent with the concept of ACO. The objective was to [45] applied HMM for mapping the trajectory of observed
predict traffic congestion one minute ahead from the in- GPS points in nearby roads. These candidate points were
formation (pheromone) provided by past cars. However, the taken as hidden states of HMM. The candidate points closer
article does not give any result on support to the model. to the observation point had higher observation probability.
6 Journal of Advanced Transportation

Transition probability of two adjacent candidates was also 4.1.4. Bayesian Network. A Bayesian network (BN), also
considered to avoid the misleading results generated from known as a causal model, is a directed graphical model for
abrupt traffic situations. representing conditional independencies between a set of
HMM shows accuracy in selecting a traffic pattern or a random variables. It is a combination of probability theory
traffic point. It has the advantage that it can deal with the and graph theory and provides a natural tool for dealing with
data with outliers. However, points with a short sampling two problems that occur through applied mathematics and
interval seem to be matched well, and long intervals and engineering—uncertainty and complexity [53].
higher similar probe data decreased the model accuracy. Asencio-Cortés et al. [54] applied an ensemble of seven
Studies have found a significant mismatch for long sampling machine learning algorithms to compute the traffic con-
interval dataset and similar road networks. gestion prediction. This methodology was developed as a
The GPS tracking system has been widely developed in binary classification problem applying the HIOCC algo-
this era of the satellite. Thus, making HMM modelling is rithm. Machine learning algorithms applied in this study
currently more relevant for map matching. Other sectors of were K-nearest neighbour (K-NN), C4.5 decision trees
transport where HMM is applied include traffic prediction (C4.5), artificial neural network (ANN) of backpropagation
[46], modified HMM for speed prediction [47], and traffic technique, stochastic gradient descent optimisation (SGD),
flow state transition [48], etc. fuzzy unordered rule induction algorithm (FURIA),
Bayesian network (BN), and support vector machine (SVM).
Three of these algorithms (C4.5, FURIA, and BN) can
4.1.3. Gaussian Distribution. Gaussian processes have produce interpretable models of viewable knowledge. A set
proven to be a successful tool for regression problems. of ensembled learning algorithms were applied to improve
Formally, a Gaussian process is a collection of random the results found from these prediction models. The en-
variables, any finite number of which obeys a joint Gaussian semble algorithm group included bagging, boosting (Ada-
prior distribution. For regression, the function to be esti- Boost M1), stacking, and Probability Threshold Selector
mated is assumed to be generated by an infinite-dimensional (PTS). The authors found a significant improvement in
Gaussian distribution, and the observed outputs are con- Precision for BN after applying ensemble algorithms. On the
taminated by additive Gaussian noise. other hand, Kim and Wang [34] applied BN to determine the
Yang [29] applied Gaussian distribution for traffic factors that affect congestion initialization on different road
congestion prediction in their study. This study was divided sections. The developed model of this study gave a frame-
into three parts. First, the sensor ranking was done work to assess different scenario ranking and prioritizing.
according to the volume quality by applying p test. In the Bayesian network is seen to perform better with
second part of the study, the congestion occurring prob- ensembled algorithms or while modified, e.g., other trans-
ability was determined from a statistics-based method. In port sectors of traffic flow prediction [55] and parameter
the learning phase of this part, two Gaussian probability estimation at signalised intersection [56, 57].
models were developed from two datasets for every point of
interest. In the decision phase, on which model the input
traffic volume value fitted was evaluated, and a prediction 4.1.5. Others. Other than the models mentioned above, the
score presenting congestion state was determined from the Kalman Filter (KF) is also a popular probabilistic algorithm.
ratio of two models. Finally, the probability of congestion With the increment of available data, data fusion methods
occurring at the point of interest was found by combining are becoming popular. The fusion of historical and real-time
and sorting the prediction score from all the ranked sen- traffic data can achieve a higher level of traffic congestion
sors. Zhu et al. [49] also presented the probability of traffic prediction accuracy. In this regard, KF is commonly applied.
state distribution. Selection of mean and variance pa- Extended KF (EKF) is an extension of KF, which can be used
rameters of Gaussian distribution is an important step. In to stochastically filter the nonlinear noises to improve the
this study, the EM algorithm was applied for this purpose. mean and covariance of an estimated state. Therefore, after
The first step generated the log-likelihood expectation for data fusion, it updated the estimated covariance error by
the parameters, whereas the last step maximised it. Sun removing outliers [7].
et al. [45] approximated the error in GPS location in the Wen et al. [8] applied GA in traffic congestion prediction
road with Gaussian Distribution, taking mean 0. The error from spatiotemporal traffic environment. Temporal asso-
was calculated from the actual GPS point, matching point ciation rules were extracted from the traffic environment
on the road section, and standard deviation of GPS mea- applying GA-based temporal association rules (GATARs).
surement error. Their proposed Hybrid Temporal Association Rules Mining
From the abovementioned studies, it is seen that the method (HTARM) included DBSCAN and GATAR
Gaussian distribution model has a useful application in methods. The DBSCAN application method was discussed
reducing feature numbers without compromising the quality previously in this article. While encoding using GATAR,
of the prediction results or for location error estimation road section number and congestion level were included in
while using GPS data. Gaussian distribution is also applied the chromosome. The decoding was done to obtain temporal
in traffic volume prediction [50], traffic safety [51], and association rules and was sorted according to confidence and
traffic speed distribution variability [52]. support value in the rule pool. For both simulated and real-
Journal of Advanced Transportation 7

world scenarios, the proposed HTARM method out- Unlike the previous studies, those focused on traffic flow
performed GATAR in terms of extracting temporal asso- parameters to conduct traffic congestion prediction re-
ciation rules and prediction accuracy. However, the cluster search; Ito and Kaneyasu [60] analysed drivers’ behaviour in
number difference showed a big difference in the two sce- predicting congestion. They showed that vehicle operators
narios. Besides, with the increment of road network com- act differently on different phases of the journey. They used
plexity, the prediction accuracy decreased. one layered BPNN to learn the behaviour of female drivers
Table 1 summarises the methodologies and different and extract travel phase according to that. The results
parameters used in various studies we have discussed so far. showed an average efficiency of 82% in distinguishing the
travel phase.
ANN is a useful machine learning model which has a
4.2. Shallow Machine Learning. Shallow machine learning
flexible structure. The neurons of the layer can be adapted
(SML) algorithms include traditional and simple ML algo-
according to the input data. As mentioned above, a general
rithms. These algorithms usually consist of a few, many
model can be developed and applied for different road types
times, one hidden layer. SML algorithms cannot extract
by using the advantage of nonlinearity capturing ability of
features from the input, and features need to be defined
ANN. However, ANN requires larger datasets than the
beforehand. Model training can only be done after feature
probabilistic reasoning models, which results in high
extraction. SML algorithms and their application in traffic
complexity.
congestion studies are discussed in this section and shown in
ANN shows great potential in diverse parameter anal-
Figure 6.
ysis. ANN is the only model that has recently been applied
for driver behaviour analysis for traffic congestion. ANN is
4.2.1. Artificial Neural Network. Artificial neural network popular in every section of transport- traffic flow prediction
(ANN) was developed, mimicking the function of the hu- [61, 62], congestion control [63], driver tiredness [64], and
man brain to solve different nonlinear problems. It is a first- vehicle noise [65, 66].
order mathematical or computational model that consists of
a set of interconnected processors or neurons. Figure 7
shows a simple ANN structure. Due to its easy imple- 4.2.2. Regression Model. Regression is a statistical supervised
mentation and efficient forecasting ability, ANN has become ML algorithm. It models the prediction real numbered
popular in the field of traffic congestion prediction research. output value based on the independent input numerical
Hopfield network, feedforward network, and back- variable. Regression models can be further divided
propagation are the examples of ANN. Feedforward neural according to the number of input variables. The simplest
network (FNN) is the simplest NN, where the input data go regression model is linear regression with one input feature.
to the hidden layer and from there to the output layer. When the feature number increases, the multiple regression
Backpropagation neural network (BPNN) consists of feed- model is generated.
forward and weight adjustment of the layers and is the most Jiwan et al. [27] developed a multiple linear regression
commonly applied ANN in transportation management. Xu analysis (MLRA) model using weather data and traffic
et al. [31] applied BPNN to predict traffic flow, thus to congestion data after preprocessing using Hadoop. At first, a
evaluate congestion factor in their study. They proposed single regression model was developed for all the variables
occupancy-based congestion factor (CRO) evaluation using R. After a 3-fold reduction process, only ten variables
method with three other evaluated congestion factors based were determined to form the final MLRA model. Zhang and
on mileage ratio of congestion (CMRC), road speed (CRS), Qian [22] conducted an interesting approach to predict
and vehicle density (CVD). They also evaluated the effect of morning peak hour congestion using household electricity
data-size on real-time rendering of road congestion. usage patterns. They used LASSO regression to correlate the
Complex road network with higher interconnections pattern features using the advantage of linearly related
showed higher complication in simulation and rendering. critical feature selection capability.
The advantage of the proposed model was that it took little On the other hand, Jain et al. [33] developed both linear
processing time for high sampling data rendering. The and exponential regression model using IBM SPSS software
model can be used as a general congestion prediction model to find the relevant variables. The authors converted het-
for different road networks. Some used hybrid NN for erogenous vehicles into passenger car unit (PCU) for sim-
congestion prediction. Nadeem and Fowdur [11] predicted plification. Three independent variables were considered to
congestion in spatial space, applying the combination of one estimation origin-destination- (O-D-) based congestion
of six SML algorithms with NN. Six SML algorithms in- measures. They used PCC to evaluate the correlation among
cluded moving average (MA), autoregressive integrated the parameters. However, simply averaging O-D node pa-
moving average (ARIMA), linear regression, second- and rameters may not provide the actual situation of dynamic
third-degree polynomial regression, and k-nearest neigh- traffic patterns.
bour (KNN). The model showing the least RMSE value was Regression models consist of some hidden coefficients,
combined with BPNN to form hybrid NN. The hidden layer which are determined in the training phase. The most ap-
had seven neurons, which was determined by trial and error. plied regression model is the autoregressive integrated
However, it was a very preliminary level work. It did not moving average (ARIMA). ARIMA has three parameters- p,
show the effect of data increment in the accuracy. d, and q. “p” is the auto regressive order that refers to how
8 Journal of Advanced Transportation

Table 1: Traffic congestion prediction studies in probabilistic reasoning.


Data No. of congestion
Methodology Road type Input parameters Target domain Reference
source state levels∗
Occupancy Speed 2 Zhang et al. [30]
Hierarchical fuzzy rule-
Lopez-garcia
based system Speed Speed
Highway corridor Sensor et al. [37]
Evolutionary fuzzy rule Onieva et al.
Traffic flow Traffic density
learning [28]
Cao and Wang
Mamdani-type fuzzy Highway, trunk Speed 4
— [3]
logic inference road, branch road
Density
Congestion Index
Travel time
Wang et al. [58]
Fuzzy inference Highway corridor Camera Traffic flow
Speed
Fuzzy comprehensive Traffic volume Saturation Kong et al. [4]
Highway corridor Probe 5
evaluation Speed Density speed Yang et al. [5]
Traffic pattern
Emission matrix — Zaki et al. [32]
selection
Highway network Sensor
Emission matrix Traffic pattern
— Zaki et al. [25]
Transition matrix determination
Hidden Markov model
Observation
probability
Main road Probe Mapping GPS data — Sun et al. [45]
Transition
probability
Optimal feature
Gaussian distribution Highway corridor Sensor Traffic volume — Yang [29]
selection
Road and bus
Build-up area Simulation Yi Liu et al. [59]
increment
Intensity Congestion

Occupation probability Asencio-Cortés
Bridge Sensor
Average speed et al. [54]
Average distance
Network
direction
Bayesian network
Day and time
weather
Incidents Congestion Kim and Wang
Highway network Sensor
Traffic flow probability [34]
Occupancy
Speed
Level of service
Congestion state
Adetiloye and
Extended Kalman filter Highway Camera Travel time Data fusion —
Awasthi [7]
The table accumulates the data source, scope of the study area, input and resulting parameters, and how many cognitive traffic states were considered in the
studies.∗ 2 � free/congested, 4 � free/light/medium/severe, 5 � very free/free/light/medium/severe

Shallow machine
learning

Artificial neural Support vector


Regression model Decision tree
network machine

Figure 6: Subdivision of shallow machine learning models.

many lags of the independent variable needs to be con- the time-series stationary. Alghamdi et al. [67] took d as 1 as
sidered for prediction. Moving average order “q” presents one differencing order could make the model stationary.
the lag prediction error numbers. Lastly, “d” is used to make Next, they applied the autocorrelation function (ACF) and
Journal of Advanced Transportation 9

Output layer
predictability, an optimal segment length and velocity was
found. However, with less available data, an increased
number of segments increased the predictability. Another
traffic parameter, travel time, was used to find CI by Liu and
Wu [73]. They applied the random forest ML algorithm to
Hidden layer forecast traffic congestion states. At first, they extracted 100
sample sets to construct 100 decision trees by using boot-
strap. The number of feature attributes was determined as
the square root of the total number of features. Chen et al.
Input layer [16] also applied the CART method for prediction and
classification of traffic congestion. The authors applied
Figure 7: A simple ANN structure. Moran’s I method to analyse the spatiotemporal correlation
among different road network traffic flow. The model
showed effectiveness compared with SVM and K-means
the partial autocorrelation function (PACF) along with the algorithm.
minimum information criteria matrix to determine the Decision tree is a simple classification problem-solving
values of p and q. They only took the time dimension into model that can be applied for multifeature data, e.g., Liu
account. However, the results inclined with the true pattern and Wu [73] applied weather condition, road condition,
for only one week and needed to be fine-tuned considering time period, and holiday as the input variables. This
prediction errors. Besides, the study did not consider the model’s knowledge can be represented in the form of IF-
spatial dimension. THEN rules, making it an easily interpretable problem. It is
Regression models are useful to be applied for time series also needed to be kept in mind that the classification results
problems. Therefore, regression models are suitable for are usually binary and therefore, not suitable where the
traffic forecasting problems. However, these models are not congestion level is required to be known. Other sectors of
reliable for nonlinear, rapidly changing the multidimension transport, where decision tree models applied are traffic
dataset. The results need to be modified according to pre- prediction [74] and traffic signal optimisation with Fuzzy
diction errors. logic [75].
However, as already and further will be discussed in this
article, most of the studies used different regression models
to validate their proposed model [6, 11, 25, 68, 69]. 4.2.4. Support Vector Machine. The support vector machine
With the increment of dataset and complexity associated (SVM) is a statistical machine learning method. The main
with it, regression models are becoming less popular in idea of this model is to map the nonlinear data to a higher
traffic congestion prediction. Currently, regression models dimensional linear space where data can be linearly classified
are frequently used by modifying with other machine by hyperplane [1]. Therefore, it can be very useful in traffic
learning algorithms, e.g., ANN and kernel functions. Some flow pattern identification for traffic congestion prediction.
other sectors’ regression models are applied including hy- Tseng et al. [13] determined travel speed in predicting real-
brid ARIMA in traffic speed prediction for specific vehicle time congestion applying SVM. They used Apache Storm to
type (Wang et al. [70], traffic volume prediction [71], and process big data using spouts and bolts. Traffic, weather
flow prediction applying modified ARIMA [72]. sensors, and events collected from social media of close
proximity were evaluated together by the system. They
categorised vehicle speed into classes and referred them as
4.2.3. Decision Tree. A decision tree is a model that predicts labels. Speed of the previous three intervals was used to train
an output based on several input variables. There are two the proposed model. However, the congestion level cat-
types of trees: the classification tree and the regression tree. egororised from 0 to 100 does not carry a specific knowledge
When these two trees merge, a new tree named classification of the severity of the level, especially to the road users.
and regression tree (CART) generates. Decision tree uses the Increment in training data raised accuracy and computa-
features extracted from the entire dataset. Random forest is a tional time. This may ultimately make it difficult to make
supervised ML classification algorithm that is the average of real-time congestion prediction.
multiple decision tree results. The features are randomly Traffic flow shows different patterns based on the traffic
used while developing decision trees. It uses a vast amount of mixture or time of the day. SVM is applied to identify the
CART decision trees. The decision trees vote for the pre- appropriate pattern. Currently modified SVM mostly has its
dicted class in a random forest model. application in other sectors as well, e.g., freeway exiting
Wang et al. [9] proposed a probabilistic method of traffic volume prediction [58], traffic flow prediction [76],
exploiting information theory tools of entropy and Fano’s and sustainable development of transportation and ecology
inequality to predict road traffic pattern and its associated [77].
congestion for urban road segments with no prior knowl- Most of the studies compared their developed model
edge on the O-D of the vehicle. They incorporated road with SVM [22, 78, 79]. Deep machine learning (DML) al-
congestion level into time series for mapping the vehicle gorithms showed better results compared to SVM. Table 2
state into the traffic conditions. As interval influenced the refers to the studies under this section.
10 Journal of Advanced Transportation

Table 2: Traffic congestion prediction studies in shallow machine learning.


No. of
Data
Methodology Road type Input parameters Target domain congestion Reference
source
state levels
Sensor Occupancy
Road network Congestion factor 3 Xu et al. [31]
Simulation Density
Nadeem and Fowdur
Distance
Artificial neural Sensor Speed 2 [11]
network Highway Speed
corridor Speed
Traffic congestion Ito and Kaneyasu
Simulation Throttle opening 2
state [60]
Steering input angle
Temperature
Humidity
Highway Traffic congestion
Sensor Rainfall — Jiwan et al. [27]
corridor score
Regression model Traffic speed
Time
Arterial road Volume
Camera Congestion Index 4 Jain et al. [33]
Subarterial road Speed
Ring road Average speed Traffic predictability — Wang et al. [9]
Decision tree Probe Speed
Road network Moran index 5 Chen et al. [16]
Trajectory
Speed
Density
Support vector Highway
Sensor Traffic volume Travel speed — Tseng et al. [13]
machine corridor
difference
Rainfall volume

4.3. Deep Machine Learning. DML algorithms consist of Whereas Chen et al. [68] used a five-layered convolution of
several hidden layers to process nonlinear problems. The filter size of (2 × 2) without the pooling layer. The authors
most significant advantage of these algorithms is they can applied a novel method called convolution-based deep
extract features from the input data without any prior neural network modelling periodic traffic data (PCNN).
knowledge. Unlike SML, feature extraction and model The study folded the time-series to generate the input
training are done together in these algorithms. DML can combining real-time and historical traffic data. To capture
convert the vast continuous and complex traffic data with the correlation of a new time slot with the immediate past,
limited collection time horizon into patterns or feature they duplicated the congestion level of the last slot in the
vectors. From last few years, DML has become popular in matrix. Zhu et al. [49] also applied five convolution-
traffic congestion prediction studies. Traffic congestion pooling layers as well as (3 × 3) and (2 × 2) sizes, respec-
studies that used DML algorithms are shown in Figure 8 and tively. Along with temporal and spatial data, the authors
discussed in this section. also incorporated time interval data to produce a 3-D input
matrix. Unlike these studies, Zhang et al. [6] preprocessed
4.3.1. Convolutional Neural Network. Convolutional neural the raw data by performing a spatiotemporal cross-cor-
network (CNN) is a commonly applied DML algorithm in relation analysis of traffic flow sequence data using PCC.
traffic engineering. Due to the excellent performance of Then, they applied a model named spatiotemporal feature
CNN in image processing, while applying in traffic pre- selection algorithm (STFSA) on the traffic flow sequence
diction, traffic flow data are converted into a 2-D matrix to data to select the feature subsets as the input matrix. A 2-
process. There are five main parts of a CNN structure in layered CNN with the convolutional and pooling size as
transportation: the input layer, convolution layer, pool same as the previous studies was used. However, STFSA
layer, full connection layer, and output layer. Both the considers its heuristics, biases, and trade-offs and does not
convolution and pooling layer extracts important features. guarantee optimality.
The depth of these two layers differs in different studies. CNN shows good performance, where a large dataset is
Majority of the studies converted traffic flow data into an available. It has excellent feature learning capability with less
image of a 2-D matrix. In the studies performed by Ma et al. time-consuming classification ability. Therefore, CNN can
[80] and Sun et al. [45], each component of the matrix be applied where the available dataset can be converted into
represented average traffic speed on a specific part of the an image. CNN is applied in traffic speed prediction [81],
time. While tuning CNN parameters, they selected a traffic flow prediction [6], and modified CNN with LSTM is
convolutional filter size of (3 × 3) and max-pooling of size also applied for traffic prediction [82]. However, as men-
(2 × 2) of 3 layers according to parameter settings of LeNet tioned above, no model depth and parameter selection
and AlexNet and loss of information measurement. strategies are available.
Journal of Advanced Transportation 11

Deep machine
learning

Convolutional Long short-term Extreme learning


neural network memory machine

Figure 8: Subdivision of deep machine learning models.

4.3.2. Recurrent Neural Network. Recurrent neural network


(RNN) has a wide usage in the sequential traffic data pro- Output layer
cessing by considering the influence of the related neighbour
(Figure 9). Long short-term memory (LSTM) is a branch of
RNN. In the hidden layer of LSTM, there is a memory block
that includes four NN layers, which stores and regulates the
information flow. In recent years, with different data col- Output layer
lection systems with extended intervals, LSTM has become
popular. Due to this advantage, Zhao et al. [12] developed an
LSTM model consisted of three hidden layers and ten
neurons using long interval data. They set an adequate target
Output layer
and fine-tuned the parameters until the training model
stabilized. The authors also applied the congestion index and
classification (CI-C) model to classify the congestion by Figure 9: A simple RNN structure.
calculating CI from LSTM output data. Most of the studies
use the equal interval of CI to divide congestion states. This layer. The congestion state was analysed from traffic speed
study did two more intervals of natural breakpoint and and was represented in binary format in a matrix as input.
geometric interval to find that the latest provided the most Also, Sun et al. [45] combined RNN of three hidden layers,
information from information entropy. Lee et al. [69] ap- with its two other variants: LSTM and gated recurrent unit
plied 4 layers with 100 neurons LSTM model of 3D matrix (GRU). The hidden layers included the memory block
input. The input matrix element contained a normalised characteristic of LSTM, and the cell state and hidden state
speed to shorten the training time. While eliminating the were incorporated by GRU.
dependency, the authors found that a random distribution of As the sample size is increasing vastly, RNN is becoming
target road speed and more than optimally connected roads popular as a current way of modelling. RNN has a short-
in the matrix reduced the performance. To eliminate the term memory. This characteristic of RNN helps to model
limitation of temporal dependency, Yuan-Yuan et al. [79] nonlinear time series data. The training of RNN is also
trained their model in the batch learning approach. The straight forward, similar to multilayer FNN. However, this
instance found from classifying test dataset was used to train training may become difficult due to the conversion in a
the model in an online framework. Some studies introduced deep architecture with multiple layers in long-term de-
new layers to modify the LSTM model for feature extraction. pendency. In case of long-term dependency problems,
Zhang et al. [83] introduced an attention mechanism layer LSTM is becoming more suitable to be applied as LSTM can
between LSTM and prediction layer that enabled the feature remember information for a long period of time. RNN has
extraction from a traffic flow data sequence and captured the its application in other sectors of the transport too, e.g.,
importance of a traffic state. Di et al. [84] introduced traffic passenger flow prediction [86], modified LSTM in real
convolution that provides an input to the LSTM model to time crash prediction [87], and road-network traffic pre-
form the CPM-ConvLSTM model. All the studies applied diction [88].
the one-hot method to convert the input parameters. Adam,
stochastic gradient descent (SGD), and leakage integral echo
state network (LiESN) are a few optimisation methods 4.3.3. Extreme Learning Machine. In recent years, a novel
applied to fine-tune the outcome. learning algorithm called the extreme learning machine
A few studies combined RNN with other algorithms (ELM) is proposed for training the single layer feed-forward
while dealing with vast parameters of the road network. In neural network (SLFN). In ELM, input weights and hidden
this regard, Ma et al. [85] applied the RNN and restricted biases are assigned randomly instead of being exhaustively
Boltzmann machine (RNN-RBM) model for networkwide tuned. Therefore, ELM training is fast. Therefore, taking this
spatiotemporal congestion prediction. Here, they used advantage into account, Ban et al. [19] applied the ELM
conditional RBM to construct the proposed deep architec- model for real-time traffic congestion prediction. They de-
ture, which is designed to process the temporal sequence by termine CI using the average travel speed. A 4-fold cross-
providing a feedback loop between visible layer and hidden validation was done to avoid noise in raw data. The model
12 Journal of Advanced Transportation

Table 3: Traffic congestion prediction studies in deep machine learning.


Data No. of congestion
Methodology Road type Input parameters Target domain Reference
source state levels
Speed 3 Ma et al. [80]
Probe Average traffic speed Average traffic
Road network 5 Sun et al. [45]
Convolutional neural speed
networks Camera Congestion level Congestion level 3 Chen et al. [68]
Highway
Sensor Traffic flow Traffic flow — Zhang et al. [93]
corridor
Weather data
Road section Probe Congestion time 5 Zhao et al. [12]
Congestion time
Yuan-Yuan et al.
Arterial road Online Congestion level Congestion level
[79]
Spatial similarity
Camera 3
Recurrent neural feature
Road network Speed Lee et al. [69]
network Sensor Speed
Survey Peak hour
Speed
Highway
Sensor Travel time Congestion level 4 Zhang et al. [83]
corridor
Volume
Congestion state Congestion state 2 Ma et al. [85]
Current time
Road traffic state
cluster
Extreme learning Road network Probe Last congestion Congestion
— Ban et al. [19]
machine index Index
Road type
Number of adjacent
roads

found optimal hidden nodes to be 200 in terms of com- to learn temporal correlations of a transportation network.
putational cost in the study. An extension of this study was The first component encoded the vector representation of
done by Shen et al. [78] and Shen et al. [89] by applying a historical congestion levels and their correlation. They then
kernel-based semisupervised extreme learning machine decoded to build a representation of congestion levels for the
(kernel-SSELM) model. This model can deal with the future. The second component of DCPS used two dense
unlabelled data problem of ELM and the heterogenous data layers; those converted the output from the decoder to cal-
influence. The model integrated small-scaled labelled data of culate a vector representation of congestion level. However,
transportation personnel and large-scaled unlabelled traffic the process lost information as the congestion level of all the
data to evaluate urban traffic congestion. ELM speeded up pixels was averaged. This approach needed high iteration and
the processing time, where kernel function optimized the was computationally expensive as all the pixels regardless of
accuracy and robustness of the whole model. However, real- roads were considered. Another study applied a generalised
time labelled data collection was quite costly in terms of version of recurrent neural network named recursive neural
human resources and working time, and the number of network. The difference between these two is, in recurrent
experts for congestion state evaluation should have been NN, weights are shared along the data sequence. Whereas
more. Another modification of EML was applied by Yiming recursive NN is a single neuron model; therefore, weights are
et al. [20]. They applied asymmetric extreme learning ma- shared at every node. Huang et al. [94] applied a recursive NN
chine cluster (S-ELM-cluster) model for short-term traffic algorithm named echo state network (ESN). This model
congestion prediction by determining the CI. The authors consists of an input layer, reservoir network, and output layer.
divided the study area and implemented submodels pro- The reservoir layer constructs the rules that connected pre-
cessing simultaneously for fast speed. diction origin and forecasting horizon. As the study took a
The ELM model has the advantage in processing large scale large study area with vast link number, they simplified the
data learning at high speed. ELM works better with labelled training rule complexity applying recursive NN. Table 3
data. Where both labelled and unlabelled data are available, summarises some studies.
semisupervised ELM has shown good prediction accuracy, as it
was seen from the studies. Other sectors where ELM was
applied included air traffic flow prediction [90], traffic flow
5. Discussion and Research Gaps
prediction [91], and traffic volume interval prediction [92]. Research in traffic congestion prediction is increasing ex-
Other than the models already discussed, Zhang et al. [93] ponentially. Among the two sources, most of the studies
proposed a deep autoencoder-based neural network model used stationary sensor/camera data. Although sensor data
with symmetry of four layers for the encoder and the decoder cannot capture the dynamic traffic change, frequent change
Journal of Advanced Transportation 13

in source makes it complicated to evaluate the flow patterns


for probe data [95]. Data collection horizon is an important Deep ML
factor in traffic congestion studies. The small horizon of a
few days [3] cannot capture the actual situation of the
congestion as traffic is dynamic. Other studies that used data Shallow ML
for a few months showed the limitation of seasonality
[22, 67].
The condition of the surrounding plays an important Reasoning
factor in traffic congestion. A few studies focused on these
factors. Two studies considered social media contribution in 0 2 4 6 8 10 12
input parameter [7, 13], and five considered weather con-
dition [12, 13, 27, 34, 73]. Events, e.g., national event, school 2019 2016
2018 2015
holiday, and popular sports events, play a big role in traffic
2017 2014
congestion. For example, Melbourne, Australia, has two
public holidays before and during two most popular sports Figure 10: Application of AI models with time.
events of the country. The authorities close a few traffic
routes to tackle the traffic and the parade, resulting in traffic
congestion. Therefore, more focus must be put in including linear and nonlinear features efficiently. Besides, real-time
these factors while forecasting. congestion prediction cannot afford high computation time.
Dealing with missing data is a challenge in the data Therefore, models taking a short computational time are
processing. Some excluded the respective data altogether more effective in this case.
[29], others applied different methods to retrieve the data
[59, 85], and some replaced with other data [45]. Missing 6. Future Direction
data imputation can be a useful research scope in trans-
portation engineering. Traffic congestion is a promising area of research. There-
Machine learning algorithm, especially DML models, is fore, there are multiple directions to conduct in future
developed with time. This shows a clear impact on the rise of research.
their implementation in traffic congestion forecasting Numerous forecasting models have already been applied
(Figure 10). in road traffic congestion forecasting. However, with the
Probabilistic reasoning algorithms were mostly applied newly developed forecasting models, there is more scope to
for a part of the prediction model, e.g., map matching and make the congestion prediction more precise. Also, in this
optimal feature number selection. Fuzzy logic is the most era of information, the use of increased available traffic data
widely used algorithm in this class of algorithms. From other by applying the newly developed forecasting models can
branches, ANN and RNN are the mostly applied models. Most improve the prediction accuracy.
of the studies that applied hybrid or ensembled models belong The semisupervised model was applied only for the EML
to probabilistic and shallow learning class. Only two studies model. Other machine learning algorithms should be ex-
applied hybrid deep learning models while predicting net- plored for using both labelled and unlabelled data for higher
workwide congestion. Tables 4, 5–6 summarize the advantage prediction accuracy. Also, a limited number of studies have
and weaknesses of the algorithms of different branches. focused on real-time congestion forecasting. In future, re-
Among all DML models, RNN is more suitable for time searches should pay attention to real-time traffic congestion
series prediction. In a few studies, RNN performed better estimation problem.
than CNN as the gap between the traffic speeds in different Another future direction can be focusing on the level
classes was very small [12, 69]. However, due to little re- of traffic congestion. A few studies have divided the traffic
search in traffic congestion field, a lot of new ML algorithms congestion into a few states. However, for better traffic
are yet to be applied. management, knowing the grade of congestion is essen-
SML models showed better results than DML while tial. Therefore, future researches should focus on this.
forecasting traffic congestion in the short-term, as SML can Besides, most studies focused on only one traffic pa-
process linearity efficiently and linear features have more rameter to forecast congestion for congestion prediction.
contribution to traffic flow in short-term. All the short-term This can be an excellent future direction to give attention
forecasting studies discussed in this article applying SML to more than one parameter and combining the results
showed promising results. At the same time, DML models during congestion forecasting to make the forecasting
showed good accuracy as these models can handle both more reliable.
14 Journal of Advanced Transportation

Table 4: The strength and weakness of the models of probabilistic reasoning.


Methodology Advantages Disadvantages
(i) It converts the binary value into the linguistic
(i) No appropriate membership function shape selection
description hence portraying the traffic congestion
method exists.
state.
Fuzzy logic (ii) Traffic pattern recognition capability is not as durable as
(ii) iIt can portray more than two states.
ML algorithms.
(iii) As it does not need an exact crisp input, it can deal (iii) Traffic state may not match the actual traffic state as the
with uncertainty. outcome is not exact.
(i) Accuracy decreases with scarce temporal probe trajectory
(i) The model can overcome noisy measurements.
data
Hidden Markov
(ii) Can efficiently learn from non-preprocessed data. (ii) Not suitable in case of missing dataset.
model
(iii) Can evaluate multiple hypotheses of the actual
mapping simultaneously.
(i) Can do traffic parameter distribution over a period (i) Optimization algorithm used with GMM must be chosen
as a mixture regardless of the traffic state. cautiously.
Gaussian mixture
(ii) Can overcome the limitation of not being able to (ii) Results may show wrong traffic patterns due to local
model
account for multimodal output by a single Gaussian optima limitation and lack of traffic congestion threshold
process. knowledge of the optimisation algorithm.
(i) It can understand the underlying relationship
(i) Computationally expensive.
between random variables.
(ii) It can model and analyse traffic parameters
Bayesian network (ii) The model performs poorly with the increment in data.
between adjacent road links.
(iii) The model represents one-directional relation between
(iii) The model can work with incomplete data.
variables only.

Table 5: The strength and weakness of the models of shallow machine Learning.
Methodology Advantages Disadvantages
(i) BPNN requires vast data for training the model due to the
(i) It is an adaptive system that can change structure
parameter complexity resulting from its parameter
based on inputs during the learning stage [96].
Artificial neural nonsharing technique [97].
network (ii) It features defined early, FNN shows excellent
efficiency in capturing the nonlinear relationship of (ii) The training convergence rate of the model is slow.
data.
(i) Linear models cannot address nonlinearity, making it
(i) Models are suitable for time series problems.
harder to solve complex prediction problems.
(ii) Traffic congestion forecasting problems can be
(ii) Linear models are sensitive to outliers.
easily solved.
Regression model (iii) ARIMA can increase accuracy by maintaining
(iii) Computationally expensive.
minimum parameters.
(iv) Minimum complexity in the model. (iv) ARIMA cannot deal multifeature dataset efficiently.
(v) ARIMA cannot capture the rapidly changing traffic flow
[8].
(i) It is efficient in pattern recognition and (i) The improperly chosen kernel function may result in an
classification. inaccurate outcome.
Support vector (ii) A universal learning algorithm that can diminish
(ii) Unstable traffic flow requires improved prediction
machine the classification error probability by reducing the
accuracy of SVM.
structural risk [1].
(iii) It does not need a vast sample size. (iii) It takes high computational time and memory.
Journal of Advanced Transportation 15

Table 6: The strength and weakness of the models of deep machine learning.
Methodology Advantages Disadvantages
(i) Capable of learning features from local connections and (i) Computationally expensive as a huge kernel is
composing them into high-level representation. needed for feature extraction.
Convolutional neural (ii) Classification is less time-consuming. (ii) A vast dataset is required.
networks (iii) Can automatically extract features. (iii) Traffic data needs to be converted to an image.
(iv) No available strategies are available on CNN
model depth and parameter selection.
(i) Shows excellent performance in processing sequential (i) Long-term dependency results in bad
data flow. performance.
Recurrent neural (ii) No available firm guideline in dependency
(ii) Efficient in sequence classification.
network elimination.
(iii) Efficient in processing time-series with long intervals
and postponements.
(i) Training time increases with the hidden node
(i) Fast learning speed
rise.
Extreme learning
(ii) Can avoid local minima. (ii) Unlabelled data problem.
machine
(iii) Modified models are available to deal with an unlabelled
(iii) May produce less accurate results.
data problem.

7. Conclusions [3] W. Cao and J. Wang, “Research on traffic flow congestion


based on Mamdani fuzzy system,” AIP Conference Proceed-
Traffic congestion prediction is getting more attention from ings, vol. 2073, 2019.
the last few decades. With the development of infrastructure, [4] X. Kong, Z. Xu, G. Shen, J. Wang, Q. Yang, and B. Zhang,
every country is facing traffic congestion problem. There- “Urban traffic congestion estimation and prediction based on
fore, forecasting the congestion can allow authorities to floating car trajectory data,” Future Generation Computer
make plans and take necessary actions to avoid it. The Systems, vol. 61, pp. 97–107, 2016.
development of artificial intelligence and the availability of [5] Q. Yang, J. Wang, X. Song, X. Kong, Z. Xu, and B. Zhang,
“Urban traffic congestion prediction using floating car tra-
big data have led researchers to apply different models in this
jectory data,” in Proceedings of the International Conference on
field. This article divided the methodologies in three classed.
Algorithms and Architectures for Parallel Processing, pp. 18–
Although probabilistic models are simple in general, they 30, Springer, Zhangjiajie, China, November 2015.
become complex while different factors that affect traffic [6] W. Zhang, Y. Yu, Y. Qi, F. Shu, and Y. Wang, “Short-term
congestion, e.g., weather, social media, and event, are traffic flow prediction based on spatio-temporal analysis and
considered. Machine learning, especially deep learning, has CNN deep learning,” Transportmetrica A: Transport Science,
the benefit in this case. Therefore, deep learning algorithms vol. 15, no. 2, pp. 1688–1711, 2019.
became more popular with time as they can assess a large [7] T. Adetiloye and A. Awasthi, “Multimodal big data fusion for
dataset. However, a wide range of machine learning algo- traffic congestion prediction,” Multimodal Analytics for Next-
rithms are yet to be applied. Therefore, a vast opportunity of Generation Big Data Technologies and Applications, Springer,
research in the field of traffic congestion prediction still Berlin, Germany, pp. 319–335, 2019.
prevails. [8] F. Wen, G. Zhang, L. Sun, X. Wang, and X. Xu, “A hybrid
temporal association rules mining method for traffic con-
gestion prediction,” Computers & Industrial Engineering,
Conflicts of Interest vol. 130, pp. 779–787, 2019.
[9] J. Wang, Y. Mao, J. Li, Z. Xiong, and W.-X. Wang, “Pre-
The authors declare that they have no conflicts of interest. dictability of road traffic and Congestion in urban areas,”
PLoS One, vol. 10, no. 4, Article ID e0121825, 2015.
[10] Z. He, G. Qi, L. Lu, and Y. Chen, “Network-wide identification
Acknowledgments of turn-level intersection congestion using only low-fre-
The authors would like to thank RMIT University and the quency probe vehicle data,” Transportation Research Part C:
Australian Government Research Training Program (RTP) Emerging Technologies, vol. 108, pp. 320–339, 2019.
[11] K. M. Nadeem and T. P. Fowdur, “Performance analysis of a
for the financial support.
real-time adaptive prediction algorithm for traffic conges-
tion,” Journal of Information and Communication Technology,
References vol. 17, no. 3, pp. 493–511, 2018.
[12] H. Zhao, X. Jizhe, L. Fan, L. Zhen, and L. Qingquan, “A peak
[1] Z. Shi, Advanced Artificial Intelligence, World Scientific, traffic Congestion prediction method based on bus driving
Singapore, 2011. time,” Entropy, vol. 21, no. 7, p. 709, 2019.
[2] M. S. Ali, M. Adnan, S. M. Noman, and S. F. A. Baqueri, [13] F.-H. Tseng, J.-H. Hsueh, C.-W. Tseng, Y.-T. Yang,
“Estimation of traffic congestion cost-A case study of a major H.-C. Chao, and L.-D. Chou, “Congestion prediction with big
arterial in karachi,” Procedia Engineering, vol. 77, pp. 37–44, data for real-time highway traffic,” IEEE Access, vol. 6,
2014. pp. 57311–57323, 2018.
16 Journal of Advanced Transportation

[14] D. Li, J. Deogun, W. Spaulding, and B. Shuart, “Towards [28] E. Onieva, P. Lopez-Garcia, A. D. Masegosa, E. Osaba, and
missing data imputation: a study of fuzzy K-means Clus- A. Perallos, “A comparative study on the performance of
tering method,” in Rough Sets and Current Trends in evolutionary fuzzy and Crisp rule based Classification
Computing, S. Tsumoto, R. Słowiński, J. Komorowski, and methods in Congestion prediction,” Transportation Research
J. W. Grzymała-Busse, Eds., pp. 573–579, Springer, Berlin, Procedia, vol. 14, pp. 4458–4467, 2016.
Germany, 2004. [29] S. Yang, “On feature selection for traffic congestion predic-
[15] Y. Yang, Z. Cui, J. Wu, G. Zhang, and X. Xian, “Fuzzy c-means tion,” Transportation Research Part C: Emerging Technologies,
clustering and opposition-based reinforcement learning for vol. 26, pp. 160–169, 2013.
traffic congestion identification,” Journal of Information & [30] X. Zhang, E. Onieva, A. Perallos, E. Osaba, and V. C. S. Lee,
Computational Science, vol. 9, no. 9, pp. 2441–2450, 2012. “Hierarchical fuzzy rule-based system optimized with genetic
[16] Z. Chen, Y. Jiang, D. Sun, and X. Liu, “Discrimination and algorithms for short term traffic congestion prediction,”
prediction of traffic congestion states of urban road network Transportation Research Part C: Emerging Technologies,
based on spatio-temporal correlation,” IEEE Access, vol. 8, vol. 43, pp. 127–142, 2014.
pp. 3330–3342, 2020. [31] Y. Xu, L. Shixin, G. Keyan, Q. Tingting, and C. Xiaoya,
[17] Y. Guo, L. Yang, S. Hao, and J. Gao, “Dynamic identification “Application of data science technologies in intelligent pre-
of urban traffic congestion warning communities in hetero- diction of traffic Congestion,” Journal of Advanced Trans-
geneous networks,” Physica A: Statistical Mechanics and Its portation, 2019.
Applications, vol. 522, pp. 98–111, 2019. [32] J. Zaki, A. Ali-Eldin, S. Hussein, S. Saraya, and F. Areed, “Time
[18] A. Elfar, A. Talebpour, and H. S. Mahmassani, “Machine aware hybrid hidden Markov models for traffic Congestion
learning approach to short-term traffic congestion prediction prediction,” International Journal on Electrical Engineering
in a connected environment,” Transportation Research Re- and Informatics, vol. 11, no. 1, pp. 1–17, 2019.
cord: Journal of the Transportation Research Board, vol. 2672, [33] S. Jain, S. S. Jain, and G. Jain, “Traffic congestion modelling
no. 45, pp. 185–195, 2018. based on origin and destination,” Procedia Engineering,
[19] X. Ban, C. Guo, and G. Li, “Application of extreme learning vol. 187, pp. 442–450, 2017.
machine on large scale traffic congestion prediction,” Pro- [34] J. Kim and G. Wang, “Diagnosis and prediction of traffic
ceedings of ELM-2015, vol. 1, pp. 293–305, 2016. Congestion on urban road networks using bayesian net-
[20] X. Yiming, B. Xiaojuan, L. Xu, and S. Qing, “Large-scale traffic
works,” Transportation Research Record: Journal of the
Congestion prediction based on the symmetric extreme
Transportation Research Board, vol. 2595, no. 1, pp. 108–118,
learning machine Cluster fast learning method,” Symmetry,
2016.
vol. 11, no. 6, p. 730, 2019.
[35] W.-X. Wang, R.-J. Guo, and J. Yu, “Research on road traffic
[21] Y. Zheng, Y. Li, C.-M. Own, Z. Meng, and M. Gao, “Real-time
congestion index based on comprehensive parameters: taking
predication and navigation on traffic congestion model with
Dalian city as an example,” Advances in Mechanical Engi-
equilibrium Markov chain,” International Journal of Dis-
neering, vol. 10, no. 6, Article ID 168781401878148, 2018.
tributed Sensor Networks, vol. 14, no. 4, Article ID
[36] E. Onieva, V. Milanes, J. Villagra, J. Perez, and J. Godoy,
155014771876978, 2018.
“Genetic optimization of a vehicle fuzzy decision system for
[22] P. Zhang and Z. Qian, “User-centric interdependent urban
systems: using time-of-day electricity usage data to predict intersections,” Expert Systems with Applications, vol. 39,
morning roadway congestion,” Transportation Research Part no. 18, pp. 13148–13157, 2012.
C: Emerging Technologies, vol. 92, pp. 392–411, 2018. [37] P. Lopez-Garcia, E. Onieva, E. Osaba, A. D. Masegosa, and
[23] P. Mishra, R. Hadfi, and T. Ito, “Adaptive model for traffic A. Perallos, “A hybrid method for short-term traffic Con-
congestion prediction,” in Proceedings of the International gestion forecasting using genetic algorithms and Cross en-
Conference on Industrial, Engineering and Other Applications tropy,” IEEE Transactions on Intelligent Transportation
of Applied Intelligent Systems, vol. 9799, pp. 782–793, Mor- Systems, vol. 17, no. 2, pp. 557–569, 2016.
ioka, Japan, 2016. [38] A. Daissaoui, A. Boulmakoul, and Z. Habbas, “First specifi-
[24] F. Li, J. Gong, Y. Liang, and J. Zhou, “Real-time congestion cations of urban traffic-congestion forecasting models,” in
prediction for urban arterials using adaptive data-driven Proceedings of the 27th International Conference on Micro-
methods,” Multimedia Tools and Applications, vol. 75, no. 24, electronics (ICM 2015)., pp. 249–252, Casablanca, Morocco,
pp. 17573–17592, 2016. December 2015.
[25] J. F. Zaki, A. Ali-Eldin, S. E. Hussein, S. F. Saraya, and [39] M. Collotta, L. Lo Bello, and G. Pau, “A novel approach for
F. F. Areed, “Traffic congestion prediction based on Hidden dynamic traffic lights management based on Wireless Sensor
Markov Models and contrast measure,” Ain Shams Engi- Networks and multiple fuzzy logic controllers,” Expert Sys-
neering Journal, vol. 11, no. 3, p. 535, 2020. tems with Applications, vol. 42, no. 13, pp. 5403–5415, 2015.
[26] P. Jiang, L. Liu, L. Cui, H. Li, and Y. Shi, “Congestion pre- [40] M. B. Trabia, M. S. Kaseko, and M. Ande, “A two-stage fuzzy
diction of urban traffic employing SRBDP,” in Proceedings of logic controller for traffic signals,” Transportation Research
the 2017 IEEE International Symposium on Parallel and Part C: Emerging Technologies, vol. 7, no. 6, pp. 353–367, 1999.
Distributed Processing with Applications and 2017 IEEE In- [41] Y. Zhang and Z. Ye, “Short-term traffic flow forecasting using
ternational Conference on Ubiquitous Computing and Com- fuzzy logic system methods,” Journal of Intelligent Trans-
munications (ISPA/IUCC), pp. 1099–1106, IEEE, Guangzhou, portation Systems, vol. 12, no. 3, pp. 102–112, 2008.
China, 2017. [42] H. Wang, L. Zheng, and X. Meng, Traffic Accidents Prediction
[27] L. Jiwan, H. Bonghee, L. Kyungmin, and J. Yang-Ja, “A Model Based on Fuzzy Logic, Springer, Berlin, Germany, 2011.
prediction model of traffic congestion using weather data,” in [43] Y. Zhang and H. Ge, “Freeway travel time prediction using
Proceedings of the 2015 IEEE International Conference on Data takagi-sugeno-kang fuzzy neural network,” Computer-aided
Science and Data Intensive Systems, pp. 81–88, Sydney, NSW, Civil and Infrastructure Engineering, vol. 28, no. 8, pp. 594–
Australia, December 2015. 603, 2013.
Journal of Advanced Transportation 17

[44] J. Zhao, “Research on prediction of traffic congestion state,” in approach,” Procedia—Social and Behavioral Sciences, vol. 138,
Proceedings of the MATEC Web of Conferences, Les Ulis, no. C, pp. 671–678, 2014.
France, 2015. [60] T. Ito and R. Kaneyasu, “Predicting traffic congestion using
[45] S. Sun, J. Chen, and J. Sun, “Traffic congestion prediction driver behavior,” in Proceedigs of the International Conference
based on GPS trajectory data,” International Journal of Dis- on Knowledge Based and Intelligent Information and Engi-
tributed Sensor Networks, vol. 15, no. 5, Article ID neering Systems, Marseille, France, 2017.
155014771984744, 2019. [61] K. Kumar, M. Parida, and V. K. Katiyar, “Short term traffic
[46] Y. Qi and S. Ishak, “A Hidden Markov Model for short term flow prediction for a non urban highway using artificial neural
prediction of traffic conditions on freeways,” Transportation network,” Procedia - Social and Behavioral Sciences, vol. 104,
Research Part C: Emerging Technologies, vol. 43, pp. 95–111, pp. 755–764, 2013.
2014. [62] K. Kumar, M. Parida, and V. K. Katiyar, “Short term traffic
[47] B. Jiang and Y. Fei, “Traffic and vehicle speed prediction with flow prediction in heterogeneous condition using artificial
neural network and hidden markov model in vehicular neural network,” Transport, vol. 30, no. 4, pp. 397–405, 2013.
networks,” in Proceedings of the 2015 IEEE Intelligent Vehicles [63] R. More, A. Mugal, S. Rajgure, R. B. Adhao, and
Symposium (IV), pp. 1082–1087, IEEE, Seoul, 2015. V. K. Pachghare, “Road traffic prediction and congestion
[48] G. Zhu, K. Song, P. Zhang, and L. Wang, “A traffic flow state control using artificial neural networks,” in Proceedings of the
transition model for urban road network based on Hidden 2016 International Conference on Computing, Analytics and
Markov Model,” Neurocomputing, vol. 214, pp. 567–574, Security Trends (CAST), pp. 52–57, IEEE, Pune, India, De-
2016a. cember 2016.
[49] L. Zhu, R. Krishnan, F. Guo, J. Polak, and A. Sivakumar, [64] C. Jacobé de Naurois, C. Bourdin, A. Stratulat, E. Diaz, and
“Early identification of recurrent congestion in heterogeneous J.-L. Vercher, “Detection and prediction of driver drowsiness
urban traffic,” in Proceedings of the 2019 IEEE Intelligent using artificial neural network models,” Accident Analysis &
Transportation Systems Conference (ITSC), pp. 4392–4397, Prevention, vol. 126, pp. 95–104, 2019.
IEEE, Auckland, New Zealand, October 2019. [65] P. Kumar, S. P. Nigam, and N. Kumar, “Vehicular traffic noise
[50] Y. Xie, K. Zhao, Y. Sun, and D. Chen, “Gaussian processes for modeling using artificial neural network approach,” Trans-
short-term traffic volume forecasting,” Transportation Re- portation Research Part C: Emerging Technologies, vol. 40,
search Record: Journal of the Transportation Research Board, pp. 111–122, 2014.
vol. 2165, no. 1, pp. 69–78, 2010. [66] V. Nourani, H. Gökçekuş, I. K. Umar, and H. Najafi, “An
[51] S. Jin, X. Qu, and D. Wang, “Assessment of expressway traffic emotional artificial neural network for prediction of vehicular
safety using Gaussian mixture model based on time to col- traffic noise,” Science of The Total Environment, vol. 707,
lision,” International Journal of Computational Intelligence p. 136134, 2020.
Systems, vol. 4, no. 6, pp. 1122–1130, 2011. [67] T. Alghamdi, K. Elgazzar, M. Bayoumi, T. Sharaf, and S. Shah,
[52] J. Jun, “Understanding the variability of speed distributions “Forecasting traffic congestion using arima modeling,” in
under mixed traffic conditions caused by holiday traffic,” Proceedings of the 2019 15th International Wireless Commu-
Transportation Research Part C: Emerging Technologies, nications & Mobile Computing Conference (IWCMC),
vol. 18, no. 4, pp. 599–610, 2010. pp. 1227–1232, IEEE, Tangier, Morocco, October 2019.
[53] S. Shiliang, Z. Changshui, and Y. Guoqiang, “A bayesian [68] M. Chen, X. Yu, and Y. Liu, “PCNN: deep convolutional
network approach to traffic flow forecasting,” IEEE Trans- networks for short-term traffic congestion prediction,” IEEE
actions on Intelligent Transportation Systems, vol. 7, no. 1, Transactions on Intelligent Transportation Systems, vol. 19,
pp. 124–132, 2006. no. 11, pp. 3550–3559, 2018.
[54] G. Asencio-Cortés, E. Florido, A. Troncoso, and F. Martı́nez- [69] C. Lee, Y. Kim, S. Jin et al., “A visual analytics system for
Álvarez, “A novel methodology to predict urban traffic exploring, monitoring, and forecasting road traffic Conges-
congestion with ensemble learning,” Soft Computing, vol. 20, tion,” IEEE Transactions on Visualization and Computer
no. 11, pp. 4205–4216, 2016. Graphics, vol. 26, no. 11, pp. 3133–3146, 2020.
[55] Z. Zhu, B. Peng, C. Xiong, and L. Zhang, “Short-term traffic [70] H. Wang, L. Liu, S. Dong, Z. Qian, and H. Wei, “A novel work
flow prediction with linear conditional Gaussian Bayesian zone short-term vehicle-type specific traffic speed prediction
network,” Journal of Advanced Transportation, vol. 50, no. 6, model through the hybrid EMD-ARIMA framework,”
pp. 1111–1123, 2016. Transportmetrica B: Transport Dynamics, vol. 4, no. 3,
[56] S. Wang, W. Huang, and H. K. Lo, “Traffic parameters es- pp. 159–186, 2016.
timation for signalized intersections based on combined [71] X. F. Wang, X. Y. Zhang, Q. Y. Ding, and Z. Q. Sun,
shockwave analysis and Bayesian Network,” Transportation “Forecasting traffic volume with space-time ARIMA model,”
Research Part C: Emerging Technologies, vol. 104, pp. 22–37, Advanced Materials Research, vol. 156-157, pp. 979–983, 2010.
2019. [72] L. Kui-Lin, Z. Chun-Jie, and X. Jian-Min, “Short-term traffic
[57] S. Wang, W. Huang, and H. K. Lo, “Combining shockwave flow prediction using a methodology based on ARIMA and
analysis and Bayesian Network for traffic parameter estima- RBF-ANN,” in Proceedings of the 2017 Chinese Automation
tion at signalized intersections considering queue spillback,” Congress (CAC), pp. 2804–2807, Jinan, China, October 2017.
Transportation Research. Part C, Emerging Technologies, [73] Y. Liu and H. Wu, “Prediction of road traffic congestion based
vol. 120, Article ID 102807, 2020. on random forest,”vol. 2, pp. 361–364, in Proceedings of the
[58] X. Wang, K. An, L. Tang, and X. Chen, “Short term prediction 2017 10th International Symposium on Computational Intel-
of freeway exiting volume based on SVM and KNN,” Inter- ligence and Design (ISCID), vol. 2, pp. 361–364, IEEE,
national Journal of Transportation Science and Technology, Hangzhou, China, December 2017.
vol. 4, no. 3, pp. 337–352, 2015. [74] W. Alajali, W. Zhou, S. Wen, and Y. Wang, “Intersection
[59] Y. Liu, X. Feng, Q. Wang, H. Zhang, and X. Wang, “Prediction traffic prediction using decision tree models,” Symmetry,
of urban road Congestion using a bayesian network vol. 10, no. 9, p. 386, 2018.
18 Journal of Advanced Transportation

[75] M. Balta and İ. Özçeli̇k, “A 3-stage fuzzy-decision tree model [89] Q. Shen, X. Ban, C. Guo, and C. Wang, “Kernel based semi-
for traffic signal optimization in urban city via a SDN based supervised extreme learning machine and the application in
VANET architecture,” Future Generation Computer Systems, traffic congestion evaluation,” Proceedings of ELM-2015,
vol. 104, pp. 142–158, 2020. vol. 1, pp. 227–236, 2016.
[76] X. Feng, X. Ling, H. Zheng, Z. Chen, and Y. Xu, “Adaptive [90] Z. Zhang, A. Zhang, C. Sun et al., “Research on air traffic flow
multi-kernel SVM with spatial-temporal correlation for short- forecast based on ELM non-iterative algorithm,” Mobile
term traffic flow prediction,” IEEE Transactions on Intelligent Networks and Applications, pp. 1–15, 2020.
Transportation Systems, vol. 20, no. 6, pp. 2001–2013, 2019. [91] Y.-m. Xing, X.-j. Ban, and R. Liu, “A short-term traffic flow
[77] S. Lu and Y. Liu, “Evaluation system for the sustainable prediction method based on kernel extreme learning ma-
development of urban transportation and ecological envi- chine,” in Proceedings of the 2018 IEEE International Con-
ronment based on SVM,” Journal of Intelligent & Fuzzy ference on Big Data and Smart Computing (BigComp),
Systems, vol. 34, no. 2, pp. 831–838, 2018. pp. 533–536, IEEE, Shanghai, China, January 2018.
[78] Q. Shen, X. Ban, and C. Guo, “Urban traffic Congestion [92] L. Lin, J. C. Handley, and A. W. Sadek, “Interval prediction of
evaluation based on kernel the semi-supervised extreme short-term traffic volume based on extreme learning machine
learning machine,” Symmetry, vol. 9, no. 5, p. 70, 2017. and particle swarm optimization,” in Proceedings of the 96th
[79] C. Yuan-Yuan, Y. Lv, Z. Li, and F.-Y. Wang, “Long short-term Transportation Research Board Annual, Washington DC,
memory model for traffic congestion prediction with online USA, 2017.
open data,” in Proceedings of the 2016 IEEE 19th International [93] S. Zhang, Y. Yao, J. Hu, Y. Zhao, S. Li, and J. Hu, “Deep
Conference on Intelligent Transportation Systems (ITSC), autoencoder neural networks for short-term traffic Conges-
pp. 132–137, Rio de Janeiro, Brazil, December 2016. tion prediction of transportation networks,” Sensors, vol. 19,
[80] X. Ma, Z. Dai, Z. He, J. Ma, Y. Wang, and Y. Wang, “Learning no. 10, p. 2229, 2019.
traffic as images: a deep convolutional neural network for [94] D. Huang, Z. Deng, S. Wan, B. Mi, and Y. Liu, “Identification
large-scale transportation network speed prediction,” Sensors, and prediction of urban traffic congestion via cyber-physical
vol. 17, no. 4, p. 818, 2017. link optimization,” IEEE Access, vol. 6, pp. 63268–63278, 2018.
[81] R. Ke, W. Li, Z. Cui, and Y. Wang, “Two-stream multi- [95] V. Ahsani, M. Amin-Naseri, S. Knickerbocker, and
channel Convolutional neural network for multi-lane traffic A. Sharma, “Quantitative analysis of probe data character-
speed prediction Considering traffic volume impact,” istics: coverage, speed bias and congestion detection preci-
Transportation Research Record: Journal of the Transportation sion,” Journal of Intelligent Transportation Systems, vol. 23,
Research Board, vol. 2674, no. 4, pp. 459–470, 2020. no. 2, pp. 103–119, 2019.
[82] T. Bogaerts, A. D. Masegosa, J. S. Angarita-Zapata, E. Onieva, [96] S. J. Kwon, Artificial Neural Networks, Nova Science Pub-
and P. Hellinckx, “A graph CNN-LSTM neural network for lishers, New York, NY, USA, 2011.
short and long-term traffic forecasting based on trajectory [97] Y. Liu, Z. Liu, and R. Jia, “DeepPF: a deep learning based
data,” Transportation Research Part C: Emerging Technologies, architecture for metro passenger flow prediction,” Trans-
vol. 112, pp. 62–77, 2020. portation Research Part C: Emerging Technologies, vol. 101,
[83] T. Zhang, Y. Liu, Z. Cui, J. Leng, W. Xie, and L. Zhang, “Short- pp. 18–34, 2019.
term traffic Congestion forecasting using attention-based long
short-term memory recurrent neural network,” in Proceedings
of the International Conference on Computational Science,
Faro, Portugal, June 2019.
[84] X. Di, Y. Xiao, C. Zhu, Y. Deng, Q. Zhao, and W. Rao, “Traffic
congestion prediction by spatiotemporal propagation pat-
terns,” in Proceedings of the 2019 20th IEEE International
Conference on Mobile Data Management (MDM), pp. 298–
303, Hong Kong, China, June 2019.
[85] X. Ma, H. Yu, Y. Wang, and Y. Wang, “Large-scale trans-
portation network congestion evolution prediction using deep
learning theory,” PloS One, vol. 10, no. 3, Article ID e0119044,
2015.
[86] Z. Zhene, P. Hao, L. Lin et al., “Deep convolutional mesh RNN
for urban traffic passenger flows prediction,” in Proceedings of
the 2018 IEEE SmartWorld, Ubiquitous Intelligence & Com-
puting, Advanced & Trusted Computing, Scalable Computing
& Communications, Cloud & Big Data Computing, Internet of
People and Smart City Innovation (SmartWorld/SCALCOM/
UIC/ATC/CBDCom/IOP/SCI), pp. 1305–1310, IEEE,
Guangzhou, China, October 2018.
[87] P. Li, M. Abdel-Aty, and J. Yuan, “Real-time crash risk
prediction on arterials based on LSTM-CNN,” Accident
Analysis & Prevention, vol. 135, Article ID 105371, 2020.
[88] W. Xiangxue, X. Lunhui, C. Kaixun, X. Lunhui, C. Kaixun,
and C. Kaixun, “Data-Driven short-term forecasting for ur-
ban road network traffic based on data processing and LSTM-
RNN,” Arabian Journal for Science and Engineering, vol. 44,
no. 4, pp. 3043–3060, 2019.

You might also like