Determination of Peak Hour Ridership

Download as pdf or txt
Download as pdf or txt
You are on page 1of 22

sustainability

Article
Determination of the Peak Hour Ridership of Metro
Stations in Xi’an, China Using
Geographically-Weighted Regression
Lijie Yu, Yarong Cong * and Kuanmin Chen
Department of Traffic Engineering, College of Transportation Engineering, Chang’an University,
Xi’an 710064, China; [email protected] (L.Y.); [email protected] (K.C.)
* Correspondence: [email protected].

Received: 4 February 2020; Accepted: 11 March 2020; Published: 13 March 2020 

Abstract: The ridership of a metro station during a city’s peak hour is not always the same as that
during the station’s own peak hour. To investigate this inconsistency, this study introduces the peak
deviation coefficient to describe this phenomenon. Data from 88 metro stations in Xi’an, China,
are used to analyze the peak deviation coefficient based on the geographically weighted regression
model. The results demonstrate that when the land around a metro station is mainly land for work,
primary and middle schools, and residences, its station’s peak hour is consistent with the city’s peak
hour. Additionally, the station’s peak hour is more likely to deviate from the city’s peak hour for
suburban stations. There are two ridership options when designing stations, namely the extra peak
hour ridership during a city’s peak hour and that during a station’s peak hour, and the larger of the
two is used to design metro stations. The mixed land use ratio must be considered in urban land use
planning, because although non-commuting land can mitigate the traffic pressure of a city’s peak
hour, it may cause the deviation of the station’s peak hours from that of the city.

Keywords: urban rail transit station; peak deviation coefficient; transportation and land use;
geographically weighted regression; station design

1. Introduction
According to the Transit Capacity and Quality of Service Manual, the calculation procedure for a
desired station’s platform size includes choosing the corresponding average number of passengers,
adjusting for passenger characteristics as appropriate, estimating the maximum passenger demand for
the platform at a given time, and calculating the required waiting space by multiplying the average
space per person by the maximum passenger demand [1]. Estimating the maximum passenger demand
for a platform at a given time is one of the most important steps for station design. In China, the
Code for Design of Metro (the most recent edition of which is from 2003) states that the capacity of
a metro station must be determined by the extra peak hour passenger flow, which is the predicted
peak hour passenger flow multiplied by the extra peak hour factor, which is between 1.1 and 1.4 [2].
The predicted result for a metro station is extracted from the predicted result of the metro network,
as the total passenger flow volume of the metro network must be controlled to reduce forecasting
errors [3]. Today, macroscopic passenger flow forecasting still uses the four-step model [4], which was
primarily developed for the prediction of traffic on a regional scale [5] and the evaluation of large-scale
infrastructure projects [6]. Thus, the predicted ridership of a station is always in the city’s peak hour,
and it determines the station design [7]. However, surveys of urban rail transit operations have shown
that, in many areas, a station’s peak hours are not always the same as the city’s peak hours [8]. When
designing stations, designers are clearly biased toward adopting station ridership during the city’s

Sustainability 2020, 12, 2255; doi:10.3390/su12062255 www.mdpi.com/journal/sustainability


Sustainability 2020, 12, 2255 2 of 22

(and not the station’s) peak hours. This will increase the difficulty of both passenger flow forecasting
in the planning stage and train departure intervals and station management in the operation stage.
Because there is only one train departure interval in a period of time in a section of a metro line, if a
station’s peak hour is different from those of other stations, the train departure interval will be greater
in this station‘s peak hour, and passengers will pile up.
The latest edition of the Code for Prediction of Urban Rail Transit Ridership in China (2016) states that,
for the stations in which the passenger flow peak does not appear within the morning and evening
peak hours of the city, the station’s own peak hour and passenger flow volume should be predicted [9].
Due to the lack of a theoretical basis, rail transit passenger flow forecast reports for each city can only
provide results from a qualitative angle. Therefore, the enlargement coefficient of the passenger flow
volume during the peak hours of a station should be determined. This means that, in future rail transit
station design, the predicted passenger flow volume during a city’s peak hours can be multiplied by
the expanded multiple, thereby providing a more accurate foundation for station design.
The goal of this study is to determine an easy way to convert the ridership in a city’s peak hour as
predicted by the four-step method to the ridership in the station’s peak hour while simultaneously
retaining the advantages of the four-step method in the control of the total volume of the entire metro
network. With regard to the extra peak hour factor, this paper introduces the peak deviation coefficient
(PDC), which is the ratio between the predicted ridership of a metro station during its peak hour and
the predicted ridership of this metro station during the city’s peak hour. After forecasting the future
ridership during the city’s peak hour, the ridership in the station’s peak hour can be calculated from
the predicted results via multiplication by the PDC. When designing a new metro line, the land use
planning along the metro line will also be conducted simultaneously. Thus, by the analogy of the
relationship between the land use around the existing station and the PDC, the considered value of the
PDC in the planned station can be determined. The aim of this paper is to analyze the relationship
between the influencing factors and the PDC. Because urban development is not at equilibrium and
the geographically weighted regression (GWR) model can reflect the spatial heterogeneity, the spatial
characteristics of the PDC and its influencing factors are investigated based on the GWR model.

2. Literature Review
With the development of urban rail transit, the research on passenger flow forecasting has
continuously deepened and has yielded many results. Scholars first studied the four-step model,
which is a family of interrelated models (generation, distribution, assignment, and mode choice), and
gradually formed a more mature passenger flow forecasting method [10–13]. Researchers then began
to consider the total control and distribution of passenger flow in the rail transit network [3]. With the
construction and operation of urban rail transit, some scholars began to analyze the shortcomings of
the traditional four-step model [14,15] and carry out improvement measures [16].
The four-step method is one of the best tools for predicting travel volume [17]. However, it
is more suitable for large-scale traffic predictions rather than station-level forecasting. It cannot be
used to determine a station’s peak hour, as a station is small compared to the entire metro network.
Many scholars have therefore attempted to analyze the characteristics and influences of time-varying
passenger flow. Feng et al. [18] analyzed a proportional distribution of the hourly station ridership to the
daily station ridership, and others have found that land use [19] or the surrounding environment [20,21],
employment, and population [22] influence the diurnal pattern of station passenger flow. Sung and
Oh [23] developed different multiple regression models for the day and week. He et al. [24] used a
metro ridership estimation method to investigate station-level metro ridership at different times (day
of the week, week of the month, and month of the year). Although these studies were not aimed at the
investigation of the influence of the degree of the peak hour deviation of station passenger flow, this
type of research is a component of the time distributions of metro stations. Therefore, these studies can
provide references for the research of the present study.
Sustainability 2020, 12, 2255 3 of 22

Other scholars have used Bayesian networks [25], multivariate regression [26], bi-directional
long short-term memory networks [27], time series and regression models [28], and back propagation
neural networks [29] to predict short-term passenger flow during a station’s peak hour. However,
these methods are not suitable for stations that have not yet been constructed.
Few studies have been conducted on passenger flow during a station’s peak hour. Shen [30]
proposed the concept of two types of station ridership during peak hours, namely the ridership during
the city’s peak hour and the ridership during the station’s peak hour; however, the study provided no
research data. Others have found that a station’s peak hour is related to the station’s land use [31,32],
location, and travel purposes [8].
Fotheringham [33] proposed the geographically weighted regression (GWR) model, which
accounts for spatial heterogeneity. The GWR model is suited to solve the problems related to spatial
autocorrelation and non-stationarity in traffic system modeling [34] and has been used to analyze
traffic accidents [35], car ownership [36], average annual road traffic [37], public transport use [38], and
the relationship between transport and land use [39]. The GWR model has also been used to account
for spatial autocorrelation and non-stationarity at the station level [5,21,40].
Because urban development is not at equilibrium, the present research on the peak hour of metro
stations was carried out on the basis of previous studies of stations’ peak ridership during this time.
Thus, this paper puts forward an enlargement coefficient called peak deviation coefficient (PDC),
which can be used to convert the ridership during a city’s peak hour to that during a station’s peak
hour. Considering the uneven development around the city center in most cities and the character of
the GWR model, the GWR is used in the present study to analyze the relationship between influencing
factors and the peak deviation coefficient.

3. Data Sources and Variables

3.1. Data Sources


The object of this study is the 88 metro stations in Xi’an, China. It should be noted that the station
names consist of two parts. The first component is the number of the line that the station belongs
to, so for example, 1# represents Line 1, and the number of the transfer station is the line that was
operated earlier. The second part is the initials of the station name. The data includes: 1) The location
of metro stations in Xi’an and whether the station is a transfer station or not, according to information
collected by the Baidu coordinate pickup system in 2018; 2) The automatic fare collection system (AFC)
credit card data of the Xi’an metro in March 2019, provided by Xi’an Metro Co., Ltd.; 3) Land use and
building density around stations, which were collected using Baidu Maps. The land use of Xi’an metro
stations is shown in Figure 1.
Sustainability 2020, 12, 2255 4 of 22
Sustainability 2020, 2, x FOR PEER REVIEW 4 of 21
Sustainability 2020, 2, x FOR PEER REVIEW 4 of 21

4#SZDD
4#SZDD 4#HTDL
4#HTDL
Figure 1. Land use of Xi’an metro stations (*10444m2).22
Figure 1. Land use of Xi’an metro stations (*10 m
Figure 1. m ).).

3.2.3.2. DependentVariable
Dependent Variable
3.2. Dependent Variable
AA station’speak
station’s peakhour
hourrefers
refersto
tothe
the single
single hour
hour ofof the
the day
daywith
withthe thehighest
highesthourly
hourly ridership of of
ridership
A station’s peak hour refers to the single hour of the day with the highest hourly ridership of
thethe urban
urban rail
rail transitstation,
transit station,whereas
whereas aa city’s
city’s peak
peak hour
hour refers
referstotothe
thesingle
singlehour
hourofofthe day
the that
day hashas
that
the urban rail transit station, whereas a city’s peak hour refers to the single hour
the highest number of trips throughout the city. For example, the time distributions in the morning of the day that has
the highest number of trips throughout the city. For example, the time distributions in the morning
theofhighest number
three metro of trips
stations arethroughout
presented inthe city. For
Figures 2–4.example,
1#CLP isthe time
in the distributions
urban in the
area of Xi’an, morning
2#BKZ is
of three
of co-built metro
three metro stations are presented in Figures 2–4. 1#CLP is in the urban area of Xi’an, 2#BKZ is
with stations are presented
a high-speed inand
rail station, Figures
3#DYT 2–4. 1#CLP
is near is in thetourist
a popular urbanspot,
area the
of Xi’an,
Greater2#BKZ
Wild is
co-built
co-built with a high-speed rail station, and 3#DYT is near a popular tourist spot, the Greater Wild
GoosewithPagoda a high-speed
Square. Therail station,orand
boarding 3#DYTpeak
alighting is near
hoursa popular tourist spot,
of these stations the Greaterwith
are incongruent Wild
Goose
Goose Pagoda
Pagoda Square.
Square. The
The boarding or alighting peak hours of these stations are incongruent
boarding or alighting peak hours of these stations are incongruent with with the
the city’s peak morning hour.
city’s
the peak
city’s morning
peak morninghour.hour.

Figure 2. Time distribution of metro station 1#CLP in the morning.

Figure
Figure 2.
2. Time
Time distribution
distribution of
of metro
metro station
station 1#CLP
1#CLP in
in the
the morning.
morning.
Sustainability 2020, 12, 2255 5 of 22
Sustainability 2020, 2, x FOR PEER REVIEW 5 of 21

Figure
Figure 3.
3. Time
Timedistribution
distributionof
ofmetro
metro station
station 2#BKZ
2#BKZ in
in the
the morning.
morning.

Figure
Figure 4.
4. Time
Timedistribution
distributionof
ofmetro
metro station
station 3#DYT
3#DYT in
in the
the morning.
morning.

The predictedridership
The predicted ridershipof of metro
metro stations
stations during during
a city’sa peak
city’shour
peak hour is provided
is provided in the
in the forecasting
of passengerofflow.
forecasting The peak
passenger deviation
flow. The peak coefficient
deviation (PDC) is used(PDC)
coefficient in thisisstudy
usedtoinfacilitate
this study theto
conversion
facilitate
of ridership
the conversion during a city’s peak
of ridership houra to
during that peak
city’s duringhour a station’s
to thatpeakduringhour;a the coefficient
station’s peak ishour;
the ratio
the
coefficient
between theispredicted
the ratio ridership
between the of a predicted
metro station ridership
duringof itsaown
metro peakstation
hour andduringthe its own peak
predicted hour
ridership
of this
and themetro stationridership
predicted during the ofcity’s
this peak
metrohour.stationThe during
calculationthe formula
city’s peak of the PDCThe
hour. is ascalculation
follows:
formula of the PDC is as follows:
Ps
PDC = (1)
Pss Pc
PDC = (1)
Pcc
where Ps is the ridership of a metro station during its own peak hour and Pc is the ridership of this
metro station during the city’s peak hour,
where Pss is the ridership of a metro station during its and PDC is own
the peak
peakdeviation
hour andcoefficient. The closer
Pcc is the ridership the
of this
value of
metro the PDC
station is tothe
during 1, the more
city’s similar
peak hour,the ridership
and PDC is at thethe station’s
peak peakcoefficient.
deviation hour is to the Theridership
closer theat
the city’s peak hour.
value of the PDC is to 1, the more similar the ridership at the station’s peak hour is to the ridership at
In this
the city’s study,
peak hour. the boarding and alighting PDC values in the morning were calculated separately.
b
PDCIndenotes the boarding PDC, and PDC a denotes the alighting PDC; both are dependent variables.
this study, the boarding and alighting PDC values in the morning were calculated
The boarding and b alighting ridership in a city’s peak hour, athe
separately. PDC denotes the boarding PDC, and PDC a denotes
b differencethebetween
alighting ridership
PDC; both duringarea
station’s peak
dependent hour andThe
variables. the boarding
city’s peakandhour,alighting
and the PDC valuesinofaXi’an
ridership city’smetro
peakstations
hour, thein the morning
difference
are presented
between in Figures
ridership during5 and 6. Most PDC
a station’s peakvalues
hour and are near
the 1, indicating
city’s that these
peak hour, and stations’
the PDCpeak valueshours
of
Xi’an metro stations in the morning are presented in Figure 5 and Figure 6. Most PDC values are1,
align with the city’s peak hour. However, there are also some stations with a PDC value greater than
and some
near with a PDC
1, indicating valuestations’
that these even greater
peak than
hours1.4. alignTable
with1 the
presents
city’s the
peak morning and evening
hour. However, therePDC
are
values of 52 stations of Chongqing metro lines 1, 3, and 6 [41]. It can
also some stations with a PDC value greater than 1, and some with a PDC value even greater than be seen that although most PDC
values
1.4. Tableare1inpresents
the range theofmorning
1–1.2, 7.69%
and and
evening13.46%PDC of values
stations’ ofmorning
52 stations andofevening
Chongqing PDCs, respectively,
metro lines 1,
areand
3, found to be
6 [41]. It greater
can be than
seen 1.2.
thatThus, the phenomenon
although most PDC values of theareridership
in the of a metro
range station
of 1–1.2, during
7.69% anda
city’s peak hour not always being the same as that during a station’s own
13.46% of stations’ morning and evening PDCs, respectively, are found to be greater than 1.2. Thus, peak hour is not a unique
the phenomenon of the ridership of a metro station during a city’s peak hour not always being the
same as that during a station’s own peak hour is not a unique case in Xi’an. For the sake of the
meticulous design and sustainable development of metro systems, it is important to study the PDC.
Sustainability 2020, 12, 2255 6 of 22

case in Xi’an. For the sake of the meticulous design and sustainable development of metro systems, it
is important to study the PDC.

Table 1. The statistical peak deviation coefficient (PDC) in Chongqing, China.

Morning PDC Evening PDC


Range
Data Ratio Data Ratio
(1.00, 1.10) 37 71.15% 34 65.38%
(1.10, 1.20) 11 21.15% 11 21.15%
(1.20, 1.30) 3 5.77% 2 3.85%
(1.30, 1.40) 0 0.00% 1 1.92%
(1.40, +∞) 1 1.92% 4 7.69%

It can be determined from Figures 5 and 6 that, when the PDC is large, there may be two conditions.
The first condition is that the difference between the ridership during a station’s peak hour and the
city’s peak hour is very large compared to that of other stations, such as is the case for 1#KFL and
2#BKZ. The second condition is that the difference between the ridership during a station’s peak hour
and the city’s peak hour is small compared to that of other stations, but the station ridership itself is
small, and as a result the PDC becomes large, such as is the case for 3#WZ and 4#SZDD.
Sustainability 2020, 12, 2020,
Sustainability
Sustainability 2255 2,
2020, 2, xx FOR
FOR PEER
PEER REVIEW
REVIEW 77 of
of 21
21 7 of 22

FigureFigure
5. The5.
Figure The
The morning
5.morning
morning boarding
boarding
boarding ridership
inin
ridership
ridership in
thethe
the peak hours
peakhours
peak of
hoursof the
the city
of the city and
city andstations,
and stations,
stations,and
and the PDC
thethe
and PDC values
values
PDC of
of Xi’an
values Xi’an metro stations.
metrometro
of Xi’an stations.
stations.

FigureFigure
6. The6.
Figure The
The morning
6.morning
morning alighting
alighting
alighting ridership
inin
ridership
ridership in
thethe
the peak hours
peakhours
peak of
hoursof the
the city
of the cityand
city andstations,
and stations,
stations,and
and the PDC
thethe
and PDC values
values
PDC of
of Xi’an
values Xi’an metro stations.
metrometro
of Xi’an stations.
stations.
Sustainability 2020, 2, x FOR PEER REVIEW 8 of 21
Sustainability 2020, 12, 2255 8 of 22

3.3. Independent Variable Selection


3.3. Independent Variable Selection
Research on the peak hours of metro stations is still in the initial stage, and is classified as the
Research
forecasting of onthethe peak hours
temporal of metroofstations
distribution stationisridership.
still in theTherefore,
initial stage, thisand paperis classified
uses the as the
direct
forecasting of themodel
station ridership temporal as adistribution
reference, and of station
determinesridership. Therefore,
the influence thisof
factors paper
the PDC.uses The
the direct
direct
station
station ridership
ridershipmodel modelasgenerally
a reference, and determines
categorizes the influence
the influence factors factors
of station of the PDC. The
ridership intodirect
four
station
classes:ridership
the builtmodel generally
environment, categorizessocial
population, the influence
economy, factors of station
and station ridership into
characteristics four classes:
[42].
the built environment, population, social economy, and station characteristics [42].
3.3.1. Built Environment
3.3.1. Built Environment
The peak hours of trips are always different. People’s trips can be classified into two kinds
accordingpeak
The to the hours of trips are
consistency of always
the city’s different.
peak hour. People’s
The trips
peak can hourbeofclassified
the firstinto kindtwo kinds
of trip is
according to the consistency of the city’s peak hour. The peak hour
consistent with the city’s peak hours, such as going to work and going to school. The peak hour ofof the first kind of trip is consistent
with the city’s
the second peak
kind ofhours,
trip issuch as going towith
not consistent workthe andcity’s
goingpeakto school.
hours,The suchpeakas hour
travelingof theforsecond kind
shopping.
of trip is not consistent with the city’s peak hours, such as traveling
Because different land uses attract different purposes of trips, the ratio of land use ultimately for shopping. Because different
land uses attract
determines different
the peak hourpurposes of trips,
of the station. the ratio
Thus, of land useofultimately
the proportion the commuting determines area thehas peak hour
an impact
of the station.
on the PDC. Thus, the proportion of the commuting area has an impact on the PDC.
The
The proportion
proportion of of the
the commuting
commuting area area refers
refers toto the
the proportion
proportion of of land
land area
area thatthat can
can generate
generate
commuter
commuter traveltravel to to the
the total
total land
land area.
area. According
According to to aa 2015
2015 resident
resident travel
travel survey
survey in in Xi’an,
Xi’an, the
the city’s
city’s
peak
peak hours are 7:30–8:30 in the morning and 18:00–19:00 in the evening. The temporal distribution of
hours are 7:30–8:30 in the morning and 18:00–19:00 in the evening. The temporal distribution of
trip
trip purposes
purposes is is presented
presented in in Figure
Figure7. 7. The
The peak
peak hours
hours of of going
going to to work,
work, going
going to to school,
school, andandgoing
going
home
home fall
fall within
within the the city’s
city’s peak
peak hours;
hours; however,
however,the thepeak
peakhourshoursof ofother
otherpurposes
purposes do donotnotfall
fallwithin
within
the
the city’s peak hours. In particular, the land for education consists of two kinds of land, namely land
city’s peak hours. In particular, the land for education consists of two kinds of land, namely land
for
for primary
primaryand andmiddle
middle schools
schools andandland for colleges,
land for colleges,which both both
which generate different
generate passenger
different flows.
passenger
Most
flows.primary-and-middle-school
Most primary-and-middle-school studentsstudents
go to school
go toand go back
school and go homeback every
home school
everyday, andday,
school the
temporal distribution of their trip is labeled “Going to school” in Figure
and the temporal distribution of their trip is labeled “Going to school” in Figure 7. Other scholars 7. Other scholars have also
found thatfound
have also home-based school trips
that home-based are another
school trips are important part of morning-peak
another important ridership [43],
part of morning-peak and
ridership
this peak hour falls within the city’s peak hours. However, most college
[43], and this peak hour falls within the city’s peak hours. However, most college students live and students live and learn at
their schools, and only leave occasionally. When they return, their destinations
learn at their schools, and only leave occasionally. When they return, their destinations are their are their colleges, and
the temporal
colleges, anddistribution
the temporal is distribution
labeled “Going back to“Going
is labeled college”back in Figure 7. Thus,
to college” in this paper,
in Figure 7. Thus, theinland
this
areas
paper, the land areas for work, primary and middle schools, and residences (WPR) are used asarea.
for work, primary and middle schools, and residences (WPR) are used as the commuting the
The proportion
commuting of WPR
area. refers to the
The proportion ofsum
WPR ofrefers
the land areas
to the sumforofwork,
the landprimary
areas andformiddle schools, and
work, primary and
residences,
middle schools,whichand is then dividedwhich
residences, by theistotalthenland area by
divided around the station.
the total land area around the station.

Figure 7. Temporal
Figure 7. Temporal distribution
distributionof
oftrip
trippurposes
purposesin
inXi’an.
Xi’an.

In
In China,
China, only
only large
large cities
cities have
have built
builtup
upurban
urbanrailrailtransit.
transit. The
The floating
floating populations
populations of of these
these
cities
cities are large, and the populations around the stations are difficult to count. The high density of
are large, and the populations around the stations are difficult to count. The high density of
residential
residential buildings
buildings can
can be
be attributed
attributed to
to more
more residents
residents living
livingnear
nearthe
thestations
stations[44].
[44]. Therefore,
Therefore, in in
this
this paper,
paper,the
thebuilt
builtenvironment
environmentisisused
usedininplace
placeofofthe
thepopulation.
population. In In aa study
study byby Sung
Sung [45],
[45], the
the best
best
Sustainability 2020, 2, x FOR PEER REVIEW 9 of 21

catchment
Sustainability 2020,area for predicting station-level ridership is 500 m; thus, the statistical range of the current9 of 22
12, 2255
built environment is considered to be 500 m. The proportion of the commuting area as calculated by
the built environment in a 500 m station catchment area is then used as an independent variable. The
spatialarea
catchment distribution of the proportions
for predicting of the
station-level commuting
ridership areas
is 500 m; of Xi’an
thus, metro
the stationsrange
statistical is presented in
of the current
built Figure 8.
environment is considered
Sustainability 2020, 2, x FOR PEER REVIEWto be 500 m. The proportion of the commuting area as calculated
9 of 21 by
Yu [41] found that the area of undeveloped land around stations is related
the built environment in a 500 m station catchment area is then used as an independent variable. The to the PDC, but no
definite results
catchment
spatial distribution ofwere
area for obtained.
predicting
the proportions Thus, the
station-level undeveloped
of theridership
commuting m;land
is 500areasthus, area refers to
the statistical
of Xi’an the ofarea
range
metro stations the of the
current
is presented in
undeveloped
built land around a station, and is used as an independent variable. The spatial distribution
Figure 8. environment is considered to be 500 m. The proportion of the commuting area as calculated by
of
theundeveloped
built environment land areas of Xi’an
in a 500 metro
m station stations isarea
catchment presented in Figure
is then used as an9.independent variable. The
spatial distribution of the proportions of the commuting areas of Xi’an metro stations is presented in
Figure 8.
Yu [41] found that the area of undeveloped land around stations is related to the PDC, but no
definite results were obtained. Thus, the undeveloped land area refers to the area of the
undeveloped land around a station, and is used as an independent variable. The spatial distribution
of undeveloped land areas of Xi’an metro stations is presented in Figure 9.

Figure
Figure 8. Spatial
8. Spatial distribution
distribution ofof theproportions
the proportions of
of work,
work,primary
primaryand middle
and schools,
middle and and
schools, residences
residences
(WPR) of Xi’an metro stations.
(WPR) of Xi’an metro stations.

Yu [41] found that the area of undeveloped land around stations is related to the PDC, but no
definite results were obtained. Thus, the undeveloped land area refers to the area of the undeveloped
land around a station, and is used as an independent variable. The spatial distribution of undeveloped
land areasFigure 8. Spatial distribution of the proportions of work, primary and middle schools, and residences
of Xi’an metro stations is presented in Figure 9.
(WPR) of Xi’an metro stations.

Figure 9. Spatial distribution of the undeveloped areas of Xi’an metro stations.

Figure
Figure 9. Spatial
9. Spatial distributionof
distribution ofthe
the undeveloped
undeveloped areas
areasofof
Xi’an metro
Xi’an stations.
metro stations.
Sustainability 2020, 12, 2255 10 of 22

3.3.2.Sustainability
Distance 2020,to the
2, x City Center
FOR PEER REVIEW 10 of 21

García [46] determined


3.3.2. Distance to the City that the location of a house (close to the sea or city center and located in a
Center
certain area) influenced housing prices. Therefore, properties in the same land-use class in different
García [46] determined that the location of a house (close to the sea or city center and located in
locations will have
a certain area) different prices,
influenced reflecting
housing prices.socioeconomic conditions
Therefore, properties in the
in the area.
same If the metro
land-use station
class in
is fardifferent
from thelocations
city center,
will have different prices, reflecting socioeconomic conditions in the area. If the the
people have to depart earlier because of the longer travel time. Thus,
distance to the city center
metro station is far from is considered as an
the city center, independent
people variable
have to depart in because
earlier this study.
of the longer travel
time. Thus, the distance to the city center is considered as an independent variable in this study.
3.3.3. Betweenness Centrality
3.3.3. Betweenness Centrality
If the land use is the same at two stations, the ridership volume of a transfer station is higher
than that Ifofthe land usestation
a normal is the same
[47]. at two stations,
Existing studies thehave
ridership
usedvolume of a types
these two transferofstation
stationsis higher
as discrete
than that of a normal station [47]. Existing studies have used these two types
dummy variables [48] when analyzing the differences between transfer stations and normal stations. of stations as discrete
dummy variables [48] when analyzing the differences between transfer stations and normal stations.
However, this approach does not consider the differences between transfer stations in different locations.
However, this approach does not consider the differences between transfer stations in different
Therefore, the concept of betweenness centrality (BC), which is used in graph theory, is used in the
locations. Therefore, the concept of betweenness centrality (BC), which is used in graph theory, is
present
usedstudy
in thetopresent
determinestudythetolocations
determineand the properties
locations and of stations
propertiesinof
the urbaninrail
stations thetransit
urban network;
rail
the dummy variable indicates whether the station is a transfer station or not.
transit network; the dummy variable indicates whether the station is a transfer station or not. There is one shortest
There
path is
toone
every starting
shortest pathstation s to
to every terminal
starting station
station t, but notstation
s to terminal all shortest paths
t, but not pass through
all shortest station i.
paths pass
The BC valuestation
through reflects the situation
i. The of all shortest
BC value reflects pathsofinall
the situation the metro paths
shortest network that
in the passnetwork
metro one station,
that and
pass oneisstation,
the formula and the formula is as follows: X
as follows:
εsit
BCi ε= (2)
εst
BC =
i ε
s ≠t
sit
s,t (2)
st
where BCi is the betweenness centrality of station i, εsit is the number of shortest paths from station s to
where
station t viaBC i is the betweenness
station centrality
i, and εst is the of station
number i, εsit ispaths
of shortest the number
in theof shortest
metro paths from
network fromstation s s to
station
to station
station t. t via station i, and εst is the number of shortest paths in the metro network from station s to

station t.
The spatial distribution of the BC values of Xi’an metro stations is presented in Figure 10.
The spatial distribution of the BC values of Xi’an metro stations is presented in Figure 10.

Figure 10. Spatial distribution of the betweenness centrality (BC) values of Xi’an metro stations.
Figure 10. Spatial distribution of the betweenness centrality (BC) values of Xi’an metro stations.
Sustainability 2020, 12, 2255 11 of 22

3.4. Summary of Variables


The variables used in this paper are described in Table 2.

Table 2. The variables used in this paper.

Value
Variable Explanation
Mean Max Min
PDCb Dimensionless continuous values 1.134 3.990 1.000
PDCa Dimensionless continuous values 1.068 1.693 1.000
Proportion of WPR Dimensionless continuous values 0.764 0.995 0.062
Undeveloped land Unit: 104 km2 22.778 76.516 0.000
Distance to the city center Continuous values, unit: km 7.561 17.800 0.200
BC Dimensionless continuous values 0.113 0.457 0.000

4. Methodology
Based on the ordinary least square (OLS) model, the GWR model introduces local parameters to
measure the spatial position, and the spatial characteristics of data are added into the model to show
the local spatial characteristics and the instability of the spatial distribution in the research area [49].
The equation of the GWR model can be expressed as follows:

4
X
PDCi = β0 (ui , vi ) + βk(ui ,vi ) xik + εi (3)
k =1

where PDCi is the PDC of metro station i, (ui , vi ) denotes the location of station i, βk (ui , vi ) indicates
the kth regression parameter at station i, which is a function of the geographical position, and xik
is the independent variable, of which there are four used in this study (the proportion of WPR, the
undeveloped land area, the distance to the city center, and BC); their definitions and formulas were
introduced in the previous chapter. Additionally, εi is the normally distributed error term of station i,
and p is the total number of stations.
After omitting the spatial position term, the resulting equation is as follows:

4
X
PDCi = βi0 + βik xik + εi (4)
k =1

Formula (4) can be expressed as:

PDCi = X(i)β(i) + εi (5)

The weighted least-squares method is used to minimize Formula (6) to calculate the regression
parameters of station i:
h i−1
β̂(i) = XT W (i)X XT W (i)PDCi (6)

where W(i) is the weight matrix of the geographically weighted regression at station i, and X is a matrix
of explanatory variables. If the sample points of each station remain homogeneous in space, i.e., β(i) is
a constant, then the GWR is equivalent to the OLS model. However, in actuality, the assumption of
spatial independence or homogeneity is difficult to hold, especially in the related research of urban
rail passenger flow; although the land use situation around each station is similar, due to the different
spatial locations of each station, their boarding and alighting ridership are very different. It is clear
that determining how to choose the desired spatial weight matrix is the key to accurately describing
Sustainability 2020, 12, 2255 12 of 22

the spatial interaction, which largely determines the quality of the model fitting effect [50]. According
to Hanham et al. [51], the Gaussian distance function was chosen in this study to be the basic form of
space weight, and the expression is as follows:

 dij 2 
 
ωij = exp−( ) 
 (7)
b

where dij is the distance between station i and station j, and b is the kernel bandwidth parameter.
Point-by-point regression is performed by using the above method to obtain the regression
parameter estimation matrix containing sample points:

 β10 βn0
 
··· 
 . ..
 
β =  .. .. 
(8)
 . . 

β1p ··· βnp
 

Thus, the value of the dependent variable can be estimated as follows:

PDC(i) = X(i)β̂(i). (9)

5. Results and Discussion

5.1. Spatial Autocorrelation and Local Collinearity Test


Before using the GWR model to analyze the relationship between PDC and the independent
variables, the collinearity test of the independent variables must be done and one of the collinear
variables must be deleted first. Table 3 shows the results of the collinearity test. According to the
Mathematical dictionary [52], there are three methods to judge the collinearity of the independent
variables. The condition index is more than 10, at least one of the variance ratios is close to 1, and the
value of eigenvalue is 0. In Table 3, the condition index in the second line is more than 10, the variance
ratio of the ‘distance to city center’ is 0.86, and the value of eigenvalue is 0.03. So this independent
variable is deleted. The last line in Table 3 is the collinearity test result after deleting the ‘distance to
city center’.

Table 3. Collinearity test result.

Condition Variance ratio


Eigenvalue
Index Proportion of Distance to Undeveloped
BC
WPR City Center Land Area
0.03 11.58 0.17 0.86 0.23 0.50
0.10 5.40 0.29 - 0.41 0.46

Before establishing the GWR model, the Moran I test must be conducted to determine if spatial
autocorrelation exists. The results are shown in Table 4. The P-value of the variable is less than 0.05,
and the Moran I is between −1 and 1. Thus, spatial autocorrelation exists, and the GWR model is
suitable to setting up.

Table 4. Moran I test results.

Variable Moran I Expectation Index Mean Value Z-score P-value


Proportion of WPR 0.099 −0.012 −0.011 1.788 0.044
Undeveloped land area 0.548 −0.012 −0.010 8.476 0.001
BC 0.523 −0.012 −0.008 6.610 0.001
Sustainability 2020, 12, 2255 13 of 22

5.2. Results

5.2.1. Summary Statistics


The summary statistics of the local coefficients are shown in Table 5. If the Akaike information
criterion (AIC) difference between the two models is greater than 3, the goodness of fit is considered to
be significantly improved [44], and AIC is converted to corrected Akaike information criterion (AICc)
in the case of a small sample. In Table 5, the AICc of GWR are decreased by more than 67 compared
with the AICc of the OLS model. The adjusted R2 values are larger for the GWR (more than 0.8 for the
PDCs) than for the OLS model. Cardozo [5] used the GWR model to forecast ridership at the metro
station level. He used nine variables and the adjusted R2 value was 0.7. Thus, the adjusted R2 values in
this paper are acceptable. The values of the undeveloped land area and BC are higher than 0, indicating
that these variables have positive influences on the PDC. For the PDCa , the value of proportion of WPR
is less than 0, demonstrating that the variable has a negative influence on the PDC, whereas for the
PDCb , the value is near 0.

Table 5. Statistics of the local coefficients.

Corrected Akaike
Dependent Proportion Undeveloped
Model Information Adjusted R2 BC
Variable of WPR Land Area
Criterion (AICc)
Ordinary least square PDCb 700.068 0.522 −0.777 1.433 14.038
(OLS)
PDCa 643.217 0.750 −14.110 2.013 23.630
Geographically PDCb 607.986 0.860 −0.611 0.752 8.955
weighted regression
PDCa 575.484 0.900 −8.184 1.419 15.182
(GWR)

5.2.2. Spatial Distributions of the Coefficients


The spatial distributions of the regression coefficients are presented in Figure 11, from which it is
evident that the spatial distributions of the same coefficients exhibit differences for different PDCs.
For the proportion of WPR, the regression coefficients of PDCb have both positive and negative
numbers, and the values are small, ranging from −2.5 to 1.0. This demonstrates that this variable does
not have a dominant influence on the PDC. For the spatial distribution of the proportion of WPR, the
coefficients are close to 0.0 in the center and the northeast of the city, the coefficients are less than 0.0 in
the north, south, and east of the city, and the coefficients are greater than 0.0 in the west of the city.
In the northeast of the city, the land has not been maturely developed. In the north, south, and east
of the city, there are three sub-centers near the stations 2#FCWL, 2#XZ, and 1#FZC. To the northwest
of Xi’an, there is another city—Xianyang. The two cities are so close that the linear distance from
1#HWZ to the city center of Xianyang is only about 10 km. The office space in this area is uncertain,
and the influences of the proportions of WPR on these stations are small. The regression coefficients of
PDCa are all negative, indicating that the proportion of WPR has a negative influence on the PDC. The
regression coefficients are higher in the city center and lower in the periphery. Because the regression
coefficients are all negative, the influence increases from the center to the periphery.
For the undeveloped land area, the range of the PDCa values is slightly larger than that of the
b
PDC values. The spatial distributions are approximately the same, and the larger values occur in the
periphery. By comparing Figures 9 and 11c,d, it is clear that there are few stations in Metro Line 2
(the north–south line) that have undeveloped land, but the regression coefficients are substantially
different. For the stations in the southeast and northeast of the city with large undeveloped land areas,
the regression coefficients are still substantially different. This indicates that undeveloped land areas
have uncertain influences on both the PDCb and PDCa values, which is consistent with the fact that
trips to undeveloped land occur relatively seldom.
Sustainability 2020, 12, 2255 14 of 22
Sustainability 2020, 2, x FOR PEER REVIEW 14 of 21

(a) Proportion of WPR of PDCb (b) Proportion of WPR of PDCa

(c) Undeveloped land area of PDCb (d) Undeveloped land area of PDCa

(e) BC of PDCb (f) BC of PDCa


Figure11.
Figure 11.Spatial
Spatial distribution
distribution of
of regression
regression coefficients.
coefficients.

5.2.3.
ForStation
the BC, the range of values is similar to those of the PDCb and the PDCa . Most values are
Classification
positive, indicating
Because that this variable
the undeveloped land has
areaa has
positive
an uncertain on the PDC.
influenceinfluence ThePDC,
on the coefficients of the
the k-means
variables are smaller in the city center and larger in the periphery, which means that
method was used to classify stations with undeveloped land areas of 0. The classifying variables the BC has
area
greater influenceof
the proportion onWPR,
diverging a station’s
BC, PDC peak
b, and PDC hourthe
a, and from the city’s
results peak hour
are presented in in suburban
Table areas. 12.
6 and Figure
The PDCb and PDCa values in the 1st, 3rd, and 5th kinds of stations are all less than 1.17, and the
Sustainability 2020, 12, 2255 15 of 22

5.2.3. Station Classification


Because the undeveloped land area has an uncertain influence on the PDC, the k-means method
was used to classify stations with undeveloped land areas of 0. The classifying variables are the
proportion of WPR, BC, PDCb , and PDCa , and the results are presented in Table 6 and Figure 12.
The PDCb and PDCa values in the 1st, 3rd, and 5th kinds of stations are all less than 1.17, and the
proportions of WPR in these kinds of stations are all greater than 0.5, indicating that lands for work,
primary and middle schools, and residences of these stations occupies the main body. However, the
BC values in these kinds of stations are different. The BC values in the 1st kind of station are greater
than 0.35, indicating that these stations are in important positions of the metro network; most of the
BC values in the 3rd and 5th kinds of stations are less than 0.3, and fall within a wide range. The
PDCa values in the 4th kind of station are greater than 1.13, and the PDCb values are very large. The
values of the proportion of WPR in the 4th kind of station are less than 0.5, indicating that other land of
these stations occupies the main body. The BC values in this kind of station are within a large range
from 0 to 0.2. For the 2nd kind of station, the proportion of WPR is greater than 0.8, and the BC is
about 0.1, meaning that these stations are commuting stations and not in very important areas. The
PDCb values are close to 1, but the PDCa values are the greatest among all the PDCa values. There are
two stations of the 2nd kind, namely 1#KYM and 4#DMG. 1#KYM has a large amount of industrial
land and many residential districts established by a factory manager, and these lands do not produce
medium- or long-distance travel. 4#DMG is near cultural relics and historic sites, including Daming
Palace National Heritage Park, but this area is not as developed as the Greater Wild Goose Pagoda
Square, and the station only sees 276 people during the city’s peak hour and 340 people during the
station’s peak hour.
Thus, the proportion of WPR has the greatest influence on the PDC. If the proportion of WPR is
less than 0.5, the PDCb and PDCa values are both very large; if the proportion of WPR is greater than
0.5, most PDCb and PDCa values are close to 1.

Table 6. Summary statistics of the local coefficients.

Variables 1st Kind 2nd Kind 3rd Kind 4th Kind 5th Kind
Mean 1.171 1.006 1.010 3.635 1.036
PDCb Max 1.309 1.008 1.034 3.990 1.186
Min 1.001 1.004 1.001 3.281 1.000
Mean 1.051 1.280 1.025 1.230 1.014
PDCa Max 1.143 1.328 1.101 1.323 1.047
Min 1.004 1.232 1.000 1.137 1.004
Mean 0.759 0.871 0.854 0.410 0.685
Proportion of WPR
Max 0.786 0.936 0.995 0.503 0.739
Min 0.721 0.805 0.777 0.317 0.575
Mean 0.405 0.094 0.143 0.113 0.160
BC Max 0.457 0.108 0.248 0.225 0.323
Min 0.367 0.079 0.000 0.000 0.045
Mean 0.759 0.871 0.854 0.410 0.685
Proportion of WPR Max 0.786 0.936 0.995 0.503 0.739
Min 0.721 0.805 0.777 0.317 0.575
Mean 0.405 0.094 0.143 0.113 0.160
BC Max 0.457 0.108 0.248 0.225 0.323
Sustainability 2020, 12, 2255 16 of 22
Min 0.367 0.079 0.000 0.000 0.045

Sustainability 2020, 2, x FOR PEER REVIEW 16 of 21


(a) PDCb values of different kinds of stations (b) PDCa values of different kinds of stations

(c) Proportions of WPR of different kinds of


(d) BC values of different kinds of stations
stations
Figure The
12.12.
Figure values
The ofof
values variables forfor
variables different kinds
different of of
kinds stations.
stations.

5.3. Discussion—Future Station Design and Policy


5.3. Discussion—Future Station Design and Policy
The enlargement coefficient put forward in this paper, PDC, can be used as a simple way to convert
The enlargement coefficient put forward in this paper, PDC, can be used as a simple way to
the ridership during a city’s peak hour to the ridership during a station’s peak hour. The proportion of
convert the ridership during a city’s peak hour to the ridership during a station’s peak hour. The
the WPR was found to have a negative influence on the PDCa value. In otherawords, the larger the
proportion of the WPR was found to have a negative influence on the PDC value. In other words,
lands for work, primary and middle schools, and residences, the smaller the deviation of ridership
the larger the lands for work, primary and middle schools, and residences, the smaller the deviation
between a station’s peak hour and the city’s peak hour, as the commuting trip during workdays
of ridership between a station’s peak hour and the city’s peak hour, as the commuting trip during
constitutes the city’s peak hour. This result is consistent with the results of previous studies [8,53] that
workdays constitutes the city’s peak hour. This result is consistent with the results of previous
investigated the metro ridership in Osaka, Shanghai, and Zhengzhou, and found that trips of going to
studies [8,53] that investigated the metro ridership in Osaka, Shanghai, and Zhengzhou, and found
work and going to school make the station’s peak hour earlier, while shopping and traveling trips
that trips of going to work and going to school make the station’s peak hour earlier, while shopping
delay the station’s peak hour.
and traveling trips delay the station’s peak hour.
If the proportion of WPR of a station is greater than 0.5, it can be considered that the ridership
If the proportion of WPR of a station is greater than 0.5, it can be considered that the ridership
during the city’s peak hour is the highest ridership of the whole day; if it is less than 0.5, the highest
during the city’s peak hour is the highest ridership of the whole day; if it is less than 0.5, the highest
ridership is the ridership during the city’s peak hour multiplied by the PDC. This result is consistent
ridership is the ridership during the city’s peak hour multiplied by the PDC. This result is
with the findings of Yu [41], who examined two cities—Xi’an and Chongqing—and found that the
consistent with the findings of Yu [41], who examined two cities—Xi’an and Chongqing—and
PDC value of most metro stations is close to 1 when the proportion of WPR is greater than 0.5.
found that the PDC value of most metro stations is close to 1 when the proportion of WPR is greater
In the morning, the proportion of WPR has more influence on the alighting ridership than on
than 0.5.
the boarding ridership. For stations with proportions of WPR of greater than 0.5, if it is a special type
In the morning, the proportion of WPR has more influence on the alighting ridership than on
of land, such as the 1#KYM and 4#DMG stations, the PDCb value is close to 1, but the PDCa value
the boarding ridership. For stations with proportions of WPR of greater than 0.5, if it is a special
is greater among the PDCa values of all stations. Moreover, the regression coefficients of PDCa are
type of land, such as the 1#KYM and 4#DMG b
stations, the PDCb value is close to 1, but the PDCa
negative, but the regression coefficients of PDC are both positive and negative numbers. This means
value is greater among the PDCa values of all stations. Moreover, the regression coefficients of PDCa
that the proportion of WPR results in the alighting ridership occurring during a city’s peak hour, but
are negative, but the regression coefficients of PDCb are both positive and negative numbers. This
does not have a clear effect on the boarding ridership. It indicates that the lands for work, primary and
means that the proportion of WPR results in the alighting ridership occurring during a city’s peak
middle schools, and residences has more explanatory power regarding the peak hour deviation in the
hour, but does not have a clear effect on the boarding ridership. It indicates that the lands for work,
alighting during morning peak hours. The lands for work, primary and middle schools mostly attract
primary and middle schools, and residences has more explanatory power regarding the peak hour
office workers and students who need to arrive on time or ahead of schedule. Compared with their
deviation in the alighting during morning peak hours. The lands for work, primary and middle
schools mostly attract office workers and students who need to arrive on time or ahead of schedule.
Compared with their boarding behavior, their alighting behavior has a relatively clear arrival time.
For different enterprises or schools, the time is almost the same, and is the same as the city’s peak
hour. However, the boarding times will present large differences because of the distance between
home and work. In China, the administrative land is more concentrated than residential land [54],
Sustainability 2020, 12, 2255 17 of 22

boarding behavior, their alighting behavior has a relatively clear arrival time. For different enterprises
or schools, the time is almost the same, and is the same as the city’s peak hour. However, the boarding
times will present large differences because of the distance between home and work. In China, the
administrative land is more concentrated than residential land [54], which will lead to the concentrated
distribution of commuting–alighting passenger flow.
The local coefficient of BC is greater than 0, meaning it has an influence on the deviation of the
peak hours of the station and city. For the spatial distribution, the BC is greater in suburban areas,
indicating that the BC has a greater influence on diverging a station’s peak hour from the city’s peak
hour in suburban areas. This is because commuters who live in suburban areas and work in the
city center need to leave earlier. This finding is reasonable, as housing prices in suburban areas are
lower; people are more willing to live in these areas, thus increasing their time spent on the metro and
affecting the station’s peak hour. This evidence is consistent with the results of previous studies [43],
which found that more people want to live in suburban areas, resulting in the increased metro travel
demands of these areas.
In China, station design must consider the extra peak hour passenger flow, which is the predicted
peak hour passenger flow multiplied by the extra peak hour factor, which is between 1.1 and 1.4 [2].
The extra peak hour factor (EPHF) is the highest fifteen-minute ridership during the city’s peak hour
multiplied by 4, and then divided by the ridership during the city’s peak hour [55]. The EPHF is the
expanded threshold of the station’s capacity in the city’s peak hour. However, this study shows that
some stations’ own peak hours are inconsistent with the city’s peak hour because of various land use
and BC around the stations, and the peak load shifting is formed. This results in that the EPHF does not
have constraints to these peak load shifting stations’ capacities. Thus, this paper put forward the PDC
to depict this inconsistent phenomenon of the station’s peak hour. Although the EPHF and the PDC
both reflect the temporal distribution of metro stations, they are totally different. The EPHF reflects
the concentration of passenger flow in a city’s peak hour, while the PDC reflects the inconsistency
between a station’s peak hour and the city’s peak hour. There is no comparability between the two
coefficients. Moreover, the thinking methods about the EPHF and the PDC are completely different
from each other. For example, stations with large proportions of administrative land usually have
high EPHF values because of the instantaneous gathering of commuting ridership [56]. But the greater
commuting ridership results in the high consistency between the station’s peak hour and that of the
city, leading to a PDC value close to 1. By contrast, stations with large proportions of commercial land
usually have low EPHF values because of the randomness of the shopping flow [57]. However, the
peak hour of shopping flow lags behind the city’s peak hour, leading to the station’s peak hour being
highly inconsistent with the city’s, and this increases the PDC value. Thus, there are two ridership
options when designing stations, namely the extra peak hour ridership during a city’s peak hour and
ridership during a station’s peak hour.
The morning boarding and alighting PDC and EPHF values of Xi’an metro stations are shown in
Figures 13 and 14. The fluctuations of the PDC and the EPHF values are different. For the boarding
ridership, the EPHF values change from 1.1 to 1.5, while most PDC values are about 1.0. However, five
stations’ PDC values are greater than 1.8, and nine stations’ PDC values are greater than their EPHF
values. For the alighting ridership, the EPHF values change from 1.0 to 1.8, while most PDC values
are about 1.0, but fifteen stations’ PDC values are greater than their EPHF values. Therefore, the size
relationship between the extra peak hour passenger flow during a city’s peak hour and the ridership
during a station’s peak hour cannot be determined. If only one kind of coefficient is used to design the
station, the scale of some of the station will be small. Thus, when a station is designed, both types of
ridership (the extra peak hour ridership during a city’s peak hour and ridership during a station’s
peak hour) must be calculated, and the larger of the two is used to design the station.
Therefore, the size relationship between the extra peak hour passenger flow during a city’s peak
hour and the ridership during a station’s peak hour cannot be determined. If only one kind of
coefficient is used to design the station, the scale of some of the station will be small. Thus, when a
station is designed, both types of ridership (the extra peak hour ridership during a city’s peak hour
and ridership during a station’s peak hour) must be calculated, and the larger of the two is used
Sustainability 2020, 12, 2255
to
18 of 22
design the station.

Sustainability 2020, 2, x FOR PEER REVIEW 18 of 21


Figure 13. The
The morning
morning boarding
boarding PDC
PDC and
and EPHF
EPHF values of Xi’an metro stations.

Figure 14.
Figure 14. The morning alighting PDC and EPHF values
values of
of Xi’an
Xi’an metro
metro stations.
stations.

The
The land
land development
development in the center center of of Xi’an
Xi’an has
has been
been completed,
completed, and the city is is now
now faced
faced
with
with land
land replacement
replacement and and land
land development
development in in urban
urban areas.
areas. Land replacement refers to to moving
moving
industrial
industrial land from the city center to urban areas and changing the land to other other types
types of of property.
property.
Non-commuting
Non-commuting land land cancanmitigate
mitigatethe thetraffic
trafficpressure
pressureofofa acity’s
city’speak
peakhour,
hour, because
because itsits
travel peak
travel is
peak
different from
is different thatthat
from of commuting
of commuting land.land.
However,
However,the mixed land use
the mixed landratio
use must
ratio be considered.
must More
be considered.
non-commuting
More non-commuting land around metro stations
land around will result
metro stations in the
will deviation
result of their peak
in the deviation of hours
their peakfrom hours
those
of the those
from city. This willcity.
of the increase
This the
willdifficulty
increase of theboth passenger
difficulty flowpassenger
of both forecasting in the
flow planning in
forecasting stage
the
and train departure
planning stage andintervals and station
train departure management
intervals in the management
and station operation stage. in Because there isstage.
the operation only
one trainthere
Because departure
is only interval
one trainin adeparture
period of interval
time in ainsection
a periodof of
a metro
time inline, if a station’s
a section of a metropeakline,
hourifisa
different
station’s from
peakthathour of is
other stations,
different fromthe that
trainofdeparture intervalthe
other stations, willtrain
be greater
departurein this station‘s
interval peak
will be
hour, and passengers will pile up. However, less non-commuting land around
greater in this station‘s peak hour, and passengers will pile up. However, less non-commuting land metro stations will
result
around in metro
a largestations
extra peak willhour passenger
result in a large flow andpeak
extra a small off-peak
hour hour flow
passenger passenger
and aflow.
smallAlthough
off-peak
ahour
significant
passengeramountflow.ofAlthough
money may be spent on
a significant building
amount of amoney
large station,
may beitspent
may be onnearly
building empty for
a large
most of the day. Thus, the proportion of commuting land around metro
station, it may be nearly empty for most of the day. Thus, the proportion of commuting land stations must be considered in
land
aroundplanning.
metro stations must be considered in land planning.

6.
6. Conclusions
Conclusion
In
In this
thisstudy,
study,wewe
investigated the the
investigated differences in theinridership
differences of metro
the ridership of stations betweenbetween
metro stations the stations’
the
peak hours and the city’s peak hours for 88 metro stations in Xi’an. The enlargement
stations’ peak hours and the city’s peak hours for 88 metro stations in Xi’an. The enlargement coefficient put
forward in this
coefficient put paper,
forward PDC, can be
in this usedPDC,
paper, as a simple
can bewayusedtoasconvert the ridership
a simple duringthe
way to convert a city’s peak
ridership
during a city’s peak hour to the ridership during a station’s peak hour. GWR was used to determine
the influencing factors on the ridership volume and specifically on the PDC. The key findings are as
follows:
The proportion of WPR is found to have a negative influence on the PDCa value. In the
morning, the proportion of WPR has more influence on the alighting ridership than on the boarding
ridership. If the proportion of WPR of a station is greater than 0.5, it can be considered that the
Sustainability 2020, 12, 2255 19 of 22

hour to the ridership during a station’s peak hour. GWR was used to determine the influencing factors
on the ridership volume and specifically on the PDC. The key findings are as follows:
The proportion of WPR is found to have a negative influence on the PDCa value. In the morning,
the proportion of WPR has more influence on the alighting ridership than on the boarding ridership.
If the proportion of WPR of a station is greater than 0.5, it can be considered that the ridership during
the city’s peak hour is the highest ridership of the whole day; if it is less than 0.5, the highest ridership
is the ridership during the city’s peak hour multiplied by the PDC.
The BC has an influence on the deviation of the peak hours of the station and city, and the BC is
greater in suburban areas, indicating that the BC has a greater influence on diverging a station’s peak
hour from the city’s peak hour in suburban areas.
Thus, if a metro station is primarily surrounded by commuting land (such as if the proportion
of commuting land area is greater than 0.5 in Xi’an), does not have special land (such as an external
transportation hub), and is located in the city center, its PDC is close to 1.
When designing a metro station, there are two ridership options, namely the extra peak hour
ridership during a city’s peak hour and ridership during a station’s peak hour. The size relationship
between the extra peak hour passenger flow during a city’s peak hour and the ridership during a
station’s peak hour cannot be determined. The larger of the two is used to design the station. However,
for the stations to accord with the conditions in the previous paragraph, they only need to consider the
extra peak hour factor, as their values of PDC are close to 1.
When performing urban land use planning, the mixed land use ratio must be considered.
Non-commuting land can mitigate the traffic pressure of a city’s peak hour, but at the same time more
non-commuting land around metro stations will result in the deviation of their peak hours from those
of the city. This will increase the difficulty of both passenger flow forecasting in the planning stage and
train departure intervals and station management in the operation stage.
The transfer passenger volume cannot be counted by the automatic fare collection system directly,
and its correctness cannot be verified. Thus, this research only considered the boarding and alighting
volumes, and did not account for transfer passengers in interchange stations; an inconsistent peak
hour phenomenon also occurs for transfer passengers, which would influence the design of transfer
channels. Therefore, this topic will be investigated in future research.

Author Contributions: Conceptualization, L.Y.; methodology, Y.C.; software, Y.C.; validation, L.Y.; formal analysis,
L.Y. and K.C.; investigation, L.Y.; resources, K.C.; data curation, Y.C.; writing—original draft preparation, L.Y.;
writing—review and editing, Y.C.; visualization, Y.C.; funding acquisition, K.C. All authors have read and agreed
to the published version of the manuscript.
Funding: This research was funded by the National Natural Science Foundation of China, grant number 71871027.
Acknowledgments: We would like to acknowledge the anonymous reviewers and the authors of the cited papers
for their detailed comments, without which this work would not have been possible.
Conflicts of Interest: The authors declare no conflict of interest.

References
1. Kittelson, J. Associates, Parsons Brinckerhoff, KFH Group. Transit Capacity and Quality of Service Manual, 3rd ed.;
Transportation Research Board: Washington, DC, USA, 2013.
2. GB50157-2013. Code for Design of Metro. Available online: http://www.jianbiaoku.com/webarbs/book/1027/
1073295.shtml (accessed on 13 February 2020).
3. Ma, C.Q.; Wang, Y.P. Traffic volume forecast for xi’an urban rapid rail transit. Urban Rapid Rail Transit 2006,
19, 24–28.
4. Mwakalonge, J.L. Econometric Modeling of Total Urban Travel Demand Using Data Collected in Single and
Repeated Cross-Sectional Surveys. Ph.D. Thesis, Tennessee Technological University, Cookeville, TN, USA,
May 2010.
5. Cardozo, O.D.; Garcia-Palomares, J.C.; Gutierrez, J. Application of geographically weighted regression to the
direct forecasting of transit ridership at station-level. Appl. Geogr. 2012, 34, 548–558. [CrossRef]
Sustainability 2020, 12, 2255 20 of 22

6. Michael, G.M. The Four Step Model. In Handbook of Transport Modeling; Emerald Group Publishing Limited:
Bingley, UK, 2007.
7. Cui, Z.J. On Scale of Subway Station Platform. J. Southwest Jiaotong Univ. 1993, 3, 76–80.
8. Gu, L.P.; Ye, X.F. Study on the in and out passenger flow during peak hours of the rail transit station in Osaka.
Compr. Transp. 2014, 2, 57–61.
9. GB/T51150-2016. Code for Prediction of Urban Rail Transit Ridership. Available online: http://www.zzguifan.
com/webarbs/book/92970/3172164.shtml (accessed on 13 February 2020).
10. Tobin, R.L.; Friesz, T.L. Sensitivity analysis for equilibrium network flow. Transp. Sci. 1988, 22, 100–105.
[CrossRef]
11. Vovsha, P. Application of cross-nested logit model to mode choice in Tel Aviv, Israel, metropolitan area.
Transp. Res. Rec. 1997, 1607, 6–15. [CrossRef]
12. Sun, X.; Wilmot, C.G.; Kasturi, T. Household travel, household characteristics, and land use: An empirical
study from the 1994 Portland activity-based travel survey. Transp. Res. Rec. 1998, 1617, 10–17. [CrossRef]
13. Yam, R.C.M.; Whitfield, R.C.; Chung, R.W.F. Forecasting traffic generation in public housing estates. J. Transp.
Eng. 2000, 126, 358–361. [CrossRef]
14. Rezaeestakhruie, H. Analytical error propagation in four-step transportation demand models. Comput. Sci.
2017. [CrossRef]
15. Sanko, N.; Morikawa, T.; Nagamatsu, Y. Post-project evaluation of travel demand forecasts: Implications
from the case of a Japanese railway. Transp. Policy 2013, 27, 209–218. [CrossRef]
16. Ryu, S.; Chen, A.; Zhang, H.M. Path flow estimator for planning applications in small communities. Transp.
Res. Part A Policy Pract. 2014, 69, 212–242. [CrossRef]
17. Cervero, R.; Murakami, J.; Miller, M. Direct ridership model of bus rapid transit in Los Angeles County,
California. Transp. Res. Rec. 2010, 2145, 1–7. [CrossRef]
18. Feng, X.; Sun, Q.; Liu, J. Time Characteristic of Input Passenger in Urban Rail Transit Stations Among High
Density Residential Areas. In Proceedings of the Chinese Control Conference, Beijing, China, 29–31 July
2010.
19. Dend, J.; Xu, M. Characteristics of subway station ridership with surrounding land use: A case study in
Beijing. In Proceedings of the 2015 International Conference on Transportation Information and Safety
(ICTIS), Wuhan, China, 25–28 June 2015.
20. Chen, C.; Chen, J.; Barry, J. Diurnal pattern of transit ridership: A case study of the New York City subway
system. J. Transp. Geogr. 2009, 17, 176–186. [CrossRef]
21. Chan, S.; Miranda-Moreno, L. A station-level ridership model for the metro network in Montreal, Quebec.
Can. J. Civ. Eng. 2013, 40, 254–262. [CrossRef]
22. Choi, J.; Lee, Y.J.; Kim, T. An analysis of Metro ridership at the station-to-station level in Seoul. Transportation
2012, 39, 705–722. [CrossRef]
23. Sung, H.; Oh, J.T. Transit-oriented development in a high-density city: Identifying its association with transit
ridership in Seoul, Korea. Cities 2011, 28, 70–82. [CrossRef]
24. He, Y.X.; Zhao, Y.; Tsui, K.L. Modeling and analyzing spatiotemporal factors influencing metro station
ridership in taipei: An approach based on general estimating equation. arXiv 2019, arXiv:1904.01280.
25. Roos, J.; Bonnevay, S.; Gavin, G. Short-Term Urban Rail Passenger Flow Forecasting: A Dynamic Bayesian
Network Approach. In Proceedings of the 2016 15th IEEE International Conference on Machine Learning
and Applications (ICMLA), Anaheim, CA, USA, 18–20 December 2016.
26. Haire, A.R. A methodology for incorporating fuel price impacts into short-term transit ridership forecasts.
Ph.D. Thesis, The University of Texas at Austin, Austin, TX, UAS, May 2009.
27. Ma, X.L.; Zhang, J.Y.; Du, B. Parallel Architecture of Convolutional Bi-Directional LSTM Neural Networks
for Network-Wide Metro Ridership Prediction. IEEE Trans. Intell. Transp. Syst. 2018, 99, 1–11. [CrossRef]
28. Li, B. Research on the Computer Algorithm Application in Urban Rail Transit Holiday Passenger Flow
Prediction. In Proceedings of the 2016 International Conference on Network and Information Systems for
Computers, Wuhan, China, 15–17 April 2016.
29. Yang, R.; Wu, B. Short-term passenger flow forecast of urban rail transit based on BP neural network. In
Proceedings of the Intelligent Control & Automation, Jinan, China, 7–9 July 2010.
30. Shen, J.Y. Simplified calculation for the width of on and off region of station platform. Urban Rapid Rail
Transit 2008, 05, 9–12.
Sustainability 2020, 12, 2255 21 of 22

31. Chen, K.M.; Yu, L.J.; Ma, C.Q. Differentiated peak hours at urban rail transit stations in Xi’an. Urban Transp.
China 2018, 16, 51–58.
32. Ping, S.H. Characteristics of temporal passenger flow distribution at different stations on shenzhen metro
line 1. Urban Mass Transit 2018, 21, 85–87.
33. Brunsdon, C.; Fotheringham, A.S.; Charlton, M.E. Geographically weighted regression: A method for
exploring spatial non stationarity. Geogr. Anal. 1996, 28, 281–298. [CrossRef]
34. Chen, E.; Ye, Z.; Wang, C. Discovering the spatio-temporal impacts of built environment on metro ridership
using smart card data. Cities 2019, 95, 102359. [CrossRef]
35. Hadayeghi, A.; Shalaby, A.S.; Persaud, B.N. Development of planning level transportation safety tools using
geographically weighted poisson regression. Accid. Anal. Prev. 2010, 42, 676–688. [CrossRef] [PubMed]
36. Clark, S.D. Estimating local car ownership models. J. Transp. Geogr. 2007, 15, 184–197. [CrossRef]
37. Zhao, F.; Park, N. Using geographically weighted regression models to estimate annual average daily traffic.
J. Transp. Res. Board 2004, 1879, 99–107. [CrossRef]
38. Qian, X.; Ukkusuri, S.V. Spatial variation of the urban taxi ridership using GPS data. Appl. Geogr. 2015, 59,
31–42. [CrossRef]
39. Paez, A. Exploring contextual variations in land use and transport analysis using a probit model with
geographical weights. J. Transp. Geogr. 2006, 14, 167–176. [CrossRef]
40. Blainey, S.P.; Preston, J.M. A geographically weighted regression based analysis of rail commuting around
Cardiff, South Wales. In Proceedings of the 12th World Conference on Transportation Research, Lisbon,
Portugal, 11–15 July 2010.
41. Yu, L.; Chen, Q.; Chen, K. Deviation of peak hours for urban rail transit stations: A case study in Xi’an, China.
Sustainability 2019, 11, 2733. [CrossRef]
42. Kong, X.; Yang, J. A new method for forecasting station-level transit ridership from land-use perspective:
The case of shenzhen city. Sci. Geogr. Sin. 2018, 38, 2074–2083. [CrossRef]
43. Zhao, J.; Deng, W.; Song, Y. Analysis of Metro ridership at station level and station-to-station level in Nanjing:
An approach based on direct demand models. Transportation 2014, 41, 133–155. [CrossRef]
44. Zhu, Y.; Chen, F.; Wang, Z. Spatio-temporal analysis of rail station ridership determinants in the built
environment. Transportation 2019, 46, 2269–2289. [CrossRef]
45. Sung, H.; Choi, K.; Lee, S. Exploring the impacts of land use by service coverage and station-level accessibility
on rail transit ridership. J. Transp. Geogr. 2014, 36, 134–140. [CrossRef]
46. García, P.A. Several determinant factors of the secondhand housing price: An application of the hedonic
methodology. Rev. Estud. Reg. 2008, 82, 135–158.
47. Gutiérrez, J.; Cardozo, O.-D.; García-Palomares, J.-C. Transit ridership forecasting at station level: An
approach based on distance-decay weighted regression. J. Transp. Geogr. 2011, 19, 1081–1092. [CrossRef]
48. Report 16 Transit and Urban Form. Available online: http://onlinepubs.trb.org/onlinepubs/tcrp/tcrp_rpt_16-
2.pdf (accessed on 13 February 2020).
49. Fotheringham, A.S.; Brunsdon, C.; Charlton, M.E. Geographically Weighted Regression: The Analysis of Spatially
Varying Relationships; Wiley: Hoboken, NJ, USA, 2002.
50. Brunsdon, C.; Fotheringham, A.S.; Charlton, M. Geographically weighted summary statistics-a framework
for localised exploratory data analysis. Comput. Environ. Urban Syst. 2002, 26, 501–524. [CrossRef]
51. Hanham, R.; Hoch, R.J.; Spiker, J.S. The Spatially Varying Relationship Between Local Land-Use Policies and
Urban Growth: A Geographically Weighted Regression Analysis. In Planning and Socioeconomic Applications;
Springer: Berlin/Heidelberg, Germany, 2009.
52. Wang, Y.; Wen, L.; Chen, M.F. Mathematical Dictionary; Science Press: Beijing, China, 2010.
53. Zhao, X.F.; Tong, X.J. Characteristic analysis of temporal and spatial distributions of passengers on zhengzhou
metro line 1. Urban Mass Transit 2017, 20, 75–79.
54. Ma, X.; Liu, C.; Wen, H.; Wang, Y.; Wu, Y.J. Understanding commuting pattern using transit smart card data.
J. Transp. Geogr. 2017, 58, 135–145. [CrossRef]
55. Bassan, S. Modeling of peak hour factor on highways and arterials. KSCE J. Civ. Eng. 2013, 17, 224–232.
[CrossRef]
Sustainability 2020, 12, 2255 22 of 22

56. Jin, Y. Characteristics of peak hour passenger flow at rail transit stations in shanghai. Urban Transp. China
2019, 17, 50–57.
57. He, J. Study on the configuration quantity of automatic ticket checker in urban rail transit. Railw. Signal.
Commun. 2008, 44, 14–17.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).

You might also like