Ai - Reassesment Question - Final

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 3

Learning outcomes

1. Improved your understanding of some AI methods for prediction.


2. Gained more practical skills for implementing AI techniques to real world problems.

Specification
Overview
The aim of this reassessment coursework is to develop a framework with several appropriate
models of your choices( see below for some possible models) to predict train delay for a chosen
route, e.g. Norwich to London Liverpool Street.
The historical data of the train services for this journey can be obtained from the database in
the same way as described in the normal coursework, via the module’s blackboard.

Description
The basic task is to write Python programs (or any other programming language) to
implement a train delay prediction framework.
The task can be divided into few sub-tasks. For the given data,
(1) to write a program to pre-process the data, such as filtering out the unfinished journeys.
(2) to detect the number of the stops that a train journey has from the train running data. For
example, from Norwich to London Liverpool Street (LST), the intercity train services may
have different number of stops and at different stations, but the train running data (like
timetable) should contain this information clearly. You can then write a program or method,
to detect them out.
(3) to define and extract some features from the train data in order to transform the raw data
to a structured representation with input variables and a target variable.
Below are some possible input variables you may define and use,
(i) the deviation of the departure time (i.e. actual departure time - timetabled departure time),
(ii) time: such as peak-time or off-peak time

1
(iii) Day: Monday to Sunday, or weekday or weekend.
(iv) train-ahead: on time or delayed.
The target output variable could be the deviation of the arrival time of a train at a given station,
i.e. the actual arrival time - the scheduled arrival time: ta(i, j) − ts(i, j), where, ta(i, j) is the
arrival time of train i at station j; ts(i, j) is the scheduled arrival time of train i at station j.
(4) to build some prediction models: there are different ways to represent the train prediction
problem for a detected or given journey.
To map the prediction problem onto a model such as neural network (Multi-layer
Perceptron), or deep learning neural network, or regression or Bayesian model, you need to
decide the inputs and outputs.
One simple way is to build a model for each station to predict the arrival time of a train. So
you will have the exact number of predictive models for the number of the stops a train
journey has.
(5) to test the prediction framework, using the data given and assess the accuracy of prediction.
You should consider to use one or some commonly used measures, such as Mean Squared
Error (MSE) or Mean Squared Root Error(MSRE), or R-Squared(R2).
(6) to write a program within your framework to visualise the train timetable and your predictions
(if you can) with a train graph similar to the graph (the left one) shown on this web site.
https://observablehq.com/d/6d71c687bc5169dc

Relationship to formative assessment


As this is a reassessment, there will be no any other formative assessment. But the second
of coursework (CW2) should help you to understand the data and tasks.

Deliverables
1. A working prototype of the framework for predicting train delays for the given journey.
All the source code must be submitted in a zipped file to a dropbox on the module’s
blackboard.
2. A technical report that describes your design, implementation and testing of your
predic- tion framework.
It should be logically structured, with sections such as Introduction, Analysis (of train
delay problems), Design, Implementation, Testing and Evaluation, Conclusion/ Summary.
References should be provided if you use/cite any other work.

Note: It is NOT allowed to upload any of your work (code, reports and video) to any public
media, e.g. GitHub or youtube, etc., without getting a permission from the module organiser.
You may create a private account on GitHub to store and manage your work, but you must not
never make it public, even after the assessment.

Marking scheme
For this reassessment, there will be NO demonstration or presentation, so all the marks are
allocated to the technical report with the following schemes.

1. Introduction to train delay problem and your work: 10%


2
2. Data precessing and feature extraction: 20%

3. Design and implementation of the framework: 20%

4. Train predictive models: 20%

5. Testing and evaluation: 20%

6. Summary/Conclusion: 10%

Resources
There are plenty of resources on the internet on train delay prediction. for example, some
papers:
(1) ”Weighted Ensemble Methods for Predicting Train Delays” by M. Al Ghamdi, G. Par and
W. Wang. Computational Science and Its Applications – ICCSA 2020 - 20th
International Conference, Proceedings. Pages: 586-600.
https://rd.springer.com/chapter/10.1007/978-3-030-58799-4_43
(2) ”Dynamic Delay Predictions for Large-Scale Railway Networks: Deep and Shallow Extreme
Learning Machines Tuned via Thresholdout.” By Oneto et al.
https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=7917288

You might also like