FoRex Trading Using Supervised Machine Learning PDF
FoRex Trading Using Supervised Machine Learning PDF
FoRex Trading Using Supervised Machine Learning PDF
Research paper
Abstract
The exchange rate of each money pair can be predicted by using machine learning algorithm during classification process. With the help
of supervised machine learning model, the predicted uptrend or downtrend of FoRex rate might help traders to have right decision on
FoRex transactions. The installation of machine learning algorithms in the FoRex trading online market can automatically make the
transactions of buying/selling. All the transactions in the experiment are performed by using scripts added-on in transaction application.
The capital, profits results of use support vector machine (SVM) models are higher than the normal one (without use of SVM).
Keywords: Classification; Foreign Exchange rate; Supervised Machine Learning; Transaction.
1. Introduction FoRex rate in the trading market. These specific terms are the
basic explanations for the classification output results.
Supervised machine learning can apply into many areas in com- To be more generality, a framework of supervised machine learn-
puter science such as decision making, forecasting, and specially ing (SML) techniques in predicting FoRex rate has been shown in
is to predict stock price or money exchange rate, and so on. Su- section 4. Based on this framework, a representative SVM model,
pervised machine learning means the teaching and monitoring the which is proposed from research experiment in [27], is chosen to
computer to classify or cluster data from observed data set. With install into expert advisor (robotics) in FoRex transaction software
the explosion of data nowadays, particular with business big data, to make the actual FoRex transactions with demo account.
supervised machine learning techniques can support people to
look depth and analyze their data in order to extract useful infor-
mation for their company’s purposes. In stock market, the transac-
2. Related Works
tions of money exchange can be predicted rise or fall with helping
of machine learning and the observed data set collected in the past. The Foreign Exchange problems can be seen as the ones of pre-
The artificial intelligent systems are used with supervised machine dicting FoRex rates (up or down) of each currency pairs. There-
learning methods like Logic, Perceptron, Statistics methods fore, they can be seen as classification problems where their out-
(Bayes Networks, Instance based) [2]. The goals are the classifica- puts (FoRex rates) are the binary values. There are some previous
tion or regression dataset into alternative classes so-called as out- techniques such as Auto Regressive Integrated Moving Average
puts. These target variable (y) is usually the nominal (discrete) or (ARIMA) used for predicting time series data. According to [6],
continuous values [1]. For simplicity, the output of y in the classi- ARIMA, however, is an unvaried model in general. Moreover, the
fication problem can be seen as binary values such as 0 or 1. techniques are performed with the assumption of the linear and
stationary time series.
Support Vector Machines (SVMs), which is one of Supervised
Machine Learning techniques, are the techniques that separate According to [7], the noisy date in the FoRex prediction problems
data in hyperplane space into two data classes [3]. These tech- made them to be challenged issues in time series forecasting field.
niques create maximized margins and the distance between them Some publications in [7], [8], [6], [9] used RBF, and MLP, and
to separate data into alternative sides of hyperplane [2]. So, SVMs gene to predict the FoRex rates.
can be a possible candidate of use to predict Foreign Exchange The works, for instance, in [6] used Multilayer Perceptron Net-
(Forex) trend (Up or Down) in money exchange rate problems. work (MLP) with the training set, testing set of 1600 and 225 over
The predicting FoRex price is performed by using historical data 1825 instances. The disadvantage in the work is a use of small
such as Open", "Close", "Low" and "High" in different time rang- data set. This might cause the quick regression or mis-predicting
es of FoRex transactions. output result. Another work of Rumani stock with MLP models
Alternative experiments with different support vector machine can be seen in [10]. The publication works shown the advantages
models are performed in this paper. These are to show the ad- in use supervised learning models in forecast a value of rise or fall
vantage of using them in FoRex rate prediction. The detail re- for stock price.
search on SVM in FoRex market can be seen in [27]. The analysis There are some ANN and decision tree models which have been
of FoRex rate prediction can be seen in section 3. This shows used with forecast the uptrend or downtrend in stock market [11-
some specific concepts which contribute to the predicted trend of 14]. Although these works reported about their abilities to work
Copyright © 2018 Authors. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted
use, distribution, and reproduction in any medium, provided the original work is properly cited.
International Journal of Engineering & Technology 401
with real time data, however, their experiments have reported of that a given exchange rate can make. Every money pair has a cor-
lacking of using high frequency data [15]. responding value (a quote). For example, a quote of 1.2776 is the
exchange rate of every EUR is USD 1.2776 (it costs the trader of
The nonlinear regression problems, especially in predicting things, 1.2776 to buy a single EUR). If the rate is lightly moved from
can be solved with SVMs models. There are some publications 1.2776 to 1.2777, it shows 1 PIP increase.
which have been successful in use nonlinear regression predictions
such as in 16], [17] [23], [24], [25]. However, there are limited The PIP is calculated as:
publications used SVMs models for financial time-series forecast-
ing [19], [20] although the SVMs models can perform with non- 𝑃𝐼𝑃𝑣𝑎𝑙𝑢𝑒 = 𝑃𝐼𝑃/𝑟𝑎𝑡𝑒 × 𝐴𝑚𝑜𝑢𝑛𝑡 𝑏𝑒𝑖𝑛𝑔 𝑝𝑢𝑟𝑐ℎ𝑎𝑠𝑒𝑑
linear problems and they have been reported as the advantage For example, if a trader purchase 10,000 units with the rate of
techniques. Therefore, continue to the research in [27], the repre- EUR/USD of 1.2776. Therefore, PIPvalue= 0.0001/1.2776 x 10,000
sentative supervised machine learning method of SVM is selected = 0.7827 EUR.
to estimate the uptrend or downtrend of each currency rates. It is
because of SVMs can avoid some issues such as over-fitting prob- The concept of the transaction trend describes the direction of the
lems (like ANN in [1]), or no parameters to be harmony except the market moving as uptrend or downtrend. In the uptrend, a re-
upper bound C [21]. sistance level means there is a change in supply and demand
(more sellers than buyers) whereas in a downtrend, there is a con-
cept of support level, in which there are more buyers than sellers.
3. Research Background When the trend is ended, there are two possible outcomes of oppo-
site starting trend or in range market (stop at certain resistance or
3.1. Supervised Support Vector Machine support levels). The task for the Expert Advisor (in the experiment
in later section) is to determine the trend, calculate the time of
The Support Vector Machines (SVMs) can be divided in to linear lasting and decide if the timing is right to place a good transaction
or nonlinear SVMs. These are based on linear or nonlinear hyper- of Bid/Ask.
plane lines which separate the training data set in hyperplane
space. The linear form in linear hyperplane in explained as follow-
ing:
𝐻: 𝑤. 𝑥 + 𝑏 = 0
Step 1: Feature Selection. Collected data set might include un- #166-
Bollinger Bands 9 3 Lines x 3 Timeframe
necessary features for predicting process. Therefore, these features 174
are removed. #175- 2 values x 3
Average True Range 6
180 Timeframe
Step 2: Select dataset of currency pair. In this step, a defined pair #181- 4 values x 4
Highest, Lowest M1, M5 16
of money eg. EUR vs USD is chosen. Also in this step, the data is 196 Timeframe
taken from alternative candles, time frames etc. in order to be
easily used in installation with Robotics. Model configurations
Step 3: Processing data with supervised machine learning (SML) The SVMs models are chosen in the experiments. The models
models (here is support vector machine models). Dataset is divid- parameters are used the same as using in [27]. Therefore, the
ed into training, testing and validation sets. All data in mapped on model parameters are:
the range of [0,1].
‖x−y‖2 2
Step 4: Evaluation models with assessment metrics of mean RBF: k(x, y) = e− 2σ = e−γ‖x−y‖ with
square errors (MSE). γ ∈ [0; 5]
Step 3,4 is repeated many time in order to choose a model with Polynomial: k(x, y) = (x. y + θ)d, θ ∈ ℝ, d ∈ ℕ∗
minimum MSE. with d = 2,3,4 and θ ∈ [0; 1].
The output is a certain definition of trend (up or down). The parameter of C has been used with C ∈ [1; 10].
Table 2: Experimental Model Configurations
5. Experiment Models Poly Poly Poly GsRBF GsRBF GSRBF
1 2 3 1 2 3
5.1. Data analysis and Model configuration C 1.0 1.0 1.0 2.0 1.0 2.0
Kernel Poly Poly Poly RBF RBF RBF
Parame- Pow- Pow- Pow- =5.0 =2.0 =2.0
The experimental data is collected from 01/01/2013 to 30/09/2016 er 2 er 3 er 3
ters
by using MetaTrader 4 [22]. For general use, the pair of No of 2500 3000 3972 3000 4500 5000
EUR/USD has been chosen. Two data sets are created with the vectors
one of training of D (including the instances in a time of
01/01/2013 to 31/12/2015), and one of testing D’ (including data 5.2. Classification results
in a period 1/01/2016 to 30/09/2016).
The features are chosen basing on the profits. In this paper, the Data set D data can be separated to two parts Dpos and Dneg posi-
profit is taken of bout 10 – 15 Pips in each transaction. These se- tive and negative output respectively. Which can be further split to
1 2 𝑘 1 2 𝑘
lections might effect to the experimental results. The input fea- 𝐷𝑝𝑜𝑠 , 𝐷𝑝𝑜𝑠 , … , 𝐷𝑝𝑜𝑠 and 𝐷𝑛𝑒𝑔 , 𝐷𝑛𝑒𝑔 , … , 𝐷𝑛𝑒𝑔 sub-sets with each
tures for samples exchange rates "Open", "Close", "Low" and 1
sub-sets has 𝑘 |𝐷𝑝𝑜𝑠/𝑛𝑒𝑔 | samples. D_Test set is taken with each
"High". These rates are taken in alternative candles in the window
sub-set. D_Train includes k-1 remaining samples in the sub-sets.
time (time frame). There are 1 minute, 5 minutes, 15 minutes,1
hour, 1 day according to M1, M5, M15, H1, D1 respectively. Oth- In here, k is 5.
er features are taken from Custom Indicators, Bollinger Bands, Following Table 3 & 4 show experimental results with 6 alterna-
RSI and MA. tive models. The performance between alternative models.is eval-
To continue with the experiment in [27], 196 features are chosen uated and compared by using the Accuracy Exactness Rate, Preci-
sion of Positive, Negative, Micro-Average and Macro-Average
(in Table 1). The addition more features (increase from 137 to 196)
individually.
for the classification is for expecting to have better results in the
classification process and having better transactions. This means Table 3: Results of RBF Configuration Model
data is taken from an extension of candlesticks and timeframe. Kernel: Gaussian RBF GsRBF1 GsRBF2 GSRBF3
This is to concentrate more detail in specific historical transaction Features: 196 D_Trai D_Tes D_Trai D_Tes D_Trai D_Tes
data. n t n t n t
Accurate 55.80 58.14 58.17
84.22% 83.14% 83.81%
Table 1: Feature selections % % %
Fea- N Positive 55.59 57.55 57.56
Data Explanations 84.05% 82.59% 83.12%
ture# o % % %
04 value X 5 nearest Negative 55.98 58.70 58.76
#1-20 O , H , L , C on M1 20 84.39% 83.71% 84.52%
candles Precisio- % % %
04 value X 5 nearest n Micro- 55.80 58.14 58.17
#21-40 O , H , L , C on M5 20 84.22% 83.14% 83.81%
candles Average % % %
04 value X 4 nearest Macro- 55.79 58.13 58.16
#41-56 O , H , L , C on M15 16 84.22% 83.15% 83.82%
candles Average % % %
04 value X 4 nearest
#57-72 O , H , L , C on H1 16 Table 4: Poly Configuration Models Results
candles
4 values x 5 nearest Kernel: Polynomial Poly1 Poly2 Poly3
#73-92 O, H, L, C on D1 20 Features:196 D_Train D_Test D_Train D_Test D_Train D_Test
candles
#93-97 Time data 5 Time attributes Accurate 86.15% 74.20% 86.15% 74.35% 86.25% 74.32%
#98- 2 values x 4 Positive 86.73% 80.29% 86.58% 80.18% 86.80% 80.21%
RSI(7) on M5, M15 8 Negative 85.59% 70.36% 85.73% 70.62% 85.72% 70.56%
105 Timeframe Precisio-n
Micro-Average 86.15% 74.20% 86.15% 74.35% 86.25% 74.32%
#106- 2 values x 4
RSI(14) on M5, M15 8 Macro-Average 86.16% 75.32% 86.16% 75.40% 86.26% 75.38%
113 Timeframe
#114- MA(9), MA(12), MA(100), 8 values x 4
32 5.3. Expert Advisor experimental results
145 MA(200) Timeframe
#146-
Custom Indicator 20 8 PAX + 12 MKC
165 The Poly2 SVM model is chosen for installation into Robotics
(Expert Advisor- EA) in MetaTrader 4 [30]. The script (written in
International Journal of Engineering & Technology 403
EA) has performed with a collection of observed data set. The Robotics’ transactions. In other words, more features are used in
information incorporates some specialized markers and exchanges SVM, the profit loss can be better control.
of purchase or offer FoRex match of USD and EUR.
Drawdown Comparison
For correlation with the utilization directed machine learning
60.00% 50.67%
model in Robotics, the typical exchange (test of Robotics without
utilizing backing of SVM) additionally has been performed. 50.00%
34.69% 34.50%
40.00%
The results can be seen in Table 5. The bold numbers in the table 30.00%
24.31%
show the comparison between the transactions use or not use
20.00%
SVM.
10.00%