Forecasting Sales Using Neural Networks PDF

Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

Proceedings ICNN'97, Houston, Texas, 9{12 June 1997, Vol. 4, pp.

2125{2128, IEEE, 1997

Sales Forecasting Using Neural Networks


Frank M. Thiesing and Oliver Vornberger
Department of Mathematics and Computer Science
University of Osnabruck
D-49069 Osnabruck, Germany
[email protected]

Abstract As an answer to the weakness of statistical meth-


ods in forecasting multidimensional time series an al-
Neural networks trained with the back-propagation ternative approach gains increasing attraction: neu-
algorithm are applied to predict the future values of ral networks 5]. The practicability of using neural
time series that consist of the weekly demand on items networks for economic forecasting has already been
in a supermarket. The inuencing indicators of prices, demonstrated in a variety of applications, such as stock
advertising campaigns and holidays are taken into con- market and currency exchange rate prediction, market
sideration. The design and implementation of a neural analysis and forecasting time series of political econ-
network forecasting system is described that has been omy 6, 3, 1, 2].
developed as a prototype for the headquarters of a Ger- The approaches are based on the idea of training a
man supermarket company to support the management feed-forward multi-layer network by a supervised train-
in the process of determining the expected sale gures. ing algorithm in order to generalize the mapping be-
The performance of the networks is evaluated by com- tween the input and output data and to discover the
paring them to two prediction techniques used in the implicit rules governing the movement of the time se-
supermarket now. The comparison shows that neural ries and predict its continuation in the future. Most of
nets outperform the conventional techniques with re- the proposals deal with one or only few time series.
gard to the prediction quality. In this paper, neural networks trained with the back-
propagation algorithm 7] are applied to predict the fu-
ture values of 20 time series that consist of the weekly
demand on items in a German supermarket. An ap-
1. Introduction propriate network architecture will be presented for a
mixture of both explanatory and time series forecast-
A central problem in science is predicting the fu- ing. Unlike many other neural prediction approaches
ture of temporal sequences. Examples range from fore- described in the literature, we compare the forecasting
casting the weather to anticipating currency exchange quality of the neural network to two prediction tech-
rates. The desire to know the future is often the driv- niques currently used in the supermarket. This com-
ing force behind the search for laws in science and eco- parison shows that our approach produces good results.
nomics.
In recent years many sophisticated statistical meth- 2. Time Series Considered
ods have been developed and applied to forecasting
problems 8], however, there are two major drawbacks The times series used in this paper consist of the
to these methods. First for each problem an individual sales information of 20 items in a product group of a
statistical model has to be chosen that makes some as- supermarket. The information about the number of
sumptions about underlying trends. Second the power items sold and the sales revenue are on a weekly ba-
of deterministic data analysis can be exploited for sin- sis starting in September 1994. There are important
gle time series with some hidden regularity (though inuences on the sales that should be taken into con-
strange and hard to see but existent), however, this ap- sideration: advertising campaigns sometimes combined
proach fails for multidimensional time series with mu- with temporary price reductions holidays shorten the
tual non-linear dependencies. opening hours the season has an eect on the sales of
the considered items.
We take the sales information, prices and advertising past presence future
campaigns from the cash registers and the marketing
team of the supermarket. The holidays are calculated.
For the season information we use the time series of the
turnover sum in DM of all items of this product group
as an indicator. Its behavior over a term of 19 months
sale ?
is shown in gure 1.
1 t T time
n =2
x t-1

weeks 36/1994 to 13/1996 legend x t -2


.
turnover sum (DM) turnover sum

turnover x t +1
3000 3000 difference
.
2500 2500
t time
.
1 T
2000 2000

1500 1500 price


1000 1000
holidays
advertising
500 500 1 t T time

0 0
36 41 46 51 4 9 14 19 24 29 34 39 44 49 2 7 12
holidays

Figure 2. Input and output of the MLP


Figure 1. Turnover sum in DM of the product
group September 1994 to March 1996 series = ( t ):
x x

We use feed-forward multilayer perceptron (MLP) zt = max(t ;) min(


x )
; min( )  0 8 + 0 1
x
x

x
: : resp.
networks with one hidden layer together with the back-
propagation training method. In order to predict the zt = t; +05
x

c 

:

future sales the past information of recent weeks is n

given in the input layer. The only result in the output where min and max are the minimum and maximum
layer is the sale for the next week. values of time series and and are the average
x  

Due to the purchasing system used in the supermar- and the standard deviation. is a factor to control the c

ket there is a gap of one week between the newest sale interval of the values.
value and the forecasted week. In addition the pric- For the prices the most eecting indicator is the
ing information, advertising campaigns and holidays price change. So the prices are modeled as follows:
are already known for the future, when the forecast is 8 0 9 : price increases 9
calculated. This information is also given to the input < =
:= : 0 0 : price keeps equal  within
:

layer as shown in gure 2. prit :


week
;0 9 : price decreases
:
t

3. Preprocessing the Input Data For both the time series of holidays and advertising
campaigns we tested binary coding and linear aggrega-
An ecient preprocessing of the data is necessary tion to make them weekly. Their indicators are:
to input it into the net. In general it is better to 8 0 9 : if there is a holiday resp. 9
transform the raw time series data into indicators that < : =
represent the underlying information more explicitly. yt := : advertising within week  resp. t

Due to the sigmoidal activation function of the back- 0 0 : otherwise


:

propagation algorithm the sales information must be


scaled to ]0 1. The scaling is necessary to support yt := number of advertising6resp. holidays within t

the back-propagation learning algorithm 4]. We tested (normalized number of special days within week ) t

several scalings ( t ) for the sale and the turnover time


z
4. Experimental Results series and the - -scaling for sales. The learning rate
 

of the back-propagation algorithm was set to 0.3 with


To determine the appropriate conguration of the a momentum of 0.1. The initial weights were chosen
feed-forward MLP network several parameters have from -0.5, 0.5] by chance. The training was validated
been varied: by 12 patterns and stopped at the minimum error.
 modeling of input time series 4.4. Comparison of Prediction Techniques
 width of the sliding time window
 the number of hidden neurons The results of the forecasting accuracy are calcu-
lated for the successive prediction of the 22 weeks
 interval of the initial random weights 44/1995 to 13/1996. In this weeks there are inuences
of many campaigns and Christmas holidays.
 training rate and momentum To measure the error the root mean squared error
 number and selection of validation patterns (RMSE) is divided by the mean value (Mean) of the
time series. This is done for all the 20 items and the
 dealing with overtting average value is show in table 1. Theil's -statistic is
U

To evaluate the eciency of the neural network ap- calculated as well.


proach several tests have been performed. We compare
the prediction error of a naive (\Naive") and a statis- Table 1. Measuring forecasting accuracy, av-
tical prediction method (\MovAvg") to the successive erage of 20 items
prediction by neural networks (\Neural").
4.1. Naive Prediction Accuracy Measure Neural MovAvg Naive
RMSE/Mean 0.84 1.01 1.16
The naive prediction method uses the last known Theil's U 0.76 0.92 1.00
value of the time series of sales as the forecast value for
the future. In our terms: ^t+1 := t;1. This forecast-
x x

ing method is often used by the supermarket's person- Based on the information in table 1 the naive ap-
nel. proach is outperformed by the two other methods. For
18 of the 20 items the prediction by the neural network
4.2. Statistical Prediction is better than the statistical prediction method.
A close inspection of the times series favored by the
The statistical method is currently being used by statistical approach shows that these are very noisy
the supermarket's headquarters to forecast sales and without any implicit rules that could be learned by the
to guide personnel responsible for purchasing. It calcu- neural network. Especially one of these items has an
lates the moving average of a maximum of nine recent average weekly sale of less than 4 items.
weeks, after these sale values have been ltered from Figure 3 shows the predicted values for item 468978
exceptions and smoothed. calculated by the statistical and neural approach. The
price, advertising and holiday information is included
4.3. Neural Prediction in this gure.
We reached good results for = 2 recent values of
n

the sale time series in the sliding window. The other 5. Conclusions and Future Research
inputs are, one neuron each: both the dierence of the
sale ( 0t = t ; t;1) and the turnover of the whole
x x x For a special group of items in a German super-
group of items for the last week and the holiday, ad- market neural nets have been trained to forecast fu-
vertising and pricing information for the week to be ture demands on the basis of the past data augmented
predicted. with further inuences like price changing, advertising
Thus, for each item a net with 7 input neurons and 4 campaigns and holiday season information. The ex-
hidden neurons is trained for a one week ahead forecast perimental results show that neural nets outperform
with a gap of one week. We reached better results with the naive and statistical approaches that are currently
the binary scaling for the holiday and advertising time being used in the supermarket.
weeks 13/1995 to 13/1996 legend

Price in DM Sale in pieces Sale 468978

Price 468978
6.00
45

5.00 40
35 Sale Prediction 468978
Statistical
4.00 30
Sale Prediction 468978
25 Neural Network
3.00
20
2.00 15
10
1.00
5
0.00 0
13 17 21 25 29 33 37 41 45 49 1 5 9 13

Advertising campaigns 468978

Holidays

Figure 3. Comparison of sale prediction for an item by statistical and neural approach

Our procedure preprocesses the data of all kinds of References


time series in the same manner and uses the same net-
work architecture for the prediction of all 20 time series 1] K. Chakraborty, K. Mehrotra, C. Mohan, and S. Ranka.
of sales. The parameter optimization is based on all of Forecasting the behaviour of multivariate time series us-
the time series instead of on one special item. ing neural networks. Neural Networks, 5:961{970, 1992.
The program runs as a prototype and handles only 2] B. Freisleben and K. Ripper. Economic forecasting
a small subset of the supermarket's inventory. Future using neural networks. In Proceedings of the 1995
work will concentrate on the integration of our fore- IEEE International Conference on Neural Networks,
casting tool into the whole enterprise data ow pro- volume 2, pages 833{838, Perth, W.A., 1995. IEEE.
3] A. Refenes, M. Azema-Barac, L. Chen, and S. Karous-
cess. Since a huge number of varying products have sos. Currency exchange rate prediction and neural net-
to be managed a selection process has to be installed work design strategies. Neural Computing & Applica-
that discriminates between steady time series suitable tions, 1(1):46{58, 1993.
for conventional methods and chaotic candidates which 4] H. Rehkugler and H. G. Zimmermann. Neuronale Netze
will be processed by neural nets. 
in der Okonomie (in German Neural Networks in Eco-
The prototype is part of a forecasting system that is nomics). Verlag Vahlen, Munchen, 1994.
able to take the raw data, do the necessary preprocess- 5] R. Rojas. Neural Nets. Springer, 1996.
ing, train the nets and produce an appropriate fore- 6] E. Schoneburg. Stock price prediction using neural net-
cast. The next steps will be the development of addi- works: An empirical test. Neurocomputing, 2(1), 1991.
tional adaptive transformation techniques and methods 7] V. R. Vemuri and R. D. Rogers. Arti cial Neural Net-
works { Forecasting Time Series. IEEE Computer So-
to test the signicance of inputs which can be used to ciety Press, 1994.
reduce the complexity of the nets. 8] A. S. Weigend and N. A. Gershenfeld. Time Series
Prediction: Forecasting the Future and Understanding
the Past. Addison-Wesley, 1994.

You might also like