Time Series Forecasting - SoftDrink - Business Report
Time Series Forecasting - SoftDrink - Business Report
Time Series Forecasting - SoftDrink - Business Report
Business Report
Project - Time Series Forecasting– Soft Drink
Sales Analysis
Date: 01/04/2022
2
Table of Contents
Table of Contents.......................................................................................................................2
1. Executive Summary............................................................................................................4
2. Introduction.........................................................................................................................4
3. Data Details.........................................................................................................................4
Q1 Read the data as an appropriate Time Series data and plot the data....................................4
1.1 Reading the Data.........................................................................................................4
1.2 Plotting the Data..........................................................................................................5
2. Perform appropriate Exploratory Data Analysis to understand the data and also perform
decomposition............................................................................................................................6
2.1 EDA..................................................................................................................................6
Null Value Check...................................................................................................................6
Duplicate Value Check.......................................................................................................7
Data Description.................................................................................................................7
Yearly Box Plots.................................................................................................................7
Monthly Box Plots..............................................................................................................8
Monthly Sales across Years................................................................................................9
2.2 Decomposition...........................................................................................................11
3. Split the data into training and test. The test data should start in 1991............................12
4. Build various exponential smoothing models on the training data and evaluate the model
using RMSE on the test data. Other models such as Regression, Naïve forecast models and
simple average models should also be built on the training data and check the performance on
the test data using RMSE.........................................................................................................13
4.1 Linear Regression......................................................................................................14
4.2 Naïve Model..............................................................................................................15
4.3 Simple Average Model..............................................................................................16
4.4 Moving Average Model............................................................................................17
4.5 Simple Exponential Smoothing (SES)......................................................................20
4.6 Double Exponential Smoothing (DES).....................................................................22
4.7 Triple Exponential Smoothing (TES)........................................................................23
4.8 Summary of all Models.............................................................................................25
5. Check for the stationarity of the data on which the model is being built on using
appropriate statistical tests and also mention the hypothesis for the statistical test. If the data
is found to be non-stationary, take appropriate steps to make it stationary. Check the new data
for stationarity and comment. Note: Stationarity should be checked at alpha = 0.05..............26
3
6. Build an automated version of the ARIMA/SARIMA model in which the parameters are
selected using the lowest Akaike Information Criteria (AIC) on the training data and evaluate
this model on the test data using RMSE..................................................................................28
6.1 ARIMA Model..........................................................................................................28
6.2 SARIMA Model........................................................................................................30
7. Build ARIMA/SARIMA models based on the cut-off points of ACF and PACF on the
training data and evaluate this model on the test data using RMSE........................................32
7.1 ACF and PACF plots.................................................................................................32
8. Build a table with all the models built along with their corresponding parameters and the
respective RMSE values on the test data.................................................................................35
9. Based on the model-building exercise, build the most optimum model(s) on the complete
data and predict 12 months into the future with appropriate confidence intervals/bands........36
10. Based Comment on the model thus built and report your findings and suggest the
measures that the company should be taking for future sales..................................................37
4
1. Executive Summary
You are an analyst in the RST soft drink company and you are expected to forecast the sales
of the production of the soft drink for the upcoming 12 months from where the data ends. The
data for the production of soft drinks has been given to you from January 1980 to July 1995.
2. Introduction
The intent for this project is to perform forecasting analysis on the soft drink production
dataset. I will try to analyse this dataset by using Linear Regression, Naïve Model, Simple
and Moving Average models, Simple, Double and Triple Exponential Smoothing. The data
set contains 187entries, and I will try to build the most optimum model(s) on the complete
data and predict 12 months into the future with appropriate confidence intervals/bands.
3. Data Details
Data set contains two columns, where the first column shows the month and year of the
YearMont SoftDrinkProducti
h on
1980-01 1954
1980-02 2302
1980-03 3054
1980-04 2414
1980-05 2226
1980-06 2725
Q1 Read the data as an appropriate Time Series data and plot the data
I have imported the data series and as we can observe, entry has an YearMonth value
with it, which is not really a data point, but an index for the sales entry. So in reality the
datasets have a single column that contains the quantity of soft drink produced in that
5
particular month. Here, while reading the datasets I have given the argument in a way so
that it parses the first column which is date column, and indicates to the system that this
It can be observed the dataset has data starting from January 1980 going till July 1995, so
Now that I have uploaded the dataset with no arguments (and hence uploaded the
datasets without parsing the dates here), I will need to provide a time stamp value by
ourselves. In addition to that I have removed the YearMonth variable and added a time
As we can observe from the above plot, the production for Soft drink was in upward
direction. There is a certain seasonality element that is visible in the graph. We will
explore the trend and seasonality further during decomposition, where we will be able to
2.1 EDA
There are no duplicate entries in the dataset as each value corresponds to a different time
index, so basically these are all sales figures for different months.
Data Description
There is High Standard Deviation for the time series since the Min and Max have
significant difference between them. Moreover, there is difference between the mean and
the median for the same reason of skewness. As mentioned earlier, there are in total 187
Following is the yearly box plot for the Soft drink Production time-series:
8
upward sales trend post 1988. The highest production for soft drink can be observed in
1987 and the lowest sales in 1980. The highest variation in monthly production for soft
drink seems to be in the year 1993 and on the year 1995 there seems to be the lowest
There are outliers in the yearly production data, however as it is a Time Series; we can
Following is the monthly box plot for the shoe sales time-series:
9
As we can observe from the Monthly Box Plots, we can clearly see that there is a
seasonality element visible in time series dataset. As can be clearly seen that the
production have an increasing trend in the last quarter of the year. The production for
soft drink seems to pick up from July month and is more or less consistent till June,
observes some stagnancy in September month and then starts to pick up again from
October (i.e. last quarter). Monthly sales data shows skewness without much exception.
The monthly sales across years can be seen in the following Pivot Table and the
associated graph:
10
As can be observed from the above set of table and graph, the months of December
seems to be the month that drives the highest production figures. The second highest
above.
11
2.2 Decomposition
I have provided the decomposed elements for the Time Series below:
and multiplicative decomposition for time series so that I can determine if the shoes
As we can observe from the above, we can say that the time series is clearly
The plots above clearly indicate that the production is unstable and not uniform, and it
3. Split the data into training and test. The test data should start in 1991.
I have split the time series datasets into Train and Test datasets below. It is given the
Figure 10: Training and Test Datasets for Soft Drink Time Series
I have also confirmed that the Train dataset indeed ends in 1990, and the Test dataset
indeed starts in 1991 by using the Head and Tail functions on the Training and Test
dataset. As we can observe, the size of the Train data frame is 132 observations and that
I have also plotted the Train and test data frames for both time series datasets below:
13
depicts the Train datasets (January ’80 – December ‘90), and the Orange part of the plots
evaluate the model using RMSE on the test data. Other models such as
also be built on the training data and check the performance on the test
In this section I will try to run the various available models on time series data set. Let’s
The extracts of Training and Test time stamps for the Linear Regression can be seen
below:
The Regression plots above depict the regression on training set as the Red line and that
on the test set as the green line. As we can observe from the above plot and metric, shoes
sales show upward trend on training data set and downward trend on test data set.
The summarized performance of the model run on the dataset can be seen below:
15
The extracts of Training and Test data for the Naïve Model can be seen below:
suitable for the shoe dataset since the forecasts depends on the previous last observation.
The extracts of Training and Test data for the Simple Average Model can be seen below:
Figure 18: Training and Test data for Simple Average Model
Following are the results from running a Simple Average Model:
The summarized performance of the models run dataset can be seen below:
model has the best performance among all the three models run till now for.
The Moving Average data for the dataset can be seen below:
MAPE = 10.67
For 4 point Moving Average Model forecast on the Testing Data, RMSE = 687.181 |
MAPE = 13.71
For 6 point Moving Average Model forecast on the Testing Data, RMSE = 710.513 |
MAPE = 15.01
For 9 point Moving Average Model forecast on the Testing Data, RMSE = 735.889 |
MAPE = 15.33
The summarized performance of the models run on the wine datasets can be seen below:
As we can observe from the above plots, all of the trailing average plots show prediction
values below the actual train and test data sets, and the 9 point trailing average plot
shows the lowest prediction of all the plots. The closest prediction to actual data is shown
by the 2 point trailing moving average model. This observation is corroborated by the
As can be seen from the summarized performance of all the models, the 2 point moving
average has shown the best performance of all the models run on dataset.
For Alpha = 0.216 Simple Exponential Smoothening Model forecast on the Test data,
The summarized performance of the models run on the wine datasets can be seen below:
21
As we all know that SES model should be used on data which has no element of trend or
seasonality, I still applied it on the data set so as to see what the performance of the
I used Alpha = 0.216 for the SES model and as expected, it did not perform well as
For Alpha =0.1, Beta = 0.1 Double Exponential Smoothening Model forecast on the Test
The summarized performance of the models run on the wine datasets can be seen below:
As we all know that DES model should be used on data which has no seasonality but has
levels and trends, I used the grid search to begin and we reached conclusion that Alpha =
0.1 and Beta = 0.1 show the lowest RMSE and MAPE. . The DES model is the model
The TES Parameters for the Soft Drink dataset can be seen below:
The summarized performance of the models run on the wine datasets can be seen below:
Now that we have run all the models planned, let’s view the summary of the performance
of the dataset:
As we can observe that for the dataset, the Triple Exponential Smoothing gives the best
5. Check for the stationarity of the data on which the model is being built
on using appropriate statistical tests and also mention the hypothesis for
= 0.05
I have performed the Stationarity Test on data frame. I have used an augmented Dickey-
Fuller test on the shoes data set to check the stationarity. The Hypothesis is that the shoes
As we can observe from the above, we need to reject the Hypothesis since the p value
seems to be greater than alpha, hence we will have to stationaries the data. That is, the
data properties do not depend on the time when the data series is observed. This is
basically a hint of a seasonality/trend element in the dataset. After taking the difference
of 1 in between continuous observations to stationaries the data, we can observe that the
(AIC) on the training data and evaluate this model on the test data using
RMSE.
As we can see from the above, the lowest AIC recorded for the data is for p,d,q values of
(3,1,3) respectively and the lowest AIC is 2027.528. The p value of coefficients MA1
and MA2 are 0 and 0.013 which means that these are pretty significant. The RMSE and
As can be observed, the model with p,d,q, as 3,1,3 respectively has the lowest AIC,
which is 14. The p value of ar.S.L12 and ma.S.L12 is less than 0.05 which makes them
RMSE: 429.452
MAPE: 9.95
PACF on the training data and evaluate this model on the test data
using RMSE.
An autocorrelation (ACF) plot represents the autocorrelation of the series with lags of
itself. A partial autocorrelation (PACF) plot represents the amount of correlation between
a series and a lag to itself that is not explained by correlations at all lower- order lags.
The above shows ACF and PACF for a stationary time series, respectively. The ACF and
PACF plots indicate that an MA (1) model would be appropriate for the time series
because the ACF cuts after 1 lag while the PACFs shows a slowly decreasing trend.
8. Build a table with all the models built along with their corresponding
I have sorted the models based on lowest RMSE and MAPE values on test data.
36
Figure 42: RMSE and MAPE values on test data for all the model runs
We can observe SARIMA (3, 1, 3)(3, 0, 0, 12) average has the lowest RMSE and MAPE
on the complete data and predict 12 months into the future with
We can plot the real and the forecasted sales for the time series.
10.Based Comment on the model thus built and report your findings and
suggest the measures that the company should be taking for future sales.
The company should come up with discount offers in the months of January to
May as the sales are low in these months.
Also, the company can adopt a good price for shoes as we saw there were many
outliers in case of yearly prediction
To increase sample size
To increase the number of independent variables
Try more combinations of variables to see if accuracy of the model can be
improved.