Time Series Forecasting - SoftDrink - Business Report

1
Business Report
Project - Time Series Forecasting– Soft Drink
Sales Analysis
Divjyot Shah Singh
Date: 01/04/2022
2
Table of Contents
Table of Contents.......................................................................................................................2
1. Executive Summary............................................................................................................4
2. Introduction.........................................................................................................................4
3. Data Details.........................................................................................................................4
Q1 Read the data as an appropriate Time Series data and plot the data....................................4
1.1 Reading the Data.........................................................................................................4
1.2 Plotting the Data..........................................................................................................5
2. Perform appropriate Exploratory Data Analysis to understand the data and also perform
decomposition............................................................................................................................6
2.1 EDA..................................................................................................................................6
Null Value Check...................................................................................................................6
Duplicate Value Check.......................................................................................................7
Data Description.................................................................................................................7
Yearly Box Plots.................................................................................................................7
Monthly Box Plots..............................................................................................................8
Monthly Sales across Years................................................................................................9
2.2 Decomposition...........................................................................................................11
3. Split the data into training and test. The test data should start in 1991............................12
4. Build various exponential smoothing models on the training data and evaluate the model
using RMSE on the test data. Other models such as Regression, Naïve forecast models and
simple average models should also be built on the training data and check the performance on
the test data using RMSE.........................................................................................................13
4.1 Linear Regression......................................................................................................14
4.2 Naïve Model..............................................................................................................15
4.3 Simple Average Model..............................................................................................16
4.4 Moving Average Model............................................................................................17
4.5 Simple Exponential Smoothing (SES)......................................................................20
4.6 Double Exponential Smoothing (DES).....................................................................22
4.7 Triple Exponential Smoothing (TES)........................................................................23
4.8 Summary of all Models.............................................................................................25
5. Check for the stationarity of the data on which the model is being built on using
appropriate statistical tests and also mention the hypothesis for the statistical test. If the data
is found to be non-stationary, take appropriate steps to make it stationary. Check the new data
for stationarity and comment. Note: Stationarity should be checked at alpha = 0.05..............26
3
6. Build an automated version of the ARIMA/SARIMA model in which the parameters are
selected using the lowest Akaike Information Criteria (AIC) on the training data and evaluate
this model on the test data using RMSE..................................................................................28
6.1 ARIMA Model..........................................................................................................28
6.2 SARIMA Model........................................................................................................30
7. Build ARIMA/SARIMA models based on the cut-off points of ACF and PACF on the
training data and evaluate this model on the test data using RMSE........................................32
7.1 ACF and PACF plots.................................................................................................32
8. Build a table with all the models built along with their corresponding parameters and the
respective RMSE values on the test data.................................................................................35
9. Based on the model-building exercise, build the most optimum model(s) on the complete
data and predict 12 months into the future with appropriate confidence intervals/bands........36
10. Based Comment on the model thus built and report your findings and suggest the
measures that the company should be taking for future sales..................................................37
4
1. Executive Summary
You are an analyst in the RST soft drink company and you are expected to forecast the sales
of the production of the soft drink for the upcoming 12 months from where the data ends. The
data for the production of soft drinks has been given to you from January 1980 to July 1995.
2. Introduction
The intent for this project is to perform forecasting analysis on the soft drink production
dataset. I will try to analyse this dataset by using Linear Regression, Naïve Model, Simple
and Moving Average models, Simple, Double and Triple Exponential Smoothing. The data
set contains 187entries, and I will try to build the most optimum model(s) on the complete
data and predict 12 months into the future with appropriate confidence intervals/bands.
3. Data Details
Data set contains two columns, where the first column shows the month and year of the
corresponding Production Quantity recorded in the second column.
YearMont SoftDrinkProducti
h on
1980-01 1954
1980-02 2302
1980-03 3054
1980-04 2414
1980-05 2226
1980-06 2725
Table 1: Soft drink Production Dataset Details
Q1 Read the data as an appropriate Time Series data and plot the data
1.1 Reading the Data
I have imported the data series and as we can observe, entry has an YearMonth value
with it, which is not really a data point, but an index for the sales entry. So in reality the
datasets have a single column that contains the quantity of soft drink produced in that
5
particular month. Here, while reading the datasets I have given the argument in a way so
that it parses the first column which is date column, and indicates to the system that this
is a one column series through squeeze.
Figure 1: Reading Shoes sales Dataset
It can be observed the dataset has data starting from January 1980 going till July 1995, so
there are 187 entries in totality in each dataset.
1.2 Plotting the Data
Now that I have uploaded the dataset with no arguments (and hence uploaded the
datasets without parsing the dates here), I will need to provide a time stamp value by
ourselves. In addition to that I have removed the YearMonth variable and added a time
stamp to the dataset myself.
I have plotted both the time series below.

6
Figure 2: Soft Drink Production Time Series Plot
As we can observe from the above plot, the production for Soft drink was in upward
direction. There is a certain seasonality element that is visible in the graph. We will
explore the trend and seasonality further during decomposition, where we will be able to
view a much detailed report on these two factors.
2. Perform appropriate Exploratory Data Analysis to understand the data
and also perform decomposition
2.1 EDA
Null Value Check
Performing a Null value check on the time series, I got:
Figure 3: Null Value Check

7
Duplicate Value Check
There are no duplicate entries in the dataset as each value corresponds to a different time
index, so basically these are all sales figures for different months.
Data Description
Figure 4: Shoes sales Time Series Data Description

As we can see from the above, the shoes sales time series data look like they are skewed.
There is High Standard Deviation for the time series since the Min and Max have
significant difference between them. Moreover, there is difference between the mean and
the median for the same reason of skewness. As mentioned earlier, there are in total 187
records in the dataset.
Yearly Box Plots
Following is the yearly box plot for the Soft drink Production time-series:
8
Figure 5: Yearly Box Plots

As we can observe from the above plot, soft drink production has no trend till 1988 and a
upward sales trend post 1988. The highest production for soft drink can be observed in
1987 and the lowest sales in 1980. The highest variation in monthly production for soft
drink seems to be in the year 1993 and on the year 1995 there seems to be the lowest
variation in monthly production.
There are outliers in the yearly production data, however as it is a Time Series; we can
ignore the outlier data.
Monthly Box Plots
Following is the monthly box plot for the shoe sales time-series:
9
Figure 6: Monthly Box Plots
As we can observe from the Monthly Box Plots, we can clearly see that there is a
seasonality element visible in time series dataset. As can be clearly seen that the
production have an increasing trend in the last quarter of the year. The production for
soft drink seems to pick up from July month and is more or less consistent till June,
observes some stagnancy in September month and then starts to pick up again from
October (i.e. last quarter). Monthly sales data shows skewness without much exception.
Monthly Sales across Years
The monthly sales across years can be seen in the following Pivot Table and the
associated graph:
10
Figure 7: Monthly Production across Years
As can be observed from the above set of table and graph, the months of December
seems to be the month that drives the highest production figures. The second highest
production being in November. We can observe a seasonality element in the graph
above.
11
2.2 Decomposition
I have provided the decomposed elements for the Time Series below:
Figure 8: Additive Decomposition
Figure 9: Multiplicative Decomposition

We can see the decomposition of the time series above. I have tried with both additive
and multiplicative decomposition for time series so that I can determine if the shoes
dataset is a multiplicative or additive series.

12
As we can observe from the above, we can say that the time series is clearly
multiplicative in nature and has a seasonal component.
The plots above clearly indicate that the production is unstable and not uniform, and it
has an apparent seasonality trend.
3. Split the data into training and test. The test data should start in 1991.
I have split the time series datasets into Train and Test datasets below. It is given the
question that the Test Data should start in 1991.
Figure 10: Training and Test Datasets for Soft Drink Time Series
I have also confirmed that the Train dataset indeed ends in 1990, and the Test dataset
indeed starts in 1991 by using the Head and Tail functions on the Training and Test
dataset. As we can observe, the size of the Train data frame is 132 observations and that
of the Test data frame is 55 observations.
I have also plotted the Train and test data frames for both time series datasets below:
13
Figure 11: Plot for Training and Test data frames

We can observe the training and test data in the above plot, the blue part of the plots
depicts the Train datasets (January ’80 – December ‘90), and the Orange part of the plots
depict the test datasets (January ’91 – July ‘95).
4. Build various exponential smoothing models on the training data and
evaluate the model using RMSE on the test data. Other models such as
Regression, Naïve forecast models and simple average models should
also be built on the training data and check the performance on the test
data using RMSE
In this section I will try to run the various available models on time series data set. Let’s
kick off the analysis with Linear Regression model.

14
4.1 Linear Regression
The extracts of Training and Test time stamps for the Linear Regression can be seen
below:
Figure 12: Training and Test data for Linear Regression

Following is the results from a Linear Regression model on the dataset:
Figure 13: Linear Regression Outcome
The Regression plots above depict the regression on training set as the Red line and that
on the test set as the green line. As we can observe from the above plot and metric, shoes
sales show upward trend on training data set and downward trend on test data set.
For Regression on Time forecast on the Test Data,
RMSE = 775.807 | MAPE = 16.12
The summarized performance of the model run on the dataset can be seen below:
15
Figure 14: Performance of the Linear Regression Model
4.2 Naïve Model
The extracts of Training and Test data for the Naïve Model can be seen below:
Figure 15: Training and Test data for Naive Model

Following is the result from running a Naïve Model:
Figure 16: Naive Model Outcome

For Naive model on Time forecast on the Test Data,
RMSE = 1519.259 | MAPE = 37.75

16
Figure 17: Performance of the two Models

As can be seen from the Naïve model performance above, the Naïve model is not
suitable for the shoe dataset since the forecasts depends on the previous last observation.
4.3 Simple Average Model
The extracts of Training and Test data for the Simple Average Model can be seen below:
Figure 18: Training and Test data for Simple Average Model
Following are the results from running a Simple Average Model:
Figure 19: Simple Average Model Outcome

17
For Simple Average Model,
RMSE = 934.353 | MAPE = 20.12
The summarized performance of the models run dataset can be seen below:
Figure 20: Performance of the three Models

As can be seen from the Simple Average model performance above, the Regression
model has the best performance among all the three models run till now for.
4.4 Moving Average Model
The Moving Average data for the dataset can be seen below:
Figure 21: Moving Average Model Data

Following is the result from running a Moving Average Model dataset:
18
19
Figure 22: Moving Average Model Outcome

For 2 point Moving Average Model forecast on the Testing Data, RMSE = 556.725 |
MAPE = 10.67
MAPE = 13.71
MAPE = 15.01
MAPE = 15.33
The summarized performance of the models run on the wine datasets can be seen below:
Figure 23: Summarized Performance of the Models
I have applied 2, 4, 6 and 9-point trailing averages on the dataset.
As we can observe from the above plots, all of the trailing average plots show prediction
values below the actual train and test data sets, and the 9 point trailing average plot
shows the lowest prediction of all the plots. The closest prediction to actual data is shown
by the 2 point trailing moving average model. This observation is corroborated by the
RMSE scores for each of these moving average models.

20
As can be seen from the summarized performance of all the models, the 2 point moving
average has shown the best performance of all the models run on dataset.
4.5 Simple Exponential Smoothing (SES)
The SES Parameters for dataset can be seen below:
Figure 24: SES Parameters
Following is the result from running a SES Model on the dataset:
Figure 25: Simple Exponential Smoothing Outcome
For Alpha = 0.216 Simple Exponential Smoothening Model forecast on the Test data,
RMSE = 847.635 | MAPE = 18.86
21
As we all know that SES model should be used on data which has no element of trend or
seasonality, I still applied it on the data set so as to see what the performance of the
model is in this case.
I used Alpha = 0.216 for the SES model and as expected, it did not perform well as
compared to previously run models.
4.6 Double Exponential Smoothing (DES)
The SES Parameters for dataset can be seen below:
Figure 27: DES Parameters

Following is the result from running a DES Model on dataset:
22
Figure 28: Double Exponential Smoothing Outcome
For Alpha =0.1, Beta = 0.1 Double Exponential Smoothening Model forecast on the Test
data, RMSE = 982.938

23
As we all know that DES model should be used on data which has no seasonality but has
levels and trends, I used the grid search to begin and we reached conclusion that Alpha =
0.1 and Beta = 0.1 show the lowest RMSE and MAPE. . The DES model is the model
with the good performance so far.
4.7 Triple Exponential Smoothing (TES)
The TES Parameters for the Soft Drink dataset can be seen below:
Figure 3: TES Parameters for the Soft drink data set

The TES train and test data dataset can be seen below:
Figure 4: TES Model Train and Test data

Following is the result from running a TES Model on the dataset:
24
Figure 5: Triple Exponential Smoothing Outcome
For Alpha=0.099, Beta=0.019, Gamma=0.355, Triple Exponential Smoothing Model
forecast on the Test, RMSE = 443.499

25
4.8 Summary of all Models
Now that we have run all the models planned, let’s view the summary of the performance
of the dataset:
Figure 34: Sorted Model Performance Summary
As we can observe that for the dataset, the Triple Exponential Smoothing gives the best
RMSE and MAPE among all the models.

26
5. Check for the stationarity of the data on which the model is being built
on using appropriate statistical tests and also mention the hypothesis for
the statistical test. If the data is found to be non-stationary, take
appropriate steps to make it stationary. Check the new data for
stationarity and comment. Note: Stationarity should be checked at alpha
= 0.05
I have performed the Stationarity Test on data frame. I have used an augmented Dickey-
Fuller test on the shoes data set to check the stationarity. The Hypothesis is that the shoes
data is stationary, Alpha = 0.05

27
Figure 35: Stationarity
As we can observe from the above, we need to reject the Hypothesis since the p value
seems to be greater than alpha, hence we will have to stationaries the data. That is, the
data properties do not depend on the time when the data series is observed. This is
basically a hint of a seasonality/trend element in the dataset. After taking the difference
of 1 in between continuous observations to stationaries the data, we can observe that the
p-value appeared to be less than 0.05.

28
6. Build an automated version of the ARIMA/SARIMA model in which the
parameters are selected using the lowest Akaike Information Criteria
(AIC) on the training data and evaluate this model on the test data using
RMSE.
6.1 ARIMA Model

29
Figure 36: Running Automated ARIMA Model

Following are the Results of ARIMA model in Rose wine dataset:
Figure 37: Results of Automated ARIMA Model
As we can see from the above, the lowest AIC recorded for the data is for p,d,q values of
(3,1,3) respectively and the lowest AIC is 2027.528. The p value of coefficients MA1
and MA2 are 0 and 0.013 which means that these are pretty significant. The RMSE and
MAPE values are:
RMSE: 784.989 MAPE: 16.2

30
6.2 SARIMA Model
Following is the outcome of SARIM Model run on data:

31
Figure 68:SARIMA Model

32
As can be observed, the model with p,d,q, as 3,1,3 respectively has the lowest AIC,
which is 14. The p value of ar.S.L12 and ma.S.L12 is less than 0.05 which makes them
pretty significant. The RMSE and MAPE values are
RMSE: 429.452
MAPE: 9.95
7. Build ARIMA/SARIMA models based on the cut-off points of ACF and
PACF on the training data and evaluate this model on the test data
using RMSE.
7.1 ACF and PACF plots
An autocorrelation (ACF) plot represents the autocorrelation of the series with lags of
itself. A partial autocorrelation (PACF) plot represents the amount of correlation between
a series and a lag to itself that is not explained by correlations at all lower- order lags.
We would like all the spikes to fall in the blue region.

33
Figure 79:ACF and PACF result
The above shows ACF and PACF for a stationary time series, respectively. The ACF and
PACF plots indicate that an MA (1) model would be appropriate for the time series
because the ACF cuts after 1 lag while the PACFs shows a slowly decreasing trend.
Following is the outcome of SARIMA Model run on data:

34
Figure 40: SARIMA model
Following is the outcome of ARIMA Model run on data:

35
Figure 41: ARIMA model
8. Build a table with all the models built along with their corresponding
parameters and the respective RMSE values on the test data.
I have sorted the models based on lowest RMSE and MAPE values on test data.
36
Figure 42: RMSE and MAPE values on test data for all the model runs
We can observe SARIMA (3, 1, 3)(3, 0, 0, 12) average has the lowest RMSE and MAPE
score on test data and hence is the best model.
9. Based on the model-building exercise, build the most optimum model(s)
on the complete data and predict 12 months into the future with
appropriate confidence intervals/bands.
We can plot the real and the forecasted sales for the time series.
Figure 43: Forecasted sales

37
Figure 44: Lower and Upper Confidence interval bands
Figure 45: Lower and Upper Confidence interval forecasted plot
10.Based Comment on the model thus built and report your findings and
suggest the measures that the company should be taking for future sales.
 The company should come up with discount offers in the months of January to
May as the sales are low in these months.
 Also, the company can adopt a good price for shoes as we saw there were many
outliers in case of yearly prediction
 To increase sample size
 To increase the number of independent variables
 Try more combinations of variables to see if accuracy of the model can be
improved.

Time Series Forecasting - SoftDrink - Business Report

Uploaded by

Copyright:

Available Formats

Time Series Forecasting - SoftDrink - Business Report

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Time Series Forecasting - SoftDrink - Business Report

Uploaded by

Copyright:

Available Formats

1

Divjyot Shah Singh

corresponding Production Quantity recorded in the second column.

Table 1: Soft drink Production Dataset Details

1.1 Reading the Data

is a one column series through squeeze.

Figure 1: Reading Shoes sales Dataset

there are 187 entries in totality in each dataset.

1.2 Plotting the Data

stamp to the dataset myself.

I have plotted both the time series below.

Figure 2: Soft Drink Production Time Series Plot

view a much detailed report on these two factors.

2. Perform appropriate Exploratory Data Analysis to understand the data

and also perform decomposition

Null Value Check

Performing a Null value check on the time series, I got:

Figure 3: Null Value Check

Duplicate Value Check

Figure 4: Shoes sales Time Series Data Description

records in the dataset.

Yearly Box Plots

Figure 5: Yearly Box Plots

variation in monthly production.

ignore the outlier data.

Monthly Box Plots

Figure 6: Monthly Box Plots

Monthly Sales across Years

Figure 7: Monthly Production across Years

production being in November. We can observe a seasonality element in the graph

Figure 8: Additive Decomposition

Figure 9: Multiplicative Decomposition

dataset is a multiplicative or additive series.

multiplicative in nature and has a seasonal component.

has an apparent seasonality trend.

question that the Test Data should start in 1991.

of the Test data frame is 55 observations.

Figure 11: Plot for Training and Test data frames

depict the test datasets (January ’91 – July ‘95).

4. Build various exponential smoothing models on the training data and

Regression, Naïve forecast models and simple average models should

data using RMSE

kick off the analysis with Linear Regression model.

4.1 Linear Regression

Figure 12: Training and Test data for Linear Regression

Figure 13: Linear Regression Outcome

For Regression on Time forecast on the Test Data,

RMSE = 775.807 | MAPE = 16.12

Figure 14: Performance of the Linear Regression Model

4.2 Naïve Model

Figure 15: Training and Test data for Naive Model

Figure 16: Naive Model Outcome

RMSE = 1519.259 | MAPE = 37.75

Figure 17: Performance of the two Models

4.3 Simple Average Model

Figure 19: Simple Average Model Outcome

For Simple Average Model,

RMSE = 934.353 | MAPE = 20.12