InCCCS 2024 Stock Price Prediction
InCCCS 2024 Stock Price Prediction
InCCCS 2024 Stock Price Prediction
Abstract—The application of Artificial Intelligence (AI) in This forecasting challenge can be categorized into three
stock price prediction has demonstrated significant advance- distinct timeframes:
ments, with Machine Learning and Deep Learning techniques
proving highly efficient in this domain. Two widely adopted a. Short-term forecasting, encompasses predictions over brief
architectures for stock price prediction are the Long Short Term periods such as seconds, minutes, hours, few days, weeks,
Memory (LSTM) and Gated Recurrent Unit (GRU) models. or months.
Recognizing the potential impact of financial news sentiment on b. Medium-term forecasting, extending the prediction horizon
forecasting, this study investigates whether incorporating such to one to two years.
sentiment yields superior results compared to relying solely on
historical stock prices. Accurate predictions in the financial c. Long-term forecasting, involving predictions over periods
market are challenging due to its inherent high volatility and exceeding two years.
non-linear nature. To contribute insights into the effectiveness Numerous models and techniques have been developed and
of various deep learning architectures, a comparative analysis
utilized for stock price prediction. Our focus lies within time
was conducted on Long Short-Term Memory (LSTM), Gated
Recurrent Unit (GRU), Convolutional Neural Network (CNN), series forecasting, broadly categorized into two classes:
and the Prophet model. The evaluation focused on three com- a. Linear Models: This encompasses ARIMA [1] and its
panies listed on the National Stock Exchange (NSE). The study variations, including SARIMA [2]. These models utilize
employs the Mean Squared Error (MSE) metric to rigorously
assess and compare the models’ performance. The results of the
predefined equations to fit mathematical models to univari-
comparative analysis provide valuable implications for precision ate time series data. Our paper will delve into the ARIMA
in stock price prediction within the context of market volatility model, discussed in a subsequent section.
and non-linearity. The study adds to the growing body of research b. Non-Linear Models: This category includes deep learning
on the application of AI in financial forecasting, emphasizing algorithms, GARCH [3], and others. Deep learning algo-
the importance of considering sentiment analysis alongside tra-
ditional approaches for enhanced predictive accuracy.
rithms, renowned for capturing non-linear patterns, are of
Index Terms—LSTM, GRU, Prophet, Time Series Analysis, particular interest. In our research, we employ Convolu-
National Stock Exchange (NSE) tional Neural Network (CNN) [4] and Long Short-Term
Memory (LSTM) models. LSTMs, integral to Recurrent
I. I NTRODUCTION Neural Networks (RNNs), possess the ability to retain input
Stock price prediction is a compelling domain of interest, information over extended periods, a crucial aspect in stock
attracting researchers seeking to enhance forecasting and anal- price prediction for enhanced precision and accuracy.
ysis within the financial landscape. Time series forecasting, a Given the vast and highly non-linear nature of stock market
prevalent method applied to variables evolving, is particularly data, the preference for deep learning models is evident.
relevant in tracking the dynamic nature of stock prices. The However, working with time series data demands careful
overarching objective of this research is to enhance the pre- consideration of certain aspects:
dictive performance compared to existing models designed for
a. Stationarity: This involves assessing statistical properties
stock price projection. The intricate nature of stock price pre-
such as mean, variance, covariance, and standard de-
diction arises from its inherent volatility and unpredictability.
viation, ensuring they remain constant over time for a
Identify applicable funding agency here. If none, delete this. stationary time series.
b. Seasonality: Recognizing periodic fluctuations or patterns In a separate investigation outlined in [10] explored stock
within the time series, denoted as seasonality, is crucial. price prediction using Seasonal Autoregressive Integrated
Any predictable oscillation repeating over time falls within Moving Average (SARIMA) and Prophet Prediction Model.
this category. Their models achieved a root mean square error (RMSE)
c. Autocorrelation: This metric measures the correlation be- difference of 44.9545.
tween a variable’s current value and its past values, indi- As a whole, these investigations emphasise the possible
cating the degree of correlation across different successive effectiveness of machine learning and deep learning method-
time intervals. ologies in forecasting stock prices. The application of these
In summary, the immense and non-linear nature of stock mar- methodologies to forecast the prices of individual companies
ket data necessitates sophisticated models, and our research or stock index movements utilising daily closing prices shows
aims to contribute to this domain by exploring and enhancing potential for improving the accuracy of predictions. Further-
the performance of both linear and non-linear time series more, it has been suggested in the literature that specific
forecasting models. models demonstrate noteworthy efficacy, including the Hybrid
The remainder of the paper follows a structured organiza- LSTM-GRU model and the Recurrent Neural Network (RNN)
tion. Section 2 provides a comprehensive review of related with Long Short-Term Memory (LSTM). It has been observed
work, delving into existing literature in the field. Section that convolutional neural network (CNN) and ARIMA models
3 outlines the research methodology employed, elucidating perform admirably in scenarios requiring short-term predic-
the approach taken in conducting the study. Subsequently, in tions. It is acknowledged, however, that model performance
Section 4, the obtained results are discussed, shedding light on may vary depending on the specific dataset and characteristics
the outcomes of the research. Finally, Section 5 serves as the of the population under study.
conclusion of the paper, summarizing key findings, and also
exploring the future scope of the study. III. P ROPOSED F LOW
Figure 1 illustrates the proposed flow of the research.
II. R ELATED BACKGROUND
An expanding corpus of literature addresses the utilization
of different machine-learning techniques as well as deep-
learning techniques for stock price prediction. Several studies
have substantiated the effectiveness of these methodologies
across various datasets and techniques in forecasting stock
prices.
A notable contribution to the field of stock price predic-
tion is presented in a study [6], wherein they employed the
Autoregressive Integrated Moving Average (ARIMA) model
[5]. The research, titled ”Stock Price Prediction Using the
ARIMA Model,” meticulously details the process of construct-
ing a predictive model for stock prices. Utilizing stock data
published in the Nigeria Stock Exchange (NSE) and New
York Stock Exchange (NYSE) the authors developed stock Fig. 1. Proposed Flow
price predictive models. Their findings highlight the ARIMA
model’s significant potential for short-term predictions.
Subsequently, the same researchers, along with Otokiti A. Data Preparation
Sunday O, extended their work in [8]. This time, they explored Historical stock price datasets for TCS, ICICI, and Pow-
stock price prediction using Artificial Neural Networks (ANN) ergrid were procured from the National Stock Exchange
[7] and introduced a hybridized approach. This approach (NSE) website, spanning a one-year timeframe. These datasets
combines variables from technical and fundamental analyses encompass diverse attributes, including but not limited to
of stock market indicators to enhance existing prediction ’OPEN’, ’Date’, ’LOW’, ’HIGH,’, ’Series, ’PREV. CLOSE’,
methodologies. ’Close’ ’VWAP’,’LTP’,’52W L’,’52W H’,’VALUE’, ’VOL-
Another study [9] delves into stock price prediction utilizing UME’ and ’# Trades.’ These attributes collectively offer a
various deep learning models, specifically Long Short-Term comprehensive overview of the daily trading activities and
Memory (LSTM), Recurrent Neural Network (RNN), and performance metrics associated with the respective stocks.
Convolutional Neural Network (CNN) with a Sliding Window Each dataset, representing TCS, ICICI, and Powergrid,
Model. The primary emphasis is on making price forecasts underwent meticulous analysis utilizing five distinct ma-
for individual companies or anticipating the movements of chine learning models. The selected models comprised Long
stock indexes using daily closing prices. The method under Short-Term Memory (LSTM), Convolutional Neural Net-
consideration employs a model-independent strategy. work (CNN), AutoRegressive Integrated Moving Average
(ARIMA), a Hybrid Model integrating LSTM and Gated Re-
current Unit (GRU), and Facebook’s Prophet. This methodical
approach was employed to systematically explore and evaluate
the predictive capabilities inherent in each model concern-
ing the unique characteristics of the individual datasets. By
adopting such a comprehensive strategy, the research sought
to enhance the depth and breadth of its findings, providing a
nuanced understanding of the performance of each model in
the context of the specific stock datasets under consideration.
B. Exploratory Data Analysis
The target variable selected for forecasting is ’close,’
with its preceding timestamp being utilized for analysis.
Consequently, all the analyses outlined below will be
executed with a specific emphasis on this feature.
zt = σ(Xz · [zt−1 , yt ] + az )
rt = σ(Xr · [zt−1 , yt ] + ar )
(3)
z˜t = tanh(Xh · [rt ⊙ zt−1 , yt ] + ah )
zt = (1 − zt ) ⊙ zt−1 + zt ⊙ z˜t
Fig. 5. Autocorrelation - Powergrid dataset
Here, σ is the sigmoid activation function, ⊙ denotes
element-wise multiplication, and Xz , Xr , Xh are weight
matrices.
iii. Hybrid Model of GRU and LSTM: The hybrid model,
integrating both Long Short-Term Memory (LSTM) and
Gated Recurrent Unit (GRU), capitalizes on the respec-
tive strengths of each architecture to enhance predictive
performance. Formulated as a weighted combination
of individual LSTM and GRU predictions, the hybrid
prediction incorporates an optimized weight parameter,
denoted as σ, to achieve optimal predictive accuracy.
Equation 4 demonstrates the mathematical representation
of the hybrid prediction model.
Fig. 6. Partial Autocorrelation - Powergrid dataset Hybrid = σ × LST M + (1 − σ) × GRU (4)
Here, Wo is another weight matrix, and σ is the sigmoid In this expression, ’N’ signifies the total number of data points,
activation function. ad denotes the actual value on day ’d’, and pd represents
ii. GRU: The Gated Recurrent Unit (GRU), classified as an- the corresponding predicted value. The MSE is computed by
other variant of recurrent neural network (RNN), features squaring the difference between each actual and predicted
a cell with two primary gates: the update gate (zt ) and value, summing these squared differences, and subsequently
the reset gate (rt ). These gates govern the information dividing by the total number of data points.
Fig. 7. LSTM-GRU & PROPHET - TCS data Fig. 9. CNN model - TCS data
Fig. 8. LSTM model - TCS data Fig. 10. LSTM-GRU & PROPHET - Powergrid data
IV. R ESULTS & D ISCUSSION Figure 7, 10 and 13 green line indicates the ’Hybrid’ and the
This section provides a concise and lucid overview of orange indicates the ’Prophet’ model. In Figure 8, 11 and 14
the results, accompanied by an analysis of their significance orange line indicates training prediction and the green line
within the context of the study topic. The historical prices for indicates test prediction.
TCS, ICICI Bank, and Powergrid spanned a one-year duration The calculation of the Root Mean Squared Error (RMSE)
and were directly sourced from the National Stock Exchange score for the aforementioned models has also been undertaken.
(NSE). In this study, the training of our model exclusively It can be asserted that, in terms of the RMSE score, the
utilized the ’close’ column. LSTM-GRU Hybrid model demonstrates superior performance
in forecasting the TCS stock price.
A. TCS The calculation of the Root Mean Squared Error (RMSE)
The implementation of the LSTM, LSTM-GRU hybrid
model, CNN, and Prophet model was carried out on the TCS
dataset. The prediction graphs for the aforementioned models
are illustrated via Figure 7, 8 and 9.
B. POWERGRID
The implementation of the LSTM, LSTM-GRU hybrid
model, CNN, and Prophet model has been executed on the
Powergrid dataset. The prediction graphs for the aforemen-
tioned models are illustrated via Figure 10, 11 and 12.
C. ICICI BANK
The implementation of the LSTM, LSTM-GRU hybrid
model, CNN, and Prophet model has been carried out on the
ICICI Bank dataset. The prediction graphs for the aforemen-
tioned models are illustrated via Figure 13, 14 and 15. In Fig. 11. LSTM model - Powergrid data
Fig. 12. CNN model - Powergrid data Fig. 15. CNN model - ICICI BANK data