Gold Price Prediction Using Ensemble Based Supervised Machine Learning

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 30

GOLD PRICE PREDICTION USING

ENSEMBLE BASED SUPERVISED


MACHINE LEARNING

INTERNAL GUIDE
Mrs. M .MANASA
ASSISTANT PROFEESSOR

B.ROHAN NETHA 16C01A0503


V.DURGA DAS 16C01A0525
S.S.S VAMSI 16C01A0560
TABLE OF CONTENTS
1 ABSTRACT

2 INTRODUCTION

3 SYSTEM ANALYSIS

4 SYSTEM SPECIFICATION

5 MODULES

6 SYSTEM DESIGN

7 ALGORITHMS

8 SYSTEM TESTING

9 CONCLUSION
ABSTRACT

This project is based on a study conducted to understand the


relationship between gold price and selected factors
influencing it. Three machine learning algorithms, linear
regression, random forest regression and gradient boosting
regression were used in analyzing these data. We predict
future gold rates based on 22 market variables using machine
learning techniques.
INTRODUCTION

• In this session we explain you about the procedure of the


project and how the algorithms are used to perform prediction
of gold prices.

• The algorithms such as linear regression and XGBoost are


explained and practical usage of algorithms is shown.

• The bar graphs and dataset are implemented using the support
vector machine algorithm and ARMA model.
SYSTEM ANALYSIS
EXISTING SYSTEM
• Open,Close,High,Low,Volume of commodity
• Text mining and artificial neural networks (ANN)
• Extreme learning machines (ELM)
They conclude that ELM performs the best with accuracy of 93.82%.

DISADVANTAGES OF EXISTING SYSTEM


• We cannot predict the future sales of the gold price and accuracy is also less
in existing system.
• We cannot predict the year to year, month to month, and day to day sales in
the system
.

PROPOSED SYSTEM
The column labeled ‘Proposed’ lists the attributes used
by us to build the models.

• Dataset
• Correlation analysis
• Machine Learning Models

ADVANTAGES
• Consumption demand
• Protection against volatility
• Gold and interest rates
• Gold and inflation
SYSTEM SPECIFICATION
HARDWARE SOFTWARE
REQUIREMENTS REQUIREMENTS
• System •Operating system
Intel i3. Windows 10 Ultimate.
• Hard Disk • Coding Language :
40 GB Python(anaconda distributed).
• Floppy Drive • Front-End
1.44 MB Python.
• Monitor •Data set :
14 Color Monitor Kaggle
• Mouse
Optical Mouse

• RAM
512 MB or higher
MODULES

• User
• Data and Methodology
• Linear Regression
• Random Forest Regressor
• Gradient Boosting Regression
SYSTEM DESIGN

SYSTEM ARCHITECTURE
DATA FLOW DIAGRAM
UML DIAGRAMS

Loading Dataset

Dataset shape

Gold Price

Actor Confusion Matrix

User Linear Regression

Random forest

USE - CASE DIAGRAM


CLASS DIAGRAM
Dataset

Pandas +data_inr.csv
+list
+file path = data_inr.csv
+head()
+Read.csv=data(data_inr.csv) +Indianrupees()
+Year()
+Price()

Algoritham

+MSE
+MAE
+RMSE
+Linear Regression()
+Random Forest()
SEQUENCE DIAGRAM
pandas Dataset Algoritham

1 : Loading Dataset()

2 : Dataset Loaded()

3 : Gold Price()

4 : MAE()

5 : MSE()

6 : RMSE()

7 : Logistic Regression()

8 : Random Forest()
ACTIVITY DIAGRAM
Loading Dat aset

Loading
dataset
Loading

Loading
Dataf rame Created
Yes
Indian Rupees

Data Frame MSE

Created
Increase MA E
Decrease

Indian Rupees
MSE
RMSE

Decrease Predict ed future Increase MAE


Logistic regression

Random forest

RMSE

Logistic regression
Predicted
Future
Random forest
ALGORITHMS

Support Vector Regressor (SVR)


• The model produced by Support Vector Regression depends only on a subset
of the training data, because the cost function for building the model ignores
any training data close to the model prediction.
• I have used SVR with (kernel=’linear’)

Random Forest Regressor


• Random Forest is a supervised learning algorithm, is an ensemble of Decision
Trees, most of the time trained with the “bagging” method. The general idea
of the bagging method is that a combination of learning models increases the
overall result.
• I have used two parameters n_estimators=50 (default value =10), the
number of trees in the forest. and random state=0, random_state is the
seed used by the random number generator.
Lasso CV:
• The Lasso is a linear model that estimates sparse coefficients.
• Performs L1 regularization, i.e. adds penalty equivalent to absolute value of the
magnitude of coefficients
• Minimization objective = LS Obj + α * (sum of absolute value of
coefficients)

Ridge CV
• Ridge regression addresses some of the problems of Ordinary Least Squares by
imposing a penalty on the size of coefficients. The ridge coefficients minimize
a penalized residual sum of squares,

min || Xw – y ||2^2 + α|| w||2^2


Bayesian Ridge
• Bayesian regression techniques can be used to include regularization
parameters in the estimation procedure: the regularization parameter is not set
in a hard sense but tuned to the data at hand.
• I have used default parameters in this algorithm.

Gradient Boosting Regressor


• Gradient Tree Boosting or Gradient Boosted Regression Trees (GBRT) is a
generalization of boosting to arbitrary differentiable loss functions.
• n_estimators=70, the number of boosting stages to perform.
learning_rate=0.1, learning rate shrinks the contribution of each tree by
learning_rate, max_depth=4, the maximum depth limits the number of
nodes in the tree, random_state=0, loss='ls', ‘ls’ refers to least squares
regression.
SYSTEM TESTING

• The purpose of testing is to discover errors.

Types of testing
• Unit testing
• Functional testing
• Integration testing
• White box testing
• Black box testing
• Unit testing
• Acceptance testing
TEST CASES
S.NO Test Case for gold price Expected Result Result Remarks
1 Histogram diagram Success Pass If Histogram not available
2 Line graph diagram Success Pass If Line graph not available
3 Linear regression Success Pass If LR not available
4 Random forest Success Pass If RF not available
5 Gradient boosting Success Pass If GBR not available
6 Indian rupee Success Pass If INR not available
7 Result for RMSE Success Pass If RMSE not available
8 R Square Success Pass If R square not available
9 ACF and PACF plots Success Pass If ACF,PACF not available
10 State space model Success Pass If statemodel not available
11 Accuracy measure-LR Success Pass If Accuracy measure not
available
12 Accuracy measure-RF Success Pass If Accuracy measure not
available
SCREENSHOTS

SNAPSHOT OF DATA
EXLORATORY VISUALIZATION OF DATA
DAILY RETURNS OF GOLD AND OTHER FACTORS
GOLD MOVING AVERAGE
OPEN-CLOSE & HIGH-LOW PRICES
SCATTER PLOTS
REVIEW OF BENCHMARK AND SOLUTION MODEL
ENSEMBLE SOLUTION ACTUAL vs. PREDICTED
GOLD PRICE COMPARISON FOR 14 YEARS
CONCLUSION

• In this study, we used machine learning algorithms to predict the gold rates
very accurately.
• Our study is also the most comprehensive to date, thus taking into
consideration various economic indicators of various countries and
companies.

FUTURE EXTENSIBILITY
• Monthly Returns
• Performance of gold with markets
• Break-up of Historical demand of gold data Should you invest?
• Gold ETF rise and fall
• Gold Inflation
ANY QUERIES..?

You might also like