Natural Gas Price Prediction Using Machine Learning-IJRASET
Natural Gas Price Prediction Using Machine Learning-IJRASET
Natural Gas Price Prediction Using Machine Learning-IJRASET
https://doi.org/10.22214/ijraset.2021.37291
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.429
Volume 9 Issue VIII Aug 2021- Available at www.ijraset.com
3) Semi-supervised learning Semi-supervised learning is used for the same applications as supervised learning. But it uses both
labelled and unlabelled data for training – typically a small amount of labelled data with a large amount of unlabelled data
(because unlabelled data is less expensive and takes less effort to acquire). This type of learning can be used with methods
such as classification, regression and prediction. Semi-supervised learning is useful when the cost associated with labelling is
too high to allow for a fully labelled training process. Early examples of this include identifying a person's face on a web cam.
4. Reinforcement learning Reinforcement learning is often used for robotics, gaming and navigation. With reinforcement
learning, the algorithm discovers through trial and error which actions yield the greatest rewards. This type of learning has
three primary components: the agent (the learner or decision maker), the environment (everything the agent interacts with) and
actions (what the agent can do). The objective is for the agent to choose actions that maximize the expected reward over a
given amount of time. The agent will reach the goal much faster by following a good policy. So, the goal in reinforcement
learning is to learn the best policy.
C. Applications of Machine Learning
1) Classification: Classification or categorization is the process of classifying the objects or instances into a set of predefined
classes. The use of machine learning approach makes a classifier system more dynamic. The goal of the ML approach is to
build a concise model. This approach is to help to improve the efficiency of a classifier system. Every instance in a data set
used by the machine learning and artificial intelligence algorithm is represented using the same set of features. These instances
may have a known label; this is called the supervised machine learning algorithm. In contrast, if the labels are known, then its
called the unsupervised. These two variations of the machine learning approaches are used for classification problems.
2) Prediction: Prediction is the process of saying something based on previous history. It can be weather prediction, traffic
prediction, and may more. All sort of forecasts can be done using a machine learning approach. There are several methods like
Hidden Markov model can be used for prediction.
3) Regression: Regression is another application of machine learning. There are several techniques for regression is available.
Suppose, X1, X2, X3…. Xn are the input variables, and Y is the output. During this case, using machine learning technology
to provide the output (y) on the idea of the input variables (x). A model is used to precise the connection between numerous
parameters as: Y=g(x) Using machine learning approach in regression, the parameters can be optimized.
4) Image Recognition: Image Recognition is one of the most significant Machine Learning and artificial intelligence examples.
Basically, it is an approach for identifying and detecting a feature or an object in the digital image. Moreover, this technique
can be used for further analysis, such as pattern recognition, face detection, face recognition, optical character recognition, and
many more. Though several techniques are available, using a machine learning approach for image recognition is preferable.
In a machine learning approach for image-recognition is involved extracting the key features from the image and therefore
input these features to a machine learning model.
5) Sentiment Analysis: Sentiment analysis is another real-time machine learning application. It also refers to opinion mining,
sentiment classification, etc. It’s a process of determining the attitude or opinion of the speaker or the writer. In other words,
it’s the process of finding out the emotion from the text. The main concern of sentiment analysis is “what other people think?”.
Assume that someone writes ‘the movie is not so good.’ To find out the actual thought or opinion from the text (is it good or
bad) is the task of sentiment analysis. This sentiment analysis application can also apply to the further application such as in
review-based website, decision-making application. The machine learning approach is a discipline that constructs a system by
extracting the knowledge from data. Additionally, this approach can use big data to develop a system. In the machine learning
approach, there are two types of learning algorithm supervised and unsupervised. Both of these can be used to sentiment
analysis.
6) Speech Recognition: Speech recognition is the process of transforming spoken words into text. It is additionally called
automatic speech recognition, computer speech recognition, or speech to text. This field is benefited from the advancement of
machine learning approach and big data. At present, all commercial purpose speech recognition system uses a machine
learning approach to recognize the speech. The speech recognition system using machine learning approach outperforms better
than the speech recognition system using a traditional method. Because, in a machine learning approach, the system is trained
before it goes for the validation. Basically, the machine learning software of speech recognition works two learning phases:
a) Before the software purchase (train the software in an independent speaker domain)
b) After the user purchases the software (train the software in a speaker dependent domain). This application can also be used for
further analysis, i.e., health care domain, educational, and military.
7) Recommendation for Products and Services: A Recommender System refers to a system that is capable of predicting the
future preference of a set of items for a user, and recommend the top items. One key reason why we need a recommender
system in modern society is that people have too much options to use from due to the prevalence of Internet. We can find large
scale recommender systems in retail, video on demand, or music streaming. In order to develop and maintain such systems, a
company typically needs a group of expensive data scientist and engineers. Machine learning algorithms in recommender
systems are typically classified into two categories - content based and collaborative filtering methods although modern
recommenders combine both approaches. Content based methods are based on similarity of item attributes and collaborative
methods calculate similarity from interactions. Below we discuss mostly collaborative methods enabling users to discover new
content dissimilar to items viewed in the past.
8) Information Retrieval: The most significant machine learning and AI approach is information retrieval. It is the process of
extracting the knowledge or structured data from the unstructured data. Since, now, the availability of information has been
grown tremendously for web blogs, website, and social media. Information retrieval plays a vital role in the big data sector. In
a machine learning approach, a set of unstructured data is taken for input and therefore extracts the knowledge from the data.
9) Robot Control: A machine learning algorithm is used in a variety of robot control system. For instance, recently, several types
of research have been working to gain control over stable helicopter flight and helicopter aerobatics. In Darpa-sponsored
competition, a robot driving for over one hundred miles within the desert was won by a robot that used machine learning to
refine its ability to notice distant objects.
10) Virtual Personal Assistant: A virtual personal assistant is the advanced application of machine learning and artificial
intelligence. In the machine learning technique, this system acts as follows: machine-learning based system takes input, and
processes the input and gives the resultant output. The machine learning approach is important as they act based on the
experience. Different virtual personal assistants are smart speakers of Amazon Echo and Google Home, Mobile Apps of
Google Allo.
II. SYSTEM DESIGN
B. Pre-processing
Data pre-processing is a process of cleaning the raw data i.e. the data is collected in the real world and is converted to a clean data
set. There are certain steps executed to convert the data into a small clean data set and make it feasible for analysis, this part of the
process is called as data pre-processing. Most of the real-world data is messy, like:
1) Missing Data
2) Noisy Data
3) Inconsistent Data
Some of the basic pre-processing techniques that can be used to convert raw data are:
a) Conversion of Data
b) Ignoring the missing values
c) Filling the missing values
d) Detection of outliers
C. Feature Extraction
When the input data to an algorithm is too large to be processed and it is suspected to be redundant then it can be transformed into
a reduced set of features. Determining a subset of the initial features is called feature selection. The selected features are expected
to contain the relevant information from the input data, so that the desired task can be performed by using this reduced
representation instead of the complete initial data. Feature extraction involves reducing the number of resources required to
describe a large set of data. When performing analysis of complex data one of the major problems stems from the number of
variables involved. Analysis with a large number of variables generally requires a large amount of memory and computation
power, also it may cause a classification algorithm to overfit to training samples and generalize poorly to new samples. Feature
extraction is a general term for methods of constructing combinations of the variables to get around these problems while still
describing the data with sufficient accuracy. Many machine learning practitioners believe that properly optimized feature
extraction is the key to effective model construction.
D. Model Selection
Model selection is the process of selecting one final machine learning model from among a collection of candidate machine
learning models for a training dataset. Model selection is a process that can be applied both across different types of models and
across models of the same type configured with different model hyper parameters.
The types of classification models are:
1) K-Nearest Neighbor
2) Naive Bayes
3) Decision Trees/Random Forest
4) Support Vector Machine
5) Logistic Regression
E. Train and Test Data
For training a model we initially split the model into 2 sections which are ‘Training data’ and ‘Testing data’. The classifier is
trained using ‘training data set’, and then tests the performance of classifier on unseen ‘test data set’. Training set: The training set
is the material through which the computer learns how to process information. Machine learning uses algorithms to perform the
training part. Training data set is used for learning and to fit the parameters of the classifier. Test set: A set of unseen data used
only to assess the performance of a fully-specified classifier.
F. Evaluation
Model Evaluation is an integral part of the model development process. It helps to find the best model that represents the data and
how well the chosen model will work in the future. To improve the model hyper-parameters of the model can be tuned and the
accuracy can be improved. Confusion matrix can be used to improve by increasing the number of true positives and true negatives.
The output is predicted by analysing the test data as input along with test data output and then the output is displayed.
G. Interface
A web interface is built to take input and display an output. Flask language is used to build a web interface and pickle library is
used to integrate both model and web page.
H. Dataset Description
We used the dataset that is available from the Kaggle repository
Steps involved in predicting the Natural Gas Price
1) Step 1: Install and import required libraries
2) Step 2: Check the information of the data set. It is an object of the class ‘pandas.core.frame.DataFrame’. It also shows the
datatype of the variables.
4) Step 4: Data Munging Check for missing values in the dataset for each column.
7) Step 7: Label encoding Using sklearn.preprocessing to import Label Encoder and encode the data.
8) Step 8: Using stats to predict the z-score and to check the threshold .
9) Step 9: Data pre-processing This step fills the missing values of categorical variables with the mode of its respective variable.
10) Step 10: This step the model is fit and transformed using Standard Scaler .
11) Step 11: The model is dumped into the joblib library and tested.
12) Step 12: The model is trained using the train_test_split library and hence can predict the output.
13) Step 13: Decision Tree Regressor: Apply Decision Tree Regression to the train dataset. Predict the output of test dataset from
trained model. Calculate accuracy, precision and confusion matrix.
14) Step 14: Random Forest Classifier: Apply Random Forest Classifier to the train dataset. Predict the output of test dataset from
trained model. Calculate accuracy and precision .
IV. CONCLUSION
The present model has presented the current state in the field of natural gas forecasting. The empirical results demonstrate the
prediction methods have decent performance inforecasting natural gas price. It has always been a difficult task to predict the exact
daily price of the natural gas. But our model would be convenient and can predict the price of next 12 months also it is cost-
effective. The presented algorithms highlighted that although models constitute a wide and efficient choice for addressing price
detection also provide a significant boost in increasing the forecasting performance. The main contribution of this work is the
development of a new forecasting model for the short-term prediction of natural gas price.
REFERNECES
[1] Natural gas price link stooil market frustrate regulator sefrtsto develop competition". New York Times. 29 October 2008.
[2] Roben Farzad(19April2012)."High Oil Prices Cut the Cost of Natural Gas". Business week.com. Retrieved 15 May 2015.
[3] "Archivedcopy".Archivedfromtheoriginalon1April2017.Retrieved13May2013.
[4] Disavino,Scott; Krishnan,Barani(25September2014)."HenryHub, kingofU.S.natural gas trade, losing crown to Marcellus". Reuters. Retrieved 21
October2014.
[5] Natural Gas Exports". The World Factbook. Central Intelligence Agency. Retrieved 11 June 2015.
[6] ”Background".Naturalgas.org. Archived from the original on 9July2014. Retrieved 14 July 2012.
[7] "Electricity from Natural Gas". Archived from the original on 6 June 2014. Retrieved 10 November 2013.