House Price Predictor Using ML Through A

Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056

Volume: 07 Issue: 02 | Feb 2020 p-ISSN: 2395-0072

House Price Predictor using ML through Artificial Neural Network

Kalaiselvi. S1, Kokila. S2, Bhavanathi. K3, S. Saravanan4, B.E, M.E, PHD
1,2,3UG Student, Department of CSE, Agni College of Technology, Chennai, India
4Department of CSE, Agni College of Technology, Chennai, India

Abstract - Housing price keep changing in day in and day 2. LITERATURE SURVEY
out and sometimes are hyped rather than being based on
valuation. Predicting housing prices with real factor is the First we have investigated various papers and discussion on
main crux of our research project. Here we aim make our machine learning for house price prediction[1].The title of the
evaluation based on every basic parameter that is papers is house price prediction is on machine learning and
considered while determining the price. We use various neural networks, the description of the paper is minimum
regression techniques in this pathway using artificial neural error and maximum accuracy[2].Next title of the paper is
network which yield minimum error and maximum accuracy Hedonic models based on price data from Belfast infer that
than individual algorithms applied. We also propose to use submarkets and residential valuation this model is used to
real-time neighbourhood details using location to get exact identified over a wider spatial scale and implications for the
real-world valuation. evaluation process related to the selection of comparable
evidence and the quality of variables that the values may
1. INTRODUCTION needed.[3]The title of the paper is understanding recent
trends in house prices and home ownership in this paper
Using machine learning algorithms, we solve some they used feedback mechanism or social epidemic that
application in the real-world problem but would not be encourages a view of housing as an important investment in
complicated to implement. In this a house price prediction the market.
we using regression algorithms to predicate the price of the
house. Machine learning helps to provide valid dataset that is 3. METHODS AND ALGORITHMS
input features are squares footage, number of bedrooms, etc.
And applying regression techniques and future predictions DATA COLLECTION
the result is predicting exact price of the price. The problem
statement is to predict the monetary value of house located The dataset are collected from Bangalore house price. The
in Bangalore with more accuracy using artificial neural dataset containing several features they are area type,
network. To develop and evaluate the performance and availability, location, BHK, society, total squares feet,
predictive power of the model trained and tested on data bathrooms, balcony. The area type is categorized into three
collected from houses. In previous project is the system types are super build-up area is already fully developed area,
makes optimal use of Linear regression, Forest regression, plot area is area of empty ground and build-up area is
Boosted regression. The efficiency of the algorithms has nothing but the area which is developing. Availability also
been further increased with use of Neural networks. A categorized into ready to move, immediate position and
system that aims to provide an accurate prediction of others.
housing prices has been developed. In our project we predict
the house price for Bangalore city using various machine
learning algorithms. The efficiency of the algorithm will be Linear regression is based on supervised learning. It
tested with R-Squared value. Our survey led to the performs the tasks to predict a dependent variable value(Y)
conclusion that the actual real estate value also depends on based on given independent variable(X). It is relationship
nearby local amenities such as railways station, school, between input (X) and output (Y). It is one of the most well-
hospitals, etc. The modules are exploring and processing the known and well-understood algorithms in machine learning.
data, Building and training with Machine Learning algorithm, The linear regression models are simple linear regression,
comparing R-Squared value with ML algorithm, with highest Ordinary least squares, Gradient Descent, Regularization.
R-Squared value will be implemented for the house price
predicting, web development. The datasets which are used in DECISION TREE REGRESSION
project are Area-type, Availability, Location, BHK, society,
Total square feet, bathrooms, balcony in machine learning It is an object and trains a model in the structure of a tree to
the algorithms used in our project is supervised learning, predict data in future to produce meaningful continuous
Regression problem. So dataset was tested with several ML output. The steps are involved in decision tree regression are
algorithm are linear regression, Decision tree regression, the fundamental concepts of decision trees, Maximizing
Random forest regression, Support vector Regression. Information gain, Classification trees, Regression trees. The
fundamental concepts of decision trees is it constructed from
recursive portioning. The root node known as parent node,
each node can be split into child nodes. These node can
© 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 3237
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 07 Issue: 02 | Feb 2020 p-ISSN: 2395-0072

became parent node of their resulting child nodes. The RANDOM FOREST REGRESSION
maximizing information gain is defined as the nodes at the
informative features, to define an objective function that is to It is an important learning methods for classification and
optimize the tree learning algorithm. regression to operate a constructing a multiple of decision
trees. The preliminaries of decision trees it is popular
CLASSIFICATION TREES methods for various machine learning tasks. Tree learning
requirements for serving n off the self-produce for data
Classification trees are used to predict the object into classes mining, because invariant under scaling and various other
of a categorical dependent variable their measurement on transformations. The trees are grown very deep to learn high
one or more predictor variables. regular pattern. Random forest is a way of averaging
multiple deep decision trees trained set on different parts of
REGRESSION TREES same training set. This expenses of the small increase bias
and some loss of interoperability.
It allows the input variables to be a continuous and
categorical variables. Regression trees is considered as a SUPPORT VECTOR REGRESSION
research with several machine algorithm for the regression
problem, Decision Tree algorithm has given the minimum The supervised learning is associated with learning
loss. R-Squared value for Decision Tree is 0.998 which algorithms that analyze data used for classification and
represent the good model. Web Development was regression analysis.
completed using the Decision Tree.

© 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 3238
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 07 Issue: 02 | Feb 2020 p-ISSN: 2395-0072


Feature selection using select-K with chi-square parameters

which have selected the highly correlated top five features
from 50 feature input. The selected features are blood
glucose random level, blood urea, serum creatine, packed
cell volume and white blood count. We can able to predict
tha risk factor of a patient with selected feature values.

Machine Learning Algorithm with selected featured, various

Machine Learning Algorithms were tested. The algorithms
are Support Vector Machine, Random Forest and Naï ve Bay,
Logistic, Decision Tree and K-NN algorithm. The highest
accuracy obtained for the selected feature and support
vector machine algorithm. The accuracy achieved was 95%
for 5 feature input values.

Web Development, the proposed system was deployed using

Django with selected feature value as input. This Web
Development takes-time real prediction of the risk factor.


The HCC affected person’s risk factor was classified with

Support Vector Machine. This was achieved with feature
selection method select –K parameter with chi-square. The
effective five features were selected from 50 features using
feature selection method. The result achieved was 95%
accuracy. The trained model SVM for 5 features input are
able to predict the low risk or high risk. Advantage of using
feature selection has eliminated the unwanted feature which
may increase the blood test cost of the person.


1. Bird A.DNA methylation patterns and epigenetic

memory. Genes Dev.2002; 16:6-21.

2. Dhanasekaran R, Limaya A, Cabrera R.

Hepatocellular carcinoma: current trends in
worldwide epidemiology, risk factors, diagnosis,
The research with several machine algorithm for the and therapeutics. Hepat Med. 2012; 4:19.
regression problem, Decision Tree algorithm has given the
minimum loss. R-Squared value for Decision Tree is 0.998 3. Mizuno Y, Meamura K, Tanaka Y, et al. Expression of
which represent the good model. Web Development was delta-like 3 is down regulated by aberrant DNA
completed using the Decision Tree. methlylation and histone modification in
hepatocellular carcinoma. OncolRep. 2018; 39:220-
The proposed system, predict the house price of the 2216.
Bangalore city with several features. We have, tried with
several Machine Learning algorithm to get best model. 4. Zhang Y, Petropoulos S, Liu J, et al. The signature of
Compared to all the algorithm, Decision Tree Algorithm have liver cancer in immune cells DNA methylation. Clin
produced very minimum loss and highest R-squared. We Epigenetic. 2018; 10:8.
have developed the web development using Django
Framework. It consist of eight features, which was the input 5. Tsukuma H, Hiyama T, Tanaka S, et al. Risk factors
of the model. for hepatocellular carcinoma among patients with
chronic liver disease. N Engl J Med. 1993; 328(25):

© 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 3239
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 07 Issue: 02 | Feb 2020 p-ISSN: 2395-0072

6. Yoshizawa H. Hepaticellular carcinoma associated

with hepatitis C virus infection in Japan: Projection
to other countries in the foreseeable future.
Oncology 2002; 62 Supple 1:8-17.

7. Chen JD, Yang HI, Hoeje UH, et al. Carries of inactive

hepatitis B virus are still at risk for hepatocellular
carcinoma and liver-related death.
Gastroenterology. 2010; 138(5): 1747-1754.

8. Nishida N, Nagasaka T, Nishimura T, Ikai I, Boland

CR, et al.(2008) Aberrant methylation of multiple
tumor suppressor genes in aging liver, chronic
hepatitis, and hepatocellular carcinoma. Hepatology

9. Feng Q, Stern JE, Haws SE, Lu H, Jiang M, et al.(2010)

DNA methylation changes in normal sliver tissues
and hepatocellular carcinoma with different viral
infection. Exp Mol Pathol 88: 287-292.

10. Ishak KG, Sobin LH (1994) Histological typing of

tumors in the liver (International histological
classification of tumors 2nd ed). Berlin: Springer-

11. Yeh CC, Goyal A, Shen J, et al. Global Level of plasma

DNA Methylation is Associated with Overall Survival
in Patients with Hepatocellular Carcinoma. Ann
SurgOncol. 2017;24: 3788-3795.

12. Xu R, Wei W, Krawczyk M, et al. Circulating tumor

DNA methylation markers for diagnosis and
prognosis of hepatocellular carcinoma. Nat Mater.
2017; 16:1155.

© 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 3240

You might also like