Car Price Prediction

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 8

CAR RESALE VALUE

PREDICTION
ABSTRACT
The main motivation of doing this project is to present a sales prediction
model for the prediction of car price. Further, this research work is aimed
towards identifying the best classification algorithm for sales analysis.

 This project proposes a forecasting approach that is solely based on the data
retrieved from sales and allows for a straightforward human interpretation.
Therefore, it proposes two generalized models for predicting future sales. In
an extensive evaluation, data sets are taken which consists of car price data.

 In this work, data mining classification algorithm called Naïve Bayes is


addressed and used to develop a prediction system in order to analyze and
predict the sales volume. In addition, various grouping and chart preparation
is also made in proposed system for better classification results. The project is
designed using Python 3.7.
EXISTING SYSTEM
 In existing system, car dataset which contains attributes (engine type,
cylinder number, price, etc) are taken and two algorithms are carried out for
classification/prediction purpose. The algorithm called Naïve Bayes is used.
The training data is taken 75% from the whole data set and model is
predicted. Then the remaining 25% of the data is taken as test data and
checked against the predicted model.

DRAWBACKS

 The Naïve bayes classification yields conditional probability values only for
existing given dataset. New test data is added for classification.
 Naïve bayes classification could not be preferred when the outlier data is
more.
 Chart preparation is not carried out.
PROPOSED SYSTEM
 All the existing system approaches are carried out in proposed system. In
addition, along with Naïve Bayes based classification, various grouping
operation is used to predict the model as it helps better in various ways. It
is found to be suitable especially if the data set is having more number of
records is contains outlier data. A wide variety of sales records can be
taken for all engine type and cylinder count classification purpose and
predicting a new model at the same time increasing the efficiency. SVM
and KNN classification is also carried out.

ADVANTAGES
 Chart preparation is carried out.
 Grouping of records for various columns are prepared and displayed.
 Engine type wise sales are found out.
 Cylinder count wise sales are found out.
REQUIREMENTS
HARDWARE SPECIFICATION
 Processor : Intel Core 2 Quad
 Hard Disk Capacity : 500 GB
 RAM : 4 GB SD
 Monitor : 17inch Color
 Keyboard : 102 keys
 Mouse : Optical Mouse

 SOFTWARE SPECIFICATION
 Front-End : Python 3.7
 Operating System : Windows 10/11
MODULES

 DATASET COLLECTION
 NAÏVE BAYES CLASSIFICATION
 SVM CLASSIFICATION
 KNN CLASSIFICATION
 CHART PREPARATION
1. DATASET COLLECTION
 In this module, the sales dataset from kaggle which
contains attributes (engine type, cylinder number, price,
etc) are taken. Null value records are eliminated during
preprocessing work.

2. NAÏVE BAYES CLASSIFICATION


 Naive Bayes classifier is based on Bayes’ theorem with
independence assumptions between predictors. A Naive
Bayesian model is easy to build, with no complicated
iterative parameter estimation which makes it particularly
useful for very large datasets.
3. SVM CLASSIFICATION
 SVM stands for Support Vector Machine. It is a machine learning
approach used for classification and regression analysis. It depends on
supervised learning models and trained by learning algorithms. They analyze the
large amount of data to identify patterns from them.
4. KNN CLASSIFICATION
 In this module, KNN classification is being done with K value given as 6 and type
column (generated based on rating value with below and above 5.0 values)
column as binary classification factor. 75% of the data is given as training data
and 25% as testing data. The testing data’s record number and the record type is
found out and displayed as result.

5. CHART PREPARATION
 Using barplot the car price records values are group with their count values and
plotted. scatter.smooth(), the data sets’ column values are plotted with ‘range’ as
X and ‘count’ as Y column.

You might also like