0% found this document useful (0 votes)

79 views

ML Project Report

This document is a project report submitted by three students for their Machine Learning course. It details their project aimed at improving the accuracy of Naive Bayes classifiers and decision trees. The report includes an introduction outlining classification techniques, a literature review on related work, their methodology for scaling up accuracy, results and discussion, and a conclusion with future scope. It was submitted to fulfill the requirements for a Bachelor of Technology degree and includes certificates, declarations, acknowledgments and references.

Uploaded by

Candy Angel

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

79 views

ML Project Report

Uploaded by

Candy Angel

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 16

KLEF

Department of Computer Science Engineering

Course code -15CS4171

MACHINE LEARNING

III B. Tech – 2nd Semester

Academic Year 2018-2019

Project Based Lab

ON
SCALING UP THE ACCURACY OF NAIVE-BAYE’S CLASSIFIER AND DECISION
TREE
Submitted by
Section – 23
Batch No: 2

Student ID Student Name

160030411 G.NAGA TEJITH
160030459 G.PRIYANKA
160030559 R.KAMAL

1
KLEF
DEPARTMENT OF COMPUTER SCIENCE ENGINEERING
(DST-FIST Sponsored Department)

CERTIFICATE

This is to certify that the course based project entitled “SCALING UP THE ACCURACY
OF NAIVE-BAYE’S CALSSIFIER AND DECISION TREE” is a bonafide work done by
G.NAGA TEJITH (160030411),G.PRIYANKA (160030459),R.KAMAL(160030559) in
partial fulfilment of the requirement for the award of degree in BACHELOR OF
TECHNOLOGY in Computer Science Engineering during the academic year 2018-2019.

Faculty In Charge Head of the Department

DR.SWARNA

2
DEPARTMENT OF COMPUTER SCIENCE ENGINEERING
(DST-FIST Sponsored Department)

DECLARATION

We hereby declare that this project based lab report entitled “SCALING UP THE
ACCURACY OF NAIVE-BAYE’S CLASSIFIER AND DECISION TREE” has been
prepared by us in partial fulfilment of the requirement for the award of degree “BACHELOR
OF TECHNOLOGY in COMPUTER SCIENCE ENGINEERING” during the academic
year 2018-2019.

We also declare that this project based lab report is of our own effort and it has not
been submitted to any other university for the award of any degree.

Date:

Place: Vaddeswaram

SUBMITTED BY:

Student ID Student Name

160030411 G.NAGA TEJITH
160030459 G.PRIYANKA
160030559 R.KAMAL
.

3
ACKNOWLEDGMENTS

My sincere thanks DR.SWARNA teachersin the Lab for their outstanding support

throughout the project for the successful completion of the work

We express our gratitude toHARI KIRAN VEGE Head of the Department for Computer

Science and Engineering for providing us with adequate facilities, ways and means by which

we are able to complete this project.

We would like to place on record the deep sense of gratitude to the honourable Vice

Chancellor, K L University for providing the necessary facilities to carry the concluded

project.

Last but not the least, we thank all Teaching and Non-Teaching Staff of our department and

especially my classmates and my friends for their support in the completion of our project.

PROJECT
ASSOCIATES

Student ID Student Name

160030411 G.NAGA TEJITH
160030459 G.PRIYANKA
160030559 R.KAMAL

4
TABLEOFCONTENTS PAGENO

1. ACKNOWLEDGMENTS.........................................................................................................4

2. INTRODUCTION......................................................................................................................6

3. LITERATURE...........................................................................................................................7

4. METHODOLOGY.....................................................................................................................8

5. RESULTSANDDISCUSSION................................................................................................14

6. CONCLUSION AND FUTURESCOPE................................................................................16

7. REFERENCES.........................................................................................................................16

5
Scaling up the accuracy of Naive - Baye’s classifier and Decision tree
Classification:

Introduction:

Data classification is the process of organizing data into categories/groups in

such a way that data objects of same group are more similar and data objects
from different groups are very dissimilar. Classification algorithm assigns each
instance to a particular class such that classification error will be least. It is used
to extract models that accurately define important data classes within the given
adult dataset .

Classification techniques can handle processing of large volume of data. It can

predict categorical class labels and classifies data based on model built by using
training set and associated class labels and then can be used for classifying
newly available test data. Thus, it is outlined as an integral part of data analysis
and is gaining more popularity. Classification uses supervised learning
approach. In supervised learning, a training dataset of records is available with
associated class labels.

Classification process is divided into two main steps. The first is the training
step where the classification model is built. The second is the classification
itself, in which the trained model is applied to assign unknown data object to
one out of a given set of class label . This paper focuses on a survey of various
classification techniques that are most commonly used. The comparative study
between different algorithms (Naive- Bayes Classification and Decision tree) is
used to show the strength and accuracy of each classification algorithm in term
of performance efficiency and time complexity. A comparative study would
definitely bring out the advantages and disadvantages of one method over the
other. This would provide the guideline for interesting research issues which in
turn help other researchers in developing innovative algorithms for applications
or requirements which are not available.

6
Literature Review:

Many attempts have been made to extend Naive-Bayes or to restrict the learning
of general Bayesian networks. Approaches based on feature subset selection
may help, but they cannot increase the representation power as w&s done here,
thus we will not review them.

A Naive Bayes Style Possibilistic Classiﬁer (NBSPC) is proposed by Borgelt

and Gebhardt (1999) to deal with imprecise training sets. For this classiﬁer,
imprecision concerns only attribute values of instances (the class attribute and
the testing set are supposed to be perfect). Given the class attribute, possibility
distributions for attributes are estimated from the computation of the maximum-
based projection (Borgelt and Kruse 1988) over the set S of precise instances (S
is included in the extended dataset) which contains both the target value of the
considered attribute with the class.

A naive possibilistic network classiﬁer, proposed by Haouari et al. (2009),

presents a building procedure that deals with imperfect dataset attributes and
classes, and a classiﬁcation procedure used to classify unseen examples which
may have imperfect attribute values. This imperfection is modeled through a
possibility distribution given by an expert who expresses its partial ignorance,
due to a lack of a priori knowledge. There are some similarities between our
proposed approach and the one by Haouari et al. (2009). In particular, they are
based on the same idea stating that an attribute value is all the more possible if
there is an example, in the training set, with the same attribute value (in the
discrete case in Haouari et al. 2009) and very close attribute value (in terms of
similarity in the numerical case). However, the approach in Haouari et al.
(2009) does not require any conditional distribution over attributes to be deﬁned
in the certain case, whereas the main focus, in our proposed approaches, is how
to estimate such possibility distribution for numerical data in the certain case.

7
Methodology:

We briefly review methods for induction of decision- trees and Naive-Bayes.

Decision-tree (Quinlan 1993; Breiman et al. 1984) are commonly built by
recursive partitioning. A univariate (single attribute) split is chosen for the root
of the tree using some criterion (e.g., mutual information, gain-ratio, gini
index). The data is then divided according to the test, and the process repeats
recursively for each child. After a full tree is built, a pruning step is executed,
which reduces the tree size. In the experiments, we compared our results with
the C4.5 decision-tree induction algorithm (Quinlan 1993), which is a stateof-
the-art algorithm. Naive-Bayes (Good 1965; Langley, Iba, & Thomp- son 1992)
uses Bayes rule to compute the probability of each class given the instance,
assuming the attributes are conditionally independent given the label. The
version of Naive-Bayes we use in our experiments was implemented in MCC++
(Kohavi et al. 1994). The data is prediscretized using the an entropy-based
algorithm (Fayyad & Irani 1993; Dougherty, Kohavi, & Sahami 1995). The
probabilities are estimated directly from data based directly on counts (without
any corrections, such as Laplace or m-estimates).

Accuracy Scale-Up: A Naive-Bayes classifier requires estimation of the

conditional probabilities for each attribute value given the label. For discrete
data, because only few parameters need to be estimated, the estimates tend to
stabilize quickly and more data does not change the underlying model much.
With continuous attributes, the discretization is likely to form more intervals as
more data is available, thus increasing the representation power. However, even
with continuous data, the discretization is global and cannot take into account
attribute inter- actions. Decision-trees are non-parametric estimators and can
approximate any “reasonable” function as the database size grows (Gordon &
Olshen 1984). This theoretical result, however, may not be very comforting if
the database size required to reach the asymp- totic performance is more than
the number of atoms in the universe, as is sometimes the case. In practice, some
parametric estimators, such as Naive-Bayes, may perform better.

By using adult dataset we need to pedict the income and it is Considered as the
target attribute in the dataset.

8
ALGORITHMS:

Naive Bayes Classification:

Bayesian classifiers are statistical classifiers. They can predict class

membership probabilities, such as the probabilities, such as the probability that
a given tuple belongs to particular class. Bayesian classification is based on
Bayes Theorem. Bayesian classifiers exhibit high accuracy and speed when
applied to large database. It consists of Naïve Bayesian Classifiers and Bayesian
Belief Netwoks.Naive Bayesian Classifiers assume that the effect of an attribute
value on a given class is independent of the values of the other attribute while
Bayesian Belief Networks are graphical methods which allow the representation
of dependencies among subsets of attributes. In this paper for comparative study
of classification algorithms we have taken Naïve Bayesian Classification. The
Naïve Bayesian classification is a simple and well-known method for
performing supervised learning of a classification problem. It makes the
assumption of class conditional independence, i.e, given the class label of a
tuple, the values of the attributes are assumed to be conditionally independent of
one another.

Decision Tree Induction:

Decision tree induction is the learning of the decision trees from class-labeled
training tuples. A decision tree is a flow chart-like tree structure, where each
internal node denotes a test on an attribute, each branch represents an outcome
of the test, and each leaf node holds a class label. The topmost node in a tree is
the root node .Internal nodes are denoted by rectangles, and leaf nodes are
denoted by ovals. Some decision tree algorithms produce only binary trees,
whereas others can produce non binary trees. The construction of decision tree
classifiers does not require any domain knowledge or parameter setting, and
therefore is appropriate for exploratory knowledge discovery. They can handle
high dimensional data and have good accuracy. The decision tree induction
algorithm applied on the dataset for study is Random Forest. Random forest (or
random forests) is an ensemble classifier that consists of many decision trees
and outputs the class that is the mode of the classes output by individual trees.

9
It runs efficiently on large data bases and can handle thousands of input
variables without variable deletion. Generated forests can be saved for future
use on other data.

 Dataset:

Adult dataset available on UCI Machine Learning Repository and has a size of
3,755KB. The adult dataset consists of 32561 records and 15 attributes.

 Data Pre –Processing:

Data preprocessing is a type of processing on raw data to make it easier

and effective for further processing. It is an important step in data mining
process. The product of data preprocessing is the final training set. Kotsiantis et
al. (2006) present a well known algorithm for each step of data pre-processing .

The data preprocessing techniques are

 Data Cleaning
 Data Integration
 Data Transformation
 Data Reduction

These data preprocessing techniques are not mutually exclusive;they may work
together. Data processing techniques, when applied be fore mining, can
substantially improve the overall quality of patterns mined and the time required
for actual mining. Data preprocessing techniques can improve the quality of the
data, accuracy and efficiency of the mining process.

 Preprocessing of adult dataset:

Inorder to improve the quality of the data, accuracy and efficiency of the
mining process the adult dataset undergoes a preprocessing step. The less
sensitive attributes like final weight, capital gain, capital loss, hours per week
are removed since they are not considered as relevant attribute for privacy
preservation in data mining. So the number of attributes is reduced to 10.

The first 100 instances of the dataset is taken and then the instances with
missing values are removed resulting in a dataset of 91 attributes.

10
SOURCE CODE:

DECISION –TREE CODE:

install.packages("caret")

library(caret)

install.packages("rpart.plot")

library(rpart.plot)

setwd("C:\\Users\\USER\\Documents\\3-2\\skilling")

adult<-read.csv("adults.csv",sep=',',header = FALSE)

str(adult)

head(adult)

set.seed(3033)

intrain<-createDataPartition(y=adult$V15,p=0.7,list=FALSE)

training<-adult[intrain,]

testing<-adult[-intrain]

dim(training)

dim(testing)

anyNA(adult)

summary(adult)

trctrl<-trainControl(method ="repeatedcv",number=10,repeats = 3)

set.seed(3333)

dtree_fit_gini<-
train(V15~.,data=training,method="rpart",parms=list(split="gini"),trControl=trc
trl,tuneLength=10)

dtree_fit_gini

prp(dtree_fit_gini$finalModel,box.palette = "Blues",tweak = 1.2)

11
NAIVE-BAYE’S CODE:

#sample(x, size, replace = FALSE, prob = NULL)

setwd("C:\\Users\\USER\\Documents\\3-2\\skilling")

mydata<-read.csv(file = "adult.csv")

str(mydata)

dim(mydata)

tindex = sort(sample(nrow(mydata), nrow(mydata)*.7))

mtraining<-mydata[tindex,]

mtesting<-mydata[-tindex,]

install.packages("caTools")

#library(caTools)

#msplit<-sample.split(mydata,SplitRatio = 0.8)

#mtraining<-subset(mydata,msplit=="TRUE")

#mtesting<-subset(mydata,msplit=="FALSE")

install.packages("e1071")

library(e1071)

NB<-naiveBayes(income~., data=mtraining)

print(NB)

summary(NB)

predNB1<-predict(NB,mtesting,type=c("class"))

summary(predNB1)

table(mtesting$income,predNB1)

plot(predNB1)

install.packages("caret")

12
library(caret)

x<-mtraining[,-4]

y<-mtraining$income

model<-train(x,y,'nb',trControl = trainControl(method = 'cv',number = 10))

model

13
Results and Discussions:

1)Naive Bayesian implemented on adult dataset:

Ploting the graph to predict income

Accuracy:

14
2)Decision Tree :

Accuracy:

15
Conclusion:

The above experimentation of various classification algorithm on adult data set

shows that Naïve Bayesian is the best. Though Naïve Bayesian is followed by
Zero and Decision tree.So the Naive Bayes classifier is simple and fast and they
also exhibit higher accuracy rate than the algorithms discussed above.

Accuracy of Naive Bayes Classification is 97%.

Accuracy of Desicion tree is 85%.

The highest accuracy can be occurred in Naive Bayes classifier.

Future Scope:

Our work can be extended to other data mining techniques like clustering
,association etc. It can also be extended for other classification algorithms. We
have implemented the classification technique and found the accuracy for a
dataset with just 91 instances. This study can be carried forward by
implementing the same algorithms on larger data sets.

References:

https://rd.springer.com/content/pdf/10.1007%2Fs00500-012-0947-9.pdf

https://www.ijert.org/research/a-comparative-study-of-classification-
techniques-on-adult-data-set-IJERTV1IS8243.pdf

file:///C:/Users/USER/AppData/Local/Temp/Rar$DIa0.530/KDD96-033.pdf

Limit States Design in Structural Steel: G.L. Kulak and G.Y. Grondin 9 Edition, 1 Printing 2010
No ratings yet
Limit States Design in Structural Steel: G.L. Kulak and G.Y. Grondin 9 Edition, 1 Printing 2010
19 pages
DWDM-UNIT-IV
No ratings yet
DWDM-UNIT-IV
30 pages
Unit 3 (DWDM)
No ratings yet
Unit 3 (DWDM)
23 pages
41 j48 Naive Bayes Weka
No ratings yet
41 j48 Naive Bayes Weka
5 pages
Module4 QB 1
No ratings yet
Module4 QB 1
26 pages
IntroClassificationDA-2024
No ratings yet
IntroClassificationDA-2024
129 pages
10 Classification New 1
No ratings yet
10 Classification New 1
31 pages
CH 8 Data Mining
No ratings yet
CH 8 Data Mining
30 pages
SLIQ
No ratings yet
SLIQ
15 pages
8 Classification
No ratings yet
8 Classification
45 pages
Week 6 - 7 - Classification
No ratings yet
Week 6 - 7 - Classification
67 pages
CS402 Mod 3
No ratings yet
CS402 Mod 3
2 pages
08 Class Basic
No ratings yet
08 Class Basic
141 pages
ML Mid Sem Sep2023 Paper
No ratings yet
ML Mid Sem Sep2023 Paper
3 pages
19-Introduction classification algorithm-18-09-2024
No ratings yet
19-Introduction classification algorithm-18-09-2024
102 pages
Naïve Bayes Classifier
No ratings yet
Naïve Bayes Classifier
39 pages
Classification and Prediction
No ratings yet
Classification and Prediction
21 pages
DMDM Part 2
No ratings yet
DMDM Part 2
94 pages
Efficiency Improvement in Classification Tasks Using Naive Bayes PDF
No ratings yet
Efficiency Improvement in Classification Tasks Using Naive Bayes PDF
5 pages
DM UNIT-3
No ratings yet
DM UNIT-3
23 pages
DWDM 4
No ratings yet
DWDM 4
58 pages
05classification Rule Mining
No ratings yet
05classification Rule Mining
56 pages
Unit Iv
No ratings yet
Unit Iv
38 pages
New Classification11
No ratings yet
New Classification11
98 pages
Lecture 6 - Decision Trees
No ratings yet
Lecture 6 - Decision Trees
43 pages
Classification
No ratings yet
Classification
33 pages
DWDM Unit Iv
No ratings yet
DWDM Unit Iv
81 pages
Chapter 5 Classification
No ratings yet
Chapter 5 Classification
24 pages
08 Class Basic
No ratings yet
08 Class Basic
103 pages
7 Classification
100% (3)
7 Classification
63 pages
Decision Tree
No ratings yet
Decision Tree
30 pages
classification
No ratings yet
classification
36 pages
DWDM Unit 4
No ratings yet
DWDM Unit 4
22 pages
Unit 4
No ratings yet
Unit 4
186 pages
Unit 3
No ratings yet
Unit 3
16 pages
2 Supervised Learning
No ratings yet
2 Supervised Learning
48 pages
Supervised Learning Classification Algorithms Comparison
No ratings yet
Supervised Learning Classification Algorithms Comparison
6 pages
Post Op Weka Data Set Sample PDF
No ratings yet
Post Op Weka Data Set Sample PDF
8 pages
A5 PDF
No ratings yet
A5 PDF
9 pages
UNIT-3
No ratings yet
UNIT-3
34 pages
Classification
100% (1)
Classification
37 pages
DWDM - Unit - V
No ratings yet
DWDM - Unit - V
93 pages
Data Mining: Classification
No ratings yet
Data Mining: Classification
70 pages
Data MIning Chapter 8
No ratings yet
Data MIning Chapter 8
11 pages
Chap4 Classification Lecture 5
No ratings yet
Chap4 Classification Lecture 5
74 pages
Unit 2
No ratings yet
Unit 2
55 pages
Unit-3
No ratings yet
Unit-3
98 pages
Unit 3
No ratings yet
Unit 3
95 pages
Concepts and Techniques: Data Mining
No ratings yet
Concepts and Techniques: Data Mining
17 pages
Week 5
No ratings yet
Week 5
72 pages
Decision Tree and Evalaution
No ratings yet
Decision Tree and Evalaution
50 pages
6 Classification
No ratings yet
6 Classification
53 pages
7 - Classfication - Concept - DecisionTree - Evaluation
No ratings yet
7 - Classfication - Concept - DecisionTree - Evaluation
47 pages
ABP DWDM UNIT 4 Classification 1
No ratings yet
ABP DWDM UNIT 4 Classification 1
51 pages
Machine Learning
No ratings yet
Machine Learning
5 pages
05 Classification Part1
No ratings yet
05 Classification Part1
35 pages
DWDM Unit-3: What Is Classification? What Is Prediction?
No ratings yet
DWDM Unit-3: What Is Classification? What Is Prediction?
12 pages
EBUS537 Theme4 Week 5
No ratings yet
EBUS537 Theme4 Week 5
26 pages
Teaching and Learning in STEM With Computation, Modeling, and Simulation Practices: A Guide for Practitioners and Researchers
From Everand
Teaching and Learning in STEM With Computation, Modeling, and Simulation Practices: A Guide for Practitioners and Researchers
Alejandra J. Magana
No ratings yet
The Use of Screencasting in Higher Education: A Case Study
From Everand
The Use of Screencasting in Higher Education: A Case Study
Jetmir Abdija
No ratings yet
MATLAB for Machine Learning: Unlock the power of deep learning for swift and enhanced results
From Everand
MATLAB for Machine Learning: Unlock the power of deep learning for swift and enhanced results
Giuseppe Ciaburro
No ratings yet
1.mechanics of Measurement-Introduction and Assessment
No ratings yet
1.mechanics of Measurement-Introduction and Assessment
13 pages
160030459-Cameras, Exposure, and Photography Certification
No ratings yet
160030459-Cameras, Exposure, and Photography Certification
1 page
3.software Benchmarks
No ratings yet
3.software Benchmarks
4 pages
Hypothesis Space and Inductive Bias
No ratings yet
Hypothesis Space and Inductive Bias
5 pages
Code Generation
No ratings yet
Code Generation
49 pages
ID NO:160031327 SEC:23 NAME:Syed - Umar Lab Expt - 1 Preprocessing of The Datasets: A.) Impute Missing Values: Source Code
No ratings yet
ID NO:160031327 SEC:23 NAME:Syed - Umar Lab Expt - 1 Preprocessing of The Datasets: A.) Impute Missing Values: Source Code
6 pages
Bayesian Networks - Exercises: 1 Independence and Conditional Independence
No ratings yet
Bayesian Networks - Exercises: 1 Independence and Conditional Independence
20 pages
LAB EXPT: 2: Classification of Data Using GINI Index AIM: The Aim Is To Classify The Data Using GINI Index. Code
No ratings yet
LAB EXPT: 2: Classification of Data Using GINI Index AIM: The Aim Is To Classify The Data Using GINI Index. Code
3 pages
Well Posed Learning Problems and Applications of ML
No ratings yet
Well Posed Learning Problems and Applications of ML
17 pages
Lab Expt No: 3-Classification of The Dataset Aim: Code
No ratings yet
Lab Expt No: 3-Classification of The Dataset Aim: Code
3 pages
LAB EXPT: 2: Classification of Data Using GINI Index AIM: The Aim Is To Classify The Data Using GINI Index. Code
No ratings yet
LAB EXPT: 2: Classification of Data Using GINI Index AIM: The Aim Is To Classify The Data Using GINI Index. Code
3 pages
Lab Expt No: 3-Classification of The Dataset Aim: Code
No ratings yet
Lab Expt No: 3-Classification of The Dataset Aim: Code
3 pages
LAB EXPT: 2: Classification of Data Using GINI Index AIM: The Aim Is To Classify The Data Using GINI Index. Code
No ratings yet
LAB EXPT: 2: Classification of Data Using GINI Index AIM: The Aim Is To Classify The Data Using GINI Index. Code
3 pages
ID NO:160031327 SEC: 23 NAME: Syed - Umar Lab Expt: 05:Baye'S Classification AIM: To Predict The Class Using Baye's Classification. Source Code
No ratings yet
ID NO:160031327 SEC: 23 NAME: Syed - Umar Lab Expt: 05:Baye'S Classification AIM: To Predict The Class Using Baye's Classification. Source Code
2 pages
Scanned by Camscanner
No ratings yet
Scanned by Camscanner
2 pages
SVV Full Notes
No ratings yet
SVV Full Notes
64 pages
Algorithm Design and Analysis
No ratings yet
Algorithm Design and Analysis
2 pages
Final Record
No ratings yet
Final Record
52 pages
Scanned by Camscanner
No ratings yet
Scanned by Camscanner
2 pages
Capture All HTTP Packets by Using Wireshark and Try To Fetch All HTTP Usernames & Passwords
No ratings yet
Capture All HTTP Packets by Using Wireshark and Try To Fetch All HTTP Usernames & Passwords
38 pages
A+b Ab+ A+b-C Abc-+ A+b C Abc + A+ (B C) Abc + (A+b) C Ab+c (A+b) (C-D) Ab+cd - A+b C-D Abc D-+ A+ (B C) - D Abc D-+
No ratings yet
A+b Ab+ A+b-C Abc-+ A+b C Abc + A+ (B C) Abc + (A+b) C Ab+c (A+b) (C-D) Ab+cd - A+b C-D Abc D-+ A+ (B C) - D Abc D-+
12 pages
Corrosion Chem
No ratings yet
Corrosion Chem
29 pages
Scanned by Camscanner
No ratings yet
Scanned by Camscanner
2 pages
107 - 76409 - Session Wise Problems
No ratings yet
107 - 76409 - Session Wise Problems
14 pages
JEE Main 2023 Vector Algebra Revision Notes - Free PDF Download
No ratings yet
JEE Main 2023 Vector Algebra Revision Notes - Free PDF Download
4 pages
A New Current Mirror Layout Technique For Improved Matching Characteristics
No ratings yet
A New Current Mirror Layout Technique For Improved Matching Characteristics
4 pages
Physics Practical Investigations
67% (3)
Physics Practical Investigations
71 pages
Takehome Activity: Due Next Meeting Principles of Counting
No ratings yet
Takehome Activity: Due Next Meeting Principles of Counting
2 pages
Statistics For Managers Using Microsoft Excel: 6 Global Edition
No ratings yet
Statistics For Managers Using Microsoft Excel: 6 Global Edition
44 pages
Mathematics Action Plan 2018 2021
No ratings yet
Mathematics Action Plan 2018 2021
8 pages
DFA Interpretation Help
No ratings yet
DFA Interpretation Help
36 pages
Differential Geometry of Curves and Surfaces 2nd Edition Manfredo P. Do Carmo - Read the ebook now or download it for a full experience
100% (1)
Differential Geometry of Curves and Surfaces 2nd Edition Manfredo P. Do Carmo - Read the ebook now or download it for a full experience
80 pages
Cee 118 Sim Ulo 2
No ratings yet
Cee 118 Sim Ulo 2
32 pages
Quiz
No ratings yet
Quiz
4 pages
Free-surface flow: shallow-water dynamics Katopodes all chapter instant download
100% (1)
Free-surface flow: shallow-water dynamics Katopodes all chapter instant download
41 pages
Patterningandalgebra
No ratings yet
Patterningandalgebra
12 pages
Name of The Trade - Turner - 4 Semester NSQF - Module 1 - Introduction To CNC
No ratings yet
Name of The Trade - Turner - 4 Semester NSQF - Module 1 - Introduction To CNC
20 pages
Business Mathematics
No ratings yet
Business Mathematics
2 pages
Quant Checklist 73 PDF 2022 by Aashish Arora
100% (1)
Quant Checklist 73 PDF 2022 by Aashish Arora
83 pages
Circles: CH CH
100% (1)
Circles: CH CH
4 pages
EMFT-Unit Wise Important Questions
No ratings yet
EMFT-Unit Wise Important Questions
6 pages
Petroleum Engineering
No ratings yet
Petroleum Engineering
1 page
Neural Networks Neural Networks
No ratings yet
Neural Networks Neural Networks
30 pages
G0K-2021-SIMSO-Math-National Round-Mock
100% (1)
G0K-2021-SIMSO-Math-National Round-Mock
10 pages
C4 Binomial Expansion Exam Questions: (C4 Jan 2014 (R) Q1) 1
No ratings yet
C4 Binomial Expansion Exam Questions: (C4 Jan 2014 (R) Q1) 1
18 pages
Methods of Analysis: 5.1 First-Order Elastic Analysis
No ratings yet
Methods of Analysis: 5.1 First-Order Elastic Analysis
9 pages
Qualitative Predictor
No ratings yet
Qualitative Predictor
15 pages
RMO 2013 Solutions 4
No ratings yet
RMO 2013 Solutions 4
2 pages
Episode 110 2 Calibration of A Thermistor
No ratings yet
Episode 110 2 Calibration of A Thermistor
2 pages
Maths 2a
50% (2)
Maths 2a
6 pages
Measure Theoretic Probability With Applications to Statistics Finance
No ratings yet
Measure Theoretic Probability With Applications to Statistics Finance
262 pages
Lecture Notes 11-Initial Value Problem ODE
100% (1)
Lecture Notes 11-Initial Value Problem ODE
51 pages
Table 1: Cumulative Normal Distribution
No ratings yet
Table 1: Cumulative Normal Distribution
1 page

ML Project Report

Uploaded by

ML Project Report

Uploaded by

KLEF

Department of Computer Science Engineering

Course code -15CS4171

III B. Tech – 2nd Semester

Academic Year 2018-2019

Project Based Lab

Student ID Student Name

Faculty In Charge Head of the Department

Student ID Student Name

throughout the project for the successful completion of the work

we are able to complete this project.

Student ID Student Name

6. CONCLUSION AND FUTURESCOPE................................................................................16

Data classification is the process of organizing data into categories/groups in

Classification techniques can handle processing of large volume of data. It can

A Naive Bayes Style Possibilistic Classiﬁer (NBSPC) is proposed by Borgelt

A naive possibilistic network classiﬁer, proposed by Haouari et al. (2009),

We briefly review methods for induction of decision- trees and Naive-Bayes.

Accuracy Scale-Up: A Naive-Bayes classifier requires estimation of the

Naive Bayes Classification:

Bayesian classifiers are statistical classifiers. They can predict class

Decision Tree Induction:

 Data Pre –Processing:

Data preprocessing is a type of processing on raw data to make it easier

The data preprocessing techniques are

 Preprocessing of adult dataset:

DECISION –TREE CODE:

prp(dtree_fit_gini$finalModel,box.palette = "Blues",tweak = 1.2)

#sample(x, size, replace = FALSE, prob = NULL)

tindex = sort(sample(nrow(mydata), nrow(mydata)*.7))

model<-train(x,y,'nb',trControl = trainControl(method = 'cv',number = 10))

1)Naive Bayesian implemented on adult dataset:

Ploting the graph to predict income

The above experimentation of various classification algorithm on adult data set

Accuracy of Naive Bayes Classification is 97%.

Accuracy of Desicion tree is 85%.

The highest accuracy can be occurred in Naive Bayes classifier.

You might also like