Using rules to analyse bio-medical data: a comparison between C4. 5 and PCL

J Li, L Wong - International Conference on Web-Age Information …, 2003 - Springer
International Conference on Web-Age Information Management, 2003Springer
For easy comprehensibility, rules are preferrable to non-linear kernel functions in the
analysis of bio-medical data. In this paper, we describe two rule induction approaches—C4.
5 and our PCL classifier—for discovering rules from both traditional clinical data and recent
gene expression or proteomic profiling data. C4. 5 is a widely used method, but it has two
weaknesses, the single coverage constraint and the fragmentation problem, that affect its
accuracy. PCL is a new rule-based classifier that overcomes these two weaknesses of …
Abstract
For easy comprehensibility, rules are preferrable to non-linear kernel functions in the analysis of bio-medical data. In this paper, we describe two rule induction approaches—C4.5 and our PCL classifier—for discovering rules from both traditional clinical data and recent gene expression or proteomic profiling data. C4.5 is a widely used method, but it has two weaknesses, the single coverage constraint and the fragmentation problem, that affect its accuracy. PCL is a new rule-based classifier that overcomes these two weaknesses of decision trees by using many significant rules. We present a thorough comparison to show that our PCL method is much more accurate than C4.5, and it is also superior to Bagging and Boosting in general.
Springer
Showing the best result for this search. See all results