A data mining approach to the diagnosis of tuberculosis by cascading clustering and classification

S Natarajan, KNB Murthy - arXiv preprint arXiv:1108.1045, 2011 - arxiv.org
arXiv preprint arXiv:1108.1045, 2011arxiv.org
In this paper, a methodology for the automated detection and classification of Tuberculosis
(TB) is presented. Tuberculosis is a disease caused by mycobacterium which spreads
through the air and attacks low immune bodies easily. Our methodology is based on
clustering and classification that classifies TB into two categories, Pulmonary Tuberculosis
(PTB) and retroviral PTB (RPTB) that is those with Human Immunodeficiency Virus (HIV)
infection. Initially K-means clustering is used to group the TB data into two clusters and …
In this paper, a methodology for the automated detection and classification of Tuberculosis(TB) is presented. Tuberculosis is a disease caused by mycobacterium which spreads through the air and attacks low immune bodies easily. Our methodology is based on clustering and classification that classifies TB into two categories, Pulmonary Tuberculosis(PTB) and retroviral PTB(RPTB) that is those with Human Immunodeficiency Virus (HIV) infection. Initially K-means clustering is used to group the TB data into two clusters and assigns classes to clusters. Subsequently multiple different classification algorithms are trained on the result set to build the final classifier model based on K-fold cross validation method. This methodology is evaluated using 700 raw TB data obtained from a city hospital. The best obtained accuracy was 98.7% from support vector machine (SVM) compared to other classifiers. The proposed approach helps doctors in their diagnosis decisions and also in their treatment planning procedures for different categories.
arxiv.org
Showing the best result for this search. See all results