Selecting genes with dissimilar discrimination strength for sample class prediction
Proceedings Of The 5th Asia-Pacific Bioinformatics Conference, 2007•World Scientific
One of the main applications of microarray technology is to determine the gene expression
profiles of diseases and disease treatments. This is typically done by selecting a small
number of genes from amongst thousands to tens of thousands, whose expression values
are collectively used as classification profiles. This gene selection process is notoriously
challenging because microarray data normally contains only a very small number of
samples, but range over thousands to tens of thousands of genes. Most existing gene …
profiles of diseases and disease treatments. This is typically done by selecting a small
number of genes from amongst thousands to tens of thousands, whose expression values
are collectively used as classification profiles. This gene selection process is notoriously
challenging because microarray data normally contains only a very small number of
samples, but range over thousands to tens of thousands of genes. Most existing gene …
Abstract
One of the main applications of microarray technology is to determine the gene expression profiles of diseases and disease treatments. This is typically done by selecting a small number of genes from amongst thousands to tens of thousands, whose expression values are collectively used as classification profiles. This gene selection process is notoriously challenging because microarray data normally contains only a very small number of samples, but range over thousands to tens of thousands of genes. Most existing gene selection methods carefully define a function to score the differential levels of gene expression under a variety of conditions, in order to identify top-ranked genes. Such single gene scoring methods suffer because some selected genes have very similar expression patterns so using them all in classification is largely redundant. Furthermore, these selected genes can prevent the consideration of other individually-less but collectively-more differentially expressed genes. We propose to cluster genes in terms of their class discrimination strength and to limit the number of selected genes per cluster. By combining this idea with several existing single gene scoring methods, we show by experiments on two cancer microarray datasets that our methods identify gene subsets which collectively have significantly higher classification accuracies.
World Scientific
Showing the best result for this search. See all results