Comparing genome versus proteome-based identification of clinical bacterial isolates
Briefings in bioinformatics, 2018•academic.oup.com
Whole-genome sequencing (WGS) is gaining importance in the analysis of bacterial cultures
derived from patients with infectious diseases. Existing computational tools for WGS-based
identification have, however, been evaluated on previously defined data relying thereby
unwarily on the available taxonomic information. Here, we newly sequenced 846 clinical
gram-negative bacterial isolates representing multiple distinct genera and compared the
performance of five tools (CLARK, Kaiju, Kraken, DIAMOND/MEGAN and TUIT). To establish …
derived from patients with infectious diseases. Existing computational tools for WGS-based
identification have, however, been evaluated on previously defined data relying thereby
unwarily on the available taxonomic information. Here, we newly sequenced 846 clinical
gram-negative bacterial isolates representing multiple distinct genera and compared the
performance of five tools (CLARK, Kaiju, Kraken, DIAMOND/MEGAN and TUIT). To establish …
Abstract
Whole-genome sequencing (WGS) is gaining importance in the analysis of bacterial cultures derived from patients with infectious diseases. Existing computational tools for WGS-based identification have, however, been evaluated on previously defined data relying thereby unwarily on the available taxonomic information.
Here, we newly sequenced 846 clinical gram-negative bacterial isolates representing multiple distinct genera and compared the performance of five tools (CLARK, Kaiju, Kraken, DIAMOND/MEGAN and TUIT). To establish a faithful ‘gold standard’, the expert-driven taxonomy was compared with identifications based on matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass spectrometry (MS) analysis. Additionally, the tools were also evaluated using a data set of 200 Staphylococcus aureus isolates.
CLARK and Kraken (with k =31) performed best with 626 (100%) and 193 (99.5%) correct species classifications for the gram-negative and S. aureus isolates, respectively. Moreover, CLARK and Kraken demonstrated highest mean F-measure values (85.5/87.9% and 94.4/94.7% for the two data sets, respectively) in comparison with DIAMOND/MEGAN (71 and 85.3%), Kaiju (41.8 and 18.9%) and TUIT (34.5 and 86.5%). Finally, CLARK, Kaiju and Kraken outperformed the other tools by a factor of 30 to 170 fold in terms of runtime.
We conclude that the application of nucleotide-based tools using k-mers—e.g. CLARK or Kraken—allows for accurate and fast taxonomic characterization of bacterial isolates from WGS data. Hence, our results suggest WGS-based genotyping to be a promising alternative to the MS-based biotyping in clinical settings. Moreover, we suggest that complementary information should be used for the evaluation of taxonomic classification tools, as public databases may suffer from suboptimal annotations.
Oxford University Press
Showing the best result for this search. See all results