A highly abundant bacteriophage discovered in the unknown sequences of human faecal metagenomes

Nat Commun. 2014 Jul 24:5:4498. doi: 10.1038/ncomms5498.

Abstract

Metagenomics, or sequencing of the genetic material from a complete microbial community, is a promising tool to discover novel microbes and viruses. Viral metagenomes typically contain many unknown sequences. Here we describe the discovery of a previously unidentified bacteriophage present in the majority of published human faecal metagenomes, which we refer to as crAssphage. Its ~97 kbp genome is six times more abundant in publicly available metagenomes than all other known phages together; it comprises up to 90% and 22% of all reads in virus-like particle (VLP)-derived metagenomes and total community metagenomes, respectively; and it totals 1.68% of all human faecal metagenomic sequencing reads in the public databases. The majority of crAssphage-encoded proteins match no known sequences in the database, which is why it was not detected before. Using a new co-occurrence profiling approach, we predict a Bacteroides host for this phage, consistent with Bacteroides-related protein homologues and a unique carbohydrate-binding domain encoded in the phage genome.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Bacteriophages / genetics
  • Bacteriophages / isolation & purification*
  • Bacteroides / virology
  • Clustered Regularly Interspaced Short Palindromic Repeats
  • Feces / microbiology
  • Feces / virology*
  • Female
  • Humans
  • Metagenome*
  • Molecular Sequence Data
  • Viral Proteins / genetics

Substances

  • Viral Proteins

Associated data

  • GENBANK/JQ995537
  • GENBANK/KM000086
  • GENBANK/KM000087
  • GENBANK/KM000088
  • GENBANK/KM000089
  • GENBANK/KM000090
  • GENBANK/KM000091
  • GENBANK/KM000092
  • GENBANK/KM000093
  • GENBANK/KM000094
  • GENBANK/KM000095
  • GENBANK/KM000096
  • GENBANK/KM000097
  • GENBANK/KM000098
  • GENBANK/KM000099
  • GENBANK/KM000100
  • GENBANK/KM000101
  • GENBANK/KM000102
  • GENBANK/KM000103
  • GENBANK/KM000104
  • GENBANK/KM000105
  • GENBANK/KM000106
  • GENBANK/KM000107
  • GENBANK/KM000108
  • GENBANK/KM000109
  • GENBANK/KM000110
  • GENBANK/KM000111
  • GENBANK/KM000112
  • GENBANK/KM000113
  • GENBANK/KM000114
  • GENBANK/KM000115
  • GENBANK/KM000116
  • GENBANK/KM000117
  • GENBANK/KM000118
  • GENBANK/KM000119
  • GENBANK/KM000120
  • GENBANK/KM000121