Skip to main content
ARS Home » Midwest Area » Madison, Wisconsin » U.S. Dairy Forage Research Center » Cell Wall Biology and Utilization Research » Research » Publications at this Location » Publication #371741

Research Project: Investigating Microbial, Digestive, and Animal Factors to Increase Dairy Cow Performance and Nutrient Use Efficiency

Location: Cell Wall Biology and Utilization Research

Title: metaFlye: scalable long-read metagenome assembly using repeat graphs

Author
item KOLMOGOROV, MIKHAIL - University Of California, San Diego
item Bickhart, Derek
item BEHSAZ, BAHAR - University Of California, San Diego
item GUREVICH, ALEXEY - St Petersburg State University
item RAYKO, MIKHAIL - St Petersburg State University
item Shin, Sung
item Kuhn, Kristen
item YUAN, JEFFREY - University Of California, San Diego
item POLEVIKOV, EVGENY - St Petersburg State University
item Smith, Timothy - Tim
item PEVZNER, PAVEL - University Of California, San Diego

Submitted to: Nature Methods
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 8/11/2020
Publication Date: 10/5/2020
Citation: Kolmogorov, M., Bickhart, D.M., Behsaz, B., Gurevich, A., Rayko, M., Shin, S.B., Kuhn, K.L., Yuan, J., Polevikov, E., Smith, T.P.L., Pevzner, P.A. 2020. metaFlye: scalable long-read metagenome assembly using repeat graphs. Nature Methods. 17:1103-1110. https://doi.org/10.1038/s41592-020-00971-x.
DOI: https://doi.org/10.1038/s41592-020-00971-x

Interpretive Summary: Study of the microbiome has gained increasing recognition in the research community. These communities are difficult to study due to the genetic diversity of each system. Tools that can recognize and separate this diversity are needed. In this study, we present a new software tool to solve this problem. This software will finally allow a closer look at microbiomes that are important for animal agriculture.

Technical Abstract: Long-read sequencing technologies have substantially improved the assemblies of many isolate bacterial genomes as compared to the fragmented short-read assemblies. However, assembling complex metagenomic datasets remains difficult even for state-of-the-art long-read assemblers. Here we present the metaFlye algorithm that addresses important long-read metagenomic assembly challenges, such as uneven bacterial composition and intra-species heterogeneity. We benchmark metaFlye against state-of-the-art long-read assemblers using mock and real bacterial community datasets and show that it produces complete assemblies of nearly all bacteria in mock datasets. We also performed long-read sequencing of the sheep microbiome and applied metaFlye to reconstruct 63 complete or nearly-complete bacterial genomes within single contigs. Finally, we show that long-read assembly of the human microbiome enables the discovery of novel biosynthetic gene clusters that encode biomedically important natural products.