Location: Cell Wall Biology and Utilization Research
Title: metaFlye: scalable long-read metagenome assembly using repeat graphsAuthor
KOLMOGOROV, MIKHAIL - University Of California, San Diego | |
Bickhart, Derek | |
BEHSAZ, BAHAR - University Of California, San Diego | |
GUREVICH, ALEXEY - St Petersburg State University | |
RAYKO, MIKHAIL - St Petersburg State University | |
Shin, Sung | |
Kuhn, Kristen | |
YUAN, JEFFREY - University Of California, San Diego | |
POLEVIKOV, EVGENY - St Petersburg State University | |
Smith, Timothy - Tim | |
PEVZNER, PAVEL - University Of California, San Diego |
Submitted to: Nature Methods
Publication Type: Peer Reviewed Journal Publication Acceptance Date: 8/11/2020 Publication Date: 10/5/2020 Citation: Kolmogorov, M., Bickhart, D.M., Behsaz, B., Gurevich, A., Rayko, M., Shin, S.B., Kuhn, K.L., Yuan, J., Polevikov, E., Smith, T.P.L., Pevzner, P.A. 2020. metaFlye: scalable long-read metagenome assembly using repeat graphs. Nature Methods. 17:1103-1110. https://doi.org/10.1038/s41592-020-00971-x. DOI: https://doi.org/10.1038/s41592-020-00971-x Interpretive Summary: Study of the microbiome has gained increasing recognition in the research community. These communities are difficult to study due to the genetic diversity of each system. Tools that can recognize and separate this diversity are needed. In this study, we present a new software tool to solve this problem. This software will finally allow a closer look at microbiomes that are important for animal agriculture. Technical Abstract: Long-read sequencing technologies have substantially improved the assemblies of many isolate bacterial genomes as compared to the fragmented short-read assemblies. However, assembling complex metagenomic datasets remains difficult even for state-of-the-art long-read assemblers. Here we present the metaFlye algorithm that addresses important long-read metagenomic assembly challenges, such as uneven bacterial composition and intra-species heterogeneity. We benchmark metaFlye against state-of-the-art long-read assemblers using mock and real bacterial community datasets and show that it produces complete assemblies of nearly all bacteria in mock datasets. We also performed long-read sequencing of the sheep microbiome and applied metaFlye to reconstruct 63 complete or nearly-complete bacterial genomes within single contigs. Finally, we show that long-read assembly of the human microbiome enables the discovery of novel biosynthetic gene clusters that encode biomedically important natural products. |