Location: Animal Parasitic Diseases Laboratory
Title: Long-read sequencing improves assembly of Trichinella genomes 10-fold, revealing substantial synteny between lineages diverged over seven million yearsAuthor
THOMPSON, PETER - Orise Fellow | |
Zarlenga, Dante | |
LIU, MINGYUAN - Jilin University | |
Rosenthal, Benjamin |
Submitted to: Parasitology
Publication Type: Peer Reviewed Journal Publication Acceptance Date: 1/12/2017 Publication Date: 6/9/2017 Publication URL: http://handle.nal.usda.gov/10113/5763058 Citation: Thompson, P., Zarlenga, D.S., Liu, M., Rosenthal, B.M. 2017. Long-read sequencing improves assembly of Trichinella genomes 10-fold, revealing substantial synteny between lineages diverged over seven million years. Parasitology. 6:1-14 doi:10.1017/S0031182017000348. Interpretive Summary: Historically, undercooked pork products were a major cause of human disease due to the presence of a parasitic worm called Trichinella spiralis. Modern pork production practices and controls have virtually eliminated this threat to U.S. food safety. However, the current trend towards pasture-raised pork may be exposing pigs to a closely related parasitic worm, Trichinella murrelli, which is more common in North American wildlife and game animals. In order to understand similarities and differences between these two parasite species, we sequenced the T. murrelli genome using the latest technology and compared our resulting assembly to one previously published for T. spiralis. We achieved a 10-fold improvement in the completeness and ordering of both genomes, as a result. The improved assemblies will be valuable resources for future studies aimed at understanding how these worms cause disease, and whether the two species of worms should be considered equivalent threats to human health. These data will interest parasitologists, veterinarians, genome biologists, and those interested in swine health, particular those raised in accordance with the Organic production standard. Technical Abstract: Genome evolution influences a parasite’s’s pathogenicity, host-pathogen interactions, environmental constraints, and invasion biology, while genome assemblies form the basis of comparative sequence analyses. Given that closely related organisms typically maintain appreciable synteny, the genome assembly of one organism can improve that of a closely related organism. Therefore, in order to improve Trichinella genome assemblies, we used third-generation, long-read technology to sequence the genome of an encysted species, Trichinella murrelli, and syntenic comparisons to improve scaffolding of both the novel T. murrelli assembly and the existing Trichinella spiralis assembly. Long reads derived from T. murrelli genomic DNA were assembled into a high quality draft genome with a total size of 63.2 Mbp consisting of only 653 contigs. Over half of the assembly’s length was derived from just 26 contigs, each of which was longer than 571,000 bp (N50). When compared with existing Trichinella genome assemblies, this long-read assembly consisted of 10-fold fewer contigs that were 5 times longer on average, and revealed 5 Mbp of new sequences that were previously designated as indeterminate gap sequences. Macrosyntenic comparisons showed that 81.7% of long-read T. murrelli sequences were collinear with the previously published 2011 T. spiralis assembly scaffolds which in turn guided scaffolding of T. murrelli contigs. At a smaller scale, long-read T. murrelli contigs suggested that there were local misassemblies within the 2011 T. spiralis scaffolds and allowed for improved ordering of T. spiralis contigs within scaffolds. Furthermore, data from single-copy orthologs and median short-read coverage of T. murrelli contigs provided informed hypotheses for chromosomal groupings of T. murrelli and T. spiralis scaffolds. The net result was new assemblies for both species organized into three chromosomal scaffolds. There remain 18% of sequences from both genome assemblies which had high levels of repetitive DNA and have yet to be placed with confidence on the chromosomes. Long-read sequencing of the T. murrelli genome was an economical investment that improved not only assembly of the target species, but also that of the closely related T. spiralis genome. The improved assemblies will be valuable resources for future studies linking phenotypic traits within each species to their underlying genetic bases. |