Skip to main content
ARS Home » Northeast Area » Beltsville, Maryland (BARC) » Beltsville Agricultural Research Center » Animal Genomics and Improvement Laboratory » Research » Publications at this Location » Publication #324521

Title: Improving the goat long-read assembly with optical mapping

Author
item Bickhart, Derek
item Smith, Timothy - Tim
item Rosen, Benjamin - Ben
item KOREN , SERGEY - National Institutes Of Health (NIH)
item PHILLIPPY, ADAM - National Institutes Of Health (NIH)
item HASTIE, ALEX - Bionano Genomics, Inc
item SULLIVAN, SHAWN - Microsoft
item LIACHKO, IVAN - University Of Washington
item BURTON, JOSHUA - University Of Washington
item SAYRE, BRIAN - Virginia State University
item Liu, Ge - George
item Schroeder, Steven - Steve
item SONSTEGARD, TAD - Former ARS Employee
item Van Tassell, Curtis - Curt

Submitted to: Plant and Animal Genome Conference Proceedings
Publication Type: Abstract Only
Publication Acceptance Date: 1/12/2016
Publication Date: 1/12/2016
Citation: Bickhart, D.M., Smith, T.P., Rosen, B.D., Koren , S., Phillippy, A., Hastie, A.R., Sullivan, S.T., Liachko, I., Burton, J.N., Sayre, B.L., Liu, G., Schroeder, S.G., Sonstegard, T.S., Van Tassell, C.P. 2016. Improving the goat long-read assembly with optical mapping. Plant and Animal Genome Conference Proceedings. San Diego, CA, Jan. 9–13.

Interpretive Summary:

Technical Abstract: Reference genome assemblies provide important context in genetics by standardizing the order of genes and providing a universal set of coordinates for individual nucleotides. Often due to the high complexity of genic regions and higher copy number of genes involved in immune function, immunity-related genes are often misassembled in current reference assemblies. This problem is particularly ubiquitous in the reference genomes of non-model organisms as they often do not receive the years of curation necessary to resolve annotation and assembly errors. In this study, we reassemble a reference genome of the goat (Capra hircus) using modern PacBio technology in tandem with BioNano Genomics Irys optical maps and Lachesis clustering in order to provide a high quality reference assembly without the need for extensive filtering. Initial PacBio assemblies using P5C4 chemistry achieved contig N50's of 4 Megabases and a BUSCO completion score of 84.0%, which is comparable to several finished model organism reference assemblies. We used BioNano Genomics' Irys platform to generate 336 scaffolds from this data with a scaffold N50 of 24 megabases and total genome coverage of 98%. Lachesis interaction maps were used with a clustering algorithm to associate Irys scaffolds into the expected 30 chromosome physical maps. Comparisons of the initial hybrid scaffolds generated from the long read contigs and optical map information to a previously generated RH map revealed that the entirety of the Goat autosome 20 physical map was contained within one scaffold. Additionally, the BioNano scaffolding resolved several difficult regions that contained genes related to innate immunity which were problem regions in previous reference genome assemblies. Mention of trade names or commercial products in this publication is solely for the purpose of providing specific information and does not imply recommendation or endorsement by the U.S. Department of Agriculture.