Author
WARR, AMANDA - University Of Edinburgh | |
HALL, RICHARD - Pacific Biosciences Inc | |
KIM, KIRSTI - Pacific Biosciences Inc | |
TSENG, ELIZABETH - Pacific Biosciences Inc | |
KOREN, SERGEY - National Institutes Of Health (NIH) | |
PHILLIPPY, ADAM - National Institutes Of Health (NIH) | |
Bickhart, Derek | |
Rosen, Benjamin - Ben | |
Schroeder, Steven - Steve | |
HUME, DAVID - Roslin Institute | |
TALBOT, RICHARD - University Of Edinburgh | |
RUND, LAURIE - University Of Illinois | |
SCHOOK, LAWRENCE - University Of Illinois | |
CHOW, WILLIAM - Wellcome Trust Sanger Institute | |
HOWE, KIRSTIN - Wellcome Trust Sanger Institute | |
Nonneman, Danny - Dan | |
Rohrer, Gary | |
PUTNAM, NICHOLAS - Dovetail Genomics | |
GREEN, ED - Dovetail Genomics | |
WATSON, MICK - Roslin Institute | |
Smith, Timothy - Tim | |
ARCHIBALD, ALAN - Roslin Institute |
Submitted to: Plant and Animal Genome Conference
Publication Type: Abstract Only Publication Acceptance Date: 10/31/2016 Publication Date: 1/14/2017 Citation: Warr, A., Hall, R., Kim, K., Tseng, E., Koren, S., Phillippy, A.M., Bickhart, D.M., Rosen, B.D., Schroeder, S.G., Hume, D.A., Talbot, R., Rund, L., Schook, L.B., Chow, W., Howe, K., Nonneman, D.J., Rohrer, G.A., Putnam, N., Green, E., Watson, M., Smith, T.P., Archibald, A.L. 2017. Exploiting long read sequencing technologies to establish high quality highly contiguous pig reference genome assemblies [abstract]. Plant and Animal Genome Conference XX, January 14-18, 2017, San Diego, California. Paper No. 25025. Interpretive Summary: Technical Abstract: The current pig reference genome sequence (Sscrofa10.2) was established using Sanger sequencing and following the clone-by-clone hierarchical shotgun sequencing approach used in the public human genome project. However, as sequence coverage was low (4-6x) the resulting assembly was only of draft quality. We have built new de novo genome assemblies from whole genome shotgun (WGS) sequence reads generated using Pacific Biosciences (PacBio) long read sequencing technology for two pigs – the original reference animal (Duroc sow 2-14) and a Duroc/Landrace/Yorkshire crossbred barrow. About 60-70x coverage WGS data per animal were assembled with the Falcon assembler and error corrected with Quiver/Arrow and Pilon using high coverage WGS PacBio and Illumina reads, respectively. The estimated accuracy (99.999%) of the Duroc assembly meets the requirement of a Gold standard finished sequence. The Duroc assembly was scaffolded with paired-end reads from isogenic BAC and fosmid clones. The crossbred assembly was scaffolded using Dovetail’s Hi-Rise. The current statistics for these assemblies are: Duroc 2-14 (Sscrofa11) for SSC1-18, SSCX (2.39 Gbp, 122 contigs; contig N50=58.5 Mbp; scaffold N50=107.6 Mbp); Duroc/Landrace/Yorkshire crossbred for SSC1-18, SSCX, SSCY (2.62 Gbp, 14,924 contigs; contig N50 =6.5 Mbp; scaffold N50=132 Mbp). The BAC and fosmid clone resource from Duroc 2-14 will facilitate further targeted sequence closure. These improved genome assemblies will be a key resource for research in pigs and will enable applications in agriculture and biomedicine. The assemblies are being deposited in the public database under the pre-publication data release terms of the Toronto Statement (Nature 461:168-70). |