Skip to main content
ARS Home » Plains Area » Clay Center, Nebraska » U.S. Meat Animal Research Center » Livestock Bio-Systems » Research » Publications at this Location » Publication #335517

Title: Swine transcriptome characterization by combined Iso-Seq and RNA-seq for annotating the emerging long read-based reference genome

Author
item LIU, HAIBO - Iowa State University
item MANCHANDA, NANCY - Iowa State University
item Nonneman, Danny - Dan
item Smith, Timothy - Tim
item TUGGLE, CHRIS - Iowa State University

Submitted to: Plant and Animal Genome Conference Proceedings
Publication Type: Abstract Only
Publication Acceptance Date: 12/1/2016
Publication Date: 1/18/2017
Citation: Liu, H., Manchanda, N., Nonneman, D.J., Smith, T.P., Tuggle, C. 2017. Swine transcriptome characterization by combined Iso-Seq and RNA-seq for annotating the emerging long read-based reference genome [abstract]. In proceedings: Plant and Animal Genome Conference, Jan. 14-18, 2017, San Diego, CA. P1162.

Interpretive Summary:

Technical Abstract: PacBio long-read sequencing technology is increasingly popular in genome sequence assembly and transcriptome cataloguing. Recently, a new-generation pig reference genome was assembled based on long reads from this technology. To finely annotate this genome assembly, transcriptomes of nine tissues from the same pig for the genome assembly were deeply sequenced by using PacBio IsoSeq and Illumina RNA-seq. PacBio Iso-Seq “full-length” cDNA reads were error corrected with preprocessed RNA-seq reads by combining the SMRTAnalysis Tools with Proovread. 300 K to 400 K of high-quality, error–corrected cDNA reads from each tissue, with length of 0.5 to 10 kb, were further combined, clustered and collapsed to remove redundancy. Meanwhile, preprocessed RNA-seq reads from these tissues were collectively de novo assembled using Trinity. Noise and artifacts in both sets of transcriptomes were filtered out based on sequence characteristics associated with such artifacts. On average, ~85% and ~70% of the error-corrected reads from each tissue could be uniquely mapped back to the new pig reference genome and Sscrofa10.2, respectively. Both sets of transcriptomes are being compared and integrated and will be utilized to annotate the new swine reference genome. Allele-specific expression, tissue-specific expression and tissue-specific alternative splicing will be investigated. Through these efforts, we learned valuable lessons from library construction, error correction and noise removal, and helpful recommendations for such efforts will be given. In summary, we systematically catalogued the transcriptomes of nine porcine tissues, which will be used to annotate the new reference genome along with other extant evidence.