Author
LIU, HAIBO - Iowa State University | |
Smith, Timothy - Tim | |
Nonneman, Danny - Dan | |
DEKKERS, JACK - Iowa State University | |
TUGGLE, CHRISTOPHER - Iowa State University |
Submitted to: BMC Genomics
Publication Type: Peer Reviewed Journal Publication Acceptance Date: 6/14/2017 Publication Date: 6/24/2017 Publication URL: https://handle.nal.usda.gov/10113/5801807 Citation: Liu, H., Smith, T.P., Nonneman, D.J., Dekkers, J.C., Tuggle, C.K. 2017. A high-quality annotated transcriptome of swine peripheral blood. BMC Genomics. 18:479.. https://doi.org/10.1186/s12864-017-3863-7. DOI: https://doi.org/10.1186/s12864-017-3863-7 Interpretive Summary: Pigs are an important species both as a livestock animal, and as a biomedical model due to their similarity with human organs and immune systems. Advanced, current technologies for characterizing health status, metabolic state, and drug response, rely on profiling of blood and blood cell gene expression. However, in swine the foundational data to interpret high-throughput gene expression profiling assays is not in an advanced enough state to support accurate interpretation of results. The focus of the manuscript is to create a database of functional information and sequence, to use in interpretation of profiling assays to support physiological and genetics research using pigs. A high quality set of gene transcript abundance (measuring the amount of message generated from each gene) and annotation of genes based on RNA sequence was produced and described, and the data placed in public repositories for research use. Technical Abstract: Background: High throughput gene expression profiling assays of peripheral blood are widely used in biomedicine, as well as in animal genetics and physiology research. Accurate, comprehensive, and precise interpretation of such high throughput assays relies on well-characterized reference genomes and/or transcriptomes. However, neither the reference genome nor the peripheral blood transcriptome of the pig have been sufficiently assembled and annotated to support such profiling assays in this emerging biomedical model organism. We aimed to assemble published and novel RNA-seq data to provide a comprehensive, well-annotated blood transcriptome for pigs by integrating a de novo assembly with a genome-guided assembly. Results: A de novo and a genome-guided transcriptome of porcine whole peripheral blood was assembled with ~162 million pairs of paired-end and ~183 million single-end, trimmed and normalized Illumina RNA-seq reads (~6 billion initial reads) from five independent studies by using the Trinity and Cufflinks software, respectively. We then removed putative transcripts (PTs) of low confidence from both assemblies and merged the remaining PTs into an integrated transcriptome consisting of 132,928 PTs, with 126,225 (~95%) PTs from the de novo assembly and more than 91% of PTs spliced. In the integrated transcriptome, ~90% and 63% of PTs had significant sequence similarity to sequences in the NCBI NT and NR databases, respectively; 68,754 (~52%) PTs were annotated with 15,965 unique GO terms; and 7,618 PTs annotated with Enzyme Commission codes were assigned to 134 KEGG pathways. Full exon-intron junctions of 17,528 PTs were validated by PacBio IsoSeq full-length cDNA reads from 3 other porcine tissue types, NCBI pig RefSeq mRNAs and transcripts from Ensembl Sscrofa10.2 annotation. Completeness of the 5’ termini of 37,569 PTs was validated by public CAGE data. By comparison to the Ensembl transcripts, we found the deduced precursors of 54,402 PTs shared at least one intron or exon with those of 18,437 Ensembl transcripts and 12,262 PTs had both longer 5’ and 3’ UTRs than their maximally overlapping Ensembl transcripts and 41,838 spliced PTs were totally missing from the Sscrofa10.2 annotation. Similar results were obtained when the PTs were compared to the pig NCBI RefSeq mRNAs collection. Conclusion: We built, validated and annotated a comprehensive porcine blood transcriptome with significant improvement over the annotation of Ensembl Sscrofa10.2 and the pig NCBI RefSeq mRNAs, and laid a foundation for blood-based high throughput transcriptomic assays in pigs and for advancing annotation of the pig genome. |