Skip to main content
ARS Home » Northeast Area » Beltsville, Maryland (BHNRC) » Beltsville Human Nutrition Research Center » Diet, Genomics and Immunology Laboratory » Research » Publications at this Location » Publication #333138

Title: The porcine translational research database: A manually curated, genomics and proteomics-based research resource

Author
item Dawson, Harry
item Chen, Celine
item Gaynor, Brady
item Shao, Jonathan
item Urban, Joseph

Submitted to: BMC Genomics
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 8/2/2017
Publication Date: 8/2/2017
Citation: Dawson, H.D., Chen, C.T., Gaynor, B., Shao, J.Y., Urban Jr, J.F. 2017. The porcine translational research database: A manually curated, genomics and proteomics-based research resource. Biomed Central (BMC) Genomics. doi: 10.1186/s12864-017-4009-7.

Interpretive Summary: The use of swine in biomedical research has increased dramatically in the last decade. Diverse genomic- and proteomic databases have been developed to facilitate research using human and rodent models. Current porcine gene databases, however, lack the robust annotation to study pig models that are relevant to human studies and for comparative evaluation with rodent models. Furthermore, they contain a significant number of errors due to their primary reliance on machine-based annotation. To address these deficiencies, a comprehensive literature-based survey was conducted to identify genes that have demonstrated function in humans, mice or pigs. The process identified 11,805 candidate genes or proteins used to select potential porcine homologs by searching multiple online sources of porcine gene information. The data in the Porcine Translational Research Database is supported by > 6,000 references, and contains 65 data fields for each entry, including > 8,000 full length (5’ and 3’) unambiguous pig sequences, >2,400 real time PCR assays and reactivity information on > 1,300 antibodies. It also contains gene and/or protein expression data for > 2,000 genes and identifies and corrects errors in gene duplications artifacts, mis-assemblies, mis-annotations, and incorrect species assignments for > 4,900 porcine genes. This database provides the first comprehensive description of three major Super-families or functionally related groups of proteins (Cluster of Differentiation (CD) Marker genes, Solute Carrier Superfamily, ATP binding Cassette Superfamily), and a comparative description of porcine microRNAs. It is the largest manually curated database for any veterinary species and is unique among porcine gene databases in regard to linking gene expression to gene function, identifying related gene pathways, and connecting data with other porcine gene databases.

Technical Abstract: The use of swine in biomedical research has increased dramatically in the last decade. Diverse genomic- and proteomic databases have been developed to facilitate research using human and rodent models. Current porcine gene databases, however, lack the robust annotation to study pig models that are relevant to human studies and for comparative evaluation with rodent models. Furthermore, they contain a significant number of errors due to their primary reliance on machine-based annotation. To address these deficiencies, a comprehensive literature-based survey was conducted to identify genes that have demonstrated function in humans, mice or pigs. The process identified 11,805 candidate genes or proteins used to select potential porcine homologs by searching multiple online sources of porcine gene information. The data in the Porcine Translational Research Database is supported by > 6,000 references, and contains 65 data fields for each entry, including > 8,000 full length (5’ and 3’) unambiguous pig sequences, >2,400 real time PCR assays and reactivity information on > 1,300 antibodies. It also contains gene and/or protein expression data for > 2,000 genes and identifies and corrects errors in gene duplications artifacts, mis-assemblies, mis-annotations, and incorrect species assignments for > 4,900 porcine genes. This database provides the first comprehensive description of three major Super-families or functionally related groups of proteins (Cluster of Differentiation (CD) Marker genes, Solute Carrier Superfamily, ATP binding Cassette Superfamily), and a comparative description of porcine microRNAs. It is the largest manually curated database for any veterinary species and is unique among porcine gene databases in regard to linking gene expression to gene function, identifying related gene pathways, and connecting data with other porcine gene databases.