Skip to main content
ARS Home » Research » Publications » Publications at this Location

Research Project: Genomes to Phenomes in Beef Cattle Research

Location: Genetics and Animal Breeding

Title: Enhanced bovine genome annotation through integration of transcriptomics and epi-transcriptomics datasets facilitates genomic biology

Author
item BEIKI, HAMID - Iowa State University
item MURDOCH, BRENDA - University Of Idaho
item PARK, CARISSA - Iowa State University
item KERN, CHANDLER - Pennsylvania State University
item KONTECHY, DENISE - University Of Idaho
item BECKER, GABRIELLE - University Of Idaho
item RINCON, GONZALO - Zoetis
item JIANG, HONGLIN - Virginia Tech
item ZHOU, HUAIJUN - University Of California, Davis
item THORNE, JACOB - Iowa State University
item KOLTES, JAMES - Iowa State University
item MICHAL, JENNIFER - Washington State University
item DAVENPORT, KIMBERLEY - University Of Missouri
item RIJNKELS, MONIQUE - Texas A&M University
item ROSS, PABLO - University Of California, Davis
item HU, RUI - Virginia Tech
item CORUM, SARAH - Zoetis
item MCKAY, STEPHANIE - University Of Vermont
item Smith, Timothy - Tim
item LIU, WANSHENG - Pennsylvania State University
item MA, WENZHI - Pennsylvania State University
item ZHANG, XIAOHUI - Washington State University
item XU, XIANOQING - University Of California, Davis
item HAN, XUELEI - University Of California, Davis
item JIANG, ZHIHUA - University Of California, Davis
item HU, ZHI-LIANG - University Of Iowa
item REECY, JAMES - Iowa State University

Submitted to: Gigascience
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 3/27/2024
Publication Date: 4/16/2024
Citation: Beiki, H., Murdoch, B.M., Park, C.A., Kern, C., Kontechy, D., Becker, G., Rincon, G., Jiang, H., Zhou, H., Thorne, J., Koltes, J.E., Michal, J.J., Davenport, K.G., Rijnkels, M., Ross, P.J., Hu, R., Corum, S., McKay, S., Smith, T.P.L., Liu, W., Ma, W., Zhang, X., Xu, X., Han, X., Jiang, Z., Hu, Z., Reecy, J.M. 2024. Enhanced bovine genome annotation through integration of transcriptomics and epi-transcriptomics datasets facilitates genomic biology. Gigascience. 13. Article giae019. https://doi.org/10.1093/gigascience/giae019.
DOI: https://doi.org/10.1093/gigascience/giae019

Interpretive Summary: An international effort to document the functional parts of the cattle genome assembly, which is known as "annotation", is named the "Functional Annotation of Animal Genomes" or FAANG project. The U.S. effort to support FAANG involved generating a series of data types known collectively as "Omics" from a common set of samples at various institutions. Integrating data that characterize a measure of gene expression with other data types that inform the state of the genome near the originating gene, provides a means to identify inter-animal variation that might affect their phenotype. This annotation effort is necessary to provide context for deciding if genome sequence variation between individual animals might control gene expression levels, and thus underlie heritable effects on things such as productivity, health, or environmental impact. The output provides a resource for studies that determine the DNA sequence of specific individuals and try to predict the outcome(s) of the specific sequence to support breeding or management objectives.

Technical Abstract: Background: The accurate identification of the functional elements in the bovine genome is a fundamental requirement for high-quality analysis of data informing both genome biology and genomic selection. Functional annotation of the bovine genome was performed to identify a more complete catalog of transcript isoforms across bovine tissues. Results: A total of 160,820 unique transcripts (50% protein coding) representing 34,882 unique genes (60% protein coding) were identified across tissues. Among them, 118,563 transcripts (73% of the total) were structurally validated by independent datasets (PacBio isoform sequencing data, Oxford Nanopore Technologies sequencing data, de novo assembled transcripts from RNA sequencing data) and comparison with Ensembl and NCBI gene sets. In addition, all transcripts were supported by extensive data from different technologies such as whole transcriptome termini site sequencing, RNA Annotation and Mapping of Promoters for the Analysis of Gene Expression, chromatin immunoprecipitation sequencing, and assay for transposase-accessible chromatin using sequencing. A large proportion of identified transcripts (69%) were unannotated, of which 86% were produced by annotated genes and 14% by unannotated genes. A median of two 5_ untranslated regions were expressed per gene. Around 50% of protein-coding genes in each tissue were bifunctional and transcribed both coding and noncoding isoforms. Furthermore, we identified 3,744 genes that functioned as noncoding genes in fetal tissues but as protein-coding genes in adult tissues. Our new bovine genome annotation extended more than 11,000 annotated gene borders compared to Ensembl or NCBI annotations. The resulting bovine transcriptome was integrated with publicly available quantitative trait loci data to study tissue–tissue interconnection involved in different traits and construct the first bovine trait similarity network. Conclusions: These validated results show significant improvement over current bovine genome annotations.