Location: Virus and Prion Research
Title: Tripal, a community update after 10 years of supporting open source, standards-based genetic, genomic and breeding databasesAuthor
STATON, MARGARET - University Of Tennessee | |
Cannon, Ethalinda | |
SANDERSON, LACEY-ANNE - University Of Saskatchewan | |
WEGRZYN, JILL - University Of Connecticut | |
BUEHLER, SEAN - University Of Tennessee | |
FICKLIN, STEPHEN - Washington State University | |
GRAU, EMILY - University Of Connecticut | |
GUIGNON, VALENTIN - Bioversity International | |
GUNOSKEY, JESSICA - University Of Connecticut | |
JUNG, SOOK - Washington State University | |
MAIN, DORRIE - Washington State University | |
Poelchau, Monica | |
RAMNATH, RISHARDE - University Of Connecticut | |
COBO, IRENE - University Of Connecticut | |
RICHTER, PETER - University Of Connecticut | |
WEST, JOE - University Of Tennessee | |
Anderson, Tavis | |
INDERSKI, BLAKE - Orise Fellow | |
Faaberg, Kay | |
Lager, Kelly |
Submitted to: Briefings in Bioinformatics
Publication Type: Peer Reviewed Journal Publication Acceptance Date: 6/1/2021 Publication Date: 7/12/2021 Citation: Staton, M., Cannon, E.K., Sanderson, L., Wegrzyn, J., Buehler, S., Ficklin, S., Grau, E., Guignon, V., Gunoskey, J., Jung, S., Main, D., Poelchau, M.F., Ramnath, R., Cobo, I., Richter, P., West, J., Anderson, T.K., Inderski, B., Faaberg, K.S., Lager, K.M. 2021. Tripal, a community update after 10 years of supporting open source, standards-based genetic, genomic and breeding databases. Briefings in Bioinformatics. 22(6). https://doi.org/10.1093/bib/bbab238. DOI: https://doi.org/10.1093/bib/bbab238 Interpretive Summary: Online, open access databases for biological knowledge serve as central repositories for research communities to store, find and analyze integrated, multi-disciplinary datasets. With increasing volumes, complexity and the need to integrate disparate data types, community databases face tremendous challenges in ongoing maintenance, expansion and upgrades. To solve this problem, a common framework that implements best practices that may be implemented across multiple databases was developed. The system, called Tripal, can reduce development burden, provide interoperability, ensure use of common standards, and is sustainable. Tripal provides functionality for searching, browsing, loading, and curating numerous types of data and is a primary technology powering at least 31 publicly available databases spanning plants, animals, and human data, primarily storing genomics, genetics and breeding data. The platform allows organisms, including swine pathogens, to be annotated to facilitate data mining and hypothesis generation. The integrated tools provide researchers timely access to sequences and associated descriptive data, allowing for biological data mining and epidemiological studies. The results that Tripal provides allow for a better understanding of organisms stored in such databases, this information can be used to describe the emergence of novel viruses, how these novel organisms are disseminated in the US and abroad, and provides a toolkit for discovering new patterns of biological consequence. Technical Abstract: Online, open access databases for biological knowledge serve as central repositories for research communities to store, find and analyze integrated, multi-disciplinary datasets. With increasing volumes, complexity and the need to integrate genomic, transcriptomic, metabolomic, proteomic, phenomic and environmental data, community databases face tremendous challenges in ongoing maintenance, expansion and upgrades. A common infrastructure framework using community standards shared by many databases can reduce development burden, provide interoperability, ensure use of common standards, and support long-term sustainability. Tripal is a mature, open source platform built to meet this need. With ongoing improvement since its first release in 2009, Tripal provides full functionality for searching, browsing, loading, and curating numerous types of data and is a primary technology powering at least 31 publicly available databases spanning plants, animals, and human data, primarily storing genomics, genetics and breeding data. Tripal software development is managed by a shared, inclusive governance structure including both project management and advisory teams. Here we report on the most important and innovative aspects of Tripal after 11 years development, including integration of diverse types of biological data, successful collaborative projects across member databases, and support for implementing FAIR principles. |