Publication : USDA ARS

ARS Home » Research » Publications at this Location » Publication #275322

Title: Uniform standards for genome databases in forest and fruit trees

Author

	WEGRZYN, JILL - University Of California
	MAIN, DORRIE - Washington State University
	FIGUEROA, BEN - University Of California
	CHOI, MINYOUNG - University Of California
	NEALE, DAVID - University Of California
	JUNG, SOOK - Washington State University
	STANTON, MARGARET - Clemson University
	ZHENG, PING - Washington State University
	FICKLIN, STEPHEN - Washington State University
	CHO, ILHYUONG - Washington State University
	PEACE, CAMERON - Washington State University
	EVANS, KATE - Washington State University
	Volk, Gayle
	ORAGUZIE, NNADOZIE - Washington State University
	CHEN, CHUNXIAN - University Of Florida
	GMITTER, FRED - University Of Florida
	ABBOTT, ALBERT - Clemson University

Submitted to: Tree Genetics and Genomes
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 2/16/2012
Publication Date: 3/27/2012
Citation: Wegrzyn, J.L., Main, D., Figueroa, B., Choi, M., Neale, D.B., Jung, S., Stanton, M., Zheng, P., Ficklin, S., Cho, I., Peace, C., Evans, K., Volk, G.M., Oraguzie, N., Chen, C., Gmitter, F.G., Abbott, A.G. 2012. Uniform standards for genome databases in forest and fruit trees. Tree Genetics and Genomes. 8:549-557.

Interpretive Summary: Genomic databases for tree fruit and forestry species contain critical data for research and breeding programs. These data are most valuable when descriptive information about the physical traits, experimental conditions, and environmental conditions are also available. This manuscript describes the development and integration of standardized vocabularies that are being developed to increase the value of the genomic data within the TreeGenes and tfGDR databases.

Technical Abstract: TreeGenes and tfGDR serve the international forestry and fruit tree genomics research communities, respectively. These databases hold similar sequence data and provide resources for the submission and recovery of this information in order to enable comparative genomics research. Large-scale genotype and phenotype projects have recently spawned the development of independent tools and interfaces within these repositories to deliver information to both geneticists and breeders. The increase in next generation sequencing projects has increased the amount of data as well as the scale of analysis that can be performed. These two repositories are now working towards a similar goal of archiving the diverse, independent data sets generated from genotype/phenotype experiments. This is achieved through focused development on data input standards (templates), pipelines for the storage and automated curation, and consistent annotation efforts through the application of widely accepted ontologies to improve the extraction and exchange of the data for comparative analysis. Efforts toward standardization are not limited to genotype/phenotype experiments but are also being applied to other data types to improve gene prediction and annotation for de novo sequencing projects. The resources developed towards these goals represent the first large-scale coordinated effort in plant databases to add informatic value to diverse genotype/phenotype experiments.