Skip to main content
ARS Home » Midwest Area » Ames, Iowa » Corn Insects and Crop Genetics Research » Research » Research Project #428852

Research Project: Federated Plant Database Initiative for the Legumes

Location: Corn Insects and Crop Genetics Research

Project Number: 5030-21000-069-003-N
Project Type: Non-Funded Cooperative Agreement

Start Date: Apr 15, 2015
End Date: Mar 31, 2020

Objective:
This research will extend and build on the Legume Information System (LIS), PeanutBase, SoyBase, and on a genome database maintained at JCVI for the model species Medicago truncatula. While this work would be very similar to work already planned for LIS and other databases in the project, it would be more outward-looking, explictly focusing on building collaborations with several other legume databases (e.g. CoolSeasonFoodLegume.org and KnowPulse), and also with the iPlant project. The objectives are: 1. Assist in the porting of several important databases to state-of-the-art open-source model organism database tools (chiefly: Chado, Tripal, JBrowse, CMap II, InterMine). 2. Improve standardization for plant data and metadata collection, by improving templates and protocols, and by implementing common data storage schemas. 3. Improve the capacity of organism database projects to collect and manage complex phenotype data, using controlled vocabularies and well defined protocols and schemas. 4. Implement a common, open, virtualized Data Repository, utilizing persistent locator IDs for data sets where possible and practical, standardized metadata, and robust methods for extracting and sharing data sets from project databases. 5. Integrate genetic, genomic, and phenotypic data across legume species, to enable identification of common molecular bases for important traits and enable traversal across database projects via orthology, synteny, and mappings of other significant features. 6. Build and maintain orthology relationships with improved gene family and phylogenetic methods, and regularly updated mappings between major phylogenetic databases.

Approach:
1. Genetic and genomic data for several crop species, including chickpea, common bean, peanut, pigeon pea, and others, will be entered into the primary database schema for the project (the Chado open-source biology schema). Similarly, data for these species will be entered into the GBrowse and JBrowse genome viewers, for interactive viewing and exploration. 2. Standardization for collecting plant data: this will be accomplished through the use of improved data collection templates (shared with collaborators); and also through use of new methods at iPlant for managing metadata (that is: data about data sets: provenance, experimental design, etc.). 3. Improving the way plant trait (phenotype) data is managed will be accomplished partly by improving the data schemas for storing the data. This will require ongoing revision of the biological database schemas. 4. A central Data Repository will be a collaborative project, making use of storage capacity at iPlant, and using improved metadata standards for descriptions about the data sets. 5. "Integrate genetic, genomic, and phenotypic data across legume species" -- This is largely a continuation of work now underway at SoyBase and LIS. 6. Improved orthology relationships, and use in integrating across species: this work basically amounts to finding corresponding genes between species. Such relationships establish gene "families" (sets of related genes). We will maintain mappings between gene families maintained in this project, and those in other related projects, including the ARS Gramene project. This will allow traversal between species and among genomic databases and portals.