Location: Plant, Soil and Nutrition Research
Project Number: 8062-21000-052-002-A
Project Type: Cooperative Agreement
Start Date: Sep 16, 2020
End Date: Sep 15, 2025
Objective:
Breeding Insight (BI) will continue to support breeding projects across ARS with a team of specialists in information technology, genomics, and breeding process design who partner with individual ARS pre-breeding and breeding groups. The initial two years focused on recruiting the BI team, ramping up breeding programs for new technologies, and integrating informatics tools. The project leverages investments in nine open source informatic tools that are already funded by ARS, CGIAR, USAID, and BMGF and being used and developed by six institutions. Currently, these nine tools cover the domain space (various activities) needed for most breeding programs to operate efficiently, but they do not inter-operate well and some needs/activities are not covered (e.g., animal welfare management). Additionally, some of the tools have been scaled for very large programs, and they need to be simplified for smaller breeding programs. The software engineers will continue to improve and integrate these tools, or create new tools, so they can be applied to specialty animal and plant breeding programs.
Initially, the platform addresses four use cases: efficient genotyping, pedigree verification, genomic prediction, and identification of novel favorable variants. A director, a software development team, and application specialist coordinators lead this effort from Ithaca, NY. In the initial years, the project has focused on six breeding programs during the pilot phase. In this phase, the project in collaboration with ARS coordinators will expand service and support to an additional five species (bringing total species served to eleven). The software engineering team in later years will focus on integration with germplasm collections, scaling, and support for the wide range of biology encountered across dozens of breeding programs.
Breeding Insight has the possibility of more than doubling the efficiency of breeding programs, which would result in more sustainable, nutritious, profitable fruits, vegetables, aquacultural species, and range land plants. The shared open source platform will also allow innovation and talent to be shared much more widely.
The open source software system developed by this project will also have broad applicability to numerous non-agricultural species, including species critical for ecology, conservation biology, pathology or any genomic diversity study. This project conducts training to ensure researchers in other communities can use and contribute to the software platform. Additionally, because of the training and the platform’s open source nature it will provide a catalyst for start-up companies and university-based breeding programs to accelerate their efforts.
Approach:
To build the Breeding Insight software a team of developers and coordinators are developing a platform the combines the various software so that:
1. Breeding programs can track germplasm resources and field experiment designs
2. Phenotypic data can be easily collected in the field and integrated with genomic data
3. Genotyping data can be easily integrated with germplasm and phenotypic data.
4. Pedigree relationship can be evaluated.
5. Whole genome prediction of phenotypic traits.
6. Animal welfare data can be efficiently tracked, monitored, and reported to regulatory agencies.
This project will continue to coordinate the genomic diversity analyses for all included species, which includes long read DNA sequencing technology to assemble genomes, resequencing to discover variants, develop low costs assays for genotyping, and then genotype relevant germplasm.
The first two years of this project have worked on integrating and implementing BrAPI and BreedBase into a container that is easily deployable to cloud services, deploying FieldBook to work with BreedBase via BrAPI, and developing user interfaces that are intuitive to breeders.
Building on these accomplishment, the sequence of the software development will be:
1. Continued development of interfaces and work flows required by another 5 species.
2. Update the EBS and GOBII module to support genomics.
3. Update Sample Tracker to support working with genotyping vendors.
4. Integrate Pangenome Graphs to support whole genome information.
5. Develop BrAPI communication between BI and GRINGlobal and USDA NPGS.
6. Identify or develop animal welfare data collection software for improved record keeping.
Element 1 will continue throughout the project. 2-4 are expected to take a year and half of software development time. Elements 5-6 another two years of time.
The coordinators have worked with 6 pilot plant and animal species to develop and test the system on their breeding programs. In this project, coordinators will continue to support the pilot species needs as well as start new support for an additional 5 species.