Location: Plant Genetics Research
Title: The allele catalog tool: a web-based interactive tool for allele discovery and analysisAuthor
CHAN, YEN ON - University Of Missouri | |
DIETZ, NICHOLAS - University Of Missouri | |
ZENG, SHUAI - University Of Missouri | |
WANG, JUEXIN - University Of Missouri | |
Flint-Garcia, Sherry | |
SALAZAR-VIDAL, NANCY - University Of California, Davis | |
SKRABISOVA, MARIA - Palacky University | |
Bilyeu, Kristin | |
JOSHI, TRUPTI - University Of Missouri System |
Submitted to: BMC Genomics
Publication Type: Peer Reviewed Journal Publication Acceptance Date: 1/31/2023 Publication Date: 3/10/2023 Citation: Chan, Y., Dietz, N., Zeng, S., Wang, J., Flint Garcia, S.A., Salazar-Vidal, N.M., Skrabisova, M., Bilyeu, K.D., Joshi, T. 2023. The allele catalog tool: a web-based interactive tool for allele discovery and analysis. BMC Genomics. 24: Article 107. https://doi.org/10.1186/s12864-023-09161-3. DOI: https://doi.org/10.1186/s12864-023-09161-3 Interpretive Summary: Generating genomic sequence data for large numbers of accessions is now feasable for many agriculturally relevant species. However, the analysis of those big data sets has been descriptive, noncomprehensive, and static. The objective of this research was to design and develop a gene and accession-based interactive bioinformatics tool utilizing genomic sequence data from large accession sets. The Allele Catalog Tool is an online resource for soybean, maize, and the model plant Arabidopsis research that empowers users to explore the data in a gene-based standardized format. The results are rendered with summary accession information along with details of the gene information. Detailed meta information is also available for all accessions. The results are downloadable for additional analysis offline. The impact of this work is the ability to conduct biological investigations on previously generated data and therefore connect genotypes to phenotypes for an initial set of agriculturally important species. Technical Abstract: Background The advancement of sequencing technologies today has made a plethora of whole-genome re-sequenced (WGRS) data publicly available. However, research utilizing the WGRS data without further configuration is nearly impossible. To solve this problem, our research group has developed an interactive Allele Catalog Tool to enable researchers to explore the coding region allelic variation present in over 1,000 re-sequenced accessions each for soybean, Arabidopsis, and maize. Results The Allele Catalog Tool was designed originally with soybean genomic data and resources. The Allele Catalog datasets were generated using our variant calling pipeline (SnakyVC) and the Allele Catalog pipeline (AlleleCatalog). The variant calling pipeline is developed to parallelly process raw sequencing reads to generate the Variant Call Format (VCF) files, and the Allele Catalog pipeline takes VCF files to perform imputations, functional effect predictions, and assemble alleles for each gene to generate curated Allele Catalog datasets. Both pipelines were utilized to generate the data panels (VCF files and Allele Catalog files) in which the accessions of the WGRS datasets were collected from various sources, currently representing over 1,000 diverse accessions for soybean, Arabidopsis, and maize individually. The main features of the Allele Catalog Tool include data query, visualization of results, categorical filtering, and download functions. Queries are performed from user input, and results are a tabular format of summary results by categorical description and genotype results of the alleles for each gene. The categorical information is specific to each species; additionally, available detailed meta-information is provided in modal popups. The genotypic information contains the variant positions, reference or alternate genotypes, the functional effect classes, and the amino-acid changes of each accession. Besides that, the results can also be downloaded for other research purposes. Conclusions The Allele Catalog Tool is a web-based tool that currently supports three species: soybean, Arabidopsis, and maize. The Soybean Allele Catalog Tool is hosted on the SoyKB website (https://soykb.org/SoybeanAlleleCatalogTool/), while the Allele Catalog Tool for Arabidopsis and maize is hosted on the KBCommons website (https://kbcommons.org/system/tools/AlleleCatalogTool/Zmays and https://kbcommons.org/system/tools/AlleleCatalogTool/Athaliana). Researchers can use this tool to connect variant alleles of genes with meta-information of species. |