Location: Plant Genetics Research
Title: Application of SNPViz v2.0 using next-generation sequencing data sets in the discovery of potential causative mutations in candidate genes associated with phenotypesAuthor
ZENG, SHUAI - University Of Missouri | |
SKRABISOVA, MARIA - Palacky University | |
LYU, ZHEN - University Of Missouri | |
CHAN, YEN ON - University Of Missouri | |
DIETZ, NICHOLAS - University Of Missouri | |
Bilyeu, Kristin | |
JOSHI, TRUPTI - University Of Missouri |
Submitted to: International Journal of Data Mining and Bioinformatics
Publication Type: Peer Reviewed Journal Publication Acceptance Date: 4/5/2021 Publication Date: 7/27/2021 Citation: Zeng, S., Skrabisova, M., Lyu, Z., Chan, Y., Dietz, N., Bilyeu, K.D., Joshi, T. 2021. Application of SNPViz v2.0 using next-generation sequencing data sets in the discovery of potential causative mutations in candidate genes associated with phenotypes. International Journal of Data Mining and Bioinformatics. 25(1-2):65-85. https://doi.org/10.1504/IJDMB.2021.116886. DOI: https://doi.org/10.1504/IJDMB.2021.116886 Interpretive Summary: The ability to generate genomic sequence data now exceeds the research capacity to efficiently analyze and interpret the data for soybean and other crop species. This research was aimed at creating an online tool that enables researchers to analyze and interpret genomic sequence data by visualizing the sequence variation from selected soybean accessions in user-defined regions of the genome. The visualization returns the results so that users can analyze for haplotypes, gene information, any modifying effects from the sequence variants, and also input statistical and phenotype results into the tool for advanced analyses. The research demonstrated case use studies for examples in soybean and Arabidopsis to demonstrate broader applicability in other crop species. The impact of this research is a tool available to the research community that empowers researchers to analyze genomic sequence data as part of investigations for discovery of the genes that control important traits. Technical Abstract: Single nucleotide polymorphisms (SNPs) and insertions/deletions (Indels) are most common biological markers that are widely spread across all chromosomes of the genome. Due to large amounts of SNPs and Indels data have become available in the past ten years, it is a challenge to intuitively integrate, compare and/or visualize them, and effectively perform analysis across multiple samples simultaneously. Genome-wide association studies (GWAS) is an approach to find genetic variants associated with a trait, but it lacks an efficient way to investigate genomic variant functions. To tackle these issues, we developed SNPViz v2.0, a web-based tool to visualize large-scale haplotype blocks with detailed SNPs and Indels grouped by their chromosomal coordinates, along with their overlapping gene models, phenotype to genotype accuracies, Gene Ontology (GO), protein families (Pfam), and their functional effects. SNPViz v2.0 is available in both SoyKB and KBCommons. For soybean only, the SNPViz v2.0 is available at http://soykb.org/SNPViz2/ . For other plants such as Arabidopsis thaliana and Zea mays, SNPViz v2.0 is available in respective knowledge bases at https://kbcommons.org. |