Skip to main content
ARS Home » Midwest Area » Ames, Iowa » Corn Insects and Crop Genetics Research » Research » Publications at this Location » Publication #409098

Research Project: MaizeGDB - Database and Computational Resources for Maize Genetics, Genomics, and Breeding Research

Location: Corn Insects and Crop Genetics Research

Title: PanEffect: A pan-genome visualization tool for variant effects in maize

Author
item Andorf, Carson
item HALEY, OLIVIA - Orise Fellow
item HAYFORD, RITA - Orise Fellow
item Portwood, John
item SEN, SHATABDI - Iowa State University
item Cannon, Ethalinda
item GARDINER, JACK - University Of Missouri
item Woodhouse, Margaret

Submitted to: bioRxiv
Publication Type: Pre-print Publication
Publication Acceptance Date: 9/26/2023
Publication Date: 9/26/2023
Citation: Andorf, C.M., Haley, O., Hayford, R., Portwood Ii, J.L., Sen, S., Cannon, E.K., Gardiner, J.M., Woodhouse, M.H. 2023. PanEffect: A pan-genome visualization tool for variant effects in maize. bioRxiv. Article 09.25.559155. https://doi.org/10.1101/2023.09.25.559155.
DOI: https://doi.org/10.1101/2023.09.25.559155

Interpretive Summary: Understanding the effects of genetic variants is crucial for accurately predicting traits and phenotypic outcomes. Recent advances have used models developed by artificial intelligence to score all possible protein-coding mutations and indicate if those mutations are benign or likely to have an observable effect. A reliable tool is now needed to explore these effects at the pan-genome level. To address this gap, a new tool called PanEffect was developed. PanEffect is available at the Maize Genetics and Genomics Database and enables a comprehensive examination of the potential effects of coding variants across 51 maize genomes. The tool allows users to visualize over 550 million possible amino acid substitutions in the B73 maize reference genome and observe the effects of the 2.3 million natural variations in the maize pan-genome. The strength of PanEffect lies in its potential to propel forward investigations into protein variants, pinpointing genetic targets that can enhance crop breeding strategies. Thus, in an era where food security and sustainable agriculture are paramount, tools like PanEffect can assist researchers and breeders seeking to harness genetic potential in crops like maize.

Technical Abstract: Understanding the effects of genetic variants is crucial for accurately predicting traits and phenotypic outcomes. Recent advances have utilized protein language models to score all possible missense variant effects at the proteome level for a single genome, but a reliable tool is needed to explore these effects at the pan-genome level. To address this gap, we introduce a new tool called PanEffect. We implemented PanEffect at MaizeGDB to enable a comprehensive examination of the potential effects of coding variants across 51 maize genomes. The tool allows users to visualize over 550 million possible amino acid substitutions in the B73 maize reference genome and also to observe the effects of the 2.3 million natural variations in the maize pan-genome. Each variant effect score, calculated from the ESM protein language model, shows the log-likelihood ratio difference between B73 and all variants in the pan-genome. These scores are shown using heatmaps spanning benign outcomes to strong phenotypic consequences. Additionally, PanEffect displays secondary structures and functional domains along with the variant effects, offering additional functional and structural context. Using PanEffect, researchers now have a platform to explore protein variants and identify genetic targets for crop enhancement.