Location: Corn Insects and Crop Genetics Research
Title: Pandagma: a tool for identifying pan-gene sets and gene families at desired evolutionary depths and accommodating whole genome duplicationsAuthor
Cannon, Steven | |
LEE, HYUN-OH - Orise Fellow | |
Weeks, Nathan | |
BERENDZEN, JOEL - Generisbio, Llc |
Submitted to: Bioinformatics
Publication Type: Peer Reviewed Journal Publication Acceptance Date: 8/18/2024 Publication Date: 8/24/2024 Citation: Cannon, S.B., Lee, H., Weeks, N.T., Berendzen, J. 2024. Pandagma: a tool for identifying pan-gene sets and gene families at desired evolutionary depths and accommodating whole genome duplications. Bioinformatics. https://doi.org/10.1093/bioinformatics/btae526. DOI: https://doi.org/10.1093/bioinformatics/btae526 Interpretive Summary: Identifying corresponding genes from different individuals in a species, or from different species in a genus, is important for discovering the causes of differences between those individuals. In turn, understanding the causes of those differences helps breeders and researchers to select and generate crop varieties with improved characteristics. This publication describes software for comparing and analyzing all genes in a set of individuals. The software places them into collections of genes that represent the core sets of all genes in a species -- or if applied to multiple species, all genes in that set of species. This software is expected to help breeders and researchers to better utilize diverse genetic material for crop improvement. Technical Abstract: Identification of allelic or corresponding genes (pan-genes) within a species or genus is important for discovery of biologically significant genetic conservation and variation. Similarly, identification of orthologs (gene families) across wider evolutionary distances is important for understanding the genetic basis for similar or differing traits. Especially in plants, several complications make identification of pan-genes and gene families challenging, including whole-genome duplications, evolutionary rate differences among lineages, and varying qualities of assemblies and annotations. Here, we document and distribute a set of workflows that we have used to address these problems. |