Location: Plant Genetics Research
Title: Trait association and prediction through integrative K-mer analysisAuthor
HE, CHENG - Kansas State University | |
Washburn, Jacob | |
SCHLEIF, NATHANIEL - University Of Wisconsin | |
HAO, YANGFAN - Kansas State University | |
KAEPPLER, HEIDI - University Of Wisconsin | |
KAEPPLER, SHAWN - University Of Wisconsin | |
ZHANG, ZHIWU - Washington State University | |
YANG, JINLIANG - University Of Nebraska | |
LIU, SANZHEN - Kansas State University |
Submitted to: The Plant Journal
Publication Type: Peer Reviewed Journal Publication Acceptance Date: 8/22/2024 Publication Date: 9/11/2024 Citation: He, C., Washburn, J.D., Schleif, N., Hao, Y., Kaeppler, H., Kaeppler, S., Zhang, Z., Yang, J., Liu, S. 2024. Trait association and prediction through integrative K-mer analysis. The Plant Journal. 120(2): 833-850. https://doi.org/10.1111/tpj.17012. DOI: https://doi.org/10.1111/tpj.17012 Interpretive Summary: Genome-wide association study (GWAS) and genomic prediction (GP) are popular and effective methods for determining which genes potentially contribute to a trait, and for predicting how different individuals manifest that trait. Both methods traditionally require the mapping of DNA sequences to a reference sequenced genome. This mapping process is error prone and depends on the quality and existence of a reference genome. An alternative approach was developed and tested for using k-mers, short k-length fragments from DNA sequences, directly without a mapping step. This approach was shown to work in ways that are complimentary to traditional methods, and in some cases more accurate than those methods. Technical Abstract: Genome-wide association study (GWAS) with single nucleotide polymorphisms (SNPs) has been widelyused to explore genetic controls of phenotypic traits. Alternatively, GWAS can use counts of substrings of length k from longer sequencing reads, k-mers, as genotyping data. Using maize cob and kernel color traits, we demonstrated that k-mer GWAS can effectively identify associated k-mers. Co-expression analysis of kernel color k-mers and genes directly found k-mers from known causal genes. Analyzing complex traits of kernel oil and leaf angle resulted in k-mers from both known and candidate genes. A gene encoding a MADS transcription factor was functionally validated by showing that ectopic expression of the gene led to less upright leaves. Evolution analysis revealed most k-mers positively correlated with kernel oil were strongly selected against in maize populations, while most k-mers for upright leaf angle were positively selected. In addition, genomic prediction of kernel oil, leaf angle, and flowering time using k-mer data resulted in a similarly high prediction accuracy to the standard SNP-based method. Collectively, we showed k-mer GWAS is a powerful approach for identifying trait-associated genetic elements. Further, our results demonstrated the bridging role of k-mers for data integration and functional gene discovery. |