Location: Plant Genetics Research
Title: Trait association and prediction through integrative K-mer analysisAuthor
HE, CHENG - Kansas State University | |
Washburn, Jacob | |
HAO, YANGFAN - Kansas State University | |
ZHANG, ZHIWU - Washington State University | |
YANG, JINLIANG - University Of Nebraska | |
LIU, SANZHEN - Kansas State University |
Submitted to: bioRxiv
Publication Type: Pre-print Publication Publication Acceptance Date: 11/19/2021 Publication Date: 11/19/2021 Citation: He, C., Washburn, J.D., Hao, Y., Zhang, Z., Yang, J., Liu, S. 2021. Trait association and prediction through integrative K-mer analysis. bioRxiv. https://doi.org/10.1101/2021.11.17.468725. DOI: https://doi.org/10.1101/2021.11.17.468725 Interpretive Summary: Genome-wide association study (GWAS) and genomic prediction (GP) are popular and effective methods for determining which genes potentially contribute to a trait, and for predicting how different individuals manifest that trait. Both methods traditionally require the mapping of DNA sequences to a reference sequenced genome. This mapping process is error prone and depends on the quality and existence of a reference genome. An alternative approach was developed and tested for using k-mers, short k-length fragments from DNA sequences, directly without a mapping step. This approach was shown to work in ways that are complimentary to traditional methods, and in some cases more accurate than those methods. Technical Abstract: Genome-wide association study with single nucleotide polymorphisms (SNPs) has been widely used to explore genetic controls of phenotypic traits. Here we employed an approach based on k-mers, short substrings from sequencing reads. Using maize cob and kernel color traits, we demonstrated that k-mer GWAS can identify associated k-mers from known loci. Co-expression analysis of kernel color associated k-mers and pathway genes directly found k-mers from causal genes. Analyzing complex traits of kernel oil and leaf angle resulted in associated k-mers from known and candidate genes. Evolution analysis revealed most k-mers positively correlated with kernel oil were under purifying selection in maize populations, while most k-mers for upright leaf angle were positively selected. In addition, phenotypic prediction of flowering time using k-mer data showed a similar prediction accuracy to the SNP method. Collectively, our results demonstrated that the k-mer can be a bridging element for data integration and functional gene discovery. |