Location: Genetic Improvement for Fruits & Vegetables Laboratory
Title: Candidate gene identification of existing or induced mutations with pipelines applicable to large genomesAuthor
DONG, JIAQIANG - Rutgers University | |
TU, MIN - Rutgers University | |
FENG, YAPING - Rutgers University | |
ZDEPSKI, ANNA - Rutgers University | |
GE, FEI - Rutgers University | |
KUMAR, DIBYENDU - Rutgers University | |
Slovin, Janet | |
MESSING, JOACHIM - Rutgers University |
Submitted to: Plant Journal
Publication Type: Peer Reviewed Journal Publication Acceptance Date: 11/2/2018 Publication Date: 11/12/2018 Citation: Dong, J., Tu, M., Feng, Y., Zdepski, A., Ge, F., Kumar, D., Slovin, J.P., Messing, J. 2018. Candidate gene identification of existing or induced mutations with pipelines applicable to large genomes. Plant Journal. 97:673-682. https://doi.org/10.1111/tpj.14153. DOI: https://doi.org/10.1111/tpj.14153 Interpretive Summary: Two groups of programs, entitled VarMapDNA and VarMapRNA, for analyzing DNA and RNA sequence data were developed. They are easy to use and can be implemented by researchers without advanced computer capabilities. The programs allow researchers to pinpoint changes in sequence between the standard reference sequence and DNA or RNA sequences from organisms that have deleterious or beneficial traits. The programs are particularly useful for organisms with large genomes like maize. The programs were tested with sequence data from strawberry, and were then used to identify differences in gene sequence that cause the kernels on an ear of corn to develop improperly. These programs will be used by researchers studying how genes act, and by breeders working to improve crops. Technical Abstract: Bulked segregant analysis is used to identify natural or induced variants that are linked to phenotypes. Although it is used in Arabidopsis and rice, it remains challenging for crops with large genomes, such as maize. Moreover, analysis of huge data sets can present a bottleneck for linking phenotypes to their molecular basis, especially for geneticists without programming experience. Here, we identified two genes of maize defective kernel mutants with user-friendly analysis pipelines that require no programming and should be applicable to any large genome. In the 1970s, Neuffer and Sheridan generated a chemically induced “defective kernel” (dek) mutant collection with the potential to uncover critical genes for seed development. To locate such mutations, the dek phenotypes were introgressed into two inbred lines to take advantage of maize haplotype variations and their sequenced genomes. We generated two pipelines that take fastq files derived from nextGen paired-endDNA or cDNA sequencing as input, call on several well established and freely available genomic tools to call SNPs and INDELs, and generate lists of the most likely causal mutations together with variant index plots to locate the mutation to a specific sequence position on a chromosome. The pipelines were validated with a known strawberry mutation before cloning the dek mutants, thereby enabling phenotypic analysis of large genomes by next-generation sequencing. |