Author
CHEN, CHARLES - Cornell University | |
DECLERCK, GENEVIEVE - Cornell University | |
TIAN, FENG - China Agricultural University | |
SPOONER, WILLIAM - Cornell University | |
MCCOUCH, SUSAN - Cornell University | |
Buckler, Edward - Ed |
Submitted to: PLOS ONE
Publication Type: Peer Reviewed Journal Publication Acceptance Date: 9/4/2012 Publication Date: 11/7/2012 Citation: Charles, C., Declerck, G., Tian, F., Spooner, W., Mccouch, S., Buckler IV, E.S. 2012. PICARA, an analytical pipeline providing probabilistic inference about a priori candidates genes underlying genome-wide association QTL in plants. PLoS One. 7(11): e46596. Interpretive Summary: Quantitative genetics and molecular biology take different approaches for the identification of genes responsible for complex traits. However, both approaches are extremely complementary to one another. The PICARA algorithm provides the statistical basis and algorithms for combining these two approaches. The power of the approach is demonstrated in the dissection of flowering time in corn, but the approach could be used in any species where there is sufficient quantitative genetics and molecular biology data. This approach is expected to help researchers identify genes responsible for key traits when these two types of data are available. Technical Abstract: PICARA is an analytical pipeline designed to systematically summarize observed SNP/trait associations identified by genome wide association studies (GWAS) and to identify candidate genes involved in the regulation of complex trait variation. The pipeline provides probabilistic inference about a priori candidate genes using integrated information derived from genomewide association signals, gene homology, and curated gene sets embedded in pathway descriptions. In this paper, we demonstrate the performance of PICARA using data for flowering time variation in maize – a key trait for geographical and seasonal adaption of plants. Among 406 curated flowering time-related genes from Arabidopsis, we identify 61 orthologs in maize that are significantly enriched for GWAS SNP signals, including key regulators such as FT (Flowering Locus T) and GI (GIGANTEA), and genes centered in the Arabidopsis circadian pathway, including TOC1 (Timing of CAB Expression 1) and LHY (Late Elongated Hypocotyl). In addition, we discover a regulatory feature that is characteristic of these a priori flowering time candidates in maize. This new probabilistic analytical pipeline helps researchers infer the functional significance of candidate genes associated with complex traits and helps guide future experiments by providing statistical support for gene candidates based on the integration of heterogeneous biological information. |