Skip to main content
ARS Home » Northeast Area » Frederick, Maryland » Foreign Disease-Weed Science Research » Research » Publications at this Location » Publication #209446

Title: A Microarray Analysis for Differential Gene Expression in the Soybean Genome Using Bioconductor and R

Author
item ALVORD, W - NATIONAL CANCER INSTITUTE
item ROAYAEI, JEAN - NATIONAL CANCER INSTITUTE
item QUINONES, OCTAVIO - NATIONAL CANCER INSTITUTE
item Schneider, Katherine

Submitted to: Briefings in Bioinformatics
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 8/31/2007
Publication Date: 10/17/2007
Citation: Alvord, W.G., Roayaei, J., Quinones, O.A., Schneider, K. 2007. A microarray analysis for differential gene expression in the soybean genome using bioconductor and r. Briefings in Bioinformatics. 1093:43.

Interpretive Summary: The Affymetrix Soybean GeneChip® microarray contains transcripts that can be used to study three different species, Glycine max (soybean), Phytophthora sojae (a water mold that commonly attacks soybean crops), and Heterodera glycines (soybean cyst nematode). Affymetrix provides MAS5 software for the analysis of gene expression data collected using the microarrays, but experts disagree on the utility of the software. This paper describes specific procedures for conducting quality assessment of Affymetrix microarray soybean genome data and performing analyses to determine differential gene expression using the open-source R statistical language in conjunction with the open-source Bioconductor analysis software package. Procedures are described for extracting Glycine max Affymetrix IDs from the other transcripts on the soybean chip. RNA degradation and recommended procedures from Affymetrix for quality control are discussed, and graphical data modeling plots that may be used to identify aberrant chips are displayed. The robust multi-chip averaging (RMA) procedure was used for background correction, normalization, and summarization of the raw data to obtain expression level data and discover differentially expressed genes. Data from soybean lines resistant and susceptible to Phakopsora pachyrhizi are used as an example to show how these procedures successfully identified differentially expressed genes that that may play a role in soybean resistance to a fungal pathogen. Complete source code for performing all quality assessment and statistical procedures may be downloaded from http://css.ncifcrf.gov/services/download/MicroarraySoybean.zip

Technical Abstract: This paper describes specific procedures for conducting quality assessment of Affymetrix GeneChip® soybean genome data and performing analyses to determine differential gene expression using the open-source R language and environment in conjunction with the open-source Bioconductor package. Procedures are described for extracting Glycine max Affymetrix IDs on the soybean chip. RNA degradation and recommended procedures from Affymetrix for quality control are discussed, and chip pseudo-images of weights, residuals and signed residuals and additional probe-level modeling plots that may be used to identify aberrant chips are displayed. The robust multi-chip averaging (RMA) procedure was used for background correction, normalization and summarization of the AffyBatch probe level data to obtain expression level data and discover differentially expressed genes. Examples of boxplots and MA plots were shown for the expression level data. Volcano plots and heatmaps were used to demonstrate the use of (log) fold changes in conjunction with ordinary and moderated t statistics for determining interesting genes. All analyses were performed using Bioconductor and R. We show, with real data, how these procedures successfully identified differentially expressed genes that that may play a role in soybean resistance to a fungal pathogen, Phakopsora pachyrhizi. Complete source code for performing all quality assessment and statistical procedures may be downloaded from http://css.ncifcrf.gov/services/download/MicroarraySoybean.zip.