Skip to main content
ARS Home » Research » Publications at this Location » Publication #199846

Title: Characterization of an EST database for the perennial weed leafy spurge: an important resource for weed biology research

Author
item Anderson, James
item Horvath, David
item Chao, Wun
item Foley, Michael
item HERNANDEZ, ALVARO - UNIV OF ILLINOIS-URBANA
item THIMMAPURAM, JYOTHI - UNIV OF ILLINOIS-URBANA
item LEI, LIU - UNIV OF ILLINOIS-URBANA
item GONG, GEORGE - UNIV OF ILLINOIS-URBANA
item BAND, MARK - UNIV OF ILLINOIS-URBANA
item KIM, RYAN - UNIV OF ILLINOIS-URBANA
item MIKEL, MARK - UNIV OF ILLINOIS-URBANA

Submitted to: Weed Science
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 12/22/2006
Publication Date: 5/1/2007
Citation: Anderson, J.V., Horvath, D.P., Chao, W.S., Foley, M.E., Hernandez, A.G., Thimmapuram, J., Lei, L., Gong, G.L., Band, M., Kim, R., Mikel, M.A. 2007. Characterization of an EST database for the perennial weed leafy spurge: an important resource for weed biology research. Weed Science. 55:193-203. DOI:10.1614/WS-06-138.1.

Interpretive Summary: Genomics programs in the weed science community have not developed as rapidly as that of other crop, horticultural, forestry, and model plant systems. Development of genomic resources for selected model weeds are expected to enhance our understanding of weed biology, just as they have in other plant systems. In this report, we describe the development, characteristics, and information gained from an expressed sequence tag (EST)-database for the perennial weed leafy spurge. ESTs, which are composed of partial sequences (200-800 base pair) of expressed genes, were obtained using a normalized cDNA library prepared from a comprehensive collection of leafy spurge tissues. A sequencing success rate of 88% yielded 45,314 ESTs which were further assembled into 23,472 unique sequences representing 19,015 unigenes. Similarity searches done using available databases indicate that 77.4% of the 23,472 unique sequences and 74.7% of the 19,015 unigenes are similar to other known proteins and approximately 15.5% of the unigenes are novel to leafy spurge. Functional classifications for unique sequences of leafy spurge were proportional to genes of Arabidopsis, with the exception of unclassified or unknowns and transposable elements which were significantly reduced in leafy spurge. Although these EST resources have been developed for the purpose of constructing high-density leafy spurge microarrays, they are already providing valuable information related to plant growth and development in weedy perennials such as leafy spurge.

Technical Abstract: Genomics programs in the weed science community have not developed as rapidly as that of other crop, horticultural, forestry, and model plant systems. Development of genomic resources for selected model weeds are expected to enhance our understanding of weed biology, just as they have in other plant systems. In this report, we describe the development, characteristics, and information gained from an expressed sequence tag (EST)-database for the perennial weed leafy spurge. ESTs were obtained using a normalized cDNA library prepared from a comprehensive collection of tissues, including tissue from stressed and senescing plants. During the EST characterization process, redundancy was minimized by periodic subtractions of the normalized cDNA library. A sequencing success rate of 88% yielded 45,314 ESTs with an average read length of 671 nucleotides. Using bioinformatic analysis, the leafy spurge EST-database was assembled into 23,472 unique sequences representing 19,015 unigenes (10,293 clusters and 8,722 singletons). Blast similarity searches to the GenBank non-redundant protein database identified 18,186 total matches, of which 14,205 were non-redundant. These data indicate that 77.4% of the 23,472 unique sequences and 74.7% of the 19,015 unigenes are similar to other known proteins. Further bioinformatics analysis indicated that 2,950, or 15.5%, of the unigenes are novel to leafy spurge. Functional classifications assigned to leafy spurge unique sequences using Munich Information Center for Protein (MIPs) or Gene Ontology (GO) were proportional to functional classifications for genes of Arabidopsis, with the exception of unclassified or unknowns and transposable elements which were significantly reduced in leafy spurge. Although these EST resources have been developed for the purpose of constructing high-density leafy spurge microarrays, they are already providing valuable information related to sugar metabolism, cell cycle regulation, dormancy, terpenoid secondary metabolism, and flowering.