Skip to main content
ARS Home » Pacific West Area » Maricopa, Arizona » U.S. Arid Land Agricultural Research Center » Plant Physiology and Genetics Research » Research » Publications at this Location » Publication #187464

Title: DATA MINING FOR MICROSATELLITES IN BRASSICACEAE EXPRESSED SEQUENCE TAGS (ESTS) FOR POTENTIAL CROSS-GENERIC USE

Author
item Salywon, Andrew
item BARBER, MATTHEW - AZ STATE UNIV
item HERLING, NATHAN - AZ STATE UNIV
item STEWART, WILLIAM - AZ STATE UNIV
item Dierig, David

Submitted to: Agronomy Abstracts
Publication Type: Abstract Only
Publication Acceptance Date: 10/15/2005
Publication Date: 11/8/2005
Citation: Salywon, A.M., Barber, M., Herling, N., Stewart, W., Dierig, D.A. 2005. Data mining for microsatellites in brassicaceae expressed sequence tags (ests) for potential cross-generic use. Agronomy Abstracts. CD-Rom (P8064)

Interpretive Summary:

Technical Abstract: The purpose of this study was to use databases of expressed sequence tags from Arabidopsis thaliana L. and Brassica crop species to determine the potential for development of microsatellite markers that amplify across generic boundaries within Brassicaceae and especially in Lesquerella. In April 2004, 347,844 Arabidopsis thaliana and 44,851 Brassica EST sequences were downloaded from public databases. Sequences containing microsatellites were identified using Perl script and microsatellite ESTs were then masked using RepeatMasker Program and clustered using StackPACK 2.0 system (with associated d2_cluster, Phrap, and CRAW programs). The EST database was then again queried with microsatellite containing ESTs clusters to extend the consensus sequences and reduce redundancy by clustering significantly similar ESTs. Information from all stages was stored in a relational database. Output files from StackPACK after clustering identified 2,058 microsatellite ESTs for Arabidopsis and 540 microsatellite ESTs for Brassica spp. In both Arabidopsis and Brassica, tri nucleotide repeat motifs were found to be the most abundant (69 and 59% respectively), followed by di nucleotide repeat motifs (23 and 36% respectively) and tetra nucleotide repeat motifs (8 and 5% respectively). Preliminary results from clustering of orthologous microsatellite ESTs from Arabidopsis and Brassica indicate that this method can be used to develop microsatellite primers that may amplify across the genera in the family for population or genomic studies. Although the realized number of orthologous regions for comparison is currently limited by the number of ESTs available. This study indicates that EST data from the model organism Arabidopsis thaliana and from the agriculturally important Brassica spp. can potentially benefit molecular genetics studies other related genera, such as Lesquerella, for which little, if any, sequence data may exist.