Crop Production and Pest Control Research Site Logo
ARS Home About Us Helptop nav spacerContact Us En Espanoltop nav spacer
Printable VersionPrintable Version     E-mail this pageE-mail this page
Agricultural Research Service United States Department of Agriculture
Search
  Advanced Search
 
Programs and Projects
Subjects of Investigation
Small Grains Viral Disease Laboratory
Corn and Sorghum Fungal Disease Laboratory
Host Plant Resistance to Insects Laboratory
Wheat/Hessian fly Interactions Laboratory
Small Grains Fungal Disease Laboratory
Soybean Quality Improvement Laboratory
 

Research Project: MOLECULAR AND GENETIC MECHANISMS OF FUNGAL DISEASE RESISTANCE IN GRAIN CROPS

Location: Crop Production and Pest Control Research

Title: A Perl script for targeted local genome assembly

Authors

Submitted to: Plant and Animal Genome Conference Proceedings
Publication Type: Abstract Only
Publication Acceptance Date: November 11, 2011
Publication Date: January 15, 2012
Citation: Crane, C.F., Nemacheck, J.A., Subramanyam, S., Williams, C.E. 2012. A Perl script for targeted local genome assembly. Plant and Animal Genome Conference Proceedings. PO981.

Technical Abstract: Whenever a finished genome is unavailable, the characterization of gene families, promoters, and enhancers, would benefit from a program for de novo assembly around a user-supplied initial sequence. The iterative script described here uses blast and phrap for this purpose. At each cycle, the script identifies matching reads with blast, identifies and retrieves low-copy reads that hit the contigs retained from the previous cycle, and assembles those reads with phrap. Cycles continue until the number of new reads to be added falls below a user-specified fraction of the total reads retrieved, or until phrap fails to assemble the reads. The initial sequence can be protein or nucleotide, but subsequent searches use blastn against the contigs from the previous cycle that best match the initial sequence. Thus contigs “grow” until they encounter repetitive sequence or insufficient depth of coverage in the reads database. The script was tested with four DNA sequences encoding a putative dirigent-like protein (HfrDir) from wheat, using pyrosequencing reads from cerealsdb (www.cerealsdb.uk.net) to assemble 30 contigs that matched at least one of the initial sequences at an e-value < 1e-12 and ranged from 598 to 7395 bases in length. The contigs obtained did not precisely match the number or length of dirigent-positive contigs in the cerealsdb draft assembly, and thus offer a different view of the dirigent-like gene family. From five of the contigs, 14 contig-specific primer pairs were used for PCR on Chinese Spring wheat; all produced single amplicons, and all but two primers yielded Sanger sequence. The Sanger sequences differed from the assembled contigs by occasional SNPs and one 46-base deletion, and these differences are under further investigation. However, the assembled sequences appear to be sufficiently accurate to direct further investigation of a gene and its adjacent environment from any collection of reads having sufficient length and coverage.

   

 
Project Team
Goodwin, Stephen - Steve
Crane, Charles
 
Publications
   Publications
 
Related National Programs
  Plant Diseases (303)
 
 
Last Modified: 05/25/2013
ARS Home | USDA.gov | Site Map | Policies and Links 
FOIA | Accessibility Statement | Privacy Policy | Nondiscrimination Statement | Information Quality | USA.gov | White House