Publication : USDA ARS

ARS Home » Research » Publications at this Location » Publication #265663

Title: A Statistical Approach for Ambiguous Sequence Mappings

Author

	JOHNSON, MICHAEL - Tennessee Technological University
	GAUTAM, DILIP - Mississippi State University
	ANKALA, ARUNKANTH - Emory University, School Of Medicine
	Sonstegard, Tad
	Schroeder, Steven - Steve
	WILKINSON, JEFF - Mississippi State University
	RAMKUMAR, MAHALINGAM - Mississippi State University
	PERKINS, ANDY - Mississippi State University

Submitted to: Mid-South Computational Biology and Bioinformatics Society Conference
Publication Type: Abstract Only
Publication Acceptance Date: 3/11/2011
Publication Date: N/A
Citation: N/A

Interpretive Summary:

Technical Abstract: When attempting to map RNA sequences to a reference genome, high percentages of short sequence reads are often assigned to multiple genomic locations. One approach to handling these “ambiguous mappings” has been to discard them. This results in a loss of data, which can sometimes be as much as 45% of the original number of sequence reads. Task-specific computer programs are being developed and employed to offer an alternative to essentially “throwing away” large amounts of data when identifying significantly expressed genomic locations. Handling ambiguous mappings is a multi-step process that begins with using the open source software tool Bowtie to identify all possible mappings within the genome for each sequence read. From this initial mapping, statistical methods are employed to compare the expression of regions of interest to randomly-selected genomic locations. Using these comparisons, it should be possible to establish a value at which a gene is significantly expressed and determine which location is the mostly likely the best mapping for each ambiguous sequence. These methods are currently being developed and applied to short mRNA sequences from Zea mays.