Submitted to: Mid-South Computational Biology and Bioinformatics Society Conference
Publication Type: Abstract Only
Publication Acceptance Date: March 11, 2011
Publication Date: N/A
When attempting to map RNA sequences to a reference genome, high percentages of short sequence reads are often assigned to multiple genomic locations. One approach to handling these “ambiguous mappings” has been to discard them. This results in a loss of data, which can sometimes be as much as 45% of the original number of sequence reads. Task-specific computer programs are being developed and employed to offer an alternative to essentially “throwing away” large amounts of data when identifying significantly expressed genomic locations. Handling ambiguous mappings is a multi-step process that begins with using the open source software tool Bowtie to identify all possible mappings within the genome for each sequence read. From this initial mapping, statistical methods are employed to compare the expression of regions of interest to randomly-selected genomic locations. Using these comparisons, it should be possible to establish a value at which a gene is significantly expressed and determine which location is the mostly likely the best mapping for each ambiguous sequence. These methods are currently being developed and applied to short mRNA sequences from Zea mays.