Author
ZHU, BIN - University Of Maryland | |
JIANG, LU - University Of Maryland | |
Liu, Ge - George |
Submitted to: Genomics
Publication Type: Peer Reviewed Journal Publication Acceptance Date: 11/13/2009 Publication Date: 1/14/2010 Citation: Zhu, B., Jiang, L., Liu, G. 2010. A dynamic neighboring extension search algorithm for genome coordinate conversion in the presence of short sequence duplications. Gene Expression to Genetical Genomics. 2:29-36. Interpretive Summary: Microarrays are increasingly used in comparative genomic hybridization (CGH) to detect genomic copy number variation (CNV). The design of these microarrays usually prefers uniquely mapped probes but routinely includes multiply mapped probes within a genome to maintain the high coverage and resolution. These duplicated probes could confuse the CNV calling and hamper the genome coordinate conversion between different genome assemblies. In this study, we tested the genome coordinate conversion for over 385,000 probes between two cattle genome assemblies and found out 33,910 (8.8%) of these probes cannot be uniquely mapped due to short sequence duplications. We studied the distribution pattern of these short sequence duplications and discussed their potential impacts. We proposed and tested a dynamic neighboring extension search (DNES) algorithm to solve this conversion problem in order to facilitate a direct migration and comparison of array CGH results across different genome assemblies. Technical Abstract: Oligonucleotide arrays are increasingly used in comparative genomic hybridization (CGH) to detect genomic copy number variation (CNV). The design of these arrays usually prefers uniquely mapped probes but routinely includes multiply mapped probes within a genome to maintain the high coverage and resolution. These perfectly duplicated probes could cause several limitations: besides their effects on the CNV calling, this kind of design also leads to the difficulty of converting genome coordinates between different genome assemblies. In this study, we tested the genome coordinate conversion for over 385,000 probes between two cattle genome assemblies and found out 33,910 (8.8%) of these probes cannot be uniquely mapped due to short sequence duplications. We studied the distribution pattern of these short sequence duplications and discussed their potential impacts. We proposed and tested a dynamic neighboring extension search (DNES) algorithm to solve this conversion problem in order to facilitate a direct migration and comparison of array CGH results across different genome assemblies. |