Publication : USDA ARS

ARS Home » Midwest Area » St. Paul, Minnesota » Cereal Disease Lab » Research » Publications at this Location » Publication #305989

Title: Reliable Radiation Hybrid Maps: An Efficient Scalable Clustering-based Approach

Author

	SEETAN, RAED - North Dakota State University
	DENTON, ANNE - University Of Minnesota
	AL-AZZAM, OMAR - University Of Minnesota
	IQBAL, JAVED - North Dakota State University
	Kianian, Shahryar

Submitted to: IEEE/ACM Transactions on Computational Biology and Bioinformatics
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 8/6/2014
Publication Date: 8/6/2014
Citation: Seetan, R., Denton, A., Al-Azzam, O., Iqbal, J., Kianian, S. 2014. Reliable Radiation Hybrid Maps: An Efficient Scalable Clustering-based Approach. IEEE/ACM Transactions on Computational Biology and Bioinformatics. Available: http://www.computer.org/csdl/trans/tb/preprint/06827170.pdf.

Interpretive Summary: Genomic resources allow generation of thousands of data point rapidly. However, analyzing those data accurately are a challenge to the scientific community and can casue major errors in their application to crop improvement if not done correctly. One such example is the data generated from radiation hybrid mapping populations. In here we describe various approaches to analyzing these data, generated by our group, and the accuracey and pitfalls of each method. Overall, our approaches have a very low computational complexity and produce solid framework maps with good chromosome coverage and high agreement with the physical map marker order.

Technical Abstract: The process of mapping markers from radiation hybrid mapping (RHM) experiments is equivalent to the traveling salesman problem and, thereby, has combinatorial complexity. As an additional problem, experiments typically result in some unreliable markers that reduce the overall quality of the map. We propose a clustering approach for addressing both problems efficiently by eliminating unreliable markers without the need for mapping the complete set of markers. Traditional approaches for eliminating markers use resampling of the full data set, which has an even higher computational complexity than the original mapping problem. In contrast, the proposed approach uses a divide-and-conquer strategy to construct framework maps based on clusters that exclude unreliable markers. Clusters are ordered using parallel processing and are then combined to form the complete map. We present three algorithms that explore the trade-off between the number of markers included in the map and placement accuracy. Using an RHM data set of the human genome, we compare the framework maps from our proposed approaches with published physical maps and with the results of using the Carthagene tool. Overall, our approaches have a very low computational complexity and produce solid framework maps with good chromosome coverage and high agreement with the physical map marker order.

U.S. DEPARTMENT OF AGRICULTURE

Cereal Disease Lab: St. Paul, MN