Skip to main content
ARS Home » Plains Area » Lubbock, Texas » Cropping Systems Research Laboratory » Plant Stress and Germplasm Development Research » Research » Publications at this Location » Publication #315777

Title: Comparing genome guided assembly and phased variants based assembly approach to separate the homoeolog transcripts in tetraploid peanut (Arachis hypogaea L.)

Author
item CHOPRA, RATAN - Texas Tech University
item BUROW, MARK - Texas A&M University
item Burow, Gloria

Submitted to: Plant and Animal Genome Conference Proceedings
Publication Type: Proceedings
Publication Acceptance Date: 12/4/2014
Publication Date: 1/10/2015
Citation: Chopra, R., Burow, M., Burow, G.B. 2015. Comparing genome guided assembly and phased variants based assembly approach to separate the homoeolog transcripts in tetraploid peanut (Arachis hypogaea L.). Proceedings of Plant and Animal Genome Conference. p.264.

Interpretive Summary:

Technical Abstract: Homoeologous copies of transcripts are abundant in many self-pollinating species including tetraploid peanut, and can impose a challenge to build a transcriptome reference without the merging of homoeologs. De novo transcriptome assembly of tetraploid OLin with single kmer and multiple kmer approaches have helped in reducing homoeolog collapse to an extent, but require further efforts to resolve transcriptome complexities. In this study, we have compared a genome-guided transcriptome assembly and separation of phased transcript from de novo transcriptome assembly approaches to generate reference transcriptome assemblies. On aligning the raw reads back to each of the generated assemblies along with the de novo assembly, we observed that de novo assembly had 37% of transcripts had two or more SNPs whereas genome-guided approach had 16.5% and phased approach had 21% of transcripts respectively. This could be one of the measures to estimate the effectiveness in separating homoeologs. Transcripts with the variants on aligning back to the reference are a potential indicator of merged of sequences, or homoeologs or paralogs. A subset of one hundred transcripts classified as collapsed from the OLin de novo assembly were processed by comparison to the genome-guided and phased transcriptome assemblies, and it was observed that the collapsed contigs were separated into two or more new transcripts. Thus suggests that genome-guided and phased transcriptome assembly approach will be useful to resolve collapsed homoeologs. The genome-guided approach may have an advantage over the phased approach due to more complete representation of genes in a genome but this requires the genome sequences of the diploid progenitor species.