Location: Crops Pathology and Genetics Research
Title: Haplotype-phased genome assembly of virulent Phytophthora ramorum isolate ND886 facilitated by long-read sequencing reveals effector polymorphisms and copy number variationAuthor
MALAR, MATHU - Council Of Scientific And Industrial Research (CSIR) | |
YUZON, JENNIFER - University Of California, Davis | |
DAS, SUBHADEEP - Council Of Scientific And Industrial Research (CSIR) | |
DAS, ABHISHEK - Council Of Scientific And Industrial Research (CSIR) | |
PANDA, ARIJIT - Council Of Scientific And Industrial Research (CSIR) | |
GHOSH, SAMRAT - Council Of Scientific And Industrial Research (CSIR) | |
TYLER, BRETT - Oregon State University | |
Kasuga, Takao | |
TRIPATHY, SUCHETA - Council Of Scientific And Industrial Research (CSIR) |
Submitted to: Molecular Plant-Microbe Interactions
Publication Type: Peer Reviewed Journal Publication Acceptance Date: 7/10/2019 Publication Date: 7/15/2019 Citation: Malar, M.C., Yuzon, J.D., Das, S., Das, A., Panda, A., Ghosh, S., Tyler, B.M., Kasuga, T., Tripathy, S. 2019. Haplotype-phased genome assembly of virulent Phytophthora ramorum isolate ND886 facilitated by long-read sequencing reveals effector polymorphisms and copy number variation. Molecular Plant-Microbe Interactions. 32(8):1047-1060. https://doi.org/10.1094/MPMI-08-18-0222-R. DOI: https://doi.org/10.1094/MPMI-08-18-0222-R Interpretive Summary: Phytophthora ramorum is an invasive and devastating plant pathogen that causes sudden oak death in coastal forests in the western United States and ramorum blight in nursery ornamentals and native plants in various landscapes. A genome sequence was released in 2006, which is fragmented and contains 12 million bp assembly gaps. In this research, we use the PacBio long read technology to sequence the genome. As a result, almost all the gaps were closed, which allows us to study the genome of P. ramorum at a higher resolution. Technical Abstract: Phytophthora ramorum is a destructive pathogen that causes sudden oak death disease. The genome sequence of P. ramorum isolate Pr102 was previously produced, using Sanger reads, and contained 12 Mb of gaps. However, isolate Pr102 had shown reduced aggressiveness and genome abnormalities. In order to produce an improved genome assembly for P. ramorum, we performed long-read sequencing of highly aggressive P. ramorum isolate CDFA1418886 (abbreviated as ND886). We generated a 60.5-Mb assembly of the ND886 genome using the Pacific Biosciences (PacBio) sequencing platform. The assembly includes 302 primary contigs (60.2 Mb) and nine unplaced contigs (265 kb). Additionally, we found a 'highly repetitive' component from the PacBio unassembled unmapped reads containing tandem repeats that are not part of the 60.5-Mb genome. The overall repeat content in the primary assembly was much higher than the Pr102 Sanger version (48 versus 29%), indicating that the long reads have captured repetitive regions effectively. The 302 primary contigs were phased into 345 haplotype blocks and 222,892 phased variants, of which the longest phased block was 1,513,201 bp with 7,265 phased variants. The improved phased assembly facilitated identification of 21 and 25 Crinkler effectors and 393 and 394 RXLR effector genes from two haplotypes. Of these, 24 and 25 RXLR effectors were newly predicted from haplotypes A and B, respectively. In addition, seven new paralogs of effector Avh207 were found in contig 54, not reported earlier. Comparison of the ND886 assembly with Pr102 V1 assembly suggests that several repeat-rich smaller scaffolds within the Pr102 V1 assembly were possibly misassembled; these regions are fully encompassed now in ND886 contigs. Our analysis further reveals that Pr102 is a heterokaryon with multiple nuclear types in the sequences corresponding to contig 10 of ND886 assembly. |