Skip to main content
ARS Home » Northeast Area » Ithaca, New York » Robert W. Holley Center for Agriculture & Health » Plant, Soil and Nutrition Research » Research » Publications at this Location » Publication #400981

Research Project: Mapping Crop Genome Functions for Biology-Enabled Germplasm Improvement

Location: Plant, Soil and Nutrition Research

Title: An improved reference of the grapevine genome reasserts the origin of the PN40024 highly-homozygous genotype

Author
item VELT, AMANDINE - University Of Strossmayer
item FROMMER, BIANCA - Bielefeld University
item BLANC, SOPHIE - University Of Strossmayer
item HOLTGRAWE, DANIELA - University Of Strossmayer
item DUMAS, VINCENT - University Of Strossmayer
item GRIMPLET, JEROME - Centro De Investigacion
item HUGUENEY, PHILIPPE - University Of Strossmayer
item LAHAYE, MARIE - University Of Strossmayer
item KIM, CATHERINE - Cold Spring Harbor Laboratory
item MATUS, JOSE TOMAS - Universitat De València
item NAVARRO-PAYA, DAVID - Universitat De València
item ORDUNA, LUIS - Universitat De València
item TELLO-RUIZ, MARCELA - Cold Spring Harbor Laboratory
item VITULO, NICOLA - Universita Degli Studi Di Salerno
item Ware, Doreen
item RUSTENHOLZ, CAMILLE - University Of Strossmayer

Submitted to: G3, Genes/Genomes/Genetics
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 2/20/2023
Publication Date: 3/26/2023
Citation: Velt, A., Frommer, B., Blanc, S., Holtgrawe, D., Dumas, V., Grimplet, J., Hugueney, P., Lahaye, M., Kim, C., Matus, J., Navarro-Paya, D., Orduna, L., Tello-Ruiz, M.K., Vitulo, N., Ware, D., Rustenholz, C. 2023. An improved reference of the grapevine genome reasserts the origin of the PN40024 highly-homozygous genotype. G3, Genes/Genomes/Genetics. https://10.1093/g3journal/jkad067.
DOI: https://doi.org/10.1093/g3journal/jkad067

Interpretive Summary: Every agriculturally important crop has an assembled genome sequence that serves as the reference for many research and breeding studies. Grapevine uses the assembly for Vitis vinifera variety or clone PN40024 as such reference. This genotype is diploid, which means it contains two complete sets of chromosomes, one from each parent, and highly homozygous, which means very homogeneous genetically as the result of self-crossing. Despite several improvements of the PN40024 assembly, its current version PN12X.v2 is quite fragmented and only represents half (or its haploid state as if it came from a single parent) of its complete (diploid) genome. Being nearly homozygous means that the two halves of the genome are almost identical, but it turns out that the PN40024 assembly still contains various heterozygous regions. New and improved sequencing technologies were utilized to fully discriminate sequences from each half of the genome (haplotype). These approaches had been used to assemble other grapevine genomes, and were now used to generate an improved version of the PN40024 reference called PN40024.v4. Also, for the first time and through this work, a full alternative haplotype (half of the genome) for grapevine was built. An optimized gene annotation workflow that outperformed previous versions was used to obtain higher-quality gene models. In addition, integration of manually curated gene structures of the models in the gene reference catalogue for grapevine assisted in improving the annotation with the most reliable estimate to date of 35,230 grapevine genes. Finally, this study demonstrates that PN40024 resulted from selfings of a variety known as ‘Helfensteiner’ (which is a cross of ‘Pinot’ and ‘Schiava grossa’ instead of a single ‘Pinot’ plant). These advances will help maintain the PN40024 genome as a gold-standard reference and contribute to eventually building a grapevine pangenome (i.e., the entire set of genes from all grapevine varieties).

Technical Abstract: The genome sequence assembly of the diploid and highly homozygous V. vinifera genotype PN40024 serves as the reference for many grapevine studies. Despite several improvements of the PN40024 genome assembly, its current version PN12X.v2 is quite fragmented and only represents the haploid state of the genome with mixed haplotypes. In fact, despite the PN40024 genome is nearly homozygous, it still contains various heterozygous regions. Taking the opportunity of the improvements that long-read sequencing technologies offer to fully discriminate haplotype sequences and considering that several Vitis sp. genomes have recently been assembled with these approaches, an improved version of the reference, called PN40024.v4, was generated. Through incorporating long genomic sequencing reads to the assembly, the continuity of the 12X.v2 scaffolds was highly increased. The number of scaffolds decreased from 2,059 to 640 and the number of N bases was reduced by 88%. Additionally, the full alternative haplotype sequence was built for the first time, the chromosome anchoring was improved and the amount of unplaced scaffolds were reduced by half. To obtain a high-quality gene annotation that outperforms previous versions, a liftover approach was complemented with an optimized annotation workflow for Vitis. Integration of the gene reference catalogue and its manual curation have also assisted in improving the annotation, while defining the most reliable estimation to date of 35,230 genes. Finally, we demonstrate that PN40024 resulted from selfings of cv. ‘Helfensteiner’ (cross of cv. ‘Pinot’ and ‘Schiava grossa’ instead of a single ‘Pinot’. These advances will help maintain the PN40024 genome as a gold-standard reference also contributing in the eventual elaboration of the grapevine pangenome.