Skip to main content
ARS Home » Northeast Area » Beltsville, Maryland (BARC) » Beltsville Agricultural Research Center » Sustainable Perennial Crops Laboratory » Research » Publications at this Location » Publication #408581

Research Project: Genotypic Characterization of Genetic Resources for Cacao, Coffee, and Other Tropical Perennial Crops Economically Important to the United States

Location: Sustainable Perennial Crops Laboratory

Title: Three de novo assembled wild cacao genomes from Upper Amazon reveal new insights into an early divergence of chocolate trees

Author
item NOUSIAS, ORESTIS - University Of Nebraska
item ZHENG, JINFANG - University Of Nebraska
item LI, TANG - University Of Nebraska
item Meinhardt, Lyndel
item Bailey, Bryan
item Gutierrez, Osman
item Cohen, Stephen
item Zhang, Dapeng
item YIN, YANBIN - University Of Nebraska

Submitted to: Scientific Data
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 4/3/2024
Publication Date: 4/11/2024
Citation: Nousias, O., Zheng, J., Li, T., Meinhardt, L.W., Bailey, B.A., Gutierrez, O.A., Cohen, S.P., Zhang, D., Yin, Y. 2024. Three de novo assembled wild cacao genomes from Upper Amazon reveal new insights into an early divergence of chocolate trees. Scientific Data. https://doi.org/10.1038/s41597-024-03215-1.
DOI: https://doi.org/10.1038/s41597-024-03215-1

Interpretive Summary: Theobroma cacao, the chocolate tree, is indigenous to the Amazon basin, the greatest biodiversity hotspot on earth. Large intra-specific variations have been observed among various cacao populations, yet the mechanisms responsible for the origin of this rich diversity remain unresolved. In this study, we constructed three high-quality chromosome-level genomes assemblies of wild cacao from the Upper Amazon – the center of origin of cacao. These three wild cacao accessions are the most widely used parents in cacao breeding programs worldwide. Extensive comparative genomics analyses were performed, and the results revealed remarkably high levels of genomic diversity among the different cacao genomes. Furthermore, our result showed that cacao population differentiation started approximately 1.83 – 0.69 million years ago, providing new insight into the origin and evolutionary history of cacao in the Amazon basin. The high-quality genome assemblies of the three wild cacao plants provide valuable resources for studying the origin and evolution of cacao genetic diversity. This information will be used by cacao researchers to design conservation strategies and develop molecular tools for developing new varieties with enhanced productivity, disease resistances and quality attributes.

Technical Abstract: Theobroma cacao, the chocolate tree, is indigenous to the Amazon basin, the greatest biodiversity hotspot on earth. Large intra-specific variations have been observed among various cacao populations, yet the mechanisms responsible for the origin of this rich diversity remain unresolved.In this study, we constructed three high-quality chromosome-level genomes de novo assembled using PacBio HiFi long reads and scaffolded using a reference-free strategy. These genomes represent the three most important genetic clusters of wild cacao trees from the Upper Amazon region. Comparative genomics analysis was performed among the three wild cacao genomes, together with the two reference genomes of domesticated cacao. We found that the five cacao genotypes diverged in the early and middle Pleistocene period, approximately 1.83 – 0.69 million years ago. This finding challenges the compelling refugia hypothesis that suggests the time of cacao population differentiation during the last glaciation period (22,000 to 13,000 years BP). We also discovered that DNA variants, gene duplications, and genome structural variations drive cacao adaptation, unveiling a possible evolutionary mechanism esponsible for generating intra-specific divergence. The three wild cacao genomes provide valuable genomic resources for studying the cacao genetic diversity and developing molecular markers for crop improvement.