Skip to main content
ARS Home » Pacific West Area » Wenatchee, Washington » Physiology and Pathology of Tree Fruits Research » Research » Publications at this Location » Publication #394920

Research Project: Enhancement of Apple, Pear, and Sweet Cherry Quality

Location: Physiology and Pathology of Tree Fruits Research

Title: PlantTribes2: tools for comparative gene family analysis in plant genomics

Author
item WAFULA, ERIC - Pennsylvania State University
item ZHANG, HUITING - Washington State University
item VON KUSTER, GREG - Pennsylvania State University
item LEEBENS-MACK, JAMES - University Of Georgia
item Honaas, Loren
item DEPAMPHILIS, CLAUDE - Pennsylvania State University

Submitted to: Frontiers in Plant Science
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 12/2/2022
Publication Date: 1/31/2023
Citation: Wafula, E., Zhang, H., Von Kuster, G., Leebens-Mack, J.H., Honaas, L.A., dePamphilis, C.W. 2023. PlantTribes2: tools for comparative gene family analysis in plant genomics. Frontiers in Plant Science. 13. Article 1011199. https://doi.org/10.3389/fpls.2022.1011199.
DOI: https://doi.org/10.3389/fpls.2022.1011199

Interpretive Summary: The field of functional genomics aims to link genes to traits, but at a massive scale (e.g. all ~40,000 apple genes at once). However, apple trees are hard to study - they are large and it can take 5+ years before a tree produces fruit, making molecular biology experiments in apple trees that aim to directly examine gene function very difficult. Instead, researchers tend to study the functions of genes in plants that are amenable to laboratory experiments, like rice, tomato, and a small cousin of the mustard plant - Arabidopsis. This creates a need to transfer knowledge from well-studied plants to important agricultural crops; analogous to experiments in mice aimed at learning about gene function in humans. A method to do this starts by sorting genes into a family tree, which helps us learn about the genes in apple that have distant cousins in, for instance, tomato. Importantly, it allows us to transfer experimental knowledge between plants. This in turn accelerates studies aimed at learning about the genes that control traits in important crops, like apple trees, that are difficult to study in the lab. The software pipeline we describe in our manuscript, PlantTribes 2, is a user-friendly way to sort genes into gene families so we can more easily transfer gene knowledge between plants.

Technical Abstract: Plant genome-scale resources are being generated at an increasing rate, creating new opportunities for comparative genomics research. As the cost to build plant genomes continues to fall, the cost of downstream analysis remains large. Considerable differences in terms of quality and scale are present across plant genomes due to their varying sizes, complexity, and the technology used for the assembly and annotation. Researchers increasingly rely on comparative genomics approaches that integrate across plant community resources and data types. Taking full advantage of these diverse resources has resulted in novel insights into the evolutionary history of genomes and gene families, including complex non-model organisms. However, the essential tools for gene family analysis at a genome-scale are not neatly packaged and the learning curve can be steep. Here we present PlantTribes 2, a scalable, easily accessible, highly customizable, and broadly applicable gene family analysis framework. It utilizes objective classifications of complete protein sequences from existing, high-quality plant genomes for comparative and evolutionary studies. We explore two examples of its application in functional genomic studies in economically important plant species. PlantTribes 2 can improve transcript models and then sort them into pre-computed orthologous gene family clusters with rich functional annotation information. Then, for gene families of interest, downstream analyses using integrated, open source tools include, 1) multiple sequence alignments, 2) phylogenetic trees, and 3) inference of large-scale duplication events. PlantTribes 2 is freely available for use within the main public Galaxy instance and can be downloaded from GitHub or Bioconda.