Skip to main content
ARS Home » Northeast Area » Ithaca, New York » Robert W. Holley Center for Agriculture & Health » Emerging Pests and Pathogens Research » Research » Research Project #432869

Research Project: Development of Tools, Models and Datasets for Genome-enabled Studies of Bacterial Phytopathogens

Location: Emerging Pests and Pathogens Research

2018 Annual Report


Objectives
Objective 1: Develop datasets and computational tools to facilitate the study of large-scale genomic and pan-genomic features of plant-associated bacteria, including genomic islands and virulence pathways. [NP303, C2, PS2A] Subobjective 1A: Develop deep proteogenomic data sets to guide the annotation of poorly characterized type strains and field isolates of select strains of bacterial plant pathogens and other plant-associated bacteria. Subobjective 1B: Develop or refine annotation methods for genomic regions of anomalous nucleotide composition and the systems-level analysis of pathways related to virulence and adaptation to plant-associated niches. Objective 2: Identify genes and candidate transcription factor binding sites using comparative genomics and available CHIP-Seq, RNA-Seq and proteomics data sets, and ensure that gene calls include experimental evidence whenever appropriate. [NP303, C2, PS2A] Subobjective 2A: Extend comparative genomics methods to propagate the experimentally-supported genome annotate updates from targeted bacterial strains to related strains. Subobjective 2B: Leverage proteomics and other high-throughput datasets, along with comparative genomics methods, to identify conserved motifs representing candidate promoters and other regulatory binding sites.


Approach
A good genome annotation includes a complete set of biological components (e.g., coding and non-coding genes) and a description of the interactions between them (e.g., promoters and bind- ing sites for transcriptional regulators). Constructing this level of detail relies on painstaking ex- perimental investigations on individual genes and their regulation – a luxury enjoyed by a small handful of model organisms such as Escherichia coli, Pseudomonas aeruginosa, and Bacillus subtilus. The goal of this project is to use proteomics and other evidence based computational analysis to rapidly produce high-quality bacterial genome annotations that can be used by biologists to design experiments and interpret experimental results. Our primary goal is to develop high-quality genomic resources for field isolates currently causing disease outbreaks including Clavibacter michiganensis, Pantoea ananatis, Xylella fastidiosa, and Dickeya species. In addition, we will use existing and novel computation methods to establish pipelines for prop- agating our experimentally-driven genome annotations to other members of their clades, with special emphasis on pathways related to virulence and fitness. This work will be conducted in collaboration with the prokaryotic genome annotation pipeline (PGAP) team at the National Cen- ter for Biotechnology Information (NCBI). In this manner, the improvements to a small number of genomes will be result in improvements to literally thousands of genome annotations. Both of these objectives build on our prior experience leading experimental and computational efforts to develop genomic resources for P. syringae pv. tomato DC3000.


Progress Report
Objective 1A: High-quality proteomics data have been collected for the plant pathogenic bacteria, Pseudomonas syringae pv tomato (DC3000), Pantoea ananatis (LMG20103), and Clavibacter michiganensis michiganensis (NCPPB382), and for a laboratory strain of E.coli K12 (MG1665). Preliminary data was collected for a second strain of P. Syringae pv tomato (15025) and Liberibacter crescens (BT-1). The high-quality data sets were analyzed using multiple methods to determine the suite of proteins (i.e., the proteome) present in the bacteria under the tested conditions. Our analysis of Clavibacter michiganensis found 34 novel proteins (i.e., proteins not described in the existing genome annotation) and evidence that the previous annotation of over 80 known proteins were incorrect. Based on several characteristics of the protein sequences, these novel proteins are likely to be associated with cell membranes and may contribute to the bacteria's ability to cause disease. In collaboration with other ARS scientists in Ithaca, New York, we completed an analysis of a new genome sequence of L. crescens, a relative of the bacterial pathogen that causes citrus greening disease. This analysis helped to confirm that the new genome has a much higher quality sequence and annotation than the original genome and will be more useful in the study of the citrus greening pathogen. Objective 1B: Several existing proteome annotation methods (e.g. RAST, Prokka) were compared for their ability to annotate raw bacterial genome sequences to identify genes and other features important to the biology of these organisms. We found these programs, by and large, did a similar job at identifying genes to one another and to PGAP, the Prokaryotic Genome Annotation Pipeline, developed by NIH's NCBI. For our purposes, the advantage of a package like Prokka is that it can be deployed locally and modified to suit the needs of our project. We have obtained specialized software to identify secreted proteins encoded by bacterial genomes that might be involved in disease induction or virulence. Wet lab experiments are needed to confirm these predictions. Previously we developed software that could identify features of bacterial DNA involved in gene regulation. This analysis was originally done using RNA sequence data from the bacteria. For this project, we hypothesized that our proteomics datasets could be used instead, and we hoped to increase the value of the datasets by using them for this second analysis. Unfortunately, although we found the proteomics data able to identify some regulatory features, the use of RNA data provided consistently better results. This research project has been operating with only one of two planned scientist positions and without any support personnel due to the federal hiring freeze. As such, research outlined in Objective 2 has been put on hold. New hires are planned in the next year.


Accomplishments


Review Publications
Giglio, K., Keohane, C.E., Stodghill, P., Steele, A.D., Fetzer, C., Sieber, S., Filiatrault, M.J., Wuest, W.M. 2018. Transcriptomic profiling suggests that promysalin alters metabolic flux, motility, and iron regulation in Pseudomonas putida KT2440. ACS Infectious Diseases. https://doi.org/10.1021/acsinfecdis.8b00041.
Fishman, M., Zhang, J., Bronstein, P., Stodghill, P., Filiatrault, M.J. 2017. The Ca2+ induced two-component system, CvsSR regulates the Type III secretion system and the extracytoplasmic function sigma-factor AlgU in Pseudomonas syringae pv. tomato DC3000. Journal of Bacteriology. doi:10.1128/JB.00538-17.