Skip to main content
ARS Home » Northeast Area » Beltsville, Maryland (BARC) » Beltsville Agricultural Research Center » Molecular Plant Pathology Laboratory » Research » Publications at this Location » Publication #408275

Research Project: Omics-Based Approach to Detection, Identification, and Systematics of Plant Pathogenic Phytoplasmas and Spiroplasmas

Location: Molecular Plant Pathology Laboratory

Title: AGRAMP: Machine learning models for predicting antimicrobial peptides against phytopathogenic bacteria

Author
item Shao, Jonathan
item Zhao, Yan
item Wei, Wei
item VAISMAN, ISOIF - George Mason University

Submitted to: Frontiers in Microbiology
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 1/12/2024
Publication Date: 3/7/2024
Citation: Shao, J.Y., Zhao, Y., Wei, W., Vaisman, I. 2024. AGRAMP: Machine learning models for predicting antimicrobial peptides against phytopathogenic bacteria. Frontiers in Microbiology. 15:1304044. https://doi.org/10.3389/fmicb.2024.1304044.
DOI: https://doi.org/10.3389/fmicb.2024.1304044

Interpretive Summary: Bacterial plant pathogens cause numerous plant diseases, posing a significant threat to global agricultural productivity and food security. The conventional employment of chemical pesticides and antibiotics raises concerns on environment and human health. An alternative approach involves antimicrobial peptides (AMPs), which offers a more sustainable and eco-friendly solution. AMPs are a group of small peptides that can slow the growth or kill harmful bacteria. However, the process of identifying these AMPs is intricate and time-consuming. To address this challenge, ARS scientists at Beltsville, Maryland, in collaboration with colleagues at the George Mason University, Virginia, developed a computer program. This program utilizes N-grams (group of letters that make up the amino-acid composition of an AMP) and Random Forest (Computer code that makes the decision on the most important N-gram) machine learning technique to predict AMPs. To facilitate broader accessibility, a publicly available online resource called AGRAMP (Agricultural N-grams Antimicrobial Peptides) was constructed. This platform enables users to input short peptide sequences and obtain predictions of putative AMPs. To prove the effectiveness of the computer program, a subset of predicted putative AMPs were tested in the laboratory, and they were shown to be effective at slowing the growth of Spirplasma citri, a bacterium known to cause serious citrus diseases. The successful identification and selection of potential AMPs significantly contribute to the development of innovative strategies for plant disease management in agriculture. This article will be of interest to farmers and plant doctors who are involved in plant disease management and to researchers who are devoted to developing environmentally sound therapeutics against pathogenic bacteria.

Technical Abstract: Introduction: Antimicrobial peptides (AMPs) are promising alternatives to traditional antibiotics for combating plant pathogenic bacteria in agriculture and the environment. However, identifying potent AMPs through laborious experimental assays is resource-intensive and time-consuming. To address these limitations, this study presents a bioinformatics approach utilizing machine learning models for predicting and selecting AMPs active against plant pathogenic bacteria. Methods: N-gram representations of peptide sequences with 3-letter and 9-letter reduced amino acid alphabets were used to capture the sequence patterns and motifs that contribute to the antimicrobial activity of AMPs. A 5-fold cross-validation technique was used to train the machine learning models and to evaluate their predictive accuracy and robustness. Results: The models were applied to predict putative AMPs encoded by intergenic regions and small open reading frames (ORFs) of the citrus genome. Approximately 7% of the 10,000-peptide dataset from the intergenic region and 7% of the 685,924-peptide dataset from the whole genome were predicted as probable AMPs. The prediction accuracy of the reported models range from 0.72 to 0.91. A subset of the predicted AMPs was selected for experimental test against Spiroplasma citri, the causative agent of citrus stubborn disease. The experimental results confirm the antimicrobial activity of the selected AMPs against the target bacterium, demonstrating the predictive capability of the machine learning models. Discussion: Hydrophobic amino acid residues and positively charged amino acid residues are among the key features in predicting AMPs by the Random Forest Algorithm. Aggregation propensity appears to be correlated with the effectiveness of the AMPs. The described models would contribute to the development of effective AMP-based strategies for plant disease management in agricultural and environmental settings. To facilitate broader accessibility, our model is publicly available on the AGRAMP (Agricultural Ngrams Antimicrobial Peptides) server.