Skip to main content
ARS Home » Midwest Area » Ames, Iowa » National Animal Disease Center » Virus and Prion Research » Research » Publications at this Location » Publication #414642

Research Project: Intervention Strategies to Control Endemic and New and Emerging Influenza A Virus Infections in Swine

Location: Virus and Prion Research

Title: Phylogenetic-based methods for fine-scale classification of PRRSV-2 ORF5 sequences: a comparison of their robustness and reproducibility

Author
item VANDERWAAL, KIMBERLY - University Of Minnesota
item PAMORNCHAINVAKUL, NAKIRIN - University Of Minnesota
item KIKUTI, MARIANA - University Of Minnesota
item LINHARES, DANIEL - Iowa State University
item TREVISAN, GIOVANI - Iowa State University
item ZHANG, JIANQIANG - Iowa State University
item Anderson, Tavis
item ZELLER, MICHAEL - Iowa State University
item ROSSOW, STEPHANIE - University Of Minnesota
item HOLTKAMP, DERALD - University Of Minnesota
item MAKAU, DENNIS - University Of Minnesota
item CORZO, CESAR - University Of Minnesota
item PAPLOSKI, IGOR - University Of Minnesota

Submitted to: Frontiers in Virology
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 7/24/2024
Publication Date: 8/13/2024
Citation: Vanderwaal, K., Pamornchainvakul, N., Kikuti, M., Linhares, D.C., Trevisan, G., Zhang, J., Anderson, T.K., Zeller, M., Rossow, S., Holtkamp, D.J., Makau, D.N., Corzo, C.A., Paploski, I.A. 2024. Phylogenetic-based methods for fine-scale classification of PRRSV-2 ORF5 sequences: a comparison of their robustness and reproducibility. Frontiers in Virology. https://doi.org/10.3389/fviro.2024.1433931.
DOI: https://doi.org/10.3389/fviro.2024.1433931

Interpretive Summary: Disease management and epidemiological investigations of porcine reproductive and respiratory syndrome virus-type 2 (PRRSV-2) often relies on grouping together highly related sequences. At present, the method used by animal health practitioners, diagnosticians, and researchers to label closely related sequences relies on restriction fragment length polymorphisms (RFLPs) in a single structural protein. Unrelated sequences often are grouped into the same RFLP-type, and closely related sequences sometimes have different RFLP-types. Thus, an alternative classification scheme is needed to provide better resolution on the relatedness of PRRSV-2 viruses to better inform disease management and monitoring efforts. Here, we compare potential alternative systems for classifying PRRSV-2 variants using a database of 28,730 sequences, representing ~60% of the U.S. pig population. Machine learning algorithms were trained to classify new sequences to existing groups or to identify genes as unique with high accuracy. This work lays the foundation for a naming system that reliably groups related viruses together and provides clarity for decision-making surrounding disease management. Through identifying robust and reproducible groups of genetically similar viruses within PRRSV-2, vaccine formulations may be generated that match the genetic diversity of viruses circulating in swine production systems.

Technical Abstract: Disease management and epidemiological investigations of porcine reproductive and respiratory syndrome virus-type 2 (PRRSV-2) often relies on grouping together highly related sequences. At present, the method utilized by animal health practitioners, diagnosticians, and researchers to label closely related sequences relies on restriction fragment length polymorphisms (RFLPs) in the ORF5 gene sequence. Unrelated sequences often are grouped into the same RFLP-type, and closely related sequences sometimes have different RFLP-types. Thus, an alternative classification scheme is needed to provide better resolution on the relatedness of PRRSV-2 viruses to better inform disease management and monitoring efforts. Here, we compare potential alternative systems for classifying PRRSV-2 variants using a database of 28,730 sequences, representing ~60% of the U.S. pig population. In total, we compared 140 approaches that differed in their tree-building method, criteria, and thresholds for defining variants within phylogenetic trees using TreeCluster. We identified three approaches that did not produce overly granularized variants (i.e., =5 sequences per cluster), and resulted in reproducible and robust outputs even when the input data or input phylogenies were changed. In the three best performing approaches, the average genetic distance amongst sequences belonging to the same variant was 2.1 – 2.5%, and the genetic divergence between variants was 2.5-2.7%. Machine learning algorithms could also be trained to assign new sequences to a variant with >95% accuracy, which shows that newly generated sequences could be assigned without repeating the phylogenetic and clustering analyses. Finally, we identified 73 sequence-clusters associated with circulation events on single farms. The percent of farm sequence-clusters with an ID change was 6.5-8.7% for our best approaches. In contrast, ~43% of farm sequence-clusters had an RFLP-type change, further demonstrating how an alternative fine-scale classification system could address shortcomings of the current system based on RFLPs. Through identifying robust and reproducible classification approaches for PRRSV-2, this work lays the foundation for an alternative system that would more reliably group related field viruses and provide better improved clarity for decision-making surrounding disease management.