Skip to main content
ARS Home » Midwest Area » Ames, Iowa » National Animal Disease Center » Virus and Prion Research » Research » Publications at this Location » Publication #395228

Research Project: Intervention Strategies to Control Endemic and New and Emerging Influenza A Virus Infections in Swine

Location: Virus and Prion Research

Title: Machine learning for sequence classification and predicting antigenic phenotype

Author
item Anderson, Tavis
item Baker, Amy

Submitted to: Meeting Abstract
Publication Type: Abstract Only
Publication Acceptance Date: 6/18/2022
Publication Date: 10/6/2022
Citation: Anderson, T.K., Baker, A.L. 2022. Machine learning for sequence classification and predicting antigenic phenotype (abstract). The 65th Annual Meeting of the American Association of Veterinary Laboratory Diagnosticians. p.6.

Interpretive Summary:

Technical Abstract: Genome sequencing has become a common task in veterinary diagnostic laboratories. A subsequent challenge is the integration of the sequences with complex data from public sources and locally inferred secondary data such as phenotypic or epidemiologic information to make informed diagnostic decisions. Under this paradigm, meaningful datasets for inference must be formed and this can be achieved across large volumes of data using in-database machine learning algorithms. After forming context datasets, machine learning approaches can be used to identify genetic variations of diagnostic significance and applied in targeted surveillance or in real-time genomic epidemiology. Analytical pipelines may also be generated that can assign classifications to genetic sequence data alongside visualization of identified sequence variants. These pipelines are intuitive and customizable and can be used by diagnosticians to develop trained prediction models that can run automatically and that provide accurate output faster than alternative methods. An extension of machine learning algorithms are models that predict antigenic diversity and drift from genetic sequence data. Models that predict virus antigenic characteristics from genetic sequence data can provide a fast and accurate method linking diagnostic sequence data to antigenic characteristics. Machine learning approaches can identify genetic features associated with virus classification, predict the antigenic novelty of a virus strain, and can be used to increase researchers’ understanding of endemic and emerging viruses by helping them to quickly identify new viral variants.