Location: Virus and Prion Research
Title: Inference of Reticulations in Phylogenetic NetworksAuthor
WAGLE, SANKET - Iowa State University | |
MARKIN, ALEXEY - Orise Fellow | |
Anderson, Tavis | |
EULENSTEIN, OLIVER - Iowa State University |
Submitted to: Meeting Abstract
Publication Type: Abstract Only Publication Acceptance Date: 3/29/2021 Publication Date: 4/16/2021 Citation: Wagle, S., Markin, A., Anderson, T.K., Eulenstein, O. 2021. Inference of Reticulations in Phylogenetic Networks [abstract]. Annual Bioinformatics and Computational Biology Symposium. P. none assigned. Interpretive Summary: Technical Abstract: Phylogenetic networks are a powerful tool in evolutionary biology that capture divergent (speciation) and convergent (hybridization, reassortment, recombination) evolution. They have been invaluable in explaining how genes, genomes, and species have evolved. The application of these networks has had far-reaching consequences, especially in understanding the evolution of pathogens such as influenza A virus, rotavirus A, and bluetongue virus, where convergent evolution via gene reassortment is common. Unfortunately, inferring such networks is an inherently difficult task, and is susceptible to errors in the underlying data and hence their application may have limited utility in practice. To account for error in biological data, and to identify biologically meaningful networks, we propose a layered approach to traversing the search space of all possible phylogenetic networks. The search begins at the 0th layer, with each subsequent layer consisting of all possible networks with an increasing number of reticulations. The heuristics find the best possible network in each layer, after which we traverse to the next layer by adding a new reticulation to the network such that it maximises the heuristic score. We repeat these steps until we reach the specified bound on the number of reticulations or find a network with a satisfactory heuristic score. This approach allows the algorithm to intelligently and efficiently infer phylogenetic networks while ensuring that the resultant networks are resistant to errors in the underlying network. Using this algorithm, we analyzed the evolution of a H3 swine influenza A virus (IAV) lineage with all available whole-genome strain data (n=429). These data recapitulated known reassortment events and the evolutionary trajectory of the virus lineage. Additionally, the best phylogenetic network identified previously undescribed interlineage and intralineage reassortments: these provide an objective method to identify viruses for additional characterization. Our empirical study on IAV demonstrates the biological relevance and computational performance of our algorithms. Currently, we are developing more efficient and accurate algorithms that will enable the inference of networks with many more reticulations and ability to analyze larger numbers of taxa. |