Introduction
Seasonal influenza control and vaccine design depends on anticipating dominant strains, while pandemic preparedness depends on identifying animal influenza viruses with elevated potential for human emergence. Existing assessments rely on expert review and experimental assays that are not scalable to modern genomic surveillance. There is a need for computational approaches that can rapidly prioritize animal influenza strains using sequence data alone, while remaining grounded in empirically observed evolutionary constraints.
Materials and Methods
We developed Emergenet, a sequence-based digital twin of influenza evolution that learns mutational constraints from observed viral populations. Using 463,266 HA and NA sequences from NCBI Virus and GISAID, the model infers conditional dependencies via conditional inference trees, inducing an intrinsic evolutionary distance metric. We evaluated vaccine strain forecasting against World Health Organization (WHO) recommendations and emergence-risk estimation against Centers for Disease Control and Prevention (CDC) Influenza Risk Assessment Tool (IRAT) scores.
Results
Across 2 decades of retrospective forecasting, Emergenet consistently outperformed WHO recommendations for H1N1 and H3N2 in both hemispheres. For H1N1, Emergenet reduced mismatch to circulating strains by 3.7 amino acids over 2 decades and 5.5 over the last decade in the Northern Hemisphere; comparable improvements were observed for H3N2. Emergenet shifted mismatch from the regime associated with poorly matched seasons to that associated with the best-matched seasons. Emergenet-derived risk scores correlated with CDC IRAT emergence scores (r?=?0.72) while requiring 30 seconds per strain. Application to 6,354 animal influenza A virus sequences from 2020 to 2024 identified a subset of high-risk strains across subtypes, hosts, and regions.
Conclusions
A sequence-based digital twin of influenza evolution can support more informed vaccine strain selection and scalable biosurveillance by quantifying evolutionary proximity of animal influenza viruses to human-adapted viruses under learned constraints. Although not a mechanistic predictor of immune response for selected vaccine strains or zoonosis, Emergenet provides strain-level prioritization tools relevant to vaccine strain selection, military readiness, and public health preparedness.