Yang L, Li G, Yang J, Zhang T, Du J, Liu T, Zhang. Deep-Learning Model for Influenza Prediction From Multisource Heterogeneous Data in a Megacity: Model Development and Evaluation. J Med Internet Res. 2023 Feb 13;25:e44238
Background: In megacities, there is an urgent need to establish more sensitive forecasting and early warning methods for acute respiratory infectious diseases. Existing prediction and early warning models for influenza and other acute respiratory infectious diseases have limitations and therefore there is room for improvement.
Objective: The aim of this study was to explore a new and better-performing deep-learning model to predict influenza trends from multisource heterogeneous data in a megacity.
Methods: We collected multisource heterogeneous data from the 26th week of 2012 to the 25th week of 2019, including influenza-like illness (ILI) cases and virological surveillance, data of climate and demography, and search engines data. To avoid collinearity, we selected the best predictor according to the weight and correlation of each factor. We established a new multiattention-long short-term memory (LSTM) deep-learning model (MAL model), which was used to predict the percentage of ILI (ILI%) cases and the product of ILI% and the influenza-positive rate (ILI%×positive%), respectively. We also combined the data in different forms and added several machine-learning and deep-learning models commonly used in the past to predict influenza trends for comparison. The R2 value, explained variance scores, mean absolute error, and mean square error were used to evaluate the quality of the models.
Results: The highest correlation coefficients were found for the Baidu search data for ILI% and for air quality for ILI%×positive%. We first used the MAL model to calculate the ILI%, and then combined ILI% with climate, demographic, and Baidu data in different forms. The ILI%+climate+demography+Baidu model had the best prediction effect, with the explained variance score reaching 0.78, R2 reaching 0.76, mean absolute error of 0.08, and mean squared error of 0.01. Similarly, we used the MAL model to calculate the ILI%×positive% and combined this prediction with different data forms. The ILI%×positive%+climate+demography+Baidu model had the best prediction effect, with an explained variance score reaching 0.74, R2 reaching 0.70, mean absolute error of 0.02, and mean squared error of 0.02. Comparisons with random forest, extreme gradient boosting, LSTM, and gated current unit models showed that the MAL model had the best prediction effect.
Conclusions: The newly established MAL model outperformed existing models. Natural factors and search engine query data were more helpful in forecasting ILI patterns in megacities. With more timely and effective prediction of influenza and other respiratory infectious diseases and the epidemic intensity, early and better preparedness can be achieved to reduce the health damage to the population.
Objective: The aim of this study was to explore a new and better-performing deep-learning model to predict influenza trends from multisource heterogeneous data in a megacity.
Methods: We collected multisource heterogeneous data from the 26th week of 2012 to the 25th week of 2019, including influenza-like illness (ILI) cases and virological surveillance, data of climate and demography, and search engines data. To avoid collinearity, we selected the best predictor according to the weight and correlation of each factor. We established a new multiattention-long short-term memory (LSTM) deep-learning model (MAL model), which was used to predict the percentage of ILI (ILI%) cases and the product of ILI% and the influenza-positive rate (ILI%×positive%), respectively. We also combined the data in different forms and added several machine-learning and deep-learning models commonly used in the past to predict influenza trends for comparison. The R2 value, explained variance scores, mean absolute error, and mean square error were used to evaluate the quality of the models.
Results: The highest correlation coefficients were found for the Baidu search data for ILI% and for air quality for ILI%×positive%. We first used the MAL model to calculate the ILI%, and then combined ILI% with climate, demographic, and Baidu data in different forms. The ILI%+climate+demography+Baidu model had the best prediction effect, with the explained variance score reaching 0.78, R2 reaching 0.76, mean absolute error of 0.08, and mean squared error of 0.01. Similarly, we used the MAL model to calculate the ILI%×positive% and combined this prediction with different data forms. The ILI%×positive%+climate+demography+Baidu model had the best prediction effect, with an explained variance score reaching 0.74, R2 reaching 0.70, mean absolute error of 0.02, and mean squared error of 0.02. Comparisons with random forest, extreme gradient boosting, LSTM, and gated current unit models showed that the MAL model had the best prediction effect.
Conclusions: The newly established MAL model outperformed existing models. Natural factors and search engine query data were more helpful in forecasting ILI patterns in megacities. With more timely and effective prediction of influenza and other respiratory infectious diseases and the epidemic intensity, early and better preparedness can be achieved to reduce the health damage to the population.
See Also:
Latest articles in those days:
- Emergence of HPAI H5N6 Clade 2.3.4.4b in Wild Birds: A Case Study From South Korea, 2023 20 hours ago
- Age-Dependent Pathogenesis of Influenza A Virus H7N9 Mediated Through PB1-F2-Induced Mitochondrial DNA Release and Activation of cGAS-STING-NF-κB Signaling 20 hours ago
- Genotypic Clustering of H5N1 Avian Influenza Viruses in North America Evaluated by Ordination Analysis 20 hours ago
- Protocol for enhanced human surveillance of avian influenza A(H5N1) on farms in Canada 1 days ago
- Evolutionary analysis of Hemagglutinin and neuraminidase gene variation in H1N1 swine influenza virus from vaccine intervention in China 1 days ago
[Go Top] [Close Window]