Series temporales para prever brotes de enfermedades infecciosas en América Latina
DOI:
https://doi.org/10.70577/9j5qky84Palabras clave:
Enfermedades, Epidemiologia, Modelos, Predicción, Series Temporales.Resumen
Este estudio analiza la eficacia predictiva de modelos de series temporales aplicados a brotes de enfermedades infecciosas en América Latina, empleando un enfoque de ciencia de datos. Se compararon dos enfoques: el modelo estacional SARIMA y un modelo híbrido SARIMA + NNAAR (Red Neuronal Autorregresiva). Los resultados muestran que, aunque SARIMA presenta una limitada capacidad explicativa (R² negativo), mantiene un desempeño aceptable en términos de error (RMSE=1.55; MAE=0.87). Por el contrario, el modelo híbrido mostró un rendimiento inferior, con errores más altos y un R² aún más negativo, lo que indica que la incorporación de una red neuronal no mejora necesariamente la capacidad predictiva del sistema. La curva de aprendizaje del modelo NNAAR sugiere un posible subentrenamiento, reforzando la necesidad de una cuidadosa calibración cuando se integran modelos complejos. El estudio destaca la importancia de seleccionar modelos según la estructura de los datos, más allá de la sofisticación técnica, y recomienda optimizaciones metodológicas antes de implementar modelos híbridos en sistemas de vigilancia epidemiológica. Este análisis, basado en datos simulados realistas, subraya el valor de las metodologías de series temporales para la predicción de enfermedades y la toma de decisiones en salud pública.
Referencias
[1] Satrio, C. B. A., Darmawan, W., Nadia, B. U., & Hanafiah, N. (2021). Time series analysis and forecasting of coronavirus disease in Indonesia using ARIMA model and PROPHET. Procedia Computer Science, 179, 524-532. https://doi.org/10.1016/j.procs.2021.01.036
[2] Xiao, H., Dai, X., Wagenaar, B. H., Liu, F., Augusto, O., Guo, Y., & Unger, J. M. (2021). The impact of the COVID-19 pandemic on health services utilization in China: Time-series analyses for 2016–2020. The Lancet Regional Health–Western Pacific, 9. https://doi.org/10.1016/j.lanwpc.2021.100122
[3] Furtado, P. (2021). Epidemiology SIR with regression, arima, and Prophet in forecasting COVID-19. Engineering Proceedings, 5(1), 52. https://doi.org/10.3390/engproc2021005052
[4] Fan, J., Zhang, K., Huang, Y., Zhu, Y., & Chen, B. (2023). Parallel spatio-temporal attention-based TCN for multivariate time series prediction. Neural Computing and Applications, 35(18), 13109-13118. https://link.springer.com/article/10.1007/s00521-021-05958-z
[5] Katris, C. (2021). A time series-based statistical approach for outbreak spread forecasting: Application of COVID-19 in Greece. Expert systems with applications, 166, 114077. https://doi.org/10.1016/j.eswa.2020.114077
[6] Cihan, P. (2021). Forecasting fully vaccinated people against COVID-19 and examining future vaccination rate for herd immunity in the US, Asia, Europe, Africa, South America, and the World. Applied soft computing, 111, 107708.
https://doi.org/10.1016/j.asoc.2021.107708
[7] Nikparvar, B., Rahman, M. M., Hatami, F., & Thill, J. C. (2021). Spatio-temporal prediction of the COVID-19 pandemic in US counties: modeling with a deep LSTM neural network. Scientific reports, 11(1), 21715. https://www.nature.com/articles/s41598-021-01119-3
[8] Santangelo, O. E., Gentile, V., Pizzo, S., Giordano, D., & Cedrone, F. (2023). Machine learning and prediction of infectious diseases: a systematic review. Machine Learning and Knowledge Extraction, 5(1), 175-198. https://doi.org/10.3390/make5010013
[9] Akindahunsi, T., Olulaja, O., Ajayi, O., Prisca, I., Onyenegecha, U. H., & Fadojutimi, B. (2024). Analytical tools in diseases epidemiology and surveillance: A review of literature. International Journal of Applied Research, 10(9), 155-161. http://dx.doi.org/10.22271/allresearch.2024.v10.i9c.12018
[10] Kuo, R. J., & Xu, Z. X. (2024). Predictive maintenance for wire drawing machine using MiniRocket and GA-based ensemble method. The International Journal of Advanced Manufacturing Technology, 134(3), 1661-1676. http://dx.doi.org/10.1007/s00170-024-14225-z
[11] MatgSimpson, R. B., Kulinkina, A. V., & Naumova, E. N. (2022). Investigating seasonal patterns in enteric infections: a systematic review of time series methods. Epidemiology & Infection, 150, e50. https://doi.org/10.1017/s0950268822000243
[12] Mathur, M. B., & Fox, M. P. (2023). Toward open and reproducible epidemiology. American Journal of Epidemiology, 192(4), 658-664. https://doi.org/10.1093/aje/kwad007
[13] Riaz, M., Hussain Sial, M., Sharif, S., & Mehmood, Q. (2023). Epidemiological forecasting models using ARIMA, SARIMA, and holt–winter multiplicative approach for Pakistan. Journal of Environmental and Public Health, 2023(1), 8907610. http://dx.doi.org/10.1155/2023/8907610
[14] Wang, M., Pan, J., Li, X., Li, M., Liu, Z., Zhao, Q., ... & Wang, Y. (2022). ARIMA and ARIMA-ERNN models for prediction of pertussis incidence in mainland China from 2004 to 2021. BMC Public Health, 22(1), 1447. https://doi.org/10.1186/s12889-022-13872-9
[15] akermi, J., Xiao, Y., Sheng, Q., Zhou, J., Zhang, Z., & Zhu, F. (2024). Epidemiology and SARIMA model of deaths in a tertiary comprehensive hospital in Hangzhou from 2015 to 2022. BMC Public Health, 24(1), 2549. http://dx.doi.org/10.1186/s12889-024-20033-7
[16] Wu, Y., Li, S., & Guo, Y. (2021). Space-time-stratified case-crossover design in environmental epidemiology study. Health Data Science, 2021, 9870798. http://dx.doi.org/10.34133/2021/9870798
[17] OsaaXing, L., Zhang, X., Burstyn, I., & Gustafson, P. (2021). On logistic Box–Cox regression for flexibly estimating the shape and strength of exposure‐disease relationships. Canadian Journal of Statistics, 49(3), 808-825. https://doi.org/10.1002/cjs.11587
[18] Osama, O. M., Alakkari, K., Abotaleb, M., & El-Kenawy, E. S. M. (2023). Forecasting global monkeypox infections using LSTM: a non-stationary time series analysis. In 2023 3rd international conference on electronic engineering (ICEEM) (pp. 1-7). IEEE. http://dx.doi.org/10.1109/ICEEM58740.2023.10319532
[19] Alassafi, M. O., Jarrah, M., & Alotaibi, R. (2022). Time series predicting of COVID-19 based on deep learning. Neurocomputing, 468, 335-344. https://doi.org/10.1016/j.neucom.2021.10.035
[20] Gudziunaite, S., Shabani, Z., Weitensfelder, L., & Moshammer, H. (2023). Time series analysis in environmental epidemiology: challenges and considerations. International Journal of Occupational Medicine and Environmental Health, 36(6), 704. https://doi.org/10.13075/ijomeh.1896.02237
[21] Musa, S. S., Qureshi, S., Zhao, S., Yusuf, A., Mustapha, U. T., & He, D. (2021). Mathematical modeling of COVID-19 epidemic with effect of awareness programs. Infectious disease modelling, 6, 448-460. https://doi.org/10.1016/j.idm.2021.01.012
[22] Cori, A., & Kucharski, A. (2024). Inference of epidemic dynamics in the COVID-19 era and beyond. Epidemics, 100784. http://dx.doi.org/10.1016/j.asoc.2021.107708
[23] Ayoobi, N., Sharifrazi, D., Alizadehsani, R., Shoeibi, A., Gorriz, J. M., Moosaei, H., ... & Mosavi, A. (2021). Time series forecasting of new cases and new deaths rate for COVID-19 using deep learning methods. Results in physics, 27, 104495. https://doi.org/10.1016/j.rinp.2021.104495
[24] Shaikh, S., Gala, J., Jain, A., Advani, S., Jaidhara, S., & Edinburgh, M. R. (2021). Analysis and prediction of covid-19 using regression models and time series forecasting. In 2021 11th international conference on cloud computing, data science & engineering (Confluence) (pp. 989-995). IEEE. http://dx.doi.org/10.1109/Confluence51648.2021.9377065
[25] Dorward, J., Khubone, T., Gate, K., Ngobese, H., Sookrajh, Y., Mkhize, S., ... & Garrett, N. (2021). The impact of the COVID-19 lockdown on HIV care in 65 South African primary care clinics: an interrupted time series analysis. The lancet HIV, 8(3), e158-e165. https://doi.org/10.1016/s2352-3018(20)30359-3
[26] Chen, Y., Li, N., Lourenço, J., Wang, L., Cazelles, B., Dong, L., ... & Tully, D. C. (2022). Measuring the effects of COVID-19-related disruption on dengue transmission in southeast Asia and Latin America: a statistical modelling study. The Lancet infectious diseases, 22(5), 657-667. https://doi.org/10.1016/s1473-3099(22)00025-1
[27] Chen, M., Zhu, H., Chen, Y., & Wang, Y. (2022). A novel missing data imputation approach for time series air quality data based on logistic regression. Atmosphere, 13(7), 1044. https://doi.org/10.3390/atmos13071044
[28] Meritxell, G. O., Sierra, B., & Ferreiro, S. (2022). On the evaluation, management and improvement of data quality in streaming time series. IEEE Access, 10, 81458-81475. http://dx.doi.org/10.1109/ACCESS.2022.3195338
[29] Yarmol-Matusiak, E. A., Cipriano, L. E., & Stranges, S. (2021). A comparison of COVID-19 epidemiological indicators in Sweden, Norway, Denmark, and Finland. Scandinavian journal of public health, 49(1), 69-78. https://doi.org/10.1177/1403494820980264
[30] Liu, S., & Zhou, D. J. (2024). Using cross‐validation methods to select time series models: Promises and pitfalls. British Journal of Mathematical and Statistical Psychology, 77(2), 337-355. http://dx.doi.org/10.1111/bmsp.12330
[31] Bommareddy, S., Khan, J. A., & Anand, R. (2022). A review on healthcare data privacy and security. Networking Technologies in Smart Healthcare, 165-187. http://dx.doi.org/10.1201/9781003239888-8
[32] Cai, J., Liu, G., Jia, H., Zhang, B., Wu, R., Fu, Y., ... & Zhang, R. (2022). A new algorithm for landslide dynamic monitoring with high temporal resolution by Kalman filter integration of multiplatform time-series InSAR processing. International Journal of Applied Earth Observation and Geoinformation, 110, 102812. https://doi.org/10.1016/j.jag.2022.102812
[33] Akermi, S. E., L’Hadj, M., & Selmane, S. (2021). Epidemiology and time series analysis of human brucellosis in Tebessa province, Algeria, from 2000 to 2020. Journal of Research in Health Sciences, 22(1), e00544. https://doi.org/10.34172/jrhs.2022.79
[34] Wu, W. W., Li, Q., Tian, D. C., Zhao, H., Xia, Y., Xiong, Y., ... & Qi, L. (2022). Forecasting the monthly incidence of scarlet fever in Chongqing, China using the SARIMA model. Epidemiology & Infection, 150, e90. https://doi.org/10.1017/s0950268822000693
[35] Mamudu, L., Yahaya, A., & Dan, S. (2021). Application of seasonal autoregressive integrated moving average (SARIMA) for flows of river kaduna. Niger. J. Eng, 28(2). https://www.researchgate.net/publication/354778234_Application_of_Seasonal_Autoregressive_Integrated_Moving_Average_SARIMA_For_Flows_of_River_Kaduna
[36] Singh, D. (2024). Deployment of Seasonal Autoregressive Integrated Moving Average (SARIMA) Models for Network Reliability Prediction. In 2024 3rd International Conference for Innovation in Technology (INOCON) (pp. 1-6). IEEE. http://dx.doi.org/10.1063/5.0223836
[37] Liu, Z., Wan, G., Prakash, B. A., Lau, M. S., & Jin, W. (2024). A review of graph neural networks in epidemic modeling. In Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (pp. 6577-6587). http://dx.doi.org/10.1145/3637528.3671455
[38] Serghiou, S., & Rough, K. (2023). Deep learning for epidemiologists: an introduction to neural networks. American journal of epidemiology, 192(11), 1904-1916. http://dx.doi.org/10.48550/arXiv.2202.01319
[39] Man, H., Huang, H., Qin, Z., & Li, Z. (2023). Analysis of a SARIMA-XGBoost model for hand, foot, and mouth disease in Xinjiang, China. Epidemiology & Infection, 151, e200. https://doi.org/10.1017/s0950268823001905
[40] Anteneh, L. M., Lokonon, B. E., & Kakaï, R. G. (2024). Modelling techniques in cholera epidemiology: A systematic and critical review. Mathematical Biosciences, 109210. https://doi.org/10.1016/j.mbs.2024.109210
[41] Hamilton, A. J., Strauss, A. T., Martinez, D. A., Hinson, J. S., Levin, S., Lin, G., & Klein, E. Y. (2021). Machine learning and artificial intelligence: applications in healthcare epidemiology. Antimicrobial Stewardship & Healthcare Epidemiology, 1(1), e28. https://doi.org/10.1017/ash.2021.192
Contribución de los Autores Individuales en la Elaboración de un Artículo Científico (Política de Ghostwriting)
Todos los autores participaron equitativamente del desarrollo del artículo.
Fuentes de Financiamiento para la Investiga-ción Presentada en el Artículo Científico o para el Artículo Científico en sí
No se recibió financiación para la realización de este estudio.
Conflicto de Intereses
Los autores declaran no tener ningún conflicto de interés relevante con el contenido de este artículo.
Licencia de Atribución de Creative Commons 4.0 (Atribución 4.0 Internacional, CC BY 4.0)
Este artículo se publica bajo los términos de la Licencia de Atribución de Creative Commons 4.0.
Descargas
Publicado
Número
Sección
Licencia
Usted es libre de:
- Compartir — copiar y redistribuir el material en cualquier medio o formato
- La licenciante no puede revocar estas libertades en tanto usted siga los términos de la licencia
Bajo los siguientes términos:
- Atribución — Usted debe dar crédito de manera adecuada , brindar un enlace a la licencia, e indicar si se han realizado cambios . Puede hacerlo en cualquier forma razonable, pero no de forma tal que sugiera que usted o su uso tienen el apoyo de la licenciante.
- NoComercial — Usted no puede hacer uso del material con propósitos comerciales .
- SinDerivadas — Si remezcla, transforma o crea a partir del material, no podrá distribuir el material modificado.
- No hay restricciones adicionales — No puede aplicar términos legales ni medidas tecnológicas que restrinjan legalmente a otras a hacer cualquier uso permitido por la licencia.