Data Mining for the Optimization of Industrial Processes in Latin American Manufacturing
DOI:
https://doi.org/10.70577/c7273g40Keywords:
data, industry, mining, models, optimization.Abstract
This study explores the application of data mining and machine learning techniques for industrial process optimization in Latin America, with an emphasis on the context of Industry 4.0. Using simulated data representative of real-life operations, advanced statistical methodologies were implemented, including imputation models, variable selection, principal component analysis (PCA), clustering, and predictive models such as XGBoost and SVM. The results reveal that variables such as lead time, mean time between failures (MTBF), and CO₂ emissions have a direct impact on the defect per million (PPM) rate, highlighting the interrelationship between logistical, maintenance, and environmental factors. The clustering analysis identified three operational profiles differentiated by energy efficiency and quality, facilitating targeted interventions. Despite the high performance of the XGBoost model, possible overfitting is noted, so cross-validation is recommended. Time trends did not show significant seasonality, suggesting a greater influence of internal process variables. The study concludes that the integration of advanced analytics, predictive maintenance, and artificial intelligence can significantly improve competitiveness, sustainability, and quality in Latin American manufacturing environments.
Downloads
References
[1] Baek, C., & Doleck, T. (2023). Educational data mining versus learning analytics: A review of publications from 2015 to 2019. Interactive Learning Environments, 31(6), 3828-3850. http://dx.doi.org/10.1080/10494820.2021.1943689
[2] Roslan, M. B., & Chen, C. (2022). Educational data mining for student performance prediction: A systematic literature review (2015-2021). International Journal of Emerging Technologies in Learning (iJET), 17(5), 147-179. http://dx.doi.org/10.3991/ijet.v17i05.27685
[3] Salas-Pilco, S. Z., & Yang, Y. (2022). Artificial intelligence applications in Latin American higher education: a systematic review. International Journal of Educational Technology in Higher Education, 19(1), 21. http://dx.doi.org/10.1186/s41239-022-00326-w
[4] Mendoza P., M. A., & Cuellar, S. (2020). Industry 4. 0: Latin america smes challenges. 2020 Congreso Internacional de Innovación y Tendencias En Ingeniería (CONIITI), 1–6. https://doi.org/10.1109/CONIITI51147.2020.9240428
[5] Okoye, K., Hussein, H., Arrona-Palacios, A., Quintero, H. N., Ortega, L. O. P., Sanchez, A. L., ... & Hosseini, S. (2023). Impact of digital technologies upon teaching and learning in higher education in Latin America: an outlook on the reach, barriers, and bottlenecks. Education and Information Technologies, 28(2), 2291-2360. http://dx.doi.org/10.1007/s10639-022-11214-1
[6] Audrin, C., & Audrin, B. (2022). Key factors in digital literacy in learning and education: a systematic literature review using text mining. Education and Information Technologies, 27(6), 7395-7419. http://dx.doi.org/10.1007/s10639-021-10832-5
[7] Calzada Olvera, B. (2022). Innovation in mining: what are the challenges and opportunities along the value chain for Latin American suppliers?. Mineral Economics, 35(1), 35-51. https://doi.org/10.1007/s13563-021-00251-w
[8] Shu, X., & Ye, Y. (2023). Knowledge Discovery: Methods from data mining and machine learning. Social Science Research, 110, 102817. https://doi.org/10.1016/j.ssresearch.2022.102817
[9] Oatley, G. C. (2022). Themes in data mining, big data, and crime analytics. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 12(2), e1432. http://dx.doi.org/10.1002/widm.1432
[10] Rajan, R., Rajest, S., & Singh, B. (2021). Spatial data mining methods databases and statistics point of views. Innov Inf Commun Technol Ser, 3, 103-109. http://dx.doi.org/10.46532/978-81-950008-7-6_010
[11] Schirru, L., Rocha de Souza, A., Valente, M. G., & de Perdigão Lana, A. (2024). Text and Data Mining Exceptions in Latin America. IIC-International Review of Intellectual Property and Competition Law, 55(10), 1624-1653. https://doi.org/10.1007/s40319-024-01511-2
[12] Rodríguez-Alegre, L. R., Trujillo-Valdiviezo, G., Egusquiza-Rodríguez, M. J., & López-Padilla, R. D. P. (2021). Revolución industrial 4.0: La brecha digital en Latinoamérica. Revista arbitrada interdisciplinaria Koinonia, 6(11), 147-162. https://www.redalyc.org/journal/5768/576868768011/576868768011.pdf
[13] Gouvea, R., Gutierrez, M. S., Montoya, M., & Terra, B. (2021). Latin America: Chartering a new economic and business pathway. Thunderbird International Business Review, 63(4), 451-461. http://dx.doi.org/10.1002/tie.22201
[14] Yu, B., Mao, W., Lv, Y., Zhang, C., & Xie, Y. (2022). A survey on federated learning in data mining. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 12(1), e1443. http://dx.doi.org/10.1002/widm.1443
[15] Sun, J., Liu, X., Mei, X., Zhao, J., Plumbley, M. D., Kılıç, V., & Wang, W. (2022). Deep neural decision forest for acoustic scene classification. In 2022 30th European Signal Processing Conference (EUSIPCO) (pp. 772-776). IEEE. http://dx.doi.org/10.23919/EUSIPCO55093.2022.9909575
[16] Chong, B. (2021). K-means clustering algorithm: a brief review. Academic Journal of Computing & Information Science, 4(5), 37-40. https://dx.doi.org/10.25236/AJCIS.2021.040506
[17] Roy, A., & Chakraborty, S. (2023). Support vector machine in structural reliability analysis: A review. Reliability Engineering & System Safety, 233, 109126. http://dx.doi.org/10.1016/j.ress.2023.109126
[18] Phoenix, P., Sudaryono, R., & Suhartono, D. (2021). Classifying promotion images using optical character recognition and Naïve Bayes classifier. Procedia Computer Science, 179, 498-506. https://doi.org/10.1016/j.procs.2021.01.033
[19] Hu, J., & Szymczak, S. (2023). A review on longitudinal data analysis with random forest. Briefings in bioinformatics, 24(2), bbad002. http://dx.doi.org/10.1093/bib/bbad002
[20] Sun, H., He, D., Zhong, J., Jin, Z., Wei, Z., Lao, Z., & Shan, S. (2023). Preventive maintenance optimization for key components of subway train bogie with consideration of failure risk. Engineering Failure Analysis, 154, 107634. http://dx.doi.org/10.1016/j.engfailanal.2023.107634
[21] Tang, L., & Meng, Y. (2021). Data analytics and optimization for smart industry. Frontiers of Engineering Management, 8(2), 157-171. http://dx.doi.org/10.1007/s42524-020-0126-0
[22] Goel, K., Leemans, S. J., Martin, N., & Wynn, M. T. (2022). Quality-informed process mining: A case for standardised data quality annotations. ACM Transactions on Knowledge Discovery from Data (TKDD), 16(5), 1-47. http://dx.doi.org/10.1145/3511707
[23] Avizenna, M. H., Widyanto, R. A., Wirawan, D. K., Pratama, T. A., & Nabila, A. S. (2021). Implementation of apriori data mining algorithm on medical device inventory system. Journal of Applied Data Sciences, 2(3), 55-63. http://dx.doi.org/10.47738/jads.v2i3.35
[24] Aguilar-Pesantes, A., Pena Carpio, E., Vitvar, T., Koepke, R., & Menéndez-Aguado, J. M. (2021). A comparative study of mining control in Latin America. Mining, 1(1), 6-18. https://doi.org/10.3390/mining1010002
[25] Haslam, P. A., & Ary Tanimoune, N. (2016). The determinants of social conflict in the latin american mining sector: New evidence with quantitative data. World Development, 78, 401–419. https://doi.org/10.1016/j.worlddev.2015.10.020
[26] Bannister, P., Urbieta, A. S., & Peñalver, E. A. (2023). A systematic review of generative AI and (English medium instruction) higher education. Aula Abierta, 52(4), 401-409. http://dx.doi.org/10.17811/rifie.52.4.2023.401-409
[27] Gao, P., Li, J., & Liu, S. (2021). An introduction to key technology in artificial intelligence and big data driven e-learning and e-education. Mobile Networks and Applications, 26(5), 2123-2126. http://dx.doi.org/10.1007/s11036-021-01777-7
[28] Sekli, G. M., Godo, A., & Véliz, J. C. (2024). Generative AI solutions for faculty and students: A review of literature and roadmap for future research. Journal of Information Technology Education: Research, 23, 014. http://dx.doi.org/10.1109/ACCESS.2024.3468368
[29] Feng, C. M., Botha, E., & Pitt, L. (2024). From HAL to GenAI: Optimizing chatbot impacts with CARE. Business Horizons, 67(5), 537-548. http://dx.doi.org/10.1016/j.bushor.2024.04.012
[30] Olan, F., Arakpogun, E. O., Suklan, J., Nakpodia, F., Damij, N., & Jayawickrama, U. (2022). Artificial intelligence and knowledge sharing: Contributing factors to organizational performance. Journal of Business Research, 145, 605-615. http://dx.doi.org/10.1016/j.jbusres.2022.03.008
[31] Banerjee, D. K., Kumar, A., & Sharma, K. (2024). AI Enhanced Predictive Maintenance for Manufacturing System. International Journal of Research and Review Techniques, 3(1), 143-146. https://www.researchgate.net/publication/383022732_AI_Enhanced_Predictive_Maintenance_for_Manufacturing_System
[32] Becerra Sánchez, L. Y., Herrera Arroyave, J. E., Morris Molina, L. H. H., & Toro Lazo, A. (2024). Tecnologías de la cuarta revolución industrial utilizadas en la manufactura para mejorar los indicadores de productividad: Una revisión. Entre Ciencia e Ingeniería, 18(35), 46–58. https://doi.org/10.31908/19098367.3149
[33] Kuziboev, B., Saidmamatov, O., Khodjaniyazov, E., Ibragimov, J., Marty, P., Ruzmetov, D., ... & Ibadullaev, D. (2024). CO2 emissions, remittances, energy intensity and economic development: The evidence from Central Asia. Economies, 12(4), 95. https://www.researchgate.net/publication/379890554_CO2_Emissions_Remittances_Energy_Intensity_and_Economic_Development_The_Evidence_from_Central_Asia
[34] Corrigan, C. C., & Ikonnikova, S. A. (2024). A review of the use of AI in the mining industry: Insights and ethical considerations for multi-objective optimization. The Extractive Industries and Society, 17, 101440. https://doi.org/10.1016/j.exis.2024.101440
[35] Vigo Rodríguez, G. A., Velarde Gonzales, E. J., & Mendoza De Los Santos, A. C. (2024). La importancia de la optimización de procesos con IoT en el sector industrial. INGENIERÍA INVESTIGA, 6. https://doi.org/10.47796/ing.v6i00.1091
[36] Bejani, M. M., & Ghatee, M. (2021). A systematic review on overfitting control in shallow and deep neural networks. Artificial Intelligence Review, 54(8), 6391-6438. https://doi.org/10.1007/s10462-021-09975-1
[37] Brambilla, I., César, A., Falcone, G., & Gasparini, L. (2023). The impact of robots in Latin America: Evidence from local labor markets. World Development, 170, 106271. https://doi.org/10.1016/j.worlddev.2023.106271
[38] Hilliger, I., G. Ceballos, H., Maldonado-Mahauad, J., & Ferreira, R. (2024). Applications of learning analytics in latin america. Journal of Learning Analytics, 11(1), 1–5. https://doi.org/10.18608/jla.2024.8409
Contribución de los Autores Individuales en la Elaboración de un Artículo Científico (Po-lítica de Ghostwriting)
Todos los autores participaron equitativamente del desarrollo del artículo.
Fuentes de Financiamiento para la Investiga-ción Presentada en el Artículo Científico o para el Artículo Científico en sí
No se recibió financiación para la realización de este estudio.
Conflicto de Intereses
Los autores declaran no tener ningún conflicto de interés relevante con el contenido de este artículo.
Licencia de Atribución de Creative Com-mons 4.0 (Atribución 4.0 Internacional, CC BY 4.0)
Este artículo se publica bajo los términos de la Licencia de Atribución de Creative Commons 4.0
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Carlos Javier Lara Lascano (Autor/a)

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Eres libre de:
- Compartir : copiar y redistribuir el material en cualquier medio o formato
- Adaptar : remezclar, transformar y desarrollar el material
- El licenciante no puede revocar estas libertades siempre y cuando usted cumpla con los términos de la licencia.
En los siguientes términos:
-
- Atribución : Debe otorgar el crédito correspondiente , proporcionar un enlace a la licencia e indicar si se realizaron cambios . Puede hacerlo de cualquier manera razonable, pero no de ninguna manera que sugiera que el licenciante lo respalda a usted o a su uso.
- No comercial : no puede utilizar el material con fines comerciales .
- CompartirIgual — Si remezcla, transforma o construye sobre el material, debe distribuir sus contribuciones bajo la misma licencia que el original.
- Sin restricciones adicionales : no puede aplicar términos legales ni medidas tecnológicas que restrinjan legalmente a otros hacer algo que la licencia permite.










