Development of a Machine Learning Model for Predicting River Pollution Levels Caused by Illegal Gold Mining Activities in Kuantan Singingi Regency

  • Jasri Universitas Islam Kuantan Singingi
  • Aprizal Universitas Islam Kuantan Singingi
  • Erlinda Universitas Islam Kuantan Singingi
  • Febri Haswan Universitas Islam Kuantan Singingi
  • Amirel Hafief Agiel Universitas Islam Kuantan Singingi
Keywords: Machine Learning, River Pollution, Random Forest, Water Quality, Illegal Gold Mining

Abstract

Illegal Gold Mining (PETI) activities in Kuantan Singingi Regency have caused river water pollution and posed serious threats to environmental sustainability and public health. Conventional water quality monitoring methods still have limitations because they rely on periodic laboratory testing and are unable to provide rapid predictive results. Therefore, this study developed a Machine Learning-based prediction system to analyze river pollution levels caused by illegal gold mining activities. The study utilized water quality parameters consisting of pH, temperature, Total Suspended Solid, Dissolved Oxygen, Biological Oxygen Demand, Chemical Oxygen Demand, and mercury concentration. The dataset was processed through several preprocessing stages, including data cleaning, normalization, feature selection, and data splitting. Several Machine Learning algorithms, namely Random Forest, Support Vector Machine, and Artificial Neural Network, were implemented and compared to determine the best prediction model. The results showed that the Random Forest algorithm achieved the best performance with high accuracy and stable classification results. Furthermore, the developed model was integrated into a web-based system equipped with pollution visualization features, river information, and a Geographic Information System. The system is expected to support environmental monitoring and assist decision-making in river pollution management in Kuantan Singingi Regency.

Downloads

Download data is not yet available.

References

S. B. Awadh and M. M. Al-Sabbagh, “Machine Learning Approaches for Water Quality Prediction: A Review,” Environmental Monitoring and Assessment, vol. 195, no. 2, pp. 1–15, 2023.

L. Breiman, “Random Forests,” Machine Learning, vol. 45, no. 1, pp. 5–32, 2001.

C. Cortes and V. Vapnik, “Support-Vector Networks,” Machine Learning, vol. 20, no. 3, pp. 273–297, 1995.

S. Haykin, Neural Networks and Learning Machines, 3rd ed. New York, NY, USA: Prentice Hall, 2009.

J. Han, M. Kamber, and J. Pei, Data Mining: Concepts and Techniques, 3rd ed. San Francisco, CA, USA: Morgan Kaufmann, 2012.

I. H. Witten, E. Frank, and M. A. Hall, Data Mining: Practical Machine Learning Tools and Techniques, 4th ed. Burlington, MA, USA: Morgan Kaufmann, 2016.

Y. Chen, Z. Song, X. Wang, and H. Yang, “Water Quality Prediction Based on Machine Learning Methods,” Journal of Hydrology, vol. 589, pp. 1–12, 2020.

A. Mosavi, P. Ozturk, and K. Chau, “Flood Prediction Using Machine Learning Models: Literature Review,” Water, vol. 10, no. 11, pp. 1–40, 2018.

S. R. Qasem, A. A. Al-Fuqaha, and M. A. Alzoubi, “Prediction of Water Pollution Using Artificial Intelligence Techniques,” Environmental Science and Pollution Research, vol. 29, no. 14, pp. 20345–20358, 2022.

N. K. Ahmed, A. F. Atiya, N. El Gayar, and H. El-Shishiny, “An Empirical Comparison of Machine Learning Models for Time Series Forecasting,” Econometric Reviews, vol. 29, no. 5, pp. 594–621, 2010.

T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning, 2nd ed. New York, NY, USA: Springer, 2009.

B. Liu, Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data, 2nd ed. Berlin, Germany: Springer, 2011.

R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction, 2nd ed. Cambridge, MA, USA: MIT Press, 2018.

M. F. Goodchild, “Geographic Information Systems and Science,” International Journal of Geographical Information Science, vol. 26, no. 4, pp. 589–590, 2012.

P. A. Longley, M. F. Goodchild, D. J. Maguire, and D. W. Rhind, Geographical Information Systems and Science, 4th ed. Hoboken, NJ, USA: Wiley, 2015.

A. K. Bhadra, S. Chattopadhyay, and S. S. Das, “Prediction of River Water Quality Using Artificial Neural Network,” Applied Water Science, vol. 11, no. 4, pp. 1–12, 2021.

J. Brownlee, Machine Learning Mastery with Python. Melbourne, Australia: Machine Learning Mastery, 2017.

H. Patel and R. Patel, “A Review on Machine Learning Approaches for Water Quality Analysis,” Procedia Computer Science, vol. 167, pp. 1301–1310, 2020.

A. Singh, P. Sharma, and R. Kumar, “Water Quality Monitoring System Using IoT and Machine Learning,” in 2021 International Conference on Smart Electronics and Communication, pp. 554–560, 2021.

M. K. Jha, P. Samal, and R. Das, “Prediction of Water Quality Index Using Machine Learning Algorithms,” International Journal of Environmental Research and Public Health, vol. 20, no. 3, pp. 1–18, 2023.

S. R. Kotsiantis, “Supervised Machine Learning: A Review of Classification Techniques,” Informatica, vol. 31, no. 3, pp. 249–268, 2007.

A. Géron, Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, 3rd ed. Sebastopol, CA, USA: O’Reilly Media, 2022.

D. P. Solomatine and A. Ostfeld, “Data-Driven Modelling: Some Past Experiences and New Approaches,” Journal of Hydroinformatics, vol. 10, no. 1, pp. 3–22, 2008.

F. Pedregosa, G. Varoquaux, A. Gramfort, et al., “Scikit-learn: Machine Learning in Python,” Journal of Machine Learning Research, vol. 12, pp. 2825–2830, 2011.

X. Wu, V. Kumar, J. R. Quinlan, et al., “Top 10 Algorithms in Data Mining,” Knowledge and Information Systems, vol. 14, no. 1, pp. 1–37, 2008.

A. H. Tan, “Data Mining and Knowledge Discovery Technologies,” in Proceedings of the PAKDD Conference, pp. 1–14, 1999.

M. Abadi, A. Agarwal, P. Barham, et al., “TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems,” in Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation, pp. 265–283, 2016.

S. Madakam, R. Ramaswamy, and S. Tripathi, “Internet of Things (IoT): A Literature Review,” Journal of Computer and Communications, vol. 3, no. 5, pp. 164–173, 2015.

H. Gupta and S. Gupta, “GIS-Based Environmental Monitoring System Using Spatial Analysis Techniques,” International Journal of Environmental Science and Technology, vol. 18, no. 7, pp. 1551–1562, 2021.

K. P. Murphy, Machine Learning: A Probabilistic Perspective. Cambridge, MA, USA: MIT Press, 2012.

F. Haswan, N. W. Al-Hafiz, and G. Hamza, “Decision Support System Using the SAW Method for Determining Research Priorities at Universitas Islam Kuantan Singingi,” INOVTEK Polbeng - Seri Informatika, vol. 10, no. 3, pp. 1602–1612, 2025.

F. Haswan, Erlinda, and W. Walhidayat, “Implementasi Sistem Pendukung Keputusan untuk Menentukan Calon Reviewer Internal Universitas Islam Kuantan Singingi,” ZONAsi: Jurnal Sistem Informasi, vol. 6, no. 2, pp. 499–509, 2024.

D. Setiawan, F. Haswan, and Jasri, “Design of School Bell Scheduling Application Based on Arduino Uno on MTs Babussalam Simandolak,” Jurnal Teknologi dan Open Source, vol. 7, no. 1, pp. 22–30, 2024.

W. Walhidayat, J. Andanu, S. Djusar, and F. Haswan, “Sistem Informasi Berbasis Web untuk Pendataan dan Pengelolaan Data Penduduk dengan Metode Object Oriented Analysis and Design,” ZONAsi: Jurnal Sistem Informasi, vol. 7, no. 2, pp. 715–724, 2025.

F. Haswan, “Application of Simple Additive Weighting Method to Determine Outstanding School Principals,” Sinkron, vol. 3, no. 2, pp. 186–195, 2019.

Published
2026-06-25
How to Cite
Jasri, Aprizal, Erlinda, Febri Haswan, & Amirel Hafief Agiel. (2026). Development of a Machine Learning Model for Predicting River Pollution Levels Caused by Illegal Gold Mining Activities in Kuantan Singingi Regency. JURNAL TEKNOLOGI DAN OPEN SOURCE, 9(1), 390 - 410. https://doi.org/10.36378/jtos.v9i1.5724
Abstract viewed = 0 times
PDF downloaded = 0 times

Most read articles by the same author(s)