Classification and Mapping of Online Gambling Based on News Articles Using NER and SVM

  • Wisnu Mukti Darwansah Universitas Pembangunan Nasional "Veteran" Jawa Timur
  • Amalia Anjani Arifiyanti UPN jatim
  • Rizka Hadiwiyanti
Keywords: Online Gambling, Support Vector Machine (SVM), Named Entity Recognition (NER), Risk Classification, Data Visualization

Abstract

The phenomenon of online gambling in Indonesia has developed rapidly, posing serious social and economic threats. This thesis aims to classify and map online gambling activities based on digital news using the Support Vector Machine (SVM) algorithm and Named Entity Recognition (NER). Data were collected from the news portals Detik.com, Kompas.com, and Tribunnews from 2017 to 2024 through a web scraping approach. The research process included setup and library import, data upload, data exploration, data labeling according to Law No. 1 of 2023, data preprocessing, data filtering, location normalization and extraction, and location data cleaning. Subsequently, the SVM model was trained for risk classification and followed by prediction. Evaluation was conducted using accuracy and F1-score metrics to assess overall model performance and classification balance. Based on the evaluation results, the Normal SVM model demonstrated the best performance with an accuracy of 96.94% and an F1-score of 0.97. The findings indicate that the combination of NER and SVM effectively identifies the location and risk level of online gambling activities. This research is expected to contribute to law enforcement authorities and policymakers in their efforts to prevent and address online gambling activities in Indonesia.

Downloads

Download data is not yet available.

References

Erlina F. Santika, “10 Media Online yang Paling Banyak Digunakan Warga Indonesia 2024,” https://databoks.katadata.co.id/media/statistik/4b024acf115a988/10-media-online-yang-paling-banyak-digunakan-warga-indonesia-2024.

N. Nurchim, N. Nurmalitasari, and Z. A. Long, “Indonesian news classification application with named entity recognition approach,” JURNAL INFOTEL, vol. 15, no. 2, pp. 130–134, May 2023, doi: 10.20895/infotel.v15i2.909.

Fidyan Hamdi Lubis, Melisa Pane, and Irwansyah, “Fenomena Judi Online di Kalangan Remaja dan Faktor penyebab Maraknya Serta Pandangan Hukum Positif dan Hukum Islam (Maqashid Syariah),” Jurnal Pendidikan dan Konseling, vol. 5, pp. 2656–2657, 2023.

I. Tasya Jadidah et al., “Analisis maraknya judi online di Masyarakat,” 2023.

Adi Ahdiat, “Judi Online Kian Marak, Transaksinya Tembus Ratusan Triliun,” https://databoks.katadata.co.id/ekonomi-makro/statistik/2bdcd34bb7533f5/judi-online-kian-marak-transaksinya-tembus-ratusan-triliun.

S. Supriyatna and E. Fahrudin, “PEMANFAATAN ALGORITMA TEXT MINING DALAM MENEMUKAN POLA RISIKO BENCANA SEBAGAI PENGETAHUAN KEBENCANAAN DARI DOKUMEN KAJIAN RISIKO BENCANA (KRB) 1*,” Jurnal Informatika Utama, vol. 2, no. 1, 2024, doi: 10.55903/jitu.v2i1.xx.

W. Saefudin, A. Komarudin, and R. Ilyas, “Visualisasi Kumpulan Berita Dalam Bentuk Peta Digital Dengan Metode Term Frequency-Inverse Document Frequency dan Gazetteer,” Seminar Nasional Sains dan Teknologi Informasi (SENSASI), Jul. 2019, [Online]. Available: http://prosiding.seminar-id.com/index.php/sensasi/issue/archivePage|665

A. Suganda Girsang and B. Krisna Noveta, “Location Prediction using Named Entity Recognition for Indonesia Natural Disasters in Data Twitter,” Elsevier, Nov. 2022, [Online]. Available: https://ssrn.com/abstract=4276345

A. Adhitama, S. Hidayatullah, and M. Rahman, “Klusterisasi Judul Berita Pada Website Detik Menggunakan Algoritma Kmeans,” Indonesian Journal of Innovation Science and Knowledge, vol. 1, Jul. 2024.

S. Aliff, S. Ramadhani, A. Rahmanqa, D. Nur, and A. Rakhmawati, “Deteksi Lokasi Siswa SMP di Instagram dengan Metode Named Entity Recognition,” Jurnal Sosial dan Teknologi (SOSTECH), vol. 1, no. 7, 2021, [Online]. Available: https://greenvest.co.id/

Oryza Habibie Rahman, Gunawan Abdillah, and Agus Komarudin, “Klasifikasi Ujaran Kebencian pada Media Sosial Twitter Menggunakan Support Vector Machine,” Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), vol. 5, no. 1, pp. 17–23, Feb. 2021, doi: 10.29207/resti.v5i1.2700.

C. C. Aggarwal and C. Zhai, “Mining Text Data,” New York: Springer, 2012.

Y. A. Hafiz and E. Sudarmilah, “IMPLEMENTASI WEB SCRAPING PADA PORTAL BERITA ONLINE,” Inisiasi, vol. 12, 2023, doi: https://doi.org/10.59344/inisiasi.v12i1.120.

H. Kaur and A. Saini, “A Review of Data Preprocessing Techniques for Text Mining,” Int. J. Comput. Appl, vol. 103, 2014.

S. García, J. Luengo, and F. Herrera, “Intelligent Systems Reference Library 72 Data Preprocessing in Data Mining,” cham, 2015. doi: 10.1007/978-3-319-10247-4.

I. S. Wibowo, A. Witanti, and I. Susilawati, “Keyword Extraction Judul Berita Online Di Indonesia Menggunakan Metode TF-IDF”, [Online]. Available: http://jurnal.mdp.ac.id

H. Zhao, X. Li, F. Wang, Q. Zeng, and X. Diao, “Incorporating keyword extraction and attention for multi-label text classification,” Journal of Intelligent & Fuzzy Systems, vol. 45, no. 2, pp. 2083–2093, Aug. 2023, doi: 10.3233/JIFS-230506.

Y. Zhang, M. Jiang, Y. Meng, Y. Zhang, and J. Han, “PIEClass: Weakly-Supervised Text Classification with Prompting and Noise-Robust Iterative Ensemble Training,” Oct. 2023.

Z. Qiang, K. Taylor, and W. Wang, “How Does A Text Preprocessing Pipeline Affect Ontology Syntactic Matching?,” Nov. 2024.

B. Jehangir, S. Radhakrishnan, and R. Agarwal, “A survey on Named Entity Recognition — datasets, tools, and methodologies,” Natural Language Processing Journal, vol. 3, p. 100017, Jun. 2023, doi: 10.1016/j.nlp.2023.100017.

J. Ortega and G. Brotosaputro, “Analisis Sentimen Tokoh Politik pada Situs Berita Menggunakan NER. Studi Kasus: IMMC,” Prosiding SISFOTEK, vol. 3, Oct. 2019.

K. Lee et al., “Deduplicating Training Data Makes Language Models Better,” Jul. 2021.

W. Chen, K. Yang, Z. Yu, Y. Shi, and C. L. P. Chen, “A survey on imbalanced learning: latest research, applications and future directions,” Artif Intell Rev, vol. 57, no. 6, p. 137, May 2024, doi: 10.1007/s10462-024-10759-6.

M. Mujahid et al., “Data oversampling and imbalanced datasets: an investigation of performance for machine learning and feature engineering,” J Big Data, vol. 11, no. 1, p. 87, Jun. 2024, doi: 10.1186/s40537-024-00943-4.

Robert Antonius, A. R. Zulkarnain, and H. Irsyad, “Pendekatan TF-IDF, SMOTE, dan SVM dalam Klasifikasi Sentimen Masyarakat terhadap Pemblokiran Judi Online,” Buletin Ilmiah Informatika Teknologi, vol. 2, no. 3, pp. 115–122, Jun. 2024, doi: 10.58369/biit.v2i3.65.

Published
2025-12-10
How to Cite
Wisnu Mukti Darwansah, Amalia Anjani Arifiyanti, & Rizka Hadiwiyanti. (2025). Classification and Mapping of Online Gambling Based on News Articles Using NER and SVM. JURNAL TEKNOLOGI DAN OPEN SOURCE, 8(2), 788 - 797. https://doi.org/10.36378/jtos.v8i2.4707
Abstract viewed = 0 times
PDF downloaded = 0 times