Benchmarking IndoBERT and Multilingual BERT for Indonesian Financial News Sentiment Classification

Matius Ivan Bimasena; I Kadek Yogi Prayoga; I Gusti Agung Putu Mahendra; Purnama Sidik

doi:10.36378/jtos.v9i1.5583

Matius Ivan Bimasena Institut Teknologi dan Bisnis STIKOM Bali
I Kadek Yogi Prayoga Department of Information Technology, Institut Teknologi dan Bisnis STIKOM Bali, Indonesia
I Gusti Agung Putu Mahendra Department of Informatic Engineering, Politeknik Negeri Bengkalis, Indonesia
Purnama Sidik Department of Information Technology, Institut Teknologi dan Bisnis STIKOM Bali, Indonesia

DOI: https://doi.org/10.36378/jtos.v9i1.5583

Keywords: Financial news;, Classification;, Transformer-based models;, IndoBERT;, Multilingual BERT

Abstract

Financial news sentiment classification was important for understanding market narratives and investor perception, but Indonesian financial news remained challenging because it contained domain-specific terminology, numerical expressions, and imbalanced sentiment categories. This study benchmarked two transformer-based models, IndoBERT and Multilingual BERT, for classifying Indonesian financial news sentiment into negative, neutral, and positive classes. The dataset consisted of economic and financial news articles from Kontan, CNBC, and Bisnis.com during the first quarter of 2026. After preprocessing, 3,366 articles were used, consisting of 3,070 neutral, 184 negative, and 112 positive articles. The dataset was divided into training, validation, and testing sets using stratified splitting. Class weighting was applied to reduce the effect of class imbalance. The results showed that IndoBERT achieved the best overall performance, with 0.94 accuracy and 0.71 macro F1-score, while Multilingual BERT achieved 0.93 accuracy and 0.70 macro F1-score. These findings indicated that IndoBERT was more suitable for Indonesian financial news sentiment classification, although Multilingual BERT remained competitive, especially in detecting positive sentiment.

Downloads

Download data is not yet available.

References

G. Anese, M. Corazza, M. Costola, and L. Pelizzon, “Impact of public news sentiment on stock market index return and volatility,” Computational Management Science, vol. 20, no. 1, p. 20, Dec. 2023, doi: 10.1007/s10287-023-00454-2.

M. Bask, L. Forsberg, and A. Östling, “Media sentiment and stock returns,” The Quarterly Review of Economics and Finance, vol. 94, pp. 303–311, Apr. 2024, doi: 10.1016/j.qref.2024.02.008.

L. Yacoubian, “The Predictive Power of Social Media Sentiment on Stock Market Returns,” International Journal For Multidisciplinary Research, vol. 7, no. 3, May 2025, doi: 10.36948/ijfmr.2025.v07i03.46689.

S. D. Sriasih, F. A. Razak, and H. al I. Ikhsan, “AI-Driven Sentiment Analysis of Retail Investor Behavior during Market Volatility: A Study of Twitter Data in Southeast Asia,” Journal of Management and Informatics, vol. 4, no. 1, pp. 741–756, Apr. 2025, doi: 10.51903/jmi.v4i1.179.

K. Du, F. Xing, R. Mao, and E. Cambria, “Financial Sentiment Analysis: Techniques and Applications,” ACM Comput. Surv., vol. 56, no. 9, pp. 1–42, Oct. 2024, doi: 10.1145/3649451.

M. R. Agam, N. Setyawan, C.-C. Sun, H.-K. Su, and J.-W. Hsieh, “Classification of Indonesian Language News Documents Using RNN and Transformers,” in 2025 IEEE International Conference on Consumer Electronics - Taiwan (ICCE-Taiwan), IEEE, Jul. 2025, pp. 407–408. doi: 10.1109/ICCE-Taiwan66881.2025.11207956.

A. Karentia, F. Winaya, and D. Suhartono, “Hybrid approach sentiment analysis using Transformer-LSTM in the Indonesian language,” in 2024 7th International Seminar on Research of Information Technology and Intelligent Systems (ISRITI), IEEE, Dec. 2024, pp. 754–758. doi: 10.1109/ISRITI64779.2024.10963365.

Z. Zhu, “BERT and Its Applications in Natural Language Understanding,” Applied and Computational Engineering, vol. 175, no. 1, pp. 99–105, Aug. 2025, doi: 10.54254/2755-2721/2025.AST26090.

J. Wang et al., “Utilizing BERT for Information Retrieval: Survey, Applications, Resources, and Challenges,” ACM Comput. Surv., vol. 56, no. 7, pp. 1–33, Jul. 2024, doi: 10.1145/3648471.

N. E. Aliyah, R. W. Sholikah, H. Firdausi, H. T. Ciptaningtyas, and I. A. Sabilla, “Enhancing Automated Essay Scoring in Bahasa Indonesia with IndoBERT and IndoSBERT,” in 2025 International Conference on Smart Computing, IoT and Machine Learning (SIML), IEEE, Jun. 2025, pp. 1–7. doi: 10.1109/SIML65326.2025.11080721.

L. Afuan, N. Hidayat, H. Hamdani, H. Ismanto, B. C. Purnama, and D. I. Ramdhani, “Optimizing BERT Models with Fine-Tuning for Indonesian Twitter Sentiment Analysis,” J. Wirel. Mob. Netw. Ubiquitous Comput. Dependable Appl., vol. 16, no. 2, pp. 248–267, Jun. 2025, doi: 10.58346/JOWUA.2025.I2.016.

Taufiq Dwi Purnomo and Joko Sutopo, “COMPARISON OF PRE-TRAINED BERT-BASED TRANSFORMER MODELS FOR REGIONAL LANGUAGE TEXT SENTIMENT ANALYSIS IN INDONESIA,” International Journal Science and Technology, vol. 3, no. 3, pp. 11–21, Nov. 2024, doi: 10.56127/ijst.v3i3.1739.

E. Z. Rahardjo and T. Mauritsius, “Predicting Bank Share Prices in Indonesia using News Sentiment Analysis,” in 2025 International Conference on Information Management and Technology (ICIMTech), IEEE, Aug. 2025, pp. 746–751. doi: 10.1109/ICIMTech67074.2025.11265540.

J. Delgadillo, J. Kinyua, and C. Mutigwe, “FinSoSent: Advancing Financial Market Sentiment Analysis through Pretrained Large Language Models,” Big Data and Cognitive Computing, vol. 8, no. 8, p. 87, Aug. 2024, doi: 10.3390/bdcc8080087.

S. Adhikari, S. Thapa, U. Naseem, H. Y. Lu, G. Bharathy, and M. Prasad, “Explainable hybrid word representations for sentiment analysis of financial news,” Neural Networks, vol. 164, pp. 115–123, Jul. 2023, doi: 10.1016/j.neunet.2023.04.011.

A. A. Wijaya, B. A. Jabar, G. M. Sutarman, and A. Wijaya, “Analyzing the Short Term Impact of News Sentiment on Indonesian Stock Prices Using Deep Learning Models,” in 2025 4th International Conference on Digital Transformation and Applications (ICDXA), IEEE, Oct. 2025, pp. 268–272. doi: 10.1109/ICDXA69105.2025.11329903.

M. I. Salih, S. M. Mohammed, A. Kh. Ibrahim, O. M. Ahmed, and L. M. Haji, “Fine-Tuning BERT for Automated News Classification,” Engineering, Technology & Applied Science Research, vol. 15, no. 3, pp. 22953–22959, Jun. 2025, doi: 10.48084/etasr.10625.

N. P. I. Maharani, A. Purwarianti, Y. Yustiawan, and F. C. Rochim, “Domain-Specific Language Model Post-Training for Indonesian Financial NLP,” in 2023 International Conference on Electrical Engineering and Informatics (ICEEI), IEEE, Oct. 2023, pp. 1–6. doi: 10.1109/ICEEI59426.2023.10346625.

J. Acs, E. Hamerlik, R. Schwartz, N. A. Smith, and A. Kornai, “Morphosyntactic probing of multilingual BERT models,” Nat. Lang. Eng., vol. 30, no. 4, pp. 753–792, Jul. 2024, doi: 10.1017/S1351324923000190.

V. Ganganwar and R. Rajalakshmi, “Employing synthetic data for addressing the class imbalance in aspect-based sentiment classification,” Journal of Information and Telecommunication, vol. 8, no. 2, pp. 167–188, Apr. 2024, doi: 10.1080/24751839.2023.2270824.

J. Yang, F. Wei, N. Huber-Fliflet, A. Dabrowski, Q. Mao, and H. Qin, “An Empirical Analysis of Text Segmentation for BERT Classification in Extended Documents,” in 2023 IEEE International Conference on Big Data (BigData), IEEE, Dec. 2023, pp. 2793–2797. doi: 10.1109/BigData59044.2023.10386783.

I. Markoulidakis and G. Markoulidakis, “Probabilistic Confusion Matrix: A Novel Method for Machine Learning Algorithm Generalized Performance Analysis,” Technologies (Basel)., vol. 12, no. 7, p. 113, Jul. 2024, doi: 10.3390/technologies12070113.

M. Heydarian, T. E. Doyle, and R. Samavi, “MLCM: Multi-Label Confusion Matrix,” IEEE Access, vol. 10, pp. 19083–19095, 2022, doi: 10.1109/ACCESS.2022.3151048.

M. C. Hinojosa Lee, J. Braet, and J. Springael, “Performance Metrics for Multilabel Emotion Classification: Comparing Micro, Macro, and Weighted F1-Scores,” Applied Sciences, vol. 14, no. 21, p. 9863, Oct. 2024, doi: 10.3390/app14219863.

N. P. I. Maharani, A. Purwarianti, Y. Yustiawan, and F. C. Rochim, “Domain-Specific Language Model Post-Training for Indonesian Financial NLP,” in 2023 International Conference on Electrical Engineering and Informatics (ICEEI), IEEE, Oct. 2023, pp. 1–6. doi: 10.1109/ICEEI59426.2023.10346625.

Taufiq Dwi Purnomo and Joko Sutopo, “COMPARISON OF PRE-TRAINED BERT-BASED TRANSFORMER MODELS FOR REGIONAL LANGUAGE TEXT SENTIMENT ANALYSIS IN INDONESIA,” International Journal Science and Technology, vol. 3, no. 3, pp. 11–21, Nov. 2024, doi: 10.56127/ijst.v3i3.1739.