Semantic Diversity In The Formation Of Story Questions From The Big Language Model
Abstract
The development of Large Language Models (LLMs) has opened new opportunities for the automatic generation of math word problems (MWPs). However, many existing approaches still produce repetitive and template-based problems due to limited variation in context, narrative structure, and semantic relationships. This limitation reduces the effectiveness of such problems in assessing students’ conceptual understanding. This study aims to develop a semantic diversity pipeline based on context-aware generation to produce more varied, meaningful, and curriculum-aligned math word problems for Indonesian elementary education. The proposed method involves building an Indonesian MWP dataset, fine-tuning an LLM using Low-Rank Adaptation (LoRA), and designing a generation pipeline consisting of context retrieval, prompt diversification, semantic evaluation, and a solvability filtering mechanism. Evaluation was conducted using automated metrics, including Self-BLEU, Jaccard Similarity, and Cosine Similarity based on Sentence-BERT, as well as qualitative assessments from mathematics teachers. The results show that the proposed approach successfully improves lexical, semantic, and contextual diversity in generated problems. The Self-BLEU score of 4.79 indicates low repetition, the Jaccard Similarity score of 0.194 reflects high vocabulary variation, and the Cosine Similarity score of 0.424 demonstrates balanced semantic diversity while maintaining mathematical consistency. Teacher evaluations further confirm that the generated problems are relevant to the curriculum, appropriately challenging, and more natural compared to conventional methods. Overall, this research contributes to the development of more adaptive and diverse LLM-based math word problem generation systems for mathematics learning in Indonesia.
Downloads
References
A. Aghzal dan others, “Evaluating Large Language Models on Advanced Scientific and Mathematical Problem Solving,” IEEE Transactions on Learning Technologies, vol. 18, no. 1, hlm. 21–35, 2025.
N. Mohiuddin, A. Hassan, dan M. Rahman, “Large Language Models for Educational Natural Language Processing: A Survey,” IEEE Access, vol. 11, hlm. 112345–112367, 2023.
Y. Liu dan others, “Large Language Models for Mathematical Reasoning: Challenges and Opportunities,” Artificial Intelligence Review, vol. 57, no. 1, hlm. 1–29, 2024.
R. E. N. Wang, H. Li, dan J. Zhao, “Automatic Math Word Problem Generation Using Transformer Models,” Expert Systems with Applications, vol. 159, hlm. 113596, 2020.
D. A. Cruse, Meaning in Language: An Introduction to Semantics and Pragmatics. Oxford, UK: Oxford University Press, 2000.
T. B. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, dan others, “Language Models are Few-Shot Learners,” dalam Advances in Neural Information Processing Systems, 2020, hlm. 1877–1901.
E. J. Hu dkk., “LoRA: Low-Rank Adaptation of Large Language Models,” dalam International Conference on Learning Representations (ICLR), 2022.
S. Wang, X. Chen, dan F. Li, “Semantic Diversity Enhancement in Neural Text Generation,” IEEE Transactions on Artificial Intelligence, vol. 2, no. 4, hlm. 322–334, 2021.
D. Zhang, L. Wang, dan Y. Liu, “Generating Mathematical Word Problems with Semantic Constraints,” dalam Proceedings of COLING, 2020, hlm. 245–256.
J. Lyons, Semantics. Cambridge, UK: Cambridge University Press, 1977.
P. Anand, M. Sharma, dan K. Gupta, “Reasoning Limitations of Large Language Models in Mathematical Word Problems,” Computers & Education: Artificial Intelligence, vol. 5, hlm. 100178, 2024.
X. Xie dan others, “Multi-Level Evaluation Metrics for Diverse Text Generation,” Information Processing & Management, vol. 60, no. 2, hlm. 103210, 2023.
M. Strohmaier dan others, “Holistic Evaluation Framework for Educational Text Generation Systems,” Education and Information Technologies, vol. 30, no. 2, hlm. 1551–1570, 2025.
Z. Chen dan others, “Self-Instruct and Rejection Sampling for High-Quality Educational Dataset Generation,” IEEE Access, vol. 13, hlm. 44122–44138, 2025.
A. Mahran dan K. Simbeck, “Semantic Control in Educational Language Models,” Journal of Artificial Intelligence in Education, vol. 35, no. 1, hlm. 44–63, 2025.
N. Reimers dan I. Gurevych, “Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks,” dalam Proceedings of EMNLP-IJCNLP, 2019, hlm. 3982–3992.
. Rahmawati, D. P., & Dwi Seputro, D. N. (2025). PENINGKATAN PEMAHAMAN GROOMING SERVICE MELALUI PELATIHAN BERBASIS PRETEST DAN POSTTEST PADA KARYAWAN SUWEGER INDONESIA. BHAKTI NAGORI (Jurnal Pengabdian Kepada Masyarakat), 5(2), 456 - 462. https://doi.org/10.36378/bhakti_nagori.v5i2.4587
. Mumtazah Nadhiroh, A. K., Febrianti, A., Ultami, J. N., Ikhsan, M. A., & Hasibuan, R. (2025). PENINGKATAN PENGETAHUAN GIZI SEIMBANG DAN POLA HIDUP SEHAT BAGI SISWA SEKOLAH DASAR MELALUI PROGRAM EDUKASI INTERAKTIF DI SDIT SWASTA AL-MUNAYA: PKM. BHAKTI NAGORI (Jurnal Pengabdian Kepada Masyarakat), 5(2), 463 - 470. https://doi.org/10.36378/bhakti_nagori.v5i2.4595
. Wirasada, G. D., & Zawawi , Z. (2025). PENERAPAN MANAJEMEN OPERASIONAL DI PT. AGRODANA FUTURES: STUDI PADA PROSES EKSEKUSI TRANSAKSI DAN LAYANAN NASABAH. BHAKTI NAGORI (Jurnal Pengabdian Kepada Masyarakat), 5(2), 471 - 477. https://doi.org/10.36378/bhakti_nagori.v5i2.4602
. Nurza, R. P., Tessa, T., Dzhabi, M., Nazli, R., & Khomarudin, A. N. (2025). PENYULUHAN EDUKASI PENGATURAN SCREEN TIME DAN FILTER KONTEN DIGITAL PADA KELUARGA DI POSYANDU BUNDO KANDUANG. BHAKTI NAGORI (Jurnal Pengabdian Kepada Masyarakat), 5(2), 478 - 487. https://doi.org/10.36378/bhakti_nagori.v5i2.4637
. Rizki Fortuna, J., Ilmi Romadhoni, S., & Sari Tondang, I. (2025). PELATIHAN PEMANFAATAN KOTORAN KAMBING MENJADI PUPUK ORGANIK DI BALAI PENYULUHAN PERTANIAN PORONG: PKM MBKM. BHAKTI NAGORI (Jurnal Pengabdian Kepada Masyarakat), 5(2), 488 - 494. https://doi.org/10.36378/bhakti_nagori.v5i2.4638
. Devi, E., Fauziah Nurrahmah, F., Masruroh, M., Olivia Sinaga, S. L., Pribadi Ayuningtyas, Z., & Mardi Suryanto, T. L. (2025). EFEKTIVITAS PELATIHAN AQUAPONIK TERHADAP PENINGKATAN PENGETAHUAN DAN KETERAMPILAN PERTANIAN BERKELANJUTAN DI KELURAHAN JAMBANGAN. BHAKTI NAGORI (Jurnal Pengabdian Kepada Masyarakat), 5(2), 495 - 503. https://doi.org/10.36378/bhakti_nagori.v5i2.4742
. Trimono, T., Ningtiyas, R. W., Icha Rohmatul Jannah, Aliya Dasa Pramesthi, Putra, A., Wardah Ariij Adibah, & Ade Irma Agustian. (2025). SOSIALISASI ORANG TUA TENTANG BAHAYA GADGET BAGI ANAK-ANAK. BHAKTI NAGORI (Jurnal Pengabdian Kepada Masyarakat), 5(2), 504 - 512. https://doi.org/10.36378/bhakti_nagori.v5i2.4773
. Nainggolan, L. E., Cahya Putra, D. S., Nur Laily, R. S., Ekamartha, K. N., Hidayatullah, S., & Firdausi Novira Rachman, R. A. (2025). STRATEGI PEMBERDAYAAN LINGKUNGAN MELALUI BUDIDAYA TOGA DAN INOVASI SMARTBIN DI KELURAHAN MANYAR SABRANGAN. BHAKTI NAGORI (Jurnal Pengabdian Kepada Masyarakat), 5(2), 513 - 522. https://doi.org/10.36378/bhakti_nagori.v5i2.4783
. M. Yusfahmi, Febri Haswan, Nofri Wandi Al-Hafiz, Elgamar Syam, Helpi Nopriandi, Jasri, Aprizal, Harianja, Erlinda, Sri Chairani, Gunardi Hamzah, & Morine Delya Octa. (2025). SOSIALISASI DAN PENERAPAN APLIKASI BERBASIS TEKNOLOGI INFORMASI UNTUK MENDUKUNG TRANSFORMASI DIGITAL BUMDes TEBING TINGGI. BHAKTI NAGORI (Jurnal Pengabdian Kepada Masyarakat), 5(2), 712 - 719. https://doi.org/10.36378/bhakti_nagori.v5i2.4910
. Yogica, R., Yuhelman, N., Wanda Marten, T., & Hazizah, N. (2025). PENGUATAN PERAN KOMUNITAS OTOMOTIF DALAM EDUKASI PENCEGAHAN TAWURAN REMAJA . BHAKTI NAGORI (Jurnal Pengabdian Kepada Masyarakat), 5(2), 927 - 935. https://doi.org/10.36378/bhakti_nagori.v5i2.4941
. Faizah Qurrata Aini, Fitri Amelia, Dwi Finna Syolendra, Nofri Yuhelman, Fauzana Gazali, Minda Azhar, Fajriah Azra, Yerimadesi, Andromeda, Miftahul Khair, Zonalia Fitriza, Suryelita, Viona Maharani, Achie Keylla, Munifa Mahdiah, Melati Wahyuni, Rifka Andani, Ayu Wulandari, & Ulfa Autafia. (2025). WORKSHOP PEMANFAATAN AI UNTUK PENGEMBANGAN E-LKPD PADA PEMBELAJARAN DEEP LEARNING DI SMAN 1 PADANG SAGO: PKM. BHAKTI NAGORI (Jurnal Pengabdian Kepada Masyarakat), 5(2), 1123 - 1133. https://doi.org/10.36378/bhakti_nagori.v5i2.4764
Copyright (c) 2026 Andra Havid Andra Ramadhon, Agung Cahyono, Taufiq Agung Cahyono

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
This is an open-access article distributed under the terms of the Creative Commons Attribution-ShareAlike 4.0 International License which permits unrestricted use, distribution, and reproduction in any medium. Users are allowed to read, download, copy, distribute, search, or link to full-text articles in this journal without asking by giving appropriate credit, provide a link to the license, and indicate if changes were made. All of the remix, transform, or build upon the material must distribute the contributions under the same license as the original.












