Semantic Diversity In The Formation Of Story Questions From The Big Language Model

  • Andra Havid Andra Ramadhon andra
  • Agung Cahyono
  • Taufiq Agung Cahyono
Keywords: Large Language Model, Semantic Diversity, Math Word Problem, Prompt Diversification, Context-Aware Generation

Abstract

The development of Large Language Models (LLMs) has opened new opportunities for the automatic generation of math word problems (MWPs). However, many existing approaches still produce repetitive and template-based problems due to limited variation in context, narrative structure, and semantic relationships. This limitation reduces the effectiveness of such problems in assessing students’ conceptual understanding. This study aims to develop a semantic diversity pipeline based on context-aware generation to produce more varied, meaningful, and curriculum-aligned math word problems for Indonesian elementary education. The proposed method involves building an Indonesian MWP dataset, fine-tuning an LLM using Low-Rank Adaptation (LoRA), and designing a generation pipeline consisting of context retrieval, prompt diversification, semantic evaluation, and a solvability filtering mechanism. Evaluation was conducted using automated metrics, including Self-BLEU, Jaccard Similarity, and Cosine Similarity based on Sentence-BERT, as well as qualitative assessments from mathematics teachers. The results show that the proposed approach successfully improves lexical, semantic, and contextual diversity in generated problems. The Self-BLEU score of 4.79 indicates low repetition, the Jaccard Similarity score of 0.194 reflects high vocabulary variation, and the Cosine Similarity score of 0.424 demonstrates balanced semantic diversity while maintaining mathematical consistency. Teacher evaluations further confirm that the generated problems are relevant to the curriculum, appropriately challenging, and more natural compared to conventional methods. Overall, this research contributes to the development of more adaptive and diverse LLM-based math word problem generation systems for mathematics learning in Indonesia.

Downloads

Download data is not yet available.

References

A. Aghzal dan others, “Evaluating Large Language Models on Advanced Scientific and Mathematical Problem Solving,” IEEE Transactions on Learning Technologies, vol. 18, no. 1, hlm. 21–35, 2025.

N. Mohiuddin, A. Hassan, dan M. Rahman, “Large Language Models for Educational Natural Language Processing: A Survey,” IEEE Access, vol. 11, hlm. 112345–112367, 2023.

Y. Liu dan others, “Large Language Models for Mathematical Reasoning: Challenges and Opportunities,” Artificial Intelligence Review, vol. 57, no. 1, hlm. 1–29, 2024.

R. E. N. Wang, H. Li, dan J. Zhao, “Automatic Math Word Problem Generation Using Transformer Models,” Expert Systems with Applications, vol. 159, hlm. 113596, 2020.

D. A. Cruse, Meaning in Language: An Introduction to Semantics and Pragmatics. Oxford, UK: Oxford University Press, 2000.

T. B. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, dan others, “Language Models are Few-Shot Learners,” dalam Advances in Neural Information Processing Systems, 2020, hlm. 1877–1901.

E. J. Hu dkk., “LoRA: Low-Rank Adaptation of Large Language Models,” dalam International Conference on Learning Representations (ICLR), 2022.

S. Wang, X. Chen, dan F. Li, “Semantic Diversity Enhancement in Neural Text Generation,” IEEE Transactions on Artificial Intelligence, vol. 2, no. 4, hlm. 322–334, 2021.

D. Zhang, L. Wang, dan Y. Liu, “Generating Mathematical Word Problems with Semantic Constraints,” dalam Proceedings of COLING, 2020, hlm. 245–256.

J. Lyons, Semantics. Cambridge, UK: Cambridge University Press, 1977.

P. Anand, M. Sharma, dan K. Gupta, “Reasoning Limitations of Large Language Models in Mathematical Word Problems,” Computers & Education: Artificial Intelligence, vol. 5, hlm. 100178, 2024.

X. Xie dan others, “Multi-Level Evaluation Metrics for Diverse Text Generation,” Information Processing & Management, vol. 60, no. 2, hlm. 103210, 2023.

M. Strohmaier dan others, “Holistic Evaluation Framework for Educational Text Generation Systems,” Education and Information Technologies, vol. 30, no. 2, hlm. 1551–1570, 2025.

Z. Chen dan others, “Self-Instruct and Rejection Sampling for High-Quality Educational Dataset Generation,” IEEE Access, vol. 13, hlm. 44122–44138, 2025.

A. Mahran dan K. Simbeck, “Semantic Control in Educational Language Models,” Journal of Artificial Intelligence in Education, vol. 35, no. 1, hlm. 44–63, 2025.

N. Reimers dan I. Gurevych, “Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks,” dalam Proceedings of EMNLP-IJCNLP, 2019, hlm. 3982–3992.

. Rahmawati, D. P., & Dwi Seputro, D. N. (2025). PENINGKATAN PEMAHAMAN GROOMING SERVICE MELALUI PELATIHAN BERBASIS PRETEST DAN POSTTEST PADA KARYAWAN SUWEGER INDONESIA. BHAKTI NAGORI (Jurnal Pengabdian Kepada Masyarakat), 5(2), 456 - 462. https://doi.org/10.36378/bhakti_nagori.v5i2.4587

. Mumtazah Nadhiroh, A. K., Febrianti, A., Ultami, J. N., Ikhsan, M. A., & Hasibuan, R. (2025). PENINGKATAN PENGETAHUAN GIZI SEIMBANG DAN POLA HIDUP SEHAT BAGI SISWA SEKOLAH DASAR MELALUI PROGRAM EDUKASI INTERAKTIF DI SDIT SWASTA AL-MUNAYA: PKM. BHAKTI NAGORI (Jurnal Pengabdian Kepada Masyarakat), 5(2), 463 - 470. https://doi.org/10.36378/bhakti_nagori.v5i2.4595

. Wirasada, G. D., & Zawawi , Z. (2025). PENERAPAN MANAJEMEN OPERASIONAL DI PT. AGRODANA FUTURES: STUDI PADA PROSES EKSEKUSI TRANSAKSI DAN LAYANAN NASABAH. BHAKTI NAGORI (Jurnal Pengabdian Kepada Masyarakat), 5(2), 471 - 477. https://doi.org/10.36378/bhakti_nagori.v5i2.4602

. Nurza, R. P., Tessa, T., Dzhabi, M., Nazli, R., & Khomarudin, A. N. (2025). PENYULUHAN EDUKASI PENGATURAN SCREEN TIME DAN FILTER KONTEN DIGITAL PADA KELUARGA DI POSYANDU BUNDO KANDUANG. BHAKTI NAGORI (Jurnal Pengabdian Kepada Masyarakat), 5(2), 478 - 487. https://doi.org/10.36378/bhakti_nagori.v5i2.4637

. Rizki Fortuna, J., Ilmi Romadhoni, S., & Sari Tondang, I. (2025). PELATIHAN PEMANFAATAN KOTORAN KAMBING MENJADI PUPUK ORGANIK DI BALAI PENYULUHAN PERTANIAN PORONG: PKM MBKM. BHAKTI NAGORI (Jurnal Pengabdian Kepada Masyarakat), 5(2), 488 - 494. https://doi.org/10.36378/bhakti_nagori.v5i2.4638

. Devi, E., Fauziah Nurrahmah, F., Masruroh, M., Olivia Sinaga, S. L., Pribadi Ayuningtyas, Z., & Mardi Suryanto, T. L. (2025). EFEKTIVITAS PELATIHAN AQUAPONIK TERHADAP PENINGKATAN PENGETAHUAN DAN KETERAMPILAN PERTANIAN BERKELANJUTAN DI KELURAHAN JAMBANGAN. BHAKTI NAGORI (Jurnal Pengabdian Kepada Masyarakat), 5(2), 495 - 503. https://doi.org/10.36378/bhakti_nagori.v5i2.4742

. Trimono, T., Ningtiyas, R. W., Icha Rohmatul Jannah, Aliya Dasa Pramesthi, Putra, A., Wardah Ariij Adibah, & Ade Irma Agustian. (2025). SOSIALISASI ORANG TUA TENTANG BAHAYA GADGET BAGI ANAK-ANAK. BHAKTI NAGORI (Jurnal Pengabdian Kepada Masyarakat), 5(2), 504 - 512. https://doi.org/10.36378/bhakti_nagori.v5i2.4773

. Nainggolan, L. E., Cahya Putra, D. S., Nur Laily, R. S., Ekamartha, K. N., Hidayatullah, S., & Firdausi Novira Rachman, R. A. (2025). STRATEGI PEMBERDAYAAN LINGKUNGAN MELALUI BUDIDAYA TOGA DAN INOVASI SMARTBIN DI KELURAHAN MANYAR SABRANGAN. BHAKTI NAGORI (Jurnal Pengabdian Kepada Masyarakat), 5(2), 513 - 522. https://doi.org/10.36378/bhakti_nagori.v5i2.4783

. M. Yusfahmi, Febri Haswan, Nofri Wandi Al-Hafiz, Elgamar Syam, Helpi Nopriandi, Jasri, Aprizal, Harianja, Erlinda, Sri Chairani, Gunardi Hamzah, & Morine Delya Octa. (2025). SOSIALISASI DAN PENERAPAN APLIKASI BERBASIS TEKNOLOGI INFORMASI UNTUK MENDUKUNG TRANSFORMASI DIGITAL BUMDes TEBING TINGGI. BHAKTI NAGORI (Jurnal Pengabdian Kepada Masyarakat), 5(2), 712 - 719. https://doi.org/10.36378/bhakti_nagori.v5i2.4910

. Yogica, R., Yuhelman, N., Wanda Marten, T., & Hazizah, N. (2025). PENGUATAN PERAN KOMUNITAS OTOMOTIF DALAM EDUKASI PENCEGAHAN TAWURAN REMAJA . BHAKTI NAGORI (Jurnal Pengabdian Kepada Masyarakat), 5(2), 927 - 935. https://doi.org/10.36378/bhakti_nagori.v5i2.4941

. Faizah Qurrata Aini, Fitri Amelia, Dwi Finna Syolendra, Nofri Yuhelman, Fauzana Gazali, Minda Azhar, Fajriah Azra, Yerimadesi, Andromeda, Miftahul Khair, Zonalia Fitriza, Suryelita, Viona Maharani, Achie Keylla, Munifa Mahdiah, Melati Wahyuni, Rifka Andani, Ayu Wulandari, & Ulfa Autafia. (2025). WORKSHOP PEMANFAATAN AI UNTUK PENGEMBANGAN E-LKPD PADA PEMBELAJARAN DEEP LEARNING DI SMAN 1 PADANG SAGO: PKM. BHAKTI NAGORI (Jurnal Pengabdian Kepada Masyarakat), 5(2), 1123 - 1133. https://doi.org/10.36378/bhakti_nagori.v5i2.4764

Published
2026-06-16
How to Cite
Havid Andra Ramadhon, A., Agung Cahyono, & Taufiq Agung Cahyono. (2026). Semantic Diversity In The Formation Of Story Questions From The Big Language Model. JURNAL TEKNOLOGI DAN OPEN SOURCE, 9(1), 217 - 227. https://doi.org/10.36378/jtos.v9i1.5490
Abstract viewed = 13 times
PDF downloaded = 27 times