Dental and Oral Disease Image Classification Using MobileViT Architecture on Imbalanced Datasets

  • Dwi Pambudi Utomo Electrical Engineering Department, Faculty of Engineering, Universitas Negeri Semarang
  • Septian Eko Prasetyo Electrical Engineering Department, Faculty of Engineering, Universitas Negeri Semarang, Indonesia
Keywords: MobileVit, Image Classification, Dental and Oral disease, Imbalanced Dataset, Transfer learning, Data augmentation

Abstract

Dental and oral diseases represent a high-prevalence global health issue, yet their management still relies heavily on the subjective visual inspections of medical professionals. While automated diagnostic systems exist, previous studies predominantly employ conventional Convolutional Neural Networks (CNNs) that struggle to capture global anatomical dependencies, or standard Vision Transformers (ViTs) whose massive parameter counts hinder deployment on clinical edge devices. Furthermore, existing research is frequently constrained by limited disease classes and fails to explicitly resolve severe clinical data imbalance. To bridge these gaps, this study proposes a comprehensive multi-class oral disease image classification system using MobileViT, a lightweight hybrid architecture that efficiently combines local CNN convolutions with global transformer attention mechanisms. Evaluated on a large-scale dataset encompassing six disease classes calculus, dental caries, gingivitis, aphthous ulcers, tooth discoloration, and hypodontia, the inherent class imbalance is algorithmically addressed through a WeightedRandomSampler integrated with multi-level data augmentation utilizing RandAugment and RandomErasing. The dataset is partitioned into a 70:15:15 ratio for training, validation, and testing. Experimental results demonstrate that the proposed model achieves an accuracy of 93.61%, precision of 94.76%, recall of 93.61%, and an F1-score of 93.75% on the test set. An ablation study reveals that the combination of augmentation and sample weighting improves the F1-score by 4.2 points compared to the baseline without specific treatments. Furthermore, MobileViT explicitly outperforms conventional architectures including ResNet50, EfficientNetB0, and MobileNetV3. This research demonstrates that lightweight hybrid vision transformers can effectively resolve prior representational and imbalanced data limitations for clinical oral disease classification.

Downloads

Download data is not yet available.

References

N. I. Mohammed, “The Importance of Preventive Oral Health Care Services in Protection Against Non-Communicable Diseases,” Innovative Applications and Research Methods in Health Sciences, p. 587, 2025.

P. Mirfendereski, G. Y. Li, A. T. Pearson, and A. R. Kerr, “Artificial intelligence and the diagnosis of oral cavity cancer and oral potentially malignant disorders from clinical photographs: a narrative review,” Frontiers in Oral Health, vol. 6, p. 1569567, Mar. 2025, doi: 10.3389/froh.2025.1569567.

M. Essat, K. Cooper, A. Bessey, M. Clowes, J. B. Chilcott, and K. D. Hunter, “Diagnostic accuracy of conventional oral examination for detecting oral cavity cancer and potentially malignant disorders in patients with clinically evident oral lesions: Systematic review and meta-analysis,” Apr. 01, 2022, John Wiley and Sons Inc. doi: 10.1002/hed.26992.

J. Rashid, B. S. Qaisar, M. Faheem, A. Akram, R. ul Amin, and M. Hamid, “Mouth and oral disease classification using InceptionResNetV2 method,” Multimed. Tools Appl., vol. 83, no. 11, pp. 33903–33921, Mar. 2024, doi: 10.1007/s11042-023-16776-x.

W. Liu, X. Wang, and J. Zhang, “Enhancing dental disease classification with agent attention infused vision transformer in conformer architecture,” Biomed. Signal Process. Control, vol. 112, Feb. 2026, doi: 10.1016/j.bspc.2025.108373.

D. A. Ali and H. T. Sadeeq, “An Interpretable Deep Learning Framework for Multi-Class Dental Disease Classification from Intraoral RGB Images,” Statistics, Optimization and Information Computing, vol. 14, no. 6, pp. 3380–3397, Nov. 2025, doi: 10.19139/soic-2310-5070-2880.

N. Gour and P. Khanna, “Multi-class multi-label ophthalmological disease detection using transfer learning based convolutional neural network,” Biomed. Signal Process. Control, vol. 66, p. 102329, Apr. 2021, doi: 10.1016/J.BSPC.2020.102329.

E. Goceri, “Medical image data augmentation: techniques, comparisons and interpretations,” Artificial Intelligence Review 2023 56:11, vol. 56, no. 11, pp. 12561–12605, Mar. 2023, doi: 10.1007/S10462-023-10453-Z.

“Oral Diseases.” Accessed: May 24, 2026. [Online]. Available: https://www.kaggle.com/datasets/salmansajid05/oral-diseases

Z. Zhou, J. Zhu, Y. Zhang, X. Guan, P. Wang, and T. Li, “Deep Learning in Dental Image Analysis: A Systematic Review of Datasets, Methodologies, and Emerging Challenges,” Oct. 2025, Accessed: May 24, 2026. [Online]. Available: http://arxiv.org/abs/2510.20634

L. Alzubaidi et al., “A survey on deep learning tools dealing with data scarcity: definitions, challenges, solutions, tips, and applications,” Journal of Big Data 2023 10:1, vol. 10, no. 1, pp. 46-, Apr. 2023, doi: 10.1186/S40537-023-00727-2.

G. Lee, P. Yonrith, D. Yeo, and A. Hong, “Enhancing detection performance for robotic harvesting systems through RandAugment,” Eng. Appl. Artif. Intell., vol. 123, p. 106445, Aug. 2023, doi: 10.1016/J.ENGAPPAI.2023.106445.

M. Saran, F. Nar, and A. N. Saran, “Perlin random erasing for data augmentation,” SIU 2021 - 29th IEEE Conference on Signal Processing and Communications Applications, Proceedings, Jun. 2021, doi: 10.1109/SIU53274.2021.9477804.

R. Zhang et al., “Research and Application of Deep Learning Models with Multi-Scale Feature Fusion for Lesion Segmentation in Oral Mucosal Diseases,” Bioengineering (Basel), vol. 11, no. 11, Nov. 2024, doi: 10.3390/BIOENGINEERING11111107.

H. He, H. Chen, G. Zhao, H. Li, J. Gu, and H. He, “Improving imbalanced microstructure classification of ultrahigh carbon steel with spatial attention and ensemble prediction,” Mater. Charact., p. 116415, 2026.

S. Mehta and M. Rastegari, “MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer,” ICLR 2022 - 10th International Conference on Learning Representations, Oct. 2021, Accessed: May 24, 2026. [Online]. Available: https://arxiv.org/pdf/2110.02178

S. Mehta and M. Rastegari, “Separable Self-attention for Mobile Vision Transformers,” Transactions on Machine Learning Research, vol. 2023-January, Jun. 2022, Accessed: May 24, 2026. [Online]. Available: http://arxiv.org/abs/2206.02680

H. Guan and M. Liu, “Domain Adaptation for Medical Image Analysis: A Survey,” IEEE Trans. Biomed. Eng., vol. 69, no. 3, pp. 1173–1185, Mar. 2022, doi: 10.1109/TBME.2021.3117407.

S. Y. Kim et al., “Deep learning-based computer-aided diagnosis in screening breast ultrasound to reduce false-positive diagnoses,” Scientific Reports 2021 11:1, vol. 11, no. 1, pp. 395-, Jan. 2021, doi: 10.1038/s41598-020-79880-0.

K. A. Bhat and S. A. Sofi, “A synergistic fusion of shallow and deep generative model to enhance machine learning efficacy and classification performance in data-scarce environments,” International Journal of Information Technology 2024, pp. 1–21, Aug. 2024, doi: 10.1007/S41870-024-02120-5.

Z. Gao, H. Liu, and L. Li, “Data augmentation for time-series classification: An extensive empirical study and comprehensive survey,” Journal of Artificial Intelligence Research, vol. 83, 2025.

J. K. Chaudhary, “Algorithmic foundations for generalizable artificial intelligence models: A multi-domain study,” 2025, Doctoral Dissertation, University of Turku, 2025.[Online]. Available: https~….

E. W. Owens, “ASSESSING UNCERTAINTY: A STRATEGY FOR GROUP OVER-SAMPLING TO IMPROVE PREDICTIVE PERFORMANCE OF MACHINE LEARNING MODELS”.

H. Saeeda, T. Johansson, M. Mohamad, and E. Knauss, “Data Annotation Quality Problems in AI-Enabled Perception System Development,” vol. 1, Nov. 2025, Accessed: May 24, 2026. [Online]. Available: https://arxiv.org/pdf/2511.16410

A. C. Siregar, B. S. W. Poetro, B. C. Octariadi, R. Robet, and S. Sucipto, Buku Ajar Pengolahan Citra Digital. PT. Green Pustaka Indonesia, 2025.

Published
2026-06-24
How to Cite
Utomo, D. P., & Septian Eko Prasetyo. (2026). Dental and Oral Disease Image Classification Using MobileViT Architecture on Imbalanced Datasets. JURNAL TEKNOLOGI DAN OPEN SOURCE, 9(1), 324 - 334. https://doi.org/10.36378/jtos.v9i1.5723
Abstract viewed = 6 times
PDF downloaded = 9 times