The The Use of the K-Means Algorithm in Analyzing E-Commerce Consumer Segmentation: A Case Study of the Online Retail Dataset (UK)

  • Ardo Kusdaryanto Esa Unggul University
  • Christoporus Dimas Wijanarko Computer Science Faculty, Esa Unggul University,Bekasi Utara, Indonesia
  • Paskalis Dwi Widyantara Usat Computer Science Faculty, Esa Unggul University,Bekasi Utara, Indonesia
  • Ary Prabowo Computer Science Faculty, Esa Unggul University,Bekasi Utara, Indonesia
Keywords: Customer Segmentation, E-Commerce, Data Mining, K-Means, Online Retail

Abstract

This study aims to analyze consumer segmentation on e-commerce platforms by employing the K-Means algorithm as the primary clustering method. Using the Online Retail (UK) dataset, which contains comprehensive transaction records from a UK-based online retail company, the research focuses on identifying behavioral patterns among consumers. Several key variables, including purchase frequency, total transaction value, and recency or visit time, are processed to create meaningful clusters that represent different types of consumer behavior. The K-Means algorithm is applied through a series of preprocessing steps, such as data cleaning, feature selection, and normalization to ensure accurate clustering results. Once the clusters are formed, each consumer group is analyzed to determine its characteristics, purchasing tendencies, and potential value to the business. The segmentation results provide valuable insights for businesses in developing targeted marketing strategies and personalized service offerings. By understanding the unique preferences and behaviors within each cluster, companies can optimize promotional efforts, improve customer retention, and enhance overall user experience. The findings indicate that data-driven segmentation using the K-Means algorithm is a highly effective approach for gaining deeper, actionable insights into consumer behavior, thereby supporting more strategic decision-making in the e-commerce environment.

 

Downloads

Download data is not yet available.

References

M. Alves Gomes and T. Meisen, A review on customer segmentation methods for personalized customer targeting in e-commerce use cases, vol. 21, no. 3. Springer Berlin Heidelberg, 2023. doi: 10.1007/s10257-023-00640-4.

J. M. John, O. Shobayo, and B. Ogunleye, “An Exploration of Clustering Algorithms for Customer Segmentation in the UK Retail Market,” Analytics, vol. 2, no. 4, pp. 809–823, 2023, doi: 10.3390/analytics2040042.

M. Dibak et al., “UNICON: A Unified Framework for Behavior-Based Consumer Segmentation in E-Commerce,” Lect. Notes Electr. Eng., vol. 1299, pp. 53–71, 2025, doi: 10.1007/978-3-031-76878-1_4.

P. Rajapandian, A. Karunamurthy, V. Vasanth, and M. Meganathan, “Journal of Engineering Technology and Applied Physics E-Commerce Customer Segmentation: A Clustering Approach in A Web-Based Platform,” J. Eng. Technol. Appl. Phys., vol. 7, no. 1, pp. 2682–8383, 2025.

A. Wasilewski, “Customer segmentation in e-commerce: a context-aware quality framework for comparing clustering algorithms,” J. Internet Serv. Appl., vol. 15, no. 1, pp. 160–178, 2024, doi: 10.5753/jisa.2024.3851.

Laila Ali Putri, Mazayah Tsaqofah, Dea Syahfira Hasibuan, Hasti Fadillah, Maria Ulfa, and Mhd.Furqan, “Application of K-Means Clustering Algorithm for E-Commerce Data Analysis,” J. Artif. Intell. Eng. Appl., vol. 4, no. 3, pp. 2364–2367, 2025, doi: 10.59934/jaiea.v4i3.1170.

L. R. Singrapati, R. Dora, and R. Kurniawan, “Pengelompokkan Toko Kaus Termurah E-Commerce Shopee berdasarkan Reputasi Toko Menggunakan Metode Clustering K-Medoids dan K-Means,” J. Sist. dan Teknol. Inf., vol. 12, no. 1, p. 65, 2024, doi: 10.26418/justin.v12i1.69067.

Y. Putri, D. Aldo, and W. Ilham, “Retail Marketing Strategy Optimization: Customer Segmentation with Artificial Intelligence Integration and K-Means Clustering,” Sinkron, vol. 8, no. 4, pp. 2155–2163, 2024, doi: 10.33395/sinkron.v8i4.14000.

Lulu Yu, “The Application of K-means Clustering Algorithm in the Evaluation of E-Commerce Websites,” J. Electr. Syst., vol. 20, no. 6s, pp. 759–769, 2024, doi: 10.52783/jes.2738.

G. ASLANTAŞ, M. GENÇGÜL, M. RUMELLİ, M. ÖZSARAÇ, and G. BAKIRLI, “Customer Segmentation Using K-Means Clustering Algorithm and RFM Model,” Deu Muhendis. Fak. Fen ve Muhendis., vol. 25, no. 74, pp. 491–503, 2023, doi: 10.21205/deufmd.2023257418.

R. W. Sembiring Brahmana, F. A. Mohammed, and K. Chairuang, “Customer Segmentation Based on RFM Model Using K-Means, K-Medoids, and DBSCAN Methods,” Lontar Komput. J. Ilm. Teknol. Inf., vol. 11, no. 1, p. 32, 2020, doi: 10.24843/lkjiti.2020.v11.i01.p04.

A. Khumaidi, H. Wahyono, R. Darmawan, H. D. Kartika, N. L. Chusna, and M. K. Fauzy, “RFM-AR Model for Customer Segmentation using K-Means Algorithm,” E3S Web Conf., vol. 465, 2023, doi: 10.1051/e3sconf/202346502005.

A. Griva, E. Zampou, V. Stavrou, D. Papakiriakopoulos, and G. Doukidis, “A two-stage business analytics approach to perform behavioural and geographic customer segmentation using e-commerce delivery data,” J. Decis. Syst., vol. 33, no. 1, pp. 1–29, 2024, doi: 10.1080/12460125.2022.2151071.

G. Wang, “Customer segmentation in the digital marketing using a Q-learning based differential evolution algorithm integrated with K-means clustering,” PLoS One, vol. 20, no. 2 February, pp. 1–21, 2025, doi: 10.1371/journal.pone.0318519.

W. Zhang and Z. Wu, “E-commerce recommender system based on improved K-means commodity information management model,” Heliyon, vol. 10, no. 9, p. e29045, 2024, doi: 10.1016/j.heliyon.2024.e29045.

. W. Ahmad, H. U. Khan, T. Iqbal, and S. Iqbal, “Attention-Based Multi-Channel Gated Recurrent Neural Networks: A Novel Feature-Centric Approach for Aspect-Based Sentiment Classification,” Ieee Access, vol. 11, pp. 54408–54427, 2023, doi: 10.1109/access.2023.3281889.

. S. Smetanin, “The Applications of Sentiment Analysis for Russian Language Texts: Current Challenges and Future Perspectives,” Ieee Access, vol. 8, pp. 110693–110719, 2020, doi: 10.1109/access.2020.3002215.

. Y. Lin, J. Li, L. Yang, K. Xu, and H. Lin, “Sentiment Analysis With Comparison Enhanced Deep Neural Network,” Ieee Access, vol. 8, pp. 78378–78384, 2020, doi: 10.1109/access.2020.2989424.

. P. Thiengburanathum and P. Charoenkwan, “SETAR: Stacking Ensemble Learning for Thai Sentiment Analysis Using RoBERTa and Hybrid Feature Representation,” Ieee Access, vol. 11, pp. 92822–92837, 2023, doi: 10.1109/access.2023.3308951.

. Sundaram, H. Subramaniam, S. H. A. Hamid, and A. M. Nor, “A Systematic Literature Review on Social Media Slang Analytics in Contemporary Discourse,” Ieee Access, vol. 11, pp. 132457–132471, 2023, doi: 10.1109/access.2023.3334278.

. K. L. Tan, C. P. Lee, K. M. Lim, and K. S. M. Anbananthen, “Sentiment Analysis With Ensemble Hybrid Deep Learning Model,” Ieee Access, vol. 10, pp. 103694–103704, 2022, doi: 10.1109/access.2022.3210182.

. Yousaf et al., “Emotion Recognition by Textual Tweets Classification Using Voting Classifier (LR-SGD),” Ieee Access, vol. 9, pp. 6286–6295, 2021, doi: 10.1109/access.2020.3047831.

. J. Luo, M. Bouazizi, and T. Ohtsuki, “Data Augmentation for Sentiment Analysis Using Sentence Compression-Based SeqGAN With Data Screening,” Ieee Access, vol. 9, pp. 99922–99931, 2021, doi: 10.1109/access.2021.3094023.

. T. Subba and T. S. Chingtham, “Comparative Analysis of Machine Learning Algorithms With Advanced Feature Extraction for ECG Signal Classification,” Ieee Access, vol. 12, pp. 57727–57740, 2024, doi: 10.1109/access.2024.3387041.

. J. Khan, N. Ahmad, S. Khalid, F. Ali, and Y. Lee, “Sentiment and Context-Aware Hybrid DNN With Attention for Text Sentiment Classification,” Ieee Access, vol. 11, pp. 28162–28179, 2023, doi: 10.1109/access.2023.3259107.

Published
2025-12-02
How to Cite
Kusdaryanto, A., Wijanarko, C. D., Widyantara Usat, P. D., & Prabowo , A. (2025). The The Use of the K-Means Algorithm in Analyzing E-Commerce Consumer Segmentation: A Case Study of the Online Retail Dataset (UK). JURNAL TEKNOLOGI DAN OPEN SOURCE, 8(2), 612 - 622. https://doi.org/10.36378/jtos.v8i2.4798
Abstract viewed = 55 times
PDF downloaded = 71 times