World Country Clustering Based on Socioeconomic and Demographic Data of 2023 Using PCA and K-Means
Main Article Content
Abstract
The development of social, economic, and demographic factors is an important indicator for assessing the progress of a country. These factors reflect the quality of life, economic conditions, and population dynamics that can influence policies and development planning. Therefore, to better understand a country's conditions, it is important to cluster countries based on similar characteristics in these various aspects. The purpose of this study is to identify clusters of countries worldwide based on the analysis of socio-economic and demographic data for 2023 using Principal Component Analysis (PCA) and K-Means Clustering methods. This analysis examines the relationship between GDP, birth rate, death rate, population, and CO2 emissions. The results reveal three clusters with distinct characteristics. Cluster 0 shows high GDP with low infant mortality and controlled CO2 emissions. Cluster 1 shows lower GDP, high infant mortality, and challenges in the health and economic sectors. Cluster 2, which includes countries like China, India, and the US, has high GDP but faces high CO2 emission issues. These findings indicate the need for integrated policies to improve global well-being by considering economic, health, and environmental factors in a sustainable manner.
Article Details
Section
Penulis yang menerbitkan dengan jurnal ini setuju pada persyaratan berikut ini:
- Penulis menyimpan hak cipta dan memberikan jurnal hak penerbitan pertama, dengan pekerjaan 6 bulan setelah penerbitan secara simultan dengan lisensi di bawah: Creative Commons Attribution License yang memudahkan yang lain untuk berbagi karya dengan pengakuan penerbitan awal dan kepenulisan karya di jurnal ini.
- Penulis bisa memasukkan ke dalam penyusunan kontraktual tambahan terpisah untuk distribusi non-ekslusif versi kaya terbitan jurnal (contoh: mempostingnya ke repositori institusional atau menerbitkannya dalam sebuah buku), dengan pengakuan penerbitan awalnya di jurnal ini.
- Penulis diizinkan dan didorong untuk mem-posting karya mereka online (contoh: di repositori institusional atau di website mereka) sebelum dan selama proses penyerahan, karena dapat mengarahkan ke pertukaran produktif, seperti halnya sitiran yang lebih awal dan lebih hebat dari karya yang diterbitkan. (Lihat Efek Akses Terbuka).
How to Cite
References
[1] B. S. Lal, “Demographic and Socio-Economic Development Evidence from G7 Countries,” Studies in Social Science & Humanities, vol. 2, no. 8, pp. 17–26, Aug. 2023, doi: 10.56397/sssh.2023.08.03.
[2] N. Mohammad, “A Computational Theory and Semi-Supervised Algorithm for Clustering,” Jun. 2023, [Online]. Available: http://arxiv.org/abs/2306.06974
[3] E. Tarver, “What Is Social Economics, and How Does It Impact Society?,” investopedia.com. Accessed: Jan. 11, 2025. [Online]. Available: https://www.investopedia.com/terms/s/social-economics.asp#toc-what-is-social-economics
[4] A. Hayes, “Demographics: How to Collect, Analyze, and Use Demographic Data,” investopedia.com. Accessed: Jan. 11, 2025. [Online]. Available: https://www.investopedia.com/terms/d/demographics.asp
[5] R. Kurniawan, M. S. Hasibuan, and R. Hasibuan, “Klasterisasi Wilayah Prioritas Vaksin Menggunakan Algoritma K-Means Clustering,” Media Online, vol. 4, no. 3, pp. 1585–1592, 2023, doi: 10.30865/klik.v4i3.1334.
[6] T. A. Munandar and D. Handayani, “K-Means Cluster Algorithm for Grouping Inequality in Regional Development,” International Journal of Information Technology and Computer Science Applications, vol. 1, no. 1, 2023, doi: 10.58776/ijitcsa.v1i1.20.
[7] T. Kurita, “Principal Component Analysis (PCA),” in Computer Vision, Cham: Springer International Publishing, 2020, pp. 1–4. doi: 10.1007/978-3-030-03243-2_649-1.
[8] X. Lin and J. Xu, “Road network partitioning method based on canopy-kmeans clustering algorithm,” Archives of Transport, vol. 54, no. 2, pp. 95–106, 2020, doi: 10.5604/01.3001.0014.2970.
[9] A. Agung, A. Daniswara, I. Kadek, and D. Nuryana, “Data Preprocessing Pola Pada Penilaian Mahasiswa Program Profesi Guru,” Journal of Informatics and Computer Science, vol. 05, 2023.
[10] C. M. P. Santosa, E. Sumirat, and O. Y. Sudrajad, “An Exploratory Data Analysis (EDA) Approach for Analyzing Financial Statements in Pharmaceutical Companies Using Machine Learning,” International Journal of Current Science Research and Review, vol. 07, no. 07, Jul. 2024, doi: 10.47191/ijcsrr/V7-i7-12.
[11] Baihaqiyazid, “Data Scaling,” Medium.com. Accessed: Jan. 11, 2025. [Online]. Available: https://medium.com/@baihaqiyazid16/data-scaling-3669f475790a
[12] M. R. Salmanpour et al., “Machine Learning Evaluation Metric Discrepancies across Programming Languages and Their Components: Need for Standardization,” Nov. 2024.
[13] N. Easaw, W. S. Lee, P. S. Lohiya, S. Jalan, and P. Pradhan, “Estimation of Correlation Matrices from Limited time series Data using Machine Learning,” Sep. 2022, [Online]. Available: http://arxiv.org/abs/2209.01198
[14] A. Muqoddam, “Pengelompokan Produksi tambak Garam dengan Metode Cluster K-Means dan Optimasi Cluster Menggunakan Elbow (studi kasus: dinas kelautan Kabupaten Bangkalan),” Jurnal METHODIKA, Mar. 2023.
[15] N. A. Maori, “Metode Elbow Dalam Optimasi Jumlah Cluster Pada K-Means Clustering,” Jurnal SIMETRIS, vol. 14, 2023.
[16] G. Li and Y. Qin, “An Exploration of the Application of Principal Component Analysis in Big Data Processing,” Applied Mathematics and Nonlinear Sciences, vol. 9, no. 1, 2024, doi: 10.2478/amns-2024-0664.
[17] M. Redmann, “Dimension reduction for large-scale stochastic systems with non-zero initial states and controlled diffusion,” Aug. 2024, [Online]. Available: http://arxiv.org/abs/2408.00581
[18] U. Laa and G. Valencia, “Clustering and visualization tools to study high dimensional parameter spaces: B anomalies example,” Mar. 2023, [Online]. Available: http://arxiv.org/abs/2304.00151