Optimasi Prediksi Kelayakan Pinjaman dengan Teknik Resampling danAlgoritma Boosting

Authors

  • Muhammad Ricky Perdana Putra Universitas Amikom Yogyakarta
  • Siti Juwariyah Universitas Amikom Yogyakarta
  • Muhammad Ridwan Universitas Amikom Yogyakarta
  • Robert Marco Universitas Amikom Yogyakarta

DOI:

https://doi.org/10.34010/komputika.v14i2.15485

Abstract

Loan eligibility assessment is a crucial element in financial risk mitigation, aiming to minimize potential losses due to bad debts and ensure proper resource distribution. Traditional rule-based approaches have limitations in scalability, risk of subjective bias, and complex data management. The application of Machine Learning (ML) presents a solution with the ability to analyze complex patterns in historical data, although significant challenges such as class imbalance where the number of defaulted borrowers is much smaller than that of current borrowers and missing values ​​in the dataset remain major obstacles. This study evaluates the SMOTE and SMOTE-ENN resampling methods, to address class imbalance, as well as the mean imputation technique to handle missing values. By evaluating boosting algorithms, including Gradient Boosting, XGBoost, LightGBM, AdaBoost, and CatBoost, the results show that the combination of the CatBoost algorithm with the SMOTE-ENN sampling technique provides the highest prediction accuracy of 91.67%. This finding confirms the significant potential of ML in improving the accuracy, efficiency, and fairness of predictions, while making important contributions to the development of data-driven decision-making systems in the financial sector.

References

[1] M. Udbhav, R. Kumar, N. Kumar, R. Kumar, M. Vijarania, and S. Gupta, “Prediction of Home Loan Status Eligibility using Machine Learning.” [Online]. Available: https://ssrn.com/abstract=4121038
[2] D. Rajkumar, “Predictive Analysis of Loan Eligibility Using Machine Learning Algorithms,” 2024. [Online]. Available: www.ijrpr.com
[3] A. Sadhwani, K. Giesecke, and J. Sirignano, “Deep Learning for Mortgage Risk,” Journal of Financial Econometrics, vol. 19, no. 2, pp. 313–368, 2021, doi: 10.1093/jjfinec/nbaa025.
[4] G. Güder and U. Köse, “Prediction of Home Loan Approval with Machine Learning,” Advances in Artificial Intelligence Research, vol. 4, no. 2, pp. 87–95, Dec. 2024, doi: 10.54569/aair.1585994.
[5] M. Al Mamun, A. Farjana, and M. Mamun, “Predicting Bank Loan Eligibility Using Machine Learning Models and Comparison Analysis.”
[6] N. Uddin, M. K. Uddin Ahamed, M. A. Uddin, M. M. Islam, M. A. Talukder, and S. Aryal, “An ensemble machine learning based bank loan approval predictions system with a smart application,” International Journal of Cognitive Computing in Engineering, vol. 4, pp. 327–339, Jun. 2023, doi: 10.1016/j.ijcce.2023.09.001.
[7] K. P, R. S, and J. Jaiswal, “Comparing Machine Learning Techniques for Loan Approval Prediction,” European Alliance for Innovation n.o., Apr. 2024. doi: 10.4108/eai.23-11-2023.2343174.
[8] G. Güder and U. Köse, “Prediction of Home Loan Approval with Machine Learning,” Advances in Artificial Intelligence Research, vol. 4, no. 2, pp. 87–95, Dec. 2024, doi: 10.54569/aair.1585994.
[9] V. Kansal, U. Jain, A. K. Gupta, and M. Aeri, “Home Loan Prediction Using Machine Learning Models,” Elementary Education Online, vol. 20, no. 3, Jan. 2021, doi: 10.17051/ilkonline.2021.03.359.
[10] Y. Wang, Y. Zhang, Y. Lu, and X. Yu, “A Comparative Assessment of Credit Risk Model Based on Machine Learning ——a case study of bank loan data,” in Procedia Computer Science, Elsevier B.V., 2020, pp. 141–149. doi: 10.1016/j.procs.2020.06.069.
[11] A. Shinde, Y. Patil, I. Kotian, A. Shinde, and R. Gulwani, “Loan Prediction System Using Machine Learning,” ITM Web of Conferences, vol. 44, p. 03019, 2022, doi: 10.1051/itmconf/20224403019.
[12] F. Weng, M. Zhu, M. Buckle, P. Hajek, and M. Z. Abedin, “Class imbalance Bayesian model averaging for consumer loan default prediction: The role of soft credit information,” Res Int Bus Finance, vol. 74, Feb. 2025, doi: 10.1016/j.ribaf.2024.102722.
[13] N. Uddin, M. K. Uddin Ahamed, M. A. Uddin, M. M. Islam, M. A. Talukder, and S. Aryal, “An ensemble machine learning based bank loan approval predictions system with a smart application,” International Journal of Cognitive Computing in Engineering, vol. 4, pp. 327–339, Jun. 2023, doi: 10.1016/j.ijcce.2023.09.001.
[14] G. Chen, “Predicting Loan Eligibility Approval Using Machine Learning Algorithms,” INSTICC, Aug. 2024, pp. 512–517. doi: 10.5220/0012828200004547.
[15] T. Emmanuel, T. Maupong, D. Mpoeleng, T. Semong, B. Mphago, and O. Tabona, “A survey on missing data in machine learning,” J Big Data, vol. 8, no. 1, Dec. 2021, doi: 10.1186/s40537-021-00516-9.
[16] J. Xu, Z. Lu, and Y. Xie, “Loan default prediction of Chinese P2P market: a machine learning methodology,” Sci Rep, vol. 11, no. 1, Dec. 2021, doi: 10.1038/s41598-021-98361-6.
[17] V. Moscato, A. Picariello, and G. Sperlí, “A benchmark of machine learning approaches for credit score prediction,” Expert Syst Appl, vol. 165, Mar. 2021, doi: 10.1016/j.eswa.2020.113986.
[18] S. I. Serengil, S. Imece, U. G. Tosun, E. B. Buyukbas, and B. Koroglu, “A Comparative Study of Machine Learning Approaches for Non Performing Loan Prediction with Explainability,” Int J Mach Learn Comput, vol. 12, no. 5, Sep. 2022, doi: 10.18178/ijmlc.2022.12.5.1102.
[19] J. Błaszczyński, A. T. de Almeida Filho, A. Matuszyk, M. Szeląg, and R. Słowiński, “Auto loan fraud detection using dominance-based rough set approach versus machine learning methods,” Expert Syst Appl, vol. 163, Jan. 2021, doi: 10.1016/j.eswa.2020.113740.
[20] M. S. Sabri, “HOME LOAN DATA ANALYSIS AND VISUALIZATION,” 2021. [Online]. Available: www.ijcrt.org
[21] P. Cunningham and S. J. Delany, “k-Nearest Neighbour Classifiers: 2nd Edition (with Python examples),” Apr. 2020, doi: 10.1145/3459665.
[22] E. Dritsas, N. Fazakis, O. Kocsis, K. Moustakas, and N. Fakotakis, “Optimal Team Pairing of Elder Office Employees with Machine Learning on Synthetic Data,” in IISA 2021 - 12th International Conference on Information, Intelligence, Systems and Applications, Institute of Electrical and Electronics Engineers Inc., Jul. 2021. doi: 10.1109/IISA52424.2021.9555511.
[23] B. Krawczyk, “Learning from imbalanced data: open challenges and future directions,” Nov. 01, 2016, Springer Verlag. doi: 10.1007/s13748-016-0094-0.
[24] T. Lu, Y. Huang, W. Zhao, and J. Zhang, The Metering Automation System based Intrusion Detection Using Random Forest Classifier with SMOTE+ENN. IEEE, 2019.
[25] Y. C. Chang, K. H. Chang, and G. J. Wu, “Application of eXtreme gradient boosting trees in the construction of credit risk assessment models for financial institutions,” Applied Soft Computing Journal, vol. 73, pp. 914–920, Dec. 2018, doi: 10.1016/j.asoc.2018.09.029.
[26] T. Chen and C. Guestrin, “XGBoost: A scalable tree boosting system,” in Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Association for Computing Machinery, Aug. 2016, pp. 785–794. doi: 10.1145/2939672.2939785.
[27] C. Tu, H. Liu, and B. Xu, “AdaBoost typical Algorithm and its application research,” in MATEC Web of Conferences, EDP Sciences, Dec. 2017. doi: 10.1051/matecconf/201713900222.
[28] Y. Zhang et al., Research and Application of AdaBoost Algorithm Based on SVM. IEEE, 2019.
[29] J. T. Hancock and T. M. Khoshgoftaar, “CatBoost for big data: an interdisciplinary review,” J Big Data, vol. 7, no. 1, p. 94, 2020, doi: 10.1186/s40537-020-00369-8.
[30] J. T. Hancock and T. M. Khoshgoftaar, “CatBoost for big data: an interdisciplinary review,” J Big Data, vol. 7, no. 1, Dec. 2020, doi: 10.1186/s40537-020-00369-8.
[31] C. Blundo, V. Loia, and F. Orciuoli, “A Time-Aware Approach for MOOC Dropout Prediction Based on Rule Induction and Sequential Three-Way Decisions”, doi: 10.1109/ACCESS.2017.DOI.

Downloads

Published

2025-11-24

How to Cite

[1]
“Optimasi Prediksi Kelayakan Pinjaman dengan Teknik Resampling danAlgoritma Boosting”, Komputika, vol. 14, no. 2, Nov. 2025, doi: 10.34010/komputika.v14i2.15485.