Perbandingan Klasifikasi Penyakit Kanker Paru-Paru Menggunakan Decision Tree Dan Random Forest

Perbandingan Klasifikasi Penyakit Kanker Paru-Paru Menggunakan Decision Tree Dan Random Forest

Authors

  • Rabiatul Adawiyah Universitas PGRI Adi Buana Surabaya
  • Dwi Cahya Julia Kartikasari Universitas PGRI Adi Buana Surabaya

DOI:

https://doi.org/10.34010/kp5h2h96

Abstract

Lung cancer is the leading cause of cancer-related deaths across various age groups, with risk factors such as smoking, air pollution, and chronic diseases. Lung cancer is characterized by the uncontrolled growth of cells in lung tissue, which can spread to other organs through metastasis. Machine learning-based classification can assist in the early detection of this disease. This study compares the Decision Tree and Random Forest methods in classifying lung cancer using a dataset containing seven attributes and 1,010 data entries. Missing values were handled using mode imputation. Feature importance analysis with Random Forest identified Coughing, Chronic Disease, Smoking, and Shortness of Breath as the most influential features in classification. The classification results showed that Decision Tree without feature selection achieved an accuracy of 64.85%, higher than Random Forest, which reached only 52.62%. After feature selection, Decision Tree accuracy decreased to 55.94%, while Random Forest experienced a slight decline to 52.47%. These findings indicate that Decision Tree is more effective in capturing data patterns without feature selection, whereas Random Forest tends to be less optimal with relatively small datasets.

Keywords – Machine Learning; Classification; Feature Importance; Entropy; Gain.

References

[1] Aqila, A., & Faisal, M. (2023). Lung Cancer EDA Classification Using the Decision Trees Method in Python. Informatics and Software Engineering, 1(1), 8-13.

[2] Depari, D. H., Widiastiwi, Y., & Santoni, M. M. (2022). Perbandingan Model Decision Tree, Naive Bayes dan Random Forest untuk Prediksi Klasifikasi Penyakit Jantung. Informatik: Jurnal Ilmu Komputer, 18(3), 239-248.

[3] Desiani, A., Maiyanti, S. I., Andriani, Y., Suprihatin, B., Amran, A., Marselina, N. C., & Salsabila, A. (2023). Perbandingan Klasifikasi Penyakit Kanker Paru-Paru menggunakan Support Vector Machine dan K-Nearest Neighbor. Jurnal PROCESSOR, 18(1).

[4] Hafizan, H., & Putri, A. N. (2020). Penerapan Metode Klasifikasi Decision Tree Pada Status Gizi Balita Di Kabupaten Simalungun. Kesatria: Jurnal Penerapan Sistem Informasi (Komputer dan Manajemen), 1(2), 68-72.

[5] Idris, J. F., Ramadhani, R., & Mutoffar, M. M. (2024). Klasifikasi Penyakit Kanker Paru Menggunakan Perbandingan Algoritma Machine Learning. Jurnal Media Akademik (JMA), 2(2).

[6] Purba, W., Wardani, S., Lumbantoruan, D. F., Celia, F., Silalahi, I., & Edison, T. L. (2023). Optimization Of Lung Cancer Classification Method Using Eda-Based Machine Learning. 6(2), 43–50.

[7] Putra, H. W. N. S., Atina, V., & Maulindar, J. (2023). Penerapan Algoritme Decision Tree Pada Klasifikasi Penyakit Kanker Paru-Paru. Jutisi: Jurnal Ilmiah Teknik Informatika dan Sistem Informasi, 12(3).

[8] Rifai, A., & Prabowo, Y. (2022). Diagnosis Kanker Paru-Paru dengan Sistem Fuzzy. Krea-TIF: Jurnal Teknik Informatika, 10(1), 19-28.

[9] Rofiani, R., Oktaviani, L., Vernanda, D., & Hendriawan, T. (2024). Penerapan Metode Klasifikasi Decision Tree dalam Prediksi Kanker Paru-Paru Menggunakan Algoritma C4.5. Jurnal Tekno Kompak, 18(1), 126-139.

[10] Rosandy, T. (2016). Perbandingan Metode Naive Bayes Classifier Dengan Metode Decision Tree (C4.5) Untuk Menganalisa Kelancaran Pembiayaan (Study Kasus: KSPPS/BMT Al-Fadhila). Jurnal Teknologi Informasi Magister, 2(01), 52-62.

[11] Sari, L., Romadloni, A., & Listyaningrum, R. (2023). Penerapan Data Mining dalam Analisis Prediksi Kanker Paru Menggunakan Algoritma Random Forest. Infotekmesin, 14(1), 155-162.

[12] Septhya, D., Rahayu, K., Rabbani, S., Fitria, V., Rahmaddeni, R., Irawan, Y., & Hayami, R. (2023). Implementation of Decision Tree Algorithm and Support Vector Machine for Lung Cancer Classification. MALCOM: Indonesian Journal of Machine Learning and Computer Science, 3(1), 15-19.

[13] Tarigan, L. R. A., & Dahlan, D. (2024). Optimalisasi Fitur Dengan Forward Selection Pada Estimasi Tingkat Penyakit Paru-Paru Menggunakan Algoritma Klasifikasi Random Forest. JATI (Jurnal Mahasiswa Teknik Informatika), 8(5), 10341-10348.

[14] Kamagi, D. H., & Hansun, S. (2014). Implementasi Data Mining dengan Algoritma C4. 5 untuk Memprediksi Tingkat Kelulusan Mahasiswa. Ultimatics: Jurnal Teknik Informatika, 6(1), 15-20.

[15] Sinambela, D. P., Naparin, H., Zulfadhilah, M., & Hidayah, N. (2023). Implementasi Algoritma Decision Tree dan Random Forest dalam Prediksi Perdarahan Pascasalin. Jurnal Informasi dan Teknologi, 58-64.

Published

2025-05-03

How to Cite

[1]
“Perbandingan Klasifikasi Penyakit Kanker Paru-Paru Menggunakan Decision Tree Dan Random Forest: Perbandingan Klasifikasi Penyakit Kanker Paru-Paru Menggunakan Decision Tree Dan Random Forest”, Komputika, vol. 14, no. 1, pp. 79–85, May 2025, doi: 10.34010/kp5h2h96.