Komparasi TF-IDF dan BoW pada Analisis Sentimen Shopee-Tokopedia
DOI:
https://doi.org/10.34010/jamika.v16i1.17552Keywords:
Analisis Sentimen, Bag of Words, Random Forest, SVM, TF-IDFAbstract
The rapid growth of e-commerce in Indonesia has led to an increase in user interactions in the form of reviews and opinions on services and products. These textual data contain valuable information that can be processed through sentiment analysis to better understand user perceptions. This study aims to compare the effectiveness of Term Frequency–Inverse Document Frequency (TF-IDF) and Bag of Words (BoW) feature extraction methods in classifying user sentiments, as well as to evaluate the performance of Support Vector Machine (SVM) and Random Forest (RF) algorithms on Shopee and Tokopedia platforms. A total of 5,000 user reviews were analyzed through text preprocessing, lexicon-based sentiment labeling, application of TF-IDF and BoW feature extraction methods, model training using SVM and RF algorithms, and performance evaluation using accuracy, precision, recall, and F1-score metrics. The experimental results show the combination of BoW and SVM achieved the highest accuracy of 90% on Shopee reviews, making it the most optimal configuration in this study. Additionally, in Tokopedia reviews, the same configuration (BoW and SVM) also produced a strong accuracy of 88%. In general, the SVM algorithm showed more stable performance than RF, while the BoW method proved to be more effective (measured at up to 90% accuracy) in representing this Indonesian-language e-commerce review data. These findings contribute to the development of more accurate sentiment analysis systems in the local e-commerce domain.
References
H. Huang, A. A. Zavareh, and M. B. Mustafa, “Sentiment Analysis in E-Commerce Platforms: A Review of Current Techniques and Future Directions,” IEEE Access, vol. 11, no. August, pp. 90367–90382, 2023, doi: 10.1109/ACCESS.2023.3307308.
K. Tri Putra, M. Amin Hariyadi, and C. Crysdian, “Perbandingan Feature Extraction Tf-Idf Dan Bow Untuk Analisis Sentimen Berbasis SVM,” J. Cahaya MAndalika, p. 1449, 2023.
Mohamed Omar, “Comparative Analysis of Feature Selection Methods for Twitter Sentiment Classification,” J. Inf. Syst. Eng. Manag., vol. 10, no. 21s, pp. 330–338, Mar. 2025, doi: 10.52783/jisem.v10i21s.3331.
K. M. Suryaningrum, “Comparison of the TF-IDF Method with the Count Vectorizer to Classify Hate Speech,” Eng. Math. Comput. Sci. J., vol. 5, no. 2, pp. 79–83, May 2023, doi: 10.21512/emacsjournal.v5i2.9978.
R. Suryanti and T. Prasetyaningrum, “Perbandingan Metode TF-IDF dan Bag of Words dalam Analisis Sentimen Diet Kopi Americano di Media Sosial Twitter Menggunakan Naïve Bayes,” Technol. Sci., vol. 7, no. 1, 2025, doi: 10.47065/bits.v7i1.7244.
I. Septiana and D. Alita, “Perbandingan Random Forest dan SVM dalam Analisis Sentimen Quick Count Pemilu 2024,” J. Inform. J. Pengemb. IT, vol. 9, no. 3, pp. 224–233, 2024, doi: 10.30591/jpit.v9i3.6640.
A. Gupta and D. Kamthania, “Study of Sentiment on Google Play Store Applications,” SSRN Electron. J., pp. 1–4, 2021, doi: 10.2139/ssrn.3833926.
Siti Mutmainah, Fathir, and Erin Eka Citra, “Improving the Accuracy of Social Media Sentiment Classification with the Combination of TF-IDF Method and Random Forest Algorithm,” Journix J. Informatics Comput., vol. 1, no. 1, pp. 30–40, 2025, doi: 10.63866/journix.v1i1.2.
F. Irwannia and A. H. Lubis, “Analisis Sentimen Produk Berdasarkan Review Pelanggan Shopee Menggunakan KNN Product Sentiment Analysis Based on Shopee Customer Reviews Using KNN,” vol. 5, no. 2, pp. 239–249, 2025, doi: 10.34007/incoding.v5i2.865.
A. Hasan, Y. R. Ramadhan, and M. Minarto, “Sentiment Analysis of Telemedicine Applications on Twitter Using Lexicon-Based and Naive Bayes Classifier Methods,” J. Ris. Inform., vol. 5, no. 4, pp. 481–490, 2023, doi: 10.34288/jri.v5i4.244.
J. E. Br Sinulingga and H. C. K. Sitorus, “Analisis Sentimen Opini Masyarakat terhadap Film Horor Indonesia Menggunakan Metode SVM dan TF-IDF,” J. Manaj. Inform., vol. 14, no. 1, pp. 42–53, 2024, doi: 10.34010/jamika.v14i1.11946.
K. P. Harmandini and K. M. L, “Analysis of TF-IDF and TF-RF Feature Extraction on Product Review Sentiment,” Sinkron, vol. 8, no. 2, pp. 929–937, 2024, doi: 10.33395/sinkron.v8i2.13376.
M. Zainottah, R. Saputra, Y. Servanda, and I. Rosita, “Critical Sentiment Analysis of Tokopedia Electronic Products Using SVM-Logistic & TF-IDF Ensemble Methods,” Journal of Artificial Intelligence and Engineering Applications (JAIEA), vol. 4, no. 3, 2025.
A. Ananta Firdaus, A. Id Hadiana, and A. Kania Ningsih, “Klasifikasi Sentimen pada Aplikasi Shopee Menggunakan Fitur Bag of Word dan Algoritma Random Forest,” Ranah Res. J. Multidiscip. Res. Dev., vol. 6, no. 5, pp. 1678–1683, 2024, doi: 10.38035/rrj.v6i5.994.
A. Syah, F. Nurdiyansyah, and A. Y. Rahman, “Analisis Sentimen Aplikasi Shopee, Tokopedia, Lazada Dan Blibli Menggunakan Leksikon Dan Random Forest,” J. Inform. dan Tek. Elektro Terap., vol. 12, no. 3S1, 2024, doi: 10.23960/jitet.v12i3s1.5155.
G. Kanugrahan, V. H. C. Putra, and Y. Ramdhani, “Analisis Sentimen Aplikasi Gojek Menggunakan SVM, Random Forest dan Decision Tree,” J. Infortech, vol. 6, no. 2, pp. 171–178, 2024, doi: 10.31294/infortech.v6i2.24594.
T. Wahyudi et al., “Klasifikasi Sentimen X-Twitter Perihal Pemindahan Ibu Kota Indonesia Menggunakan Ekstraksi Fitur Tf-Idf Dan Metode Support Vector Machine (SVM),” J. Teknol. Inf., vol. 18, no. 2, pp. 185–199, 2024, [Online]. Available: https://doi.org/10.47111/JTIAvailableonlineathttps://e-journal.upr.ac.id/index.php/JTI
M. Arief and N. A. Samsudin, “Hybrid Approach with VADER and Multinomial Logistic Regression for Multiclass Sentiment Analysis in Online Customer Review,” Int. J. Adv. Comput. Sci. Appl., vol. 14, no. 12, pp. 311–320, 2023, doi: 10.14569/IJACSA.2023.0141232.
D. A. Fitri and D. Damayanti, “Komparasi Algoritma Random Forest Classifier Dan Support Vector Machine Untuk Sentimen Masyarakat Terhadap Pinjaman Online Di Media Sosial,” JIPI (Jurnal Ilm. Penelit. dan Pembelajaran Inform., vol. 9, no. 4, pp. 2018–2029, 2024, doi: 10.29100/jipi.v9i4.5608.
S. A. S. Mola, D. L. B. Baun, I. O. Nunes, and M. M. A. R. Sani, “Analisis Sentimen Aplikasi Halo Bca Di Google Play Store Menggunakan Metode Naive Bayes, Support Vector Machine Dan Random Forest,” HOAQ (High Educ. Organ. Arch. Qual. J. Teknol. Inf., vol. 15, no. 2, pp. 69–79, 2024, doi: 10.52972/hoaq.vol15no2.p69-79.










