Analisa Splitting Criteria Pada Decision Tree dan Random Forest untuk Klasifikasi  Evaluasi Kendaraan

Arie  Nugroho

doi:10.53624/jsitik.v1i1.154

Authors

Arie Nugroho Universitas Nusantara PGRI Kediri

DOI:

https://doi.org/10.53624/jsitik.v1i1.154

Keywords:

Splitting Criteria, Decision Tree, Random Forest, Klasifikasi

Abstract

Klasifikasi adalah salah satu topik dalam data mining. Algoritma atau model yang termasuk dalam klasifikasi antara lain Decision tree, K-NN, Naïve bayes. Decision tree merupakan model yang mudah untuk dipahami karena dapat divisualisasikan. Random Forest adalah salah satu model dalam klasifikasi yang merupakan pengembangan dari decision tree. Pemilihan splitting criteria dalam decision tree dan random forest dapat mempengaruhi hasil akurasi. Dalam artikel ini memaparkan perbandingan splitting criteria dalam model klasifikasi dengan decision tree dan random forest untuk data evaluasi kendaraan. Dengan menggunakan split data dan cross validation serta pengujian dengan confusion matrix, pemilihan splitting criteria memberikan pengaruh pada nilai akurasi dari model yang telah dihasilkan.

Downloads

Download data is not yet available.

References

Q. V. Pham, D. C. Nguyen, T. Huynh-The, W. J. Hwang, and P. N. Pathirana, “Artificial Intelligence (AI) and Big Data for Coronavirus (COVID-19) Pandemic: A Survey on the State-of-the-Arts,” IEEE Access, vol. 8, pp. 130820–130839, 2020, doi: 10.1109/ACCESS.2020.3009328. DOI: https://doi.org/10.1109/ACCESS.2020.3009328

R. Agrawal, Fundamentals of Machine Learning. New York: Manning Publications Co, 2018. doi: 10.1201/9780429330131-1. DOI: https://doi.org/10.1201/9780429330131-1

J. Awwalu, A. Ghazvini, and A. Abu Bakar, “Performance Comparison of Data Mining Algorithms: A Case Study on Car Evaluation Dataset,” International Journal of Computer Trends and Technology, vol. 13, no. 2, pp. 78–82, 2014, doi: 10.14445/22312803/ijctt-v13p117. DOI: https://doi.org/10.14445/22312803/IJCTT-V13P117

M. Das and R. Dash, “Performance Analysis of Classification Techniques for Car Data Set Analysis,” Proceedings of the 2020 IEEE International Conference on Communication and Signal Processing, ICCSP 2020, pp. 549–553, 2020, doi: 10.1109/ICCSP48568.2020.9182332. DOI: https://doi.org/10.1109/ICCSP48568.2020.9182332

S. Shumaly, P. Neysaryan, and Y. Guo, “Handling Class Imbalance in Customer Churn Prediction in Telecom Sector Using Sampling Techniques, Bagging and Boosting Trees,” 2020 10h International Conference on Computer and Knowledge Engineering, ICCKE 2020, pp. 82–87, 2020, doi: 10.1109/ICCKE50421.2020.9303698. DOI: https://doi.org/10.1109/ICCKE50421.2020.9303698

P. Vats and K. Samdani, “Study on machine learning techniques in financial markets,” 2019 IEEE International Conference on System, Computation, Automation and Networking, ICSCAN 2019, pp. 1–5, 2019, doi: 10.1109/ICSCAN.2019.8878741. DOI: https://doi.org/10.1109/ICSCAN.2019.8878741

B. Dai, R. C. Chen, S. Z. Zhu, and W. W. Zhang, “Using random forest algorithm for breast cancer diagnosis,” Proceedings - 2018 International Symposium on Computer, Consumer and Control, IS3C 2018, pp. 449–452, 2019, doi: 10.1109/IS3C.2018.00119. DOI: https://doi.org/10.1109/IS3C.2018.00119

Q. Wu, Y. Ye, H. Zhang, M. K. Ng, and S. S. Ho, “ForesTexter: An efficient random forest algorithm for imbalanced text categorization,” Knowledge-Based Systems, vol. 67, pp. 105–116, 2014, doi: 10.1016/j.knosys.2014.06.004. DOI: https://doi.org/10.1016/j.knosys.2014.06.004

A. Nugroho, A. Z. Fanani, and G. F. Shidik, “Evaluation of Feature Selection Using Wrapper for Numeric Dataset with Random Forest Algorithm,” Proceedings - 2021 International Seminar on Application for Technology of Information and Communication: IT Opportunities and Creativities for Digital Innovation and Communication within Global Pandemic, iSemantic 2021, pp. 179–183, 2021, doi: 10.1109/iSemantic52711.2021.9573249. DOI: https://doi.org/10.1109/iSemantic52711.2021.9573249

Y. Hao and F. Liu, “Application of Fuzzy Equivalence Relation Kernel Clustering Algorithm to Car Evaluation,” Proceedings of 2018 IEEE International Conference of Safety Produce Informatization, IICSPI 2018, pp. 591–594, 2019, doi: 10.1109/IICSPI.2018.8690512. DOI: https://doi.org/10.1109/IICSPI.2018.8690512

K. M. Kahloot and P. Ekler, “Algorithmic Splitting: A Method for Dataset Preparation,” IEEE Access, vol. 9, pp. 125229–125237, 2021, doi: 10.1109/ACCESS.2021.3110745. DOI: https://doi.org/10.1109/ACCESS.2021.3110745

E. H. Rachmawanto, D. R. Ignatius Moses Setiadi, N. Rijati, A. Susanto, I. U. Wahyu Mulyono, and H. Rahmalan, “Attribute Selection Analysis for the Random Forest Classification in Unbalanced Diabetes Dataset,” in 2021 International Seminar on Application for Technology of Information and Communication (iSemantic), 2021, pp. 82–86. doi: 10.1109/iSemantic52711.2021.9573181. DOI: https://doi.org/10.1109/iSemantic52711.2021.9573181

M. Liang, Z. Chang, Z. Wan, Y. Gan, E. Schlangen, and B. Šavija, “Interpretable Ensemble-Machine-Learning models for predicting creep behavior of concrete,” Cement and Concrete Composites, vol. 125, no. October 2021, 2022, doi: 10.1016/j.cemconcomp.2021.104295. DOI: https://doi.org/10.1016/j.cemconcomp.2021.104295

T. Gunasegaran and Y. N. Cheah, “Evolutionary cross validation,” ICIT 2017 - 8th International Conference on Information Technology, Proceedings, pp. 89–95, 2017, doi: 10.1109/ICITECH.2017.8079960. DOI: https://doi.org/10.1109/ICITECH.2017.8079960

A. Nugroho and A. Husin, “Analisis Performa Random Forest Menggunakan Normalisasi Atribut,” SISTEMASI: Jurnal Sistem Informasi, vol. 11, no. 1, pp. 186–196, 2022, doi: https://doi.org/10.32520/stmsi.v11i1.1681. DOI: https://doi.org/10.32520/stmsi.v11i1.1681