COMPARISON ACCURACY OF C4.5 ALGORITHM AND K-NEAREST NEIGHBORS FOR RAINFALL CLASSIFICATION

Authors

  • Muhammad Fauzan Nasrullah Telkom University, Indonesia Author
  • RD. Rohmat Saedudin Telkom University, Indonesia Author
  • Faqih Hamami Telkom University, Indonesia Author

DOI:

https://doi.org/10.5281/zenodo.14715070

Keywords:

Rainfall, Climate Indonesia, Classificiation, C4.5, K-Nearest Neighbor, Data Mining

Abstract

Indonesia has a predominantly tropical climate, hence Indonesia experiences limited temperature variations, but has diverse rainfall variations. The variability of rainfall is also inseparable from the impact it has on various aspects of human life and business activities. Therefore, rainfall information is an important aspect in decision making. However, of course, there are stages and methods needed to carry out the analysis process. Therefore, this study looked for the best method between C4.5 and K-Nearest Neighbors which included algorithms in data mining to classify rainfall data. Both algorithms are used to build classification models based on relevant attribute attributes. Then, testing and evaluating both models using various metrics such as Accuracy, Precision, Recall and F1-Score were carried out. In this study also applied Hyperparameter Tuning with the RandomizeSearchCV method to get the best parameters to get maximum accuracy values. The results showed good accuracy values for both algorithms, in the sense that both algorithms were able to classify rainfall based on Indonesia's climate well. Based on the accuracy values obtained with the default parameters of both algorithms, C4.5 produces a higher accuracy value of 81.42%, while K-Nearest Neighbors is only 78.10%. However, after using the best parameters resulting from the application of RandomizedSearchCV Hyperparameter Tuning, a significant change in accuracy value occurred in K-Nearest Neighbors which was found to be 83.37%, while C4.5 increased to 82.56%.

Downloads

Download data is not yet available.

References

S. Prawirowardoyo, Meteorology. Bandung: ITB, 1996.

N. Sunarmi et al., “Analisis Faktor Unsur Cuaca terhadap Perubahan Iklim di Kabupat-en

Pasuruan pada Tahun 2021 dengan Metode Principal Component Analysis,” New-tonMaxwell

Journal of Physics, vol. 3, no. 2, Oct. 2022, [Online]. Available:

https://www.ejournal.unib.ac.id/index.php/nmj

S. B. Sipayung, “Dampak Variabilitas Iklim Terhadap Produksi Pangan di Sumatera,” vol. 2, Jun.

2005.

E. Aldrian, “Sistem Peringatan Dini Menghadapi Iklim Ekstrem,” vol. 10, no. 2, Dec. 2016.

H. A. Tambunan and D. Saputra, “Rancang Bangun Aplikasi Prediksi Cuaca Berbasis Android,”

Jurnal Bisantara Informatika (JBI), vol. 6, no. 2, 2022.

S. Chodijah, “Strategi Komunikasi Penyampaikan Informasi Iklim Stasiun Klimatologi Sampali

Medan Dalam Upaya Meminimalkan Kegagalan Panen Padi Sawah Akibat Iklim Ekstrim,”

Persepsi: Communication Journal, vol. 1, no. 1, pp. 55–69, Nov. 2018, doi:

10.30596/persepsi. v1i1.2506.

J. H. Yousif, H. A. Al-Balushi, H. A. Kazem, and M. T. Chaichan, “Analysis and fore-casting of

weather conditions in Oman for renewable energy applications,” Case Stud-ies in Thermal

Engineering, vol. 13, p. 100355, Mar. 2019, doi: 10.1016/J.CSITE.2018.11.006.

B. Poernomo, R. Dewi, and I. Sari, “Penerapan Data Mining untuk Prakiraan Cuaca di Kota

Malang Menggunakan Algoritma Iterative Dichotomiser Tree (ID3),” JOUTICLA, vol. 3,

no. 2, 2017.

Irmayani, “Penerapan Algoritma CART Klasisifikasi Sosial Ekonomi Masyarakat Ke-lurahan

Amessangeng,” Jurnal Ilmiah Information Technology d’Computare, vol. 10, Jul. 2020.

J. Han and M. Kamber, “Designing Data-Intensive Web Applications,” 2006.

P. Meilina, “Penerapan Data Mining dengan Metode Klasifikasi Menggunakan Decision Tree dan

Regresi,” Jurnal Teknologi Universitas Muhammadiyah Jakarta, vol. 7, no. 1, 2015.

R. Purba, “Data Mining: Masa Lalu, Sekarang dan Masa Mendatang,” vol. 13, no. 1, 2012.

S. Anastassia Amellia Kharis and A. Haqqi Anna Zili, “Learning Analytics dan Educa-tional Data

Mining pada Data Pendidikan,” Jurnal Riset Pembelajaran Matematika Sekolah, vol. 6,

2022.

Safitra, M. F., & Abdurrahman, L. (2023). Open-up International Market Opportunities: Using

the OSINT Crawling and Analyzing Method. SEIKO: Journal of Management & Business,

6(1), 923-931.

Safitra, M. F., Lubis, M., & Widjajarto, A. (2023, March). Security Vulnerability Analy-sis using

Penetration Testing Execution Standard (PTES): Case Study of Government's Website. In

Proceedings of the 2023 6th International Conference on Electronics, Communications and

Control Engineering (pp. 139-145).

Rafiq Amaliyah, “Aplikasi Klasifikasi Citra Kerusakan Aspal Menggunakan Matlab 2013A,”

Universitas Gunadarma, 2014.

R. Amalyah, “Aplikasi Klasifikasi Citra Kerusakan Aspal Menggunakan Matlab 2013A,”

Universitas Gunadarma, 2014.

M. Jordan, J. Kleinberg, and B. Schölkopf, “Pattern Recognition and Machine Learn-ing.”

Safitra, M. F., Lubis, M., & Kurniawan, M. T. (2023, March). Cyber Resilience: Re-search

Opportunities. In Proceedings of the 2023 6th International Conference on Elec-tronics,

Communications and Control Engineering (pp. 99-104).

Safitra, M. F., Lubis, M., & Fakhrurroja, H. (2023). Counterattacking Cyber Threats: A

Framework for the Future of Cybersecurity. Sustainability, 15(18), 13369.

I. Goodfellow, Y. Bengio, and A. Courville, “Deep Learning.”

Maulana, F., Fajri, H., Safitra, M. F., & Lubis, M. (2023, August). Unmasking log4j’s

Vulnerability: Protecting Systems against Exploitation through Ethical Hacking and

Cyberlaw Perspectives. In 2023 9th International Conference on Computer and Communication

Engineering

(ICCCE)

(pp.

311-316).

IEEE.

Sutoyo,

E.,

Yanto,

I.

T.

R.,

Saedudin,

R.

R.,

&

Herawan,

T.

(2017).

A

soft

set-based

co-occurrence

for

clustering web user transactions. TELKOMNIKA (Telecommunication Computing

Electronics and Control), 15(3), 1344-1353.

Jacob, D. W., Fudzee, M. F. M., Salamat, M. A., Saedudin, R., Abdullah, Z., & Hera-wan, T.

(2017). Mining significant association rules from on information and system quality of

indonesian e-government dataset. In Recent Advances on Soft Computing and Data

Mining: The Second International Conference on Soft Computing and Data Mining

(SCDM-2016), Bandung, Indonesia, August 18-20, 2016, Proceedings Second (pp. 608618).

Springer

International

Publishing.

Zunaidi,

W. H.

A.

W.,

Saedudin,

R.

R.,

Shah, Z.

A.,

Kasim, S.,

Seah,

C. S.,

&

Abdu-rohman,

M.

(2018).

Performances analysis of heart disease dataset using different data mining

classifications. International Journal on Advanced Science, Engineering, and In-formation

Technology, 8(6), 2677-2682.

Yanto, I. T. R., Saedudin, R. R., Lashari, S. A., & Haviluddin. (2018). A numerical classification

technique based on fuzzy soft set using hamming distance. In Recent Ad-vances on Soft

Computing and Data Mining: Proceedings of the Third International Conference on Soft

Computing and Data Mining (SCDM 2018), Johor, Malaysia, Feb-ruary 06-07, 2018 (pp.

252-260). Springer International Publishing.

Jacob, D. W., Fudzee, M. F. M., Salamat, M. A., Saedudin, R. R., Yanto, I. T. R., & Herawan, T.

(2017). An application of rough set theory for clustering performance ex-pectancy of

Indonesian e-government dataset. In Recent Advances on Soft Computing and Data

Mining: The Second International Conference on Soft Computing and Data Mining

(SCDM-2016), Bandung, Indonesia, August 18-20, 2016, Proceedings Second (pp. 638646).

Springer

International

Publishing.

Seah,

C.

S.,

Kasim,

S.,

Fudzee,

M.

F.,

Mohamad,

M.

S.,

Saedudin,

R.

R.,

Hassan,

R.,

...

&

Atan,

R.

(2018). An effective pre-processing phase for gene expression classifica-tion.

Indonesian Journal of Electrical Engineering and Computer Science, 11(3), 1223.

Darmawan, M. F., Jamahir, N. I., Saedudin, R. R., & Kasim, S. (2018). Comparison be-tween

ANN and multiple linear regression models for prediction of warranty cost. In-ternational

Journal of Integrated Engineering, 10(6).

Saedudin, R. R., Sutoyo, E., Kasim, S., Mahdin, H., & Yanto, I. T. R. (2017, October). Attribute

selection on student performance dataset using maximum dependency attrib-ute. In 2017

5th International Conference on Electrical, Electronics and Information Engineering

(ICEEIE) (pp. 176-179). IEEE.

Downloads

Published

31-07-2024

Issue

Section

Articles

How to Cite

COMPARISON ACCURACY OF C4.5 ALGORITHM AND K-NEAREST NEIGHBORS FOR RAINFALL CLASSIFICATION. (2024). SITEKNIK: Sistem Informasi, Teknik Dan Teknologi Terapan, 1(2), 90-101. https://doi.org/10.5281/zenodo.14715070

Share

Similar Articles

1-10 of 34

You may also start an advanced similarity search for this article.