Perbandingan Metode KNN dan Random Forest untuk Klasifikasi Penyakit Tuberkulosis Berdasarkan Gejala Klinis Pasien

SELAN, Cory Alvania (2026) Perbandingan Metode KNN dan Random Forest untuk Klasifikasi Penyakit Tuberkulosis Berdasarkan Gejala Klinis Pasien. Undergraduate thesis, Universitas Katolik Widya Mandira.

[img] Text
ABSTRAK.pdf

Download (563kB)
[img] Text
BAB I.pdf

Download (248kB)
[img] Text
BAB II.pdf
Restricted to Repository staff only

Download (337kB)
[img] Text
BAB III.pdf
Restricted to Repository staff only

Download (721kB)
[img] Text
BAB IV.pdf
Restricted to Repository staff only

Download (873kB)
[img] Text
BAB V.pdf
Restricted to Repository staff only

Download (230kB)
[img] Text
BAB VI.pdf

Download (176kB)
[img] Text
DAFTAR PUSTAKA DAN SURAT BEBAS PLAGIAT.pdf

Download (924kB)

Abstract

Tuberculosis (TB) is an infectious disease caused by Mycobacterium tuberculosis and remains one of the leading causes of death worldwide, including in Indonesia. This disease primarily affects the lungs and requires rapid and accurate diagnosis. However, limitations in medical personnel and laboratory facilities, particularly in rural primary healthcare centers, often result in delayed diagnosis. Therefore, a decision support system based on clinical symptom data is needed to assist early diagnosis.This study aims to develop a TB classification model using patients’ clinical symptom data. Feature selection was performed using the Random Forest algorithm based on Gini Importance, and its performance was compared with the K-Nearest Neighbors (KNN) algorithm using variations of 10 and 15 features and 5-Fold and 10-Fold Cross Validation schemes. Model performance was evaluated using Accuracy, Precision, Recall, and F1-Score metrics. The training results indicate that Random Forest achieved the best performance with an Accuracy of 99.0093%, Precision of 99.0584%, Recall of 99.0183%, and F1-Score of 99.0088% using 15 features and 5-Fold Cross Validation. Testing on real data using the Majority Voting method produced an Accuracy of 98.00%, with 48 out of 50 test samples correctly classified. Based on these results, the Random Forest algorithm is identified as the most effective method and is considered reliable for TB classification in primary healthcare facilities.

Item Type: Thesis (Undergraduate)
Uncontrolled Keywords: Tuberculosis, Clinical Symptoms, Random Forest, KNN, Classification, Cross Validation
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Q Science > QA Mathematics > QA76 Computer software
Divisions: Fakultas Teknik > Program Studi Ilmu Komputer
Depositing User: CORY ALVANIA SELAN
Date Deposited: 13 Mar 2026 01:52
Last Modified: 13 Mar 2026 01:52
URI: http://repositori.unwira.ac.id/id/eprint/24221

Actions (login required)

View Item View Item