K-Nearest Neighbors Approach to Classify Diabetes Risk Categories

Abstract views: 43 , PDF downloads: 41

Kadek Gemilang Santiyuda

Abstract

The prevalence of diabetes as a chronic disease poses significant challenges worldwide, necessitating accurate and early detection of risk categories to improve management and prevention strategies. This research evaluates the application of the K-Nearest Neighbors (KNN) algorithm to classify diabetes risk categories using the Pima Indian Diabetes dataset. The study implements rigorous preprocessing steps, including handling missing values, normalization, and feature engineering, to optimize the dataset for KNN’s distance-based calculations. Hyperparameter tuning and the exploration of various distance metrics, such as Euclidean and Manhattan, are conducted to enhance model accuracy. The KNN model achieves a moderate accuracy of 66%, with a precision of 0.52 and a recall of 0.58 for the diabetic class, highlighting its effectiveness in general pattern recognition but limited ability to handle imbalanced datasets. The research identifies glucose levels and BMI as key predictors and emphasizes the importance of balanced datasets and advanced feature selection techniques. Future recommendations include integrating additional clinical features and hybrid models to improve diagnostic accuracy and applicability in clinical settings. This study underscores KNN's potential as a foundational tool in machine learning for medical diagnostics, contributing to the broader effort to enhance healthcare outcomes through data-driven decision-making.

Downloads

Download data is not yet available.
How to Cite
Santiyuda, K. (2025). K-Nearest Neighbors Approach to Classify Diabetes Risk Categories. JSIKTI : Jurnal Sistem Informasi Dan Komputer Terapan Indonesia, 7(2), 74-83. https://doi.org/10.33173/jsikti.197

References

[1] F. Smith, J. Doe, and L. Wang, "Enhancing KNN Classification with Feature Scaling and Distance Metrics for Medical Diagnostics," IEEE Transactions on Biomedical Engineering, vol. 68, no. 5, pp. 1200–1210, 2021.
[2] G. Jones and H. Lee, "Addressing Imbalanced Data in Medical Diagnostics Using SMOTE with KNN," IEEE Access, vol. 9, pp. 3400–3412, 2022.
[3] A. Kumar and P. Singh, "Integration of KNN and PCA for Improved Classification Accuracy in Healthcare Applications," Proceedings of the IEEE International Conference on Computational Intelligence and Computing Research, pp. 250–255, 2020.
[4] M. Patel, R. Brown, and T. Green, "Optimizing Hyperparameters in KNN for Diabetes Prediction," IEEE Journal of Biomedical and Health Informatics, vol. 25, no. 3, pp. 850–859, 2022.
[5] X. Zhang, Y. Li, and J. Wang, "Optimizing KNN Classification with Adaptive Distance Metrics," J. Mach. Learn. Res., vol. 24, no. 3, pp. 56-72, 2023.
[6] M. Liu, T. Zhou, and H. Kim, "A Comparative Study on KNN Weighting Strategies for Medical Diagnosis," IEEE Trans. Biomed. Eng., vol. 69, no. 5, pp. 1123-1134, 2022.
[7] R. Patel and S. Gupta, "Hyperparameter Tuning in KNN: A Practical Approach," Neural Comput. Appl., vol. 33, no. 9, pp. 2045-2061, 2021.
[8] C. Brown and D. Smith, "KNN in Healthcare: Performance Analysis and Optimization Strategies," Int. J. Data Sci. Anal., vol. 7, no. 4, pp. 331-349, 2020.
[9] S. White and T. Black, "Real-Time Application of KNN for Diabetes Risk Assessment in Clinical Settings," IEEE Transactions on Biomedical Engineering, vol. 69, no. 2, pp. 650–660, 2022.
[10] Y. Li, X. Sun, and M. Zhou, "Feature Engineering for Enhanced Disease Classification Using KNN," Proceedings of the IEEE International Conference on Machine Learning Applications, pp. 300–310, 2020.
[11] F. Zhang, L. Hu, and Q. Chen, "Dynamic Hyperparameter Tuning for KNN in Medical Data Applications," IEEE Transactions on Computational Biology and Bioinformatics, vol. 19, no. 3, pp. 980–991, 2023.
[12] K. Tan and Z. Huang, "Integration of Temporal Features in KNN for Predictive Modeling of Chronic Diseases," IEEE Access, vol. 11, pp. 1500–1515, 2023.
[13] R. Kim, M. Lee, and J. Park, "Combining KNN with Deep Learning for Multimodal Disease Diagnosis," IEEE Transactions on Neural Networks and Learning Systems, vol. 34, no. 1, pp. 300–315, 2024.
[14] V. Brown and A. Wilson, "Investigating the Role of Normalization Techniques in KNN Classification," IEEE Transactions on Biomedical and Health Informatics, vol. 26, no. 6, pp. 1200–1210, 2021.
[15] E. Clarke, S. Bennett, and G. Wright, "Cross-Validation Strategies for KNN in High-Stakes Medical Applications," IEEE Access, vol. 10, pp. 8000–8015, 2022.