Support Vector Machine for Classifying Prostate Cancer Data
Abstract views: 93 , PDF downloads: 24Abstract
Prostate cancer is one of the most prevalent cancers among men worldwide, making early detection and accurate classification essential for improving patient outcomes. This study investigates the application of Support Vector Machine (SVM) models for classifying prostate cancer using clinical and demographic data. Features such as prostate-specific antigen (PSA) levels, Gleason scores, tumor stage, and patient age were utilized to train and evaluate the model. Comprehensive preprocessing techniques, including handling missing values, feature normalization, and addressing class imbalance with the Synthetic Minority Oversampling Technique (SMOTE), were employed to ensure robust model performance. The SVM model, optimized with a radial basis function (RBF) kernel, achieved an accuracy of 94.2%, with precision, recall, and F1-scores indicating reliable classification of both cancerous and non-cancerous cases. However, the results highlight challenges with the minority class, emphasizing the need for better handling of imbalanced datasets. Explainability techniques such as SHAP (Shapley Additive Explanations) were integrated to provide interpretable insights into the model’s predictions, with PSA levels and Gleason scores identified as the most influential features. This research demonstrates the potential of SVM in prostate cancer classification, providing a foundation for integrating machine learning models into clinical workflows for improved diagnostic precision and patient care.
Downloads

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
References
[2] C. Cortes and V. Vapnik, "Support-vector networks," Mach. Learn., vol. 20, no. 3, pp. 273–297, 1995.
[3] N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, "SMOTE: Synthetic minority over-sampling technique," J. Artif. Intell. Res., vol. 16, pp. 321–357, 2002.
[4] T. Fawcett, "An introduction to ROC analysis," Pattern Recognit. Lett., vol. 27, no. 8, pp. 861–874, 2006.
[5] S. M. Lundberg and S.-I. Lee, "A unified approach to interpreting model predictions," in Adv. Neural Inf. Process. Syst. (NeurIPS), vol. 30, pp. 4765–4774, 2017.
[6] I. Guyon and A. Elisseeff, "An introduction to variable and feature selection," J. Mach. Learn. Res., vol. 3, pp. 1157–1182, 2003.
[7] M. A. Hahn et al., "Precision diagnostics in cancer using machine learning algorithms," Clin. Cancer Res., vol. 24, no. 19, pp. 4690–4700, 2018.
[8] A. E. Hassanien and A. T. Azar, Deep Learning for Healthcare Services. Cham, Switzerland: Springer, 2020.
[9] S. Ruder, "An overview of gradient descent optimization algorithms," arXiv preprint arXiv:1609.04747, 2016.
[10] Goodfellow, Y. Bengio, and A. Courville, Deep Learning. Cambridge, MA, USA: MIT Press, 2016.