Hypertension Classification Using HistGradientBoostingClassifier, HealthD, And Model Optimization
Abstract views: 52 , PDF downloads: 51Abstract
High blood pressure ranks among the world's most common heart-related conditions, carrying serious dangers like strokes and heart attacks. Even with progress in medical testing, spotting it early is tough because of the intricate mix of daily habits and inherited traits. This study seeks to solve the issue of precise hypertension forecasting using machine learning methods tailored for varied health information. Driven by the rising demand for evidence-based health prevention, the research employs the HistGradientBoostingClassifier on a collection of 1,985 patient profiles with eleven lifestyle and bodily indicators, such as age, body mass index, sleep hours, sodium consumption, and tension levels. The key innovation here is the histogram-based boosting approach, which adeptly manages diverse attributes and curbs excessive fitting via timely halting and adjustment techniques. Assessment findings show the model reaches 97% accuracy, maintaining even performance in precision, recall, and F1-score for both hypertensive and non-hypertensive groups. These findings underscore the model's reliability and suitability for inclusion in prompt alert tools for hypertension danger assessment. Upcoming efforts will investigate model clarity through SHAP analysis and pit boosting classifiers against neural network methods to boost understanding and adaptability in practical medical settings.
Downloads

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
References
[2] World Health Organization, Global Report on Hypertension: Accelerating Prevention and Control, WHO, Geneva, 2023.
[3] S. R. Banerjee, M. Zhao, and A. R. Gupta, “Lifestyle Factors and Their Association with Hypertension in Urban Populations,” BMC Public Health, vol. 24, no. 1, p. 871, 2024.
[4] M. A. Rodriguez and D. P. Singh, “Challenges in Early Detection of Hypertension in Developing Regions,” International Journal of Cardiology, vol. 378, pp. 112–119, 2023.
[5] K. Wang, J. Xu, and L. Huang, “AI and Machine Learning in Preventive Cardiology: Opportunities and Challenges,” IEEE Access, vol. 12, pp. 55476–55488, 2024.
[6] L. Park and S. Han, “Comparative Analysis of Machine Learning Algorithms for Hypertension Prediction,” Computers in Biology and Medicine, vol. 153, 2023.
[7] Y. Chen, M. Zhao, and X. Li, “Handling Categorical and Numerical Features in Clinical Data: An Overview,” Artificial Intelligence in Medicine, vol. 144, 2024.
[8] D. Nguyen, T. Vo, and A. Phan, “Dealing with Class Imbalance in Medical Prediction: A Review,” Applied Intelligence, vol. 53, no. 8, pp. 10234–10248, 2023.
[9] J. G. Friedman, “Recent Advances in Gradient Boosting and Ensemble Learning,” Journal of Machine Learning Research, vol. 24, pp. 1–23, 2022.
[10] T. Rahman and S. Alshamrani, “Performance Evaluation of Boosting-Based Models for Disease Classification,” IEEE Transactions on Neural Networks and Learning Systems, vol. 35, no. 6, 2024.
[11] T. Ke, L. Yang, and J. Zhou, “HistGradientBoosting for Efficient Large-Scale Learning,” Advances in Neural Information Processing Systems (NeurIPS), 2022.
[12] F. Pedregosa, A. Gramfort, and G. Varoquaux, “Optimization in Gradient Boosting Frameworks: Advances and Applications,” Pattern Recognition Letters, vol. 170, pp. 145–156, 2023.
[13] A. Kumar, K. Ahmed, and P. N. Singh, “Telehealth and Predictive Analytics for Chronic Disease Management,” IEEE Journal of Biomedical and Health Informatics, vol. 28, no. 4, 2024.
[14] Dataset Documentation, Hypertension Risk Dataset Analysis Using HistGradientBoostingClassifier, 2025.
[15] H. Tanaka, L. Zhang, and K. Suzuki, “Explainable and Trustworthy AI for Healthcare: The Next Frontier,” IEEE Reviews in Biomedical Engineering, vol. 18, 2025.
[16] M. N. Rahim and A. Bose, “Data-Driven Clinical Decision Systems for Chronic Disease Prediction,” IEEE Access, vol. 12, pp. 98532–98545, 2024.
[17] M. R. Islam, A. Hossain, and M. Haque, “A Comparative Study of Machine Learning Algorithms for Hypertension Classification,” Computers in Medicine and Biology, vol. 144, 2022.
[18] S. Yu and C. Chen, “Boosting Algorithms in Health Data Mining: Trends and Challenges,” IEEE Transactions on Artificial Intelligence, vol. 5, no. 2, pp. 177–191, 2024.
[19] J. Li, Q. Zhou, and R. Wu, “Efficient Gradient Boosting for Mixed Healthcare Data Using LightGBM and CatBoost,” Expert Systems with Applications, vol. 235, 2024.
[20] H. S. Patel, “Deep Neural Network Approaches in Predictive Cardiology,” IEEE Reviews in Biomedical Engineering, vol. 17, 2024.
[21] L. Wu and Y. Zhang, “Addressing Data Imbalance in Medical Prediction Using Ensemble Learning,” Applied Intelligence, vol. 52, no. 3, pp. 4281–4296, 2022.
[22] X. Zhou, J. Liang, and R. Han, “Feature-Driven Hypertension Prediction Using Explainable Ensemble Models,” BMC Medical Informatics and Decision Making, vol. 23, no. 4, p. 201, 2023.
[23] A. N. Elbaz and H. Tanaka, “Transparent and Explainable AI Models for Cardiovascular Health Prediction,” IEEE Journal of Biomedical and Health Informatics, vol. 29, no. 1, 2025.
[24] T. Akiba, S. Sano, and T. Ohta, “Optuna: A Next-generation Hyperparameter Optimization Framework,” Proceedings of the ACM, 2022.
[25] J. Lopez, D. Zhang, and S. Green, “Best Practices for Training Deep Neural Networks on Tabular Data,” IEEE Transactions on Neural Networks and Learning Systems, vol. 34, 2024.
[26] M. G. Kendall and S. L. Rogers, “Model Calibration and Decision Thresholding in Clinical Predictive Models,” Journal of Medical Informatics, vol. 19, no. 2, 2023.
[27] S. Lundberg, G. Erion, and S.-I. Lee, “SHAP for Model Explainability in Healthcare,” BMC Medical Informatics and Decision Making, 2022







