Machine Learning-Based Hotel Booking Cancellation Prediction Using XGBoost
Abstract
The rapid growth of online booking platforms has significantly increased the availability of hotel reservation data, enabling data-driven decision-making in the hospitality industry. However, high hotel booking cancellation rates remain a major challenge, leading to revenue loss and inefficient resource utilization. Accurately predicting booking cancellations is therefore essential to support effective reservation and revenue management strategies. Motivated by the limitations of traditional statistical and basic machine learning approaches in handling complex and imbalanced booking data, this study proposes a machine learning-based hotel booking cancellation prediction model using Extreme Gradient Boosting (XGBoost). The main contribution of this research lies in the systematic application of XGBoost combined with comprehensive data preprocessing, class imbalance handling, and hyperparameter optimization to improve prediction accuracy and robustness. The proposed approach is evaluated using a publicly available hotel booking demand dataset and assessed through multiple performance metrics, including accuracy, precision, recall, F1-score, and the area under the receiver operating characteristic curve (ROC-AUC). Experimental results demonstrate that the XGBoost model achieves strong and balanced classification performance in predicting both canceled and non-canceled bookings, outperforming conventional baseline methods reported in related studies. Despite the promising results, further improvements can be explored by incorporating additional contextual features and deploying explainable artificial intelligence techniques to enhance model transparency. Future work will also focus on real-time implementation and validation of the proposed model in operational hotel management systems to assess its effectiveness in dynamic booking environments.
Keywords
Full Text
Downloads
References
[2] S. Gupta and M. Saberi, “Revenue management challenges in hotel booking cancellations,” International Journal of Hospitality Management, vol. 87, pp. 102–112, 2020, doi: 10.1016/j.ijhm.2020.102498.
[3] Y. Yang, H. Pan, and J. Song, “Predicting hotel booking cancellation with machine learning models,” Expert Systems with Applications, vol. 167, pp. 114129, 2021, doi: 10.1016/j.eswa.2020.114129.
[4] M. Al-Balushi, S. Al-Khusaibi, and R. Al-Hosni, “Machine learning approaches for customer behavior prediction in hospitality,” IEEE Access, vol. 9, pp. 145678–145690, 2021, doi: 10.1109/ACCESS.2021.3119634.
[5] T. Chen and C. Guestrin, “XGBoost: A scalable tree boosting system,” ACM Transactions on Intelligent Systems and Technology, vol. 13, no. 3, pp. 1–22, 2022, doi: 10.1145/3534678.
[6] J. Li, X. Zhang, and Y. Wang, “Ensemble learning for demand forecasting: A comparative study,” Applied Soft Computing, vol. 114, p. 108121, 2022, doi: 10.1016/j.asoc.2021.108121.
[7] R. Kaur and P. K. Singh, “Customer churn and cancellation prediction using gradient boosting models,” IEEE Transactions on Computational Social Systems, vol. 10, no. 1, pp. 45–56, 2023, doi: 10.1109/TCSS.2022.3188467.
[8] H. Wang, Z. Liu, and Y. Chen, “A comparative study of machine learning models for reservation cancellation prediction,” Applied Artificial Intelligence, vol. 35, no. 12, pp. 987–1004, 2021, doi: 10.1080/08839514.2021.1925413.
[9] M. Ferreira, J. Silva, and R. Martins, “Predictive analytics for hotel revenue management using ensemble learning,” Journal of Hospitality and Tourism Technology, vol. 13, no. 3, pp. 455–471, 2022, doi: 10.1108/JHTT-01-2021-0018.
[10] A. Hermawan, I. Amalia, M. Rafif, N. A. Azzahra, and R. Ragasa,
“Optimizing Machine Learning Models for Predicting and Mitigating Hotel Booking Cancellations,” JUPTI: Jurnal Pengembangan Teknologi Informasi, vol. 4, no. 2, pp. 1–12, 2025, doi:10.55606/jupti.v4i2.4055.
License
Copyright (c) 2025 ACSIE (International Journal of Application Computer Science and Informatic Engineering)

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.