[P] ARIMA vs LSTM – Forecasting Weekly Hotel Cancellations
Over the past while, I’ve been working on a side project to forecast hotel cancellations on a weekly basis (original data and authors available here).
While the original intent of this research was to identify the drivers of such cancellations and predict whether customers would cancel using classification (i.e. cancelling customer = 1, non-cancelling customer = 0), I wanted to investigate whether time series forecasting could be a good addition to this study.
The first step was using pandas for data manipulation, i.e. sorting the cancellations by week and then summing up to get the total number of cancellations every week.
Following this, I decided to use both ARIMA and LSTM to predict future cancellations across the test set. This was done for two separate hotel datasets (H1 and H2).
Interestingly, I found that LSTM performed better on the more volatile dataset (H2), while ARIMA showed more forecast accuracy on the dataset with a smoother trend (H1).
Ultimately, doing this project reinforced to me that machine learning models like LSTM are just like any other model – they are not necessarily suitable for all situations, and one needs to understand the data they are working with before selecting the model.
If you’re interested in the findings, feel free to take a further look. It is a three-part study, but here are the relevant links below:
– ARIMA Forecasts (first half of the article covers classification with SVM)
Hope you find this of use, and grateful for any feedback!