The aim of the project is to forecast monthly admissions to Singapore public acute adult hospitals. The admissions were treated as a hierarchical time series. Admissions were forecasted at each level. Every country has a hierarchical order to its public hospitals. In Singapore, there are 3 levels:
National level
|-- Cluster level (Clusters are a network of hospitals based on geographical regions. There are 3 health clusters in Singapore- NUHS, NHG, SHS.)
|---- Hospital level (There are 8 public acute adult hospitals- NTFGH, NUH, AH, TTSH, KTPH, SGH, SKH, CGH)
Forecasting admissions at hospital levels can help hospital managers plan for better manpower deployment during predicted peak periods. Forecasting at higher levels such as cluster or even national level can help senior management and policy planners develop better strategy to deal with high and lo periods and review other strategies to reduce admission rate. A manageable admission rate helps to ensure clinicians will have sufficient time to review their patients.
Both classical and machine learning approaches were adopted for forecasting. The best model was ensembled model of retuned Random Forest and retuned Prophet Boost with a 9:1 weighting. This model's accuracy was:
Level | RMSE (testing set) | MAE (testing set) |
---|---|---|
Across all levels | 535 | 412 |
National | 949 | 789 |
Cluster | 657 | 528 |
Hospital | 393 | 321 |
While smaller rmse are favoured in general, care needs to be taken when appreciating the rmse for each level as the magnitude of admission differs for each hierarchical level, the superordinate levels have more admissions thus a larger rmse can be expected.
The datasets, model outputs and key objects are housed on this GitHub. The rest of the README outlines the project, more details are found in my blog
The dataset is monthly admission to Singapore public acute adult hospitals. The dataset starts from Jan 2016 and ends in Feb 2021. The forecast horizon was 10months, i.e. to forecast till the end of 2021. The training set was from Jan 16 to Apr 20 (3 years, 4months) and the testing set was from May 20 to Feb 21 (10 months).
Trends, seasonality, anomalies, lags and correlation of time series features and statistics were explored and analysed.
- In general, there was an increase in the number of admissions till first half of 2020 during the peak of the Covid pandemic. After the peak, admissions to KTPH and NTFGH did not increase to pre peak numbers.
- The number of admissions to SKH markedly increase during 2018 as the new hospital fully opened its entire hospital campus.
- There are fewer admissions in Feb, likely for two reasons. Firstly, Feb has the shortest month and Chinese Lunar New Year tends to happen during Feb.
- There are more admissions in the final quarter of the year, mostly from Oct and Dec.
- Most of the anomaly detected occurred during the peak of COVID19 pandemic from Jan 20- Jul 20.
- The anomaly in 2018 came from SKH and was not observed in other hospitals nor at a more aggregated cluster level. The anomaly was likely due to the change in the number of admissions before and after the hospital was opened in Jul 18.
- Correlation of the 48 time series features and statistics was conducted as 48 is a number of variables to analyse. The correlation also determines the associative relationship between the features. Correlation was done for each level as well as a collection of all levels because admissions at superordinate levels would have some correlation with admissions at the subordinate level.
- The spread in the correlations revealed the heterogeneity of the features, the features identify a variety of time series traits.
- PCA was conducted as most features have moderate correlation with each other and to condense the information.
- The first 5 principal component captured 88% of the variance.
For the classical approach, 3 hierarchical forecasting techniques were used:
- bottoms up
bu
- reconciliation using ordinary least square
ols
- reconciliation using minimum trace with sample covariance
mint_cov
Base models for the above techniques included:
- ETS
- ARIMA
- ARIMA with Covid (peak period) as regressor
- ARIMA with Covid regressor with 1 month lag
- ARIMA with Covid regressor with 2 month lag
- ARIMA with Covid regressor with 3 month lag
The best model's hierarchical forecast on the testing set is plotted below:
Hospital level:
Cluster level:
National level:
- Basic recipe
rec_basic
- Lags
- Rolling lags
- Covid peak period (dummy variable)
- Relevant temporal features from
step_timeseries_signature
e.g. month, year, quarter of the year - Hierarchical levels e.g. National level, Cluster level
- Members in the corresponding level e.g. CGH hospital, SHS cluster
- Basic recipe + Time series features and statistics
rec_ft
- Basic recipe + PCA of the time series features and statistics
rec_PC
- Basic recipe + kernel PCA of the time series and features and statistics
rec_kPC
Random forest model with cross-validation was used to screen the recipes. The best recipe rec_PC
was used for machine learning modelling.
Recipe | RMSE (avg cv) |
---|---|
rec_PC | 514 |
rec_kPC | 516 |
rec_basic | 526 |
rec_rf | 543 |
The best recipe was passed into the following models and tuned with resampling:
- Elastic net regression with splines
GLM
- Multivariate adaptive regression spline
MARS
- Random forest
RF
- Extreme gradient boost
XGB
- Boosted PROPHET
PB
LightGBM (LightGBM has seen success with hierarchical time series in the M5 competition but fatal errors were encountered when running it inR
)
Model | RMSE (avg cv) | MAE (avg cv) |
---|---|---|
RF | 549 | 409 |
PB | 1137 | 799 |
XGB | 1231 | 888 |
MARS | 3796 | 3312 |
GLM | 9847 | 8281 |
The top 2 models, RF and PB, were manually retuned.
Example of identifying more appropriate parameter range for retuning Prophet Boost
Both Random Forest and Prophet Boost benefited from retuning.
Model | RMSE (avg cv) | MAE (avg cv) |
---|---|---|
RF with retuning | 545 | 409 |
RF | 549 | 409 |
PB retuning | 945 | 673 |
PB | 1137 | 799 |
- All the ensemble models performed better than its member models.
- Better performing ensemble models had a stronger bias to Random Forest.
- Some of the classical approaches performed better than the top 2 machine learning models.
Approach | Model | RMSE (training set) | MAE (training set) |
---|---|---|---|
Machine Learning | Ensemble model (RF retuned + PB retuned, weights 9:1) | 535 | 412 |
Machine Learning | Ensemble model (RF + PB retuned, weights 9:1) | 538 | 412 |
Machine Learning | Ensemble model (RF retuned + PB retuned, weights 8:2) | 539 | 421 |
Machine Learning | Ensemble model (RF + PB retuned, weights 8:2) | 543 | 423 |
Machine Learning | RF retuned | 545 | 409 |
Machine Learning | RF | 549 | 409 |
Classical | Reconciliation with mint_cov . Base model: ARIMA + Covid regressor |
847 | 745 |
Machine Learning | PB retuned | 945 | 673 |
Classical | Reconcilation with OLS . Base model: ARIMA + Covid regressor |
1085 | 937 |
Classical | Base model of ARIMA + Covid regressor | 1117 | 991 |
Machine Learning | PB | 1137 | 799 |
The best machine learning model forecast on the testing is plotted below:
Hospital level:
Cluster level:
National level:
To recap, the training set was from Jan 16 to Apr 20 (3 years, 4months) and the testing set was from May 20 to Feb 21 (10 months) and the forecast horizon was from Mar 21- Dec 21 (10 months). The best model was an ensemble model of retuned Random Forest and retuned Prophet Boost with a 9:1 weightage. Below are the forecasted future admissions using the best model.
Hospital level:
Cluster level:
National level:
Most of the hospital level forecast were relatively flat; perhaps, due to insufficient data compared to cluster and national level which had more observations from the aggregation of subordinate levels. The forecast for cluster and national level appeared more plausible with some peaks and dips with an upward trend. Nonetheless, forecasting during this Covid period is challenging. Any forecast can be thrown off the rails as the situation is erratic and dynamic. For instance, the Covid infection rate was stable after Aug 20 but became more serious in May 21 with the Singapore government implementing stricter social distancing measures.
- More machine learning models and deep learning
- Replacing XGB in Prophet Boost with other tree-based boost models like Catboost or lightGBM
- Predicting all hospital admission at once with a global model.