Short-Term Probabilistic Load Forecasting using Conditioned Bernstein-Polynomial Normalizing Flows (STPLF-BNF)
In the study we compared the combinations of two different neural network architectures with four different methods to model the conditional marginal distributions in 24h-ahead forecasting. The following example shows the 99% and 60% confidence intervals, along with the median (blue) of the predicted conditional probability densities and the measured observations (orange) for one household with unusual high load during the Christmas week.
We use public data from 363 smart meter customers of the CER dataset to train and evaluate the models.
The transition to a fully renewable energy grid requires better forecasting of demand at the low-voltage level to increase efficiency and ensure reliable control. However, high fluctuations and increasing electrification cause huge forecast variability, not reflected in traditional point estimates. Probabilistic load forecasts take future uncertainties into account and thus allow more informed decision-making for the planning and operation of low-carbon energy systems. We propose an approach for flexible conditional density forecasting of short-term load based on Bernstein polynomial normalizing flows, where a neural network controls the parameters of the flow. In an empirical study with 363 smart meter customers, our density predictions compare favorably against Gaussian and Gaussian mixture densities. Also, they outperform a non-parametric approach based on the pinball loss for 24h-ahead load forecasting for two different neural network architectures.
To get a local copy up and running follow these simple example steps.
The project uses the following python packages:and my implementation of Bernstein-Polynomials as TensorFlow Probability Bijector, Tensorflow time-series Dataset, and my small CLI tool to run these TensorFlow Experiments.
The dependencies are defined in the respective setup.py
and are automatically installed by conda, when the experiments are reproduced as described below.
Data Version Control System was used to version the data and build an ML-pipeline with data cleansing, feature generation, training and evaluation:
scripts/prepar.py
is used to prepare the dataset.First non-residential buildings a dropped, since the stochastic behavior of residential customers was from explicit interest in this study. Then all incomplete records are removed. Optionally random subset (10% / 363 customers in the paper) is extracted.
scripts/features.py
is used to add additional features like holiday or weather information.scripts/split.py
is used to split the date into a train and test set.All records until 2010/10/31 23:30:00 have been used for training, the remaining readings were left out for testing.
scripts/validate_data.py
finally applies some sanity checks and extracts descriptive statistics.
32
. Each sample consists of an input tuple x = (x_h,x_m)
containing the historical data x_h
, with the lagged electric load of the past seven days and meta data x_m
, with trigonometric encoded time information and a binary holiday indicator.
The prediction target y
is the load for the next day, with resolution of 30
minutes. Hence, the model predicts 48
conditional densities p(y_1|x), ..., p(y_48|x)
for every future time step.
A anaconda environment (conda_env.yaml
) in combination with an MLFlow project (MLProject
) is provided for easy reproducibility.
Note: The MLProject is used to reproduce the DVC pipeline described in
dvc.yaml
. Its is possible to usedvc repro
directly, but then the required packages fromconda_env.yaml
have to be installed manually beforehand.
Follow these steps to setup and prepare the experiments.
- First ensure that you have a working anaconda or miniconda installation.
- Create a new conda environment and install MLFlow and dvc
conda create -n stplf-bnf conda activate stplf-bnf pip install mlflow dvc
- clone this repository
git clone https://github.com/MArpogaus/stplf-bnf.git ./exp cd exp
- Add your copy of the CER Smart Meter dataset and extract it to
data/raw/
. Then add it to DVC.
If everything went well the provided MLFlow project can be executed to reproduce the dvc pipline.
mlflow run .
After the MLFlow pipeline was reproduced you can show the results with:
dvc metrics show --show-md
This should print out the metrics of the following table:
| Path | continuous_ranked_probability_score | loss | mean_quantile_score | median_absolute_error | median_squared_error |
|--------------------------------------------------+-------------------------------------+------------+---------------------+-----------------------+----------------------|
| metrics/feed_forward_bernstein_flow.yaml | 0.01696 | -130.30296 | 0.01678 | 0.32215 | 0.6905 |
| metrics/feed_forward_gaussian_mixture_model.yaml | 0.01697 | -129.05446 | 0.01679 | 0.32317 | 0.41046 |
| metrics/feed_forward_normal_distribution.yaml | 0.01918 | -98.8528 | 0.01897 | 0.35269 | 0.6313 |
| metrics/feed_forward_quantile_regression.yaml | 0.01685 | -119.47409 | 0.01667 | 0.3195 | 0.4099 |
| metrics/wavenet_bernstein_flow.yaml | 0.01709 | -133.62024 | 0.01691 | 0.32437 | 0.56243 |
| metrics/wavenet_gaussian_mixture_model.yaml | 0.01798 | -127.82545 | 0.0178 | 0.33884 | 0.49286 |
| metrics/wavenet_normal_distribution.yaml | 0.0182 | -104.14383 | 0.01801 | 0.34255 | 0.37162 |
| metrics/wavenet_quantile_regression.yaml | 0.01776 | -115.97292 | 0.01757 | 0.32931 | 0.43222 |
| metrics/baseline.yaml | - | -101.34346 | 0.023 | 0.43612 | 0.68262 |
Distributed under the GNU GPLv3 License
Copyright (C) 2022 Marcel Arpogaus
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/.
Marcel Arpogaus - [email protected]Project Link: https://github.com/MArpogaus/stplf-bnf
Parts of this work have been funded by the Federal Ministry for the Environment, Nature Conservation and Nuclear Safety due to a decision of the German Federal Parliament (AI4Grids: 67KI2012A), by the Federal Ministry for Economic Affairs and Energy (BMWi) within the program SINTEG as part of the showcase region WindNODE (03SIN539) and by the Federal Ministry of Education and Research of Germany (BMBF) in the project DeepDoubt (grant no. 01IS19083A).Public data from the CER Smart Metering Project - Electricity Customer Behaviour Trial, 2009-2010 Accessed via the Irish Social Science Data Archive - www.ucd.ie/issda was used in the development of this project.
Please consider citing our work in all publications and presentations if the code provided in this repository was involved.
@unpublished{Arpogaus2022a,
title = {Short-{{Term Density Forecasting}} of {{Low-Voltage Load}} Using {{Bernstein-Polynomial Normalizing Flows}}},
author = {Arpogaus, Marcel and Voss, Marcus and Sick, Beate and Nigge-Uricher, Mark and Dürr, Oliver},
date = {2022-04-29},
eprint = {2204.13939},
eprinttype = {arxiv},
primaryclass = {cs, stat},
archiveprefix = {arXiv}
}
@inproceedings{Arpogaus2021,
title={Probabilistic Short-Term Low-Voltage Load Forecasting using Bernstein-Polynomial Normalizing Flows},
author={Arpogaus, Marcel and Voß, Marcus and Sick, Beate and Nigge-Uricher, Mark and Dürr, Oliver},
booktitle={ICML 2021 Workshop on Tackling Climate Change with Machine Learning},
url={https://www.climatechange.ai/papers/icml2021/20},
year={2021}
}
@software{Arpogaus2021,
title = {Short-Term Probabilistic Load Forecasting using Conditioned Bernstein-Polynomial Normalizing Flows},
author = {Marcel Arpogaus},
date = {2022-01-20},
url = {https://github.com/MArpogaus/stplf-bnf}
}