This is the origin Pytorch implementation of FPPformerV2 in the following paper: [FPPformerV2: EMD-Based Short Input Long Sequence Time-Series Forecasting] (Manuscript submitted to IEEE TNNLS).
The schematic in Figure 1 unveils the architecture of FPPformerV2. Compared with the former version, its encoder gets a novel attention mechanism. It is dubbed IEMD attention as it extracts the inter-relationships of different variables on the basis of EMD, which plays the role of a discriminator to determine whether the arbitrary variable pair owns underlying inter-relationship or not. IEMD attention is arranged at the end of each encoder stage, accompanied with a conventional feed-forward layer, to maintain the hierarchical architecture of the encoder and utilize the fully extracted sequence features of each variable provided by the preceding element-wise and patch-wise attention. The inter-relationships of different variables in IEMD attention are extracted in the patch level, rather than the entire sequence level, to economizes the computational cost. Besides, the decoder receives a hybrid of seasonal signals, whose periods are identified from the IMFs of input sequences, in lieu of a simple zero-initialized tensor. Instance normalization, which a prevailing technique proposed by T. Kim et al., is applied to it like the input of encoder to ensure the identical distribution of input and prediction sequence. IEMD attention is no longer deployed in decoder since the encoder has already extracted the inter-relationships of input sequences from all variables, whose existences are determined by the dominant periodic ingredients of each input sequence. Meanwhile, these dominant periodic ingredients also constitute the decoder input, making IEMD attention redundant in decoder.
As a whole, on the basis of EMD, the self-attentions in FPPformV2 encoder extract the parametric global input sequence features shared by all time-series sequences, as well as the dynamic cross-variable inter-relationships while FPPformV2 decoder receives the non-parametric local input sequence features, which vary with different input sequences. The global features and the local features interact with each other in the cross-attention modules of decoder, endowing with the property of global-local forecasting to FPPformerV2.
Figure 1. The architecture of FPPformerV2. Two improvements to the former version are highlighted in red.
- python == 3.11.4
- numpy == 1.24.3
- pandas == 1.5.3
- scipy == 1.11.3
- scikit_learn == 0.24.1
- torch == 2.1.0+cu118
- EMD-signal == 1.5.2
Dependencies can be installed using the following command:
pip install -r requirements.txt
ETT, ECL, Traffic and Weather dataset were acquired at: here. Solar dataset was acquired at: Solar. The raw data of Air dataset was acquired at: Air. The raw data of River dataset was acquired at: River. The raw data of BTC dataset was acquired at: BTC. The raw data of ETH dataset was acquired at: ETH. The last four datasets(Air-ETH) shall be used after proper data preparation so that they have already been arranged in this repository. One can also use the preprocessing program provided (expounded in the later section) to preprocess the last four datasets if he/she is interested in the raw data of them.
After you acquire raw data of all datasets, please separately place them in corresponding folders at ./FPPformerV2/data
.
We place ETT in the folder ./ETT-data
, ECL in the folder ./electricity
and weather in the folder ./weather
of here (the folder tree in the link is shown as below) into folder ./data
and rename them from ./ETT-data
,./electricity
, ./traffic
and ./weather
to ./ETT
, ./ECL
, ./Traffic
and./weather
respectively. We rename the file of ECL/Traffic from electricity.csv
/traffic.csv
to ECL.csv
/Traffic.csv
and rename its last variable from OT
/OT
to original MT_321
/Sensor_861
separately.
The folder tree in https://drive.google.com/drive/folders/1ZOYpTUa82_jCcxIdTmyr0LXQfvaM9vIy?usp=sharing:
|-autoformer
| |-ETT-data
| | |-ETTh1.csv
| | |-ETTh2.csv
| | |-ETTm1.csv
| | |-ETTm2.csv
| |
| |-electricity
| | |-electricity.csv
| |
| |-traffic
| | |-traffic.csv
| |
| |-weather
| | |-weather.csv
We place Solar in the folder ./financial
of Solar (the folder tree in the link is shown as below) into the folder ./data
and rename them as ./Solar
respectively.
The folder tree in https://drive.google.com/drive/folders/1Gv1MXjLo5bLGep4bsqDyaNMI2oQC9GH2?usp=sharing:
|-dataset
| |-financial
| | |-solar_AL.txt
We place Air/River/BTC/ETH in Air /River /BTC /ETH (the folder tree in the link is shown as below) into the folder ./Air
/./River
/./BTC
/./ETH
respectively.
The folder tree in https://archive.ics.uci.edu/dataset/360/air+quality:
|-air+quality
| |-AirQualityUCI.csv
| |-AirQualityUCI.xlsx
The folder tree in https://www.kaggle.com/datasets/samanemami/river-flowrf2:
|-river-flowrf2
| |-RF2.csv
The folder tree in https://www.kaggle.com/datasets/prasoonkottarathil/btcinusd:
|-btcinusd
| |-BTC-Hourly.csv
The folder tree in https://www.kaggle.com/datasets/franoisgeorgesjulien/crypto:
|-crypto
| |-Binance_ETHUSDT_1h (1).csv
Then you can run ./data/preprocess.py
to preprocess the raw data of Air, River, BTC and ETH datasets. Attention! If you directly use the preprocessed datasets provided in this repository, there is no need to run ./data/preprocess.py
, otherwise errors would occur.
In 'preprocess.py', We replace the missing values, which are tagged with -200 value, by the average values of normal ones. We remove the variable NMHC(GT)
in Air dataset in that all data of this variable in test subset is missing. In River dataset, we only select the first eight variables as others are corresponding time-lagged observationst. Moreover, We remove the discrete variables in BTC/ETH datasets.
After you successfully run ./data/preprocess.py
, you will obtain folder tree:
|-data
| |-Air
| | |-Air.csv
| |
| |-BTC
| | |-BTC.csv
| |
| |-ECL
| | |-ECL.csv
| |
| |-ETH
| | |-ETH.csv
| |
| |-ETT
| | |-ETTh1.csv
| | |-ETTh2.csv
| | |-ETTm1.csv
| | |-ETTm2.csv
| |
| |-River
| | |-River.csv
| |
| |-Solar
| | |-solar_AL.txt
| |
| |-Traffic
| | |-Traffic.csv
| |
| |-weather
| | |-weather.csv
We select eight up-to-date baselines, including three TSFT (ARM, iTransformer, Basisformer), two TSFM (TSMixer, FreTS), one TCN (ModernTCN), one RNN-based forecasting method (WITRAN) and one cutting-edge statistics-based forecasting method (OneShotSTL). Most of these baselines are relative latecomers to FPPformer and their state-of-the-art performances are competent in challenging or even surpassing it. Their source codes origins are given below:
Baseline | Source Code |
---|---|
ARM | https://openreview.net/forum?id=JWpwDdVbaM |
iTransformer | https://github.com/thuml/iTransformer |
Basisformer | https://github.com/nzl5116190/Basisformer |
TSMixer | https://github.com/google-research/google-research/tree/master/tsmixer |
FreTS | https://github.com/aikunyi/frets |
ModernTCN | https://openreview.net/forum?id=vpJMJerXHU |
WITRAN | https://github.com/Water2sea/WITRAN |
OneShotSTL | https://github.com/xiao-he/oneshotstl |
Moreover, the default experiment settings/parameters of aforementioned seven baselines are given below respectively:
Baselines | Settings/Parameters name | Descriptions | Default mechanisms/values |
---|---|---|---|
ARM | d_model | The number of hidden dimensions | 64 |
n_heads | The number of heads in multi-head attention mechanism | 8 | |
e_layers | The number of encoder layers | 2 | |
d_layers | The number of decoder layers | 1 | |
preprocessing_method | The preprocessing method | AUEL | |
conv_size | The size of kernels in conv layers | [49, 145, 385] | |
conv_padding | The padding value | [24, 72, 192] | |
ema_alpha | The trainable EMA parameter | 0.9 | |
iTransformer | d_model | The number of hidden dimensions | 512 |
d_ff | Dimension of fcn | 512 | |
n_heads | The number of heads in multi-head attention mechanism | 8 | |
e_layers | The number of encoder layers | 3 | |
Basisformer | N | The number of learnable basis | 10 |
block_nums | The number of blocks | 2 | |
bottleneck | reduction of bottleneck | 2 | |
map_bottleneck | reduction of mapping bottleneck | 2 | |
n_heads | The number of heads in multi-head attention mechanism | 16 | |
d_model | The number of hidden dimensions | 100 | |
TSMixer | n_block | The number of block for deep architecture | 2 |
d_model | The hidden feature dimension | 64 | |
FreTS | embed_size | The number of embedding dimensions | 128 |
hidden_size | The number of hidden dimensions | 256 | |
channel_independence | Whether channels are dependent | 1 | |
ModernTCN | d_model | The number of hidden dimensions | 64 |
ffn_ratio | The FFN ratio | 8 | |
kernel | The kernel size | 51 | |
patch_size | The patch size | 8 | |
stride | The stride value | 4 | |
e_layers | The number of ModernTCN blocks | 3 | |
WITRAN | d_model | The number of hidden dimensions | 32 |
e_layers | The number of encoder layers | 8 | |
WITRAN_dec | The prediction module of WITRAN | Concat | |
WITRAN_deal | WITRAN deal data type | None | |
WITRAN_grid_cols | Numbers of data grid cols for WITRAN | 24 | |
OneShotSTL | lambda1 | The hyper-parameter to control smoothness | 1.0 |
lambda2 | The hyper-parameter to control smoothness | 0.5 | |
lambda3 | The hyper-parameter to control smoothness | 1.0 |
Commands for training and testing FPPformer of all datasets during multivariate/univariate forecasting are in ./scripts/Main.sh
/./scripts/Univariate_ECL.sh
respectively.
More parameter information please refer to main.py
.
We provide a complete command for training and testing FPPformerV2:
For multivariate forecasting:
python -u main.py --data <data> --features <features> --input_len <input_len> --pred_len <pred_len> --encoder_layer <encoder_layer> --patch_size <patch_size> --d_model <d_model> --learning_rate <learning_rate> --dropout <dropout> --batch_size <batch_size> --train_epochs <train_epochs> --patience <patience> --itr <itr> --train --Cross <Cross> --EMD <EMD>
For univariate forecasting:
python -u main.py --data <data> --features <features> --input_len <input_len> --pred_len <pred_len> --encoder_layer <encoder_layer> --patch_size <patch_size> --d_model <d_model> --learning_rate <learning_rate> --dropout <dropout> --batch_size <batch_size> --train_epochs <train_epochs> --patience <patience> --itr <itr> --train --target <target> --EMD <EMD>
Here we provide a more detailed and complete command description for training and testing the model:
Parameter name | Description of parameter |
---|---|
data | The dataset name |
root_path | The root path of the data file |
data_path | The data file name |
features | The forecasting task. This can be set to M ,S (M : multivariate forecasting, S : univariate forecasting |
target | Target feature in S task |
ori_target | Default target, determine the EMD result order |
checkpoints | Location of model checkpoints |
input_len | Input sequence length |
pred_len | Prediction sequence length |
enc_in | Input size |
dec_out | Output size |
d_model | Dimension of model |
dropout | Dropout |
encoder_layer | The number of encoder layers |
patch_size | The size of each patch |
Cross | Whether to use cross-variable attention |
EMD | Whether to use EMD as the prediction initialization |
itr | Experiments times |
train_epochs | Train epochs of the second stage |
batch_size | The batch size of training input data in the second stage |
patience | Early stopping patience |
learning_rate | Optimizer learning rate |
The experiment parameters of each data set are formated in the Main.sh
and Univariate_ECL.sh
files in the directory ./scripts/
. You can refer to these parameters for experiments, and you can also adjust the parameters to obtain better MSE and MAE results or draw better prediction figures.
We provide the results of EMD process in the link EMD. You can download and place it in corresponding folders at ./FPPformerV2/EMD
to reduce the time consumption.
Figure 2. Multivariate forecasting results under 1-hour-level datasets
Figure 3. Multivariate forecasting results under minute-level datasets
Figure 4. Univariate forecasting results
If you have any questions, feel free to contact Li Shen through Email ([email protected]) or Github issues. Pull requests are highly welcomed!