NeurIPS 2024
We propose a multi-resolution architecture for the diffusion noise estimator considering that different attributes can influence the time series on varying scales.
Our model divides the original time series into serval patch sequences with different resolutions. Then we concatenate the input patch embedding sequences of different resolutions into a single vector and parallelly send them into the processing module with other information, e.g., attributes and diffusion step. We use the attention masking mechanism to ensure that attention is only performed in the same resolution.
To overcome the lack of source and target time series pairs in real-world dataset. We propose a training algorithm called bootstrap. Specifically, it self-scores the time series generated by the model itself, and chooses the top K samples with the highest score. Then we use these samples to further update the diffusion model.
We compare our method Time series Editing (TEdit) with the baselines on a synthetic dataset Synthetic and two real-world datasets Air and Motor. As shown in the table below, our method significantly improves the performance of overall and edited attributes, while able to preserve the performance of preserved attributes.
torch==2.2.1
pandas==2.0.3
pyyaml==6.0.2
linear_attention_transformer==0.19.1
tensorboard==2.14.0
scikit-learn==1.3.2
You can use the following command to prepare your environment.
pip install -r requirements.txt
Download the datasets from Google Drive or Baidu Cloud.
Assume the datasets are in `/path/to/data/`. It should be like:
/path/to/data/:
synthetic/:
pretrain:
train_ts.npy
train_attrs_idx.npy
valid_ts.npy
valid_attrs_idx.npy
...
trend_types:
...
...
air/:
pretrain:
...
city:
...
season:
...
motor/:
pretrain:
...
id:
...
motor:
...
NOTE: The arg --data_folder=/path/to/data/
should be passed to the training script.
Download the checkpoints from Google Drive or Baidu Cloud.
Assume the checkpoints are in `/path/to/save/`. It should be like:
/path/to/save/:
[dataset_name]:
energy:
...
[pretrain_model]:
[run_id]:
ckpts:
model_best.pth
eval_configs.yaml
pretrain_configs.yaml
model_configs.yaml
[finetune_model]:
[sub_dataset]:
[run_id]:
...
...
NOTE: The arg --save_folder=/path/to/save/
should be passed to the training script.
To pretrain the model on the specific dataset.
bash scripts/dataset_name/pretrain_multi_weaver.sh
To finetune the model on the specific dataset.
bash scripts/dataset_name/finetune_multi_weaver.sh
After the training, check the results at the following path.
{save_folder}/{run_id}/results_stat.csv
{save_folder}/{run_id}/results_finetune_stat.csv
To evaluate the model with the checkpoints.
bash scripts/dataset_name/eval_multi_weaver.sh
All codes in this repository run on GPU by default. If you need to run on the CPU, please modify the device-related parameters in the config file.
This project is licensed under the MIT License - see the LICENSE file for details.
If our work helps you in research, please give us a star or cite us using the following:
@article{jing2024towards,
title={Towards Editing Time Series},
author={Jing, Baoyu and Gu, Shuqi and Chen, Tianyu and Yang, Zhiyu and Li, Dongsheng and He, Jingrui and Ren, Kan},
journal={Advances in Neural Information Processing Systems},
year={2024}
}