This is a pytorch version of Long and Diverse Text Generation with Planning-based Hierarchical Variational Model, and it was rewrited based on the tensorflow version projrct.
Maybe there are some bug in this projrct, Hope you can find those mistake and solve them.
-
Dataset
Our dataset contains 119K pairs of product specifications and the corresponding advertising text. For more information, please refer to our paper.
-
Preprocess
- Download data from here and unzip the file, which will create a new directory named
data
. The path to our dataset is./data/data.jsonl
. - We provided most preprocessed data under
./data/processed/
except pre-trained word embeddings which can be generated with the following command line:
bash preprocess.sh
- Download data from here and unzip the file, which will create a new directory named
-
Train
./run.sh
-
Test
./test.sh