The official implementation of paper "Hello Again! LLM-powered Personalized Agent for Long-term Dialogue".
We recommend the following dependencies:
- Python 3.10.0
- PyTorch 1.13.0
- Transformers (>= 4.32.0)
Then, please install other environment dependencies through:
pip install -r requirements.txt
The recommended GPU memory is more than 32 GB.
The datasets for event summary, persona extraction, response generation and MSC can be downloaded here. Please organize the dataset path as LD-Agent/dataset
.
To automatically evaluate response quality, you should download the compressed metric files here. Then decompress it and organize it to LD-Agent/nlgeval/metric
.
We refer to the training approach of ChatGLM3-6B and separately provide LoRA tuning strategy for event summary, persona extraction, and response generation. You can run the following instructions to train these modules.
Summarizer
bash scripts/summarizer_tuning.sh
Extractor
bash scripts/extractor_tuning.sh
Generator
bash scripts/generator_tuning.sh
You can adjust the detailed training configs in Trainer/configs
.
We provide the evaluation implementations on both ChatGPT and ChatGLM3-6B.
ChatGPT
To evaluate using ChatGPT, you can edit the ${API_KEY}
in scripts/msc_gpt_eval.sh
to your openai API key and run:
bash scripts/msc_gpt_eval.sh
ChatGLM
To evaluate using ChatGLM3-6B, you can run:
bash scripts/msc_glm_eval.sh
Edit the ${SUMMARIZER}
, ${EXTRACTOR}
, and ${GENERATOR}
to specify the LoRA models used for event summary, persona extraction, and response generation, respectively. The setting of "default"
indicates employing original ChatGLM to the target module.
If you found this code useful, please cite the following paper:
@article{LD-Agent,
title={Hello Again! LLM-powered Personalized Agent for Long-term Dialogue},
author={Li, Hao and
Yang, Chenghao and
Zhang, An and
Deng, Yang and
Wang, Xiang and
Chua, Tat-Seng},
journal={arXiv preprint arXiv:2406.05925},
year={2024}
}