This repository contains the code and data for the User Simulator (USi), a model for answering clarifying questions in mixed-initiative conversational search, described in the paper "Evaluating Mixed-initiative Conversational Search Systems via User Simulation" and presented at WSDM 2022.
USi is trained on Qulac and ClariQ to answer clarifying questions in line with information need. You can use USi for help in evaluation of any models for generating clarifying question on any dataset that has a information need (facet, topic) description (which are a lot of TREC-like collections).
Requirements:
pytorch
transformers
scikit-learn
pytorch_lightning
pandas
Download the pre-trained model here and run predictions:
python run.py --test_mode 1 \\
--test_ckp checkpoints/model_8.ckpt \\
--temperature 0.7 \\
--top_k 0 \\
--top_p 0.9 \\
--min_output_len 1
You can find all the controllable parameters in argparse
in run.py
.
Consistent multi-turn interactions proved difficult for USi. To foster further research on answering clarifying questions in multi-turn interactions, we release a novel multi-turn dataset aimed at constructing conversations with hpyothetical cases, where the clarifying question is repeated, off-topic, or simply ignores the context.
Download the dataset, consisting of 1000 conversations up to the depth of 3, from here.
If you found this code or data useful, please cite:
@inproceedings{10.1145/3488560.3498440,
author = {Sekuli\'{c}, Ivan and Aliannejadi, Mohammad and Crestani, Fabio},
title = {Evaluating Mixed-Initiative Conversational Search Systems via User Simulation},
year = {2022},
booktitle = {Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining},
pages = {888–896},
series = {WSDM '22}
}
Updates to the repository coming shortly.