Human Annotator Simulation (HAS)

This repository contains code for the following two papers:

Modelling Variability in Human Annotator Simulation [Findings of ACL 2024, conference version]
It HAS to be Subjective: Human Annotator Simulation via Zero-shot Density Estimation [Preprint, journal version]

Please read our paper for detailed descriptions of the proposed human annotator simulation (HAS) method.

Dependencies

PyTorch==1.11.1
speechbrain==0.5.14
normflows==1.6
numpy==1.21.0
scikit-learn==1.0.2
statsmodels==0.13.5

Data preparation

Emotion classification on MSP-Podcast

Prepare label: python3 data_preparation/prep_msp-label.py
Prepare training scp: python3 data_preparation/prep_msp-scp.py

Hate speech detection on HateXplain

python3 data_preparation/prep_hx.py

Speech quality assessment on SOMOS

python3 data_preparation/prep_somos.py

Training

Conditional softmax flow (S-CNF) for Categorical Annotations

python3 Train_S-CNF.py Train_S-CNF.yaml --output_folder='exp'

Conditional Integer Flows (I-CNF) for Ordinal Annotations

python3 Train_I-CNF.py Train_I-CNF.yaml --output_folder='exp'

Scoring

For S-CNF: python3 scoring_S-CNF.py exp/test_outcome-E{PLACEHOLDER}.npy
For I-CNF: python3 scoring_I-CNF.py exp/test_outcome-E{PLACEHOLDER}.npy

Citation

If you find our paper and/or code useful for your research, please consider citing our paper:

@inproceedings{wu2024modelling,
  title={Modelling Variability in Human Annotator Simulation},
  author={Wu, Wen and Chen, Wenlin and Zhang, Chao and Woodland, Phil},
  booktitle={Findings of the Association for Computational Linguistics ACL 2024},
  pages={1139--1157},
  year={2024}
}

@article{wu2023has,
  title={It HAS to be Subjective: Human Annotator Simulation via Zero-shot Density Estimation},
  author={Wu, Wen and Chen, Wenlin and Zhang, Chao and Woodland, Philip C},
  journal={arXiv preprint arXiv:2310.00486},
  year={2023}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Human Annotator Simulation (HAS)

Dependencies

Data preparation

Emotion classification on MSP-Podcast

Hate speech detection on HateXplain

Speech quality assessment on SOMOS

Training

Conditional softmax flow (S-CNF) for Categorical Annotations

Conditional Integer Flows (I-CNF) for Ordinal Annotations

Scoring

Citation

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
data_preparation		data_preparation
README.md		README.md
Train_I-CNF.py		Train_I-CNF.py
Train_I-CNF.yaml		Train_I-CNF.yaml
Train_S-CNF.py		Train_S-CNF.py
Train_S-CNF.yaml		Train_S-CNF.yaml
modules.py		modules.py
scoring_I-CNF.py		scoring_I-CNF.py
scoring_S-CNF.py		scoring_S-CNF.py

W-Wu/HAS_CNF

Folders and files

Latest commit

History

Repository files navigation

Human Annotator Simulation (HAS)

Dependencies

Data preparation

Emotion classification on MSP-Podcast

Hate speech detection on HateXplain

Speech quality assessment on SOMOS

Training

Conditional softmax flow (S-CNF) for Categorical Annotations

Conditional Integer Flows (I-CNF) for Ordinal Annotations

Scoring

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages