Skip to content

Latest commit





Few-Shot Detection of Machine-Generated Text using Style Representations

This is the official repostiory for the ICLR 2024 paper titled Few-Shot Detection of Machine-Generated Text using Style Representations. Our main contributions are twofold:

  1. We show that stylistic features that distinguish human authors from one another also distinguish human authors from machine authors, and machine authors from each other. Our work explores various factors leading to effective style representations for this task, finding that contrastive training on large amounts of human-authored text is sufficient to obtain useful representations across a variety of few-shot settings.

  2. In light of the unavoidable distribution shifts stemming from the introduction of new LLMs, topics, and domains, this work focuses on the few-shot setting. Our evaluations asses the ability to detect writing samples produced by LLM unseen at training time, drawn from new domains and topics.

The gist

The main gist is that stylistic representations estimated on human-authors are useful in distinguishing humans from machines, and machines from humans. You can see this in the following figure:

alt text

Demo on ICLR 2024 reviews

To see how this would work in practice, we highly encourage you to look through our demo here.

Environment setup

Install the Python (3.8) environment using the following commands:

python3 -m venv mluar
source mluar/bin/activate
pip install -r requirements.txt

Downloading datasets & setting up paths

To download datasets, pre-trained models, and set up the project paths, run the following script:


This will download everything to ./data, if you don't have enough space there, you may also download the files individually and set up the paths manually. If this is the case, please see the section titled "Individual download paths, and "file_config.ini" description".


To quickly reproduce the UAR (Reddit 5M) results from Table 1, run the following commands:

cd evaluation/
cd ../process_results/

This will produce the following a pAUC of 0.868 for N=5, and a pAUC of 0.958 for N=10.

Reproducing the main body of results

The following scripts will reproduce all the results in Table 1, Figure 2a, Figure 2b, and Figure 3.

cd evaluation/
cd ../process_results/

These scripts will run all the single-target, multiple-target, and paraphrasing experiments. Note that these scripts will run everything on a single GPU, thus it can take quite a while to run. If you have access to more resources, we'd recommend splitting the commands up into different machines.

Once the commands have finished, you will see a printout of all the relevant tables, and the plots will be saved under ./process_results/.

Training the baselines

We've already provided the pre-trained AI-Detector, PROTONET, and MAML baselines. However, if you wish to train your own, please read the README under the baseline_training folder.

Individual download paths, and "file_config.ini" description

Model / Dataset Name Download Link
Fewshot Data
UAR Multi-domain
AI Detector (fine-tuned)

The file_config.ini contains the main application paths, here's a description of each field:

  • aac_data_path - Path to the AAC data ("AAC" in the table above)
  • lwd_data_path - Path to the LWD data ("LWD" in the table above)
  • fewshot_data_path - Path to the Few-Shot data ("Fewshot Data" in the table above)
  • weights_path - Path to the pre-trained models (folder containing the extracted files from the downloads of "MAML", "UAR Multi-domain", "UAR Multi-LLM", "AI Detector (fine-tuned)", and "PROTONET" above)

Related Works

This work is based on our previous efforts in learning stylistic representations and studying their properties. For more information, please see the following:

  1. Learning Universal Authorship Representaitons
  2. Can Authorship Representation Learning Capture Stylistic Features?


Here's a list of the people who have contributed to this work:


      title={Few-Shot Detection of Machine-Generated Text using Style Representations}, 
      author={Rafael Rivera Soto and Kailin Koch and Aleem Khan and Barry Chen and Marcus Bishop and Nicholas Andrews},


LUAR is distributed under the terms of the Apache License (Version 2.0).

All new contributions must be made under the Apache-2.0 licenses.

See LICENSE and NOTICE for details.

SPDX-License-Identifier: Apache-2.0
