-
Notifications
You must be signed in to change notification settings - Fork 8
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Create detailed_annotation_information.md
- Loading branch information
1 parent
5ade986
commit 22d5967
Showing
1 changed file
with
13 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
- The information of annotators | ||
|
||
We hire 5 annotators in total, 3 of them are master students engaged in LLM research and the rest two annotators are master students in Linguistics. We hire them to work on-site as full-time annotators. | ||
|
||
- Hours worked on annotation | ||
|
||
In average, annotators worked 8 hours per day. Six datasets containing a total of 557 instances of CF were examined, requiring 2 annotators and 3 days to complete the verification process. The task was assigned to the linguistics master's students to ensure semantic consistency. Six datasets containing KPR, comprising a total of 1,924 pairs, were processed. Replacing key words and phrase in the Chinese datasets required 5 individuals and 3 days, while replacing them in the English datasets required 2 individuals and 2 days. Nine datasets containing a total of 955 instances of AK were verified, requiring 2 individuals and 1 day to complete the process. | ||
|
||
- Detailed annotation process | ||
|
||
Firstly, we trained five annotators and had them conduct a trial annotation according to the following annotation guidelines: | ||
- For CFI, there are 3 types of cases needed to manually modify: a) The CF generated by GPT-4 did not achieve the intended interference effect. b) The CF generated by GPT-4 altered or added to the original facts, resulting in conflicts. c) The CF generated by GPT-4 did not replace the main subject, leading to ambiguity in the interpretation of the question. | ||
- For KPR, The replacement of certain words or phrases need to be done in the answer, if they appear in the question, also requires corresponding replacements in the question. The detailed requirements are as follows: a) Ensure that the replaced words or phrases differ entirely in characters from the original words or phrases. b) Maximize the differences between the entire sentence and the original sentence, replace as many elements as possible. c) Try to replace fixed terms, meaning that the term remains consistent throughout the original text without the presence of synonyms. d) The replaced concept can be fabricated or inconsistent with common knowledge, as long as a relevant answer can be found within the context. |