- Convert jsonl to json files in
jsonl2json.py
- Remove duplicate elements from json file based on 'question' key in
de_duplicate.py
- Random sampling n datasets from json file and save to the new json file in
split_dataset.py
This folder will be updated frequently, and users can define new functions according to their needs.