GitHub - azdatascience/pytorch-transformers: 👾 A library of state-of-the-art pretrained models for Natural Language Processing (NLP)

#Supporting code for paper Ablations over Transformer Models for Biomedical Relationship Extraction

This repo is a fork of HuggingFace/Transformers, with some extensions to demonstrate the code used in the paper Ablations over Transformer Models for Biomedical Relationship Extraction.

`run_semeval.py`: relationship classification using R-Bert

R-BERT is a relationship classification head for BERT and RoBERTa, described here.

This example code fine-tunes R-BERT on the semeval 2010 Task 8 dataset:

python ./examples/run_semeval.py \
--data_dir $SEMEVAL_DIR \
--output_dir $RESULTS_DIR \
--model_name_or_path bert-base-uncased \
--do_train \
--do_eval \
--overwrite_output_dir \
--num_train_epochs 8.0 \
--per_gpu_train_batch_size 16 \
--per_gpu_eval_batch_size 16 \
--learning_rate 2e-5 \
--max_seq_length 128 \
--task_name semeval2010_task8 \
--train_on_other_labels \
--eval_on_other_labels \
--include_directionality

The $SEMEVAL_DIR should point to the extracted archive.

The --include_directionality flag trains a classifier using all 18 semeval classes. The --train_on_other_labels and --eval_on_other_labels flags also include instances labeled as 'Other' in the training and evaluation respectively. Include all of these to be able to use the official evaluation script.

Note, although an F1 score is calculated in the python code, two additional files are also written out at the checkpoint intervals {global_step}_semeval_results.tsv that may be used with the official Semeval evaluation script (supplied in the semeval data archive). The dataset is available under Creative Commons Atrribution 3.0 Unported Licence (http://creativecommons.org/licenses/by/3.0/) and is available here.

for example, using BERT:

./semeval2010_task8_scorer-v1.2.pl $SEMEVAL_DIR/{global_step}_semeval_results.tsv $SEMEVAL_DIR/TEST_FILE_SEMEVAL_SCRIPT_FORMAT.tsv

However, the RoBERTa model can also be used with this head:

python ./examples/run_semeval.py \
--data_dir $SEMEVAL_DIR \
--output_dir $RESULTS_DIR \
--model_name_or_path roberta-large \
--model_type roberta \
--do_train \
--do_eval \
--overwrite_output_dir \
--num_train_epochs 8.0 \
--per_gpu_train_batch_size 16 \
--per_gpu_eval_batch_size 16 \
--learning_rate 2e-5 \
--max_seq_length 128 \
--task_name semeval2010_task8 \
--train_on_other_labels \
--eval_on_other_labels \
--include_directionality

Name		Name	Last commit message	Last commit date
Latest commit History 1,832 Commits
.circleci		.circleci
.github		.github
docker		docker
docs		docs
examples		examples
notebooks		notebooks
transformers		transformers
.coveragerc		.coveragerc
.gitignore		.gitignore
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
hubconf.py		hubconf.py
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

`run_semeval.py`: relationship classification using R-Bert

About

Uh oh!

Releases 1

Packages

Languages

License

azdatascience/pytorch-transformers

Folders and files

Latest commit

History

Repository files navigation

run_semeval.py: relationship classification using R-Bert

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

`run_semeval.py`: relationship classification using R-Bert

Packages