This is the code we used in the following paper
A Closer Look into the Robustness of Neural Dependency Parsers Using Better Adversarial Examples
Yuxuan Wang, Wanxiang Che, Ivan Titov, Shay B. Cohen, Zhilin Lei, Ting Liu
Findings of ACL 2021
Python 3.6, PyTorch >=1.3.1, ...
For the data format used in our implementation, please read the code (conllx format)
First to the experiments folder:
cd experiments
To train a Stack-Pointer parser, simply run
./scripts/train_stackptr.sh
Remeber to setup the paths for data and embeddings.
To train a Deep BiAffine parser, simply run
./scripts/train_biaf.sh
Again, remember to setup the paths for data and embeddings.
First to the adversary folder:
cd adversary
-
get the train_vocab.json or test_vocab.json from training set or test set
./scripts/build_vocab_from_conll.py
-
get the candidate words of each token in training set with sememe-based method
./scripts/lemma.py
./scripts/gen_candidates.py -
get the candidate words of each token in test set with synonym-based method(note: input vocab is test_vocab.json, not training_vocab .json)
./scripts/gen_synonym.py
-
get the candidate words of each token in test set with bert-based method
./scripts/gen_mlm_cands.sh
-
adjust the embedding according to <https://github.com/nmrksic/counter -fitting> . Maybe,you need to compute the similarity Matrix using compute_nn.sh and merge_nn.sh
At last, you need to run the scirpt ./scripts/preprocess.py to get the total adversarial cache under test set or training set
python ./pipeline.py --gpu 0 ./config/pipe.json ./config/stanfordtag.json
output_dir
if you want to attack the ensembled model, you need to add " --ensemble " in ./scripts/pipeline.py
To train a Deep BiAffine parser, simply run
./scripts/train_biaf.sh
But, remember to setup "add_path" for adding adversarial sample. (you can find adversarial gold samples after you run "adversarial attacking")
Thanks to Ma et al. The implementation is based on the dependency parser by Ma et al . (2018 ) (https://github.com/XuezheMax/NeuroNLP2) and reuses part of its code.