Skip to content
/ LUNA Public

Code for the paper "LUNA: Language Understanding with Number Augmentations on Transformers via Number Plugins and Pre-training"

License

Notifications You must be signed in to change notification settings

zmy/LUNA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Description

This anonymous repository contains code for the core models and description for the data used in LUNA: Language Understanding with Number Augmentations on Transformers via Number Plugins and Pre-training.

The folders number_encoder and number_tokenizer contain code for NumBed and NumTok (Sec. 3.1), respectively.

The folder phases/numerical_field contains codes for number pre-training (Sec. 3.2).

The folder phases/single_number contains codes for the toy task.

The folder phases/downstream_tasks contains codes for the downstream tasks (Sec. 4), including TAT-QA, TabFact, and CrediTrans.

The folder phases/empirical_study contains codes for the empirical studies (Sec. E in appendix), including visualization of attention maps and embeddings from different transformer layers, please rename attention.ipy123nb to attention.ipynb

Preparation

unzip the data.zip in supplementary and put the data dir in this repo as LUNA/data.

To build docker image, run:

docker build -t luna:1.0 .

To launch the docker container, run:

docker run --rm -it --shm-size=8g luna:1.0 /bin/bash

To run each experiment, see the README document in each directory under phases (as mentioned above).

About

Code for the paper "LUNA: Language Understanding with Number Augmentations on Transformers via Number Plugins and Pre-training"

Resources

License

Stars

Watchers

Forks