Skip to content

Latest commit

 

History

History

HW5

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 

HW5 Transformer

Task description:

Use the Transformer to complete English to Chinese translation

I use the Transformer with


  • encoder/decorder_embed_dim = 1024.
  • encoder/decorder_ffn_embed_dim = 4096.
  • encoder/decorder_layers = 8.
  • Dropout = 0.3
  • encoder/decoder_attention_heads = 16
  • epoch = 50
  • Training time = more than 2 days

Because of GPU RAM constrain, I adjust the max_tokens to 2048 and accum_steps to 8.
I use the monolingual data in order to train a backward model
--> it can increase our training data to get a higher score.

How to Run

Open the Google Colab
upload the 「HW05_ipynb」的副本.ipynb:

Replenishment

  • You can't directly run the code, download the data is required.
  • Watch out the file directory.
  • You need to ensemble more than five checkpoints for getting higher score.