Skip to content

Commit

Permalink
fix README.md (facebookresearch#2735)
Browse files Browse the repository at this point in the history
Summary:
# Before submitting

- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/master/CONTRIBUTING.md)?
- [ ] Did you make sure to update the docs?
- [ ] Did you write any new necessary tests?

## What does this PR do?
Fixes # (issue).

## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

## Did you have fun?
Make sure you had fun coding �

Pull Request resolved: facebookresearch#2735

Reviewed By: myleott

Differential Revision: D24343492

Pulled By: xianxl

fbshipit-source-id: c61c717756307036f9d89de5a8ded66784f1acf7
  • Loading branch information
Xian Li authored and facebook-github-bot committed Oct 15, 2020
1 parent 573c2f4 commit 05a5232
Showing 1 changed file with 4 additions and 4 deletions.
8 changes: 4 additions & 4 deletions examples/latent_depth/README.md
Original file line number Diff line number Diff line change
@@ -1,19 +1,19 @@
# Deep Transformers with Latent Depth (Li et al., 2020)

[https://arxiv.org/abs/2009.13102] (https://arxiv.org/abs/2009.13102).
[https://arxiv.org/abs/2009.13102](https://arxiv.org/abs/2009.13102).

## Introduction

We present a probabilistic framework to automatically learn which layer(s) to use by learning the posterior distributions of layer selection. As an extension of this framework, we propose a novel method to train one shared Transformer network for multilingual machine translation with different layer selection posteriors for each language pair.

## Training a multilingual model with latent depth

Below is an example of training with latent depth in decoder for one-to-many (O2M) related languages. We use the same preprocessed (numberized and binarized) TED8 dataset as in [Balancing Training for Multilingual Neural Machine Translation (Wang et al., 2020)] (https://github.com/cindyxinyiwang/multiDDS), which could be generated by [the script] (https://github.com/cindyxinyiwang/multiDDS/blob/multiDDS/util_scripts/prepare_multilingual_data.sh) the author provided.
Below is an example of training with latent depth in decoder for one-to-many (O2M) related languages. We use the same preprocessed (numberized and binarized) TED8 dataset as in [Balancing Training for Multilingual Neural Machine Translation (Wang et al., 2020)](https://github.com/cindyxinyiwang/multiDDS), which could be generated by [the script](https://github.com/cindyxinyiwang/multiDDS/blob/multiDDS/util_scripts/prepare_multilingual_data.sh) the author provided.
```bash
lang_pairs_str="eng-aze,eng-bel,eng-ces,eng-glg,eng-por,eng-rus,eng-slk,eng-tur"
databin_dir=<path to binarized data>

python fairseq_cli/train.py ${databin_dir} \
fairseq-train ${databin_dir} \
--user-dir, examples/latent_depth/src \
--lang-pairs "${lang_pairs_str}" \
--arch multilingual_transformer_iwslt_de_en \
Expand Down Expand Up @@ -50,7 +50,7 @@ src_lang=<source language to translate from>
tgt_lang=<target language to translate to>
gen_data=<name of data split, e.g. valid, test, etc>

python fairseq_cli/generate.py ${databin_dir} \
fairseq-generate ${databin_dir} \
--path ${model_path} \
--task multilingual_translation_latent_depth \
--decoder-latent-layer \
Expand Down

0 comments on commit 05a5232

Please sign in to comment.