Skip to content

Commit

Permalink
Examples reorg (huggingface#11350)
Browse files Browse the repository at this point in the history
* Base move

* Examples reorganization

* Update references

* Put back test data

* Move conftest

* More fixes

* Move test data to test fixtures

* Update path

* Apply suggestions from code review

Co-authored-by: Lysandre Debut <[email protected]>

* Address review comments and clean

Co-authored-by: Lysandre Debut <[email protected]>
  • Loading branch information
sgugger and LysandreJik authored Apr 21, 2021
1 parent ca7ff64 commit dabeb15
Show file tree
Hide file tree
Showing 105 changed files with 1,060 additions and 558 deletions.
4 changes: 2 additions & 2 deletions .circleci/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -306,12 +306,12 @@ jobs:
- v0.4-{{ checksum "setup.py" }}
- run: pip install --upgrade pip
- run: pip install .[sklearn,torch,sentencepiece,testing]
- run: pip install -r examples/_tests_requirements.txt
- run: pip install -r examples/pytorch/_tests_requirements.txt
- save_cache:
key: v0.4-torch_examples-{{ checksum "setup.py" }}
paths:
- '~/.cache/pip'
- run: TRANSFORMERS_IS_CI=1 python -m pytest -n 8 --dist=loadfile -s --make-reports=examples_torch ./examples/ | tee examples_output.txt
- run: TRANSFORMERS_IS_CI=1 python -m pytest -n 8 --dist=loadfile -s --make-reports=examples_torch ./examples/pytorch/ | tee examples_output.txt
- store_artifacts:
path: ~/transformers/examples_output.txt
- store_artifacts:
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/self-scheduled.yml
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ jobs:
HF_HOME: /mnt/cache
TRANSFORMERS_IS_CI: yes
run: |
pip install -r examples/_tests_requirements.txt
pip install -r examples/pytorch/_tests_requirements.txt
python -m pytest -n 1 --dist=loadfile --make-reports=examples_torch_gpu examples
- name: Failure short reports
Expand Down
2 changes: 1 addition & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -285,7 +285,7 @@ $ python -m pytest -n auto --dist=loadfile -s -v ./tests/
and for the examples:

```bash
$ pip install -r examples/requirements.txt # only needed the first time
$ pip install -r examples/xxx/requirements.txt # only needed the first time
$ python -m pytest -n auto --dist=loadfile -s -v ./examples/
```
In fact, that's how `make test` and `make test-examples` are implemented (sans the `pip install` line)!
Expand Down
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@ test:
# Run tests for examples

test-examples:
python -m pytest -n auto --dist=loadfile -s -v ./examples/
python -m pytest -n auto --dist=loadfile -s -v ./examples/pytorch/

# Run tests for SageMaker DLC release

Expand Down
2 changes: 1 addition & 1 deletion docker/transformers-pytorch-tpu/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ RUN git clone https://github.com/huggingface/transformers.git && \
git checkout CI && \
cd .. && \
pip install ./transformers && \
pip install -r ./transformers/examples/requirements.txt && \
pip install -r ./transformers/examples/pytorch/_test_requirements.txt && \
pip install pytest

RUN python -c "import torch_xla; print(torch_xla.__version__)"
Expand Down
2 changes: 1 addition & 1 deletion docker/transformers-pytorch-tpu/bert-base-cased.jsonnet
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ local bertBaseCased = base.BaseTest {
},
command: utils.scriptCommand(
|||
python -m pytest -s transformers/examples/test_xla_examples.py -v
python -m pytest -s transformers/examples/pytorch/test_xla_examples.py -v
test_exit_code=$?
echo "\nFinished running commands.\n"
test $test_exit_code -eq 0
Expand Down
4 changes: 2 additions & 2 deletions docs/source/benchmarks.rst
Original file line number Diff line number Diff line change
Expand Up @@ -65,10 +65,10 @@ respectively.
.. code-block:: bash
## PYTORCH CODE
python examples/benchmarking/run_benchmark.py --help
python examples/pytorch/benchmarking/run_benchmark.py --help
## TENSORFLOW CODE
python examples/benchmarking/run_benchmark_tf.py --help
python examples/tensorflow/benchmarking/run_benchmark_tf.py --help
An instantiated benchmark object can then simply be run by calling ``benchmark.run()``.
Expand Down
4 changes: 2 additions & 2 deletions docs/source/converting_tensorflow_models.rst
Original file line number Diff line number Diff line change
Expand Up @@ -33,8 +33,8 @@ You can convert any TensorFlow checkpoint for BERT (in particular `the pre-train
This CLI takes as input a TensorFlow checkpoint (three files starting with ``bert_model.ckpt``\ ) and the associated
configuration file (\ ``bert_config.json``\ ), and creates a PyTorch model for this configuration, loads the weights
from the TensorFlow checkpoint in the PyTorch model and saves the resulting model in a standard PyTorch save file that
can be imported using ``from_pretrained()`` (see example in :doc:`quicktour` , `run_glue.py
<https://github.com/huggingface/transformers/blob/master/examples/text-classification/run_glue.py>`_\ ).
can be imported using ``from_pretrained()`` (see example in :doc:`quicktour` , :prefix_link:`run_glue.py
<examples/pytorch/text-classification/run_glue.py>` \ ).

You only need to run this conversion script **once** to get a PyTorch model. You can then disregard the TensorFlow
checkpoint (the three files starting with ``bert_model.ckpt``\ ) but be sure to keep the configuration file (\
Expand Down
4 changes: 2 additions & 2 deletions docs/source/installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -168,13 +168,13 @@ Here is an example of how this can be used on a filesystem that is shared betwee
On the instance with the normal network run your program which will download and cache models (and optionally datasets if you use 🤗 Datasets). For example:

```
python examples/seq2seq/run_translation.py --model_name_or_path t5-small --dataset_name wmt16 --dataset_config ro-en ...
python examples/pytorch/translation/run_translation.py --model_name_or_path t5-small --dataset_name wmt16 --dataset_config ro-en ...
```

and then with the same filesystem you can now run the same program on a firewalled instance:
```
HF_DATASETS_OFFLINE=1 TRANSFORMERS_OFFLINE=1 \
python examples/seq2seq/run_translation.py --model_name_or_path t5-small --dataset_name wmt16 --dataset_config ro-en ...
python examples/pytorch/translation/run_translation.py --model_name_or_path t5-small --dataset_name wmt16 --dataset_config ro-en ...
```
and it should succeed without any hanging waiting to timeout.

Expand Down
10 changes: 5 additions & 5 deletions docs/source/main_classes/processors.rst
Original file line number Diff line number Diff line change
Expand Up @@ -68,8 +68,8 @@ Additionally, the following method can be used to load values from a data file a
Example usage
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

An example using these processors is given in the `run_glue.py
<https://github.com/huggingface/pytorch-transformers/blob/master/examples/text-classification/run_glue.py>`__ script.
An example using these processors is given in the :prefix_link:`run_glue.py
<examples/legacy/text-classification/run_glue.py>` script.


XNLI
Expand All @@ -89,8 +89,8 @@ This library hosts the processor to load the XNLI data:

Please note that since the gold labels are available on the test set, evaluation is performed on the test set.

An example using these processors is given in the `run_xnli.py
<https://github.com/huggingface/pytorch-transformers/blob/master/examples/text-classification/run_xnli.py>`__ script.
An example using these processors is given in the :prefix_link:`run_xnli.py
<examples/legacy/text-classification/run_xnli.py>` script.


SQuAD
Expand Down Expand Up @@ -169,4 +169,4 @@ Using `tensorflow_datasets` is as easy as using a data file:
Another example using these processors is given in the :prefix_link:`run_squad.py
<examples/question-answering/run_squad.py>` script.
<examples/legacy/question-answering/run_squad.py>` script.
14 changes: 7 additions & 7 deletions docs/source/main_classes/trainer.rst
Original file line number Diff line number Diff line change
Expand Up @@ -338,7 +338,7 @@ For example here is how you could use it for ``run_translation.py`` with 2 GPUs:

.. code-block:: bash
python -m torch.distributed.launch --nproc_per_node=2 examples/seq2seq/run_translation.py \
python -m torch.distributed.launch --nproc_per_node=2 examples/pytorch/translation/run_translation.py \
--model_name_or_path t5-small --per_device_train_batch_size 1 \
--output_dir output_dir --overwrite_output_dir \
--do_train --max_train_samples 500 --num_train_epochs 1 \
Expand All @@ -363,7 +363,7 @@ For example here is how you could use it for ``run_translation.py`` with 2 GPUs:

.. code-block:: bash
python -m torch.distributed.launch --nproc_per_node=2 examples/seq2seq/run_translation.py \
python -m torch.distributed.launch --nproc_per_node=2 examples/pytorch/translation/run_translation.py \
--model_name_or_path t5-small --per_device_train_batch_size 1 \
--output_dir output_dir --overwrite_output_dir \
--do_train --max_train_samples 500 --num_train_epochs 1 \
Expand Down Expand Up @@ -540,7 +540,7 @@ Here is an example of running ``run_translation.py`` under DeepSpeed deploying a

.. code-block:: bash
deepspeed examples/seq2seq/run_translation.py \
deepspeed examples/pytorch/translation/run_translation.py \
--deepspeed tests/deepspeed/ds_config.json \
--model_name_or_path t5-small --per_device_train_batch_size 1 \
--output_dir output_dir --overwrite_output_dir --fp16 \
Expand All @@ -565,7 +565,7 @@ To deploy DeepSpeed with one GPU adjust the :class:`~transformers.Trainer` comma

.. code-block:: bash
deepspeed --num_gpus=1 examples/seq2seq/run_translation.py \
deepspeed --num_gpus=1 examples/pytorch/translation/run_translation.py \
--deepspeed tests/deepspeed/ds_config.json \
--model_name_or_path t5-small --per_device_train_batch_size 1 \
--output_dir output_dir --overwrite_output_dir --fp16 \
Expand Down Expand Up @@ -617,7 +617,7 @@ Notes:

.. code-block:: bash
deepspeed --include localhost:1 examples/seq2seq/run_translation.py ...
deepspeed --include localhost:1 examples/pytorch/translation/run_translation.py ...
In this example, we tell DeepSpeed to use GPU 1 (second gpu).

Expand Down Expand Up @@ -711,7 +711,7 @@ shell from a cell. For example, to use ``run_translation.py`` you would launch i
.. code-block::
!git clone https://github.com/huggingface/transformers
!cd transformers; deepspeed examples/seq2seq/run_translation.py ...
!cd transformers; deepspeed examples/pytorch/translation/run_translation.py ...
or with ``%%bash`` magic, where you can write a multi-line code for the shell program to run:

Expand All @@ -721,7 +721,7 @@ or with ``%%bash`` magic, where you can write a multi-line code for the shell pr
git clone https://github.com/huggingface/transformers
cd transformers
deepspeed examples/seq2seq/run_translation.py ...
deepspeed examples/pytorch/translation/run_translation.py ...
In such case you don't need any of the code presented at the beginning of this section.

Expand Down
2 changes: 1 addition & 1 deletion docs/source/model_doc/bart.rst
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ Examples
_______________________________________________________________________________________________________________________

- Examples and scripts for fine-tuning BART and other models for sequence to sequence tasks can be found in
:prefix_link:`examples/seq2seq/ <examples/seq2seq/README.md>`.
:prefix_link:`examples/pytorch/summarization/ <examples/pytorch/summarization/README.md>`.
- An example of how to train :class:`~transformers.BartForConditionalGeneration` with a Hugging Face :obj:`datasets`
object can be found in this `forum discussion
<https://discuss.huggingface.co/t/train-bart-for-conditional-generation-e-g-summarization/1904>`__.
Expand Down
2 changes: 1 addition & 1 deletion docs/source/model_doc/barthez.rst
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ Examples
_______________________________________________________________________________________________________________________

- BARThez can be fine-tuned on sequence-to-sequence tasks in a similar way as BART, check:
:prefix_link:`examples/seq2seq/ <examples/seq2seq/README.md>`.
:prefix_link:`examples/pytorch/summarization/ <examples/pytorch/summarization/README.md>`.


BarthezTokenizer
Expand Down
4 changes: 2 additions & 2 deletions docs/source/model_doc/distilbert.rst
Original file line number Diff line number Diff line change
Expand Up @@ -44,8 +44,8 @@ Tips:
- DistilBERT doesn't have options to select the input positions (:obj:`position_ids` input). This could be added if
necessary though, just let us know if you need this option.

This model was contributed by `victorsanh <https://huggingface.co/victorsanh>`__. The original code can be found `here
<https://github.com/huggingface/transformers/tree/master/examples/distillation>`__.
This model was contributed by `victorsanh <https://huggingface.co/victorsanh>`__. The original code can be found
:prefix_link:`here <examples/research-projects/distillation>`.


DistilBertConfig
Expand Down
3 changes: 2 additions & 1 deletion docs/source/model_doc/pegasus.rst
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,8 @@ Examples
_______________________________________________________________________________________________________________________

- :prefix_link:`Script <examples/research_projects/seq2seq-distillation/finetune_pegasus_xsum.sh>` to fine-tune pegasus
on the XSUM dataset. Data download instructions at :prefix_link:`examples/seq2seq/ <examples/seq2seq/README.md>`.
on the XSUM dataset. Data download instructions at :prefix_link:`examples/pytorch/summarization/
<examples/pytorch/summarization/README.md>`.
- FP16 is not supported (help/ideas on this appreciated!).
- The adafactor optimizer is recommended for pegasus fine-tuning.

Expand Down
2 changes: 1 addition & 1 deletion docs/source/model_doc/retribert.rst
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ Question Answering <https://yjernite.github.io/lfqa.html>`__. RetriBERT is a sma
pair of BERT encoders with lower-dimension projection for dense semantic indexing of text.

This model was contributed by `yjernite <https://huggingface.co/yjernite>`__. Code to train and use the model can be
found `here <https://github.com/huggingface/transformers/tree/master/examples/distillation>`__.
found :prefix_link:`here <examples/research-projects/distillation>`.


RetriBertConfig
Expand Down
2 changes: 1 addition & 1 deletion docs/source/model_doc/xlnet.rst
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ Tips:
using only a sub-set of the output tokens as target which are selected with the :obj:`target_mapping` input.
- To use XLNet for sequential decoding (i.e. not in fully bi-directional setting), use the :obj:`perm_mask` and
:obj:`target_mapping` inputs to control the attention span and outputs (see examples in
`examples/text-generation/run_generation.py`)
`examples/pytorch/text-generation/run_generation.py`)
- XLNet is one of the few models that has no sequence length limit.

This model was contributed by `thomwolf <https://huggingface.co/thomwolf>`__. The original code can be found `here
Expand Down
3 changes: 2 additions & 1 deletion docs/source/model_summary.rst
Original file line number Diff line number Diff line change
Expand Up @@ -682,7 +682,8 @@ The `mbart-large-en-ro checkpoint <https://huggingface.co/facebook/mbart-large-e
romanian translation.

The `mbart-large-cc25 <https://huggingface.co/facebook/mbart-large-cc25>`_ checkpoint can be finetuned for other
translation and summarization tasks, using code in ```examples/seq2seq/``` , but is not very useful without finetuning.
translation and summarization tasks, using code in ```examples/pytorch/translation/``` , but is not very useful without
finetuning.


ProphetNet
Expand Down
4 changes: 2 additions & 2 deletions docs/source/multilingual.rst
Original file line number Diff line number Diff line change
Expand Up @@ -90,8 +90,8 @@ You can then feed it all as input to your model:
>>> outputs = model(input_ids, langs=langs)
The example :prefix_link:`run_generation.py <examples/text-generation/run_generation.py>` can generate text using the
CLM checkpoints from XLM, using the language embeddings.
The example :prefix_link:`run_generation.py <examples/pytorch/text-generation/run_generation.py>` can generate text
using the CLM checkpoints from XLM, using the language embeddings.

XLM without Language Embeddings
-----------------------------------------------------------------------------------------------------------------------
Expand Down
4 changes: 2 additions & 2 deletions docs/source/sagemaker.md
Original file line number Diff line number Diff line change
Expand Up @@ -325,7 +325,7 @@ When you create a `HuggingFace` Estimator, you can specify a [training script th

If you are using `git_config` to run the [🤗 Transformers examples scripts](https://github.com/huggingface/transformers/tree/master/examples) keep in mind that you need to configure the right `'branch'` for you `transformers_version`, e.g. if you use `transformers_version='4.4.2` you have to use `'branch':'v4.4.2'`.

As an example to use `git_config` with an [example script from the transformers repository](https://github.com/huggingface/transformers/tree/master/examples/text-classification).
As an example to use `git_config` with an [example script from the transformers repository](https://github.com/huggingface/transformers/tree/master/examples/pytorch/text-classification).

_Tip: define `output_dir` as `/opt/ml/model` in the hyperparameter for the script to save your model to S3 after training._

Expand All @@ -338,7 +338,7 @@ git_config = {'repo': 'https://github.com/huggingface/transformers.git','branch'
# create the Estimator
huggingface_estimator = HuggingFace(
entry_point='run_glue.py',
source_dir='./examples/text-classification',
source_dir='./examples/pytorch/text-classification',
git_config=git_config,
instance_type='ml.p3.2xlarge',
instance_count=1,
Expand Down
Loading

0 comments on commit dabeb15

Please sign in to comment.