Skip to content

Commit

Permalink
Fix some writing issues in the docs (huggingface#14136)
Browse files Browse the repository at this point in the history
* Fix some writing issues in the docs

* Run code quality check
  • Loading branch information
h4iku authored Oct 25, 2021
1 parent 2ac6555 commit 3e04a41
Show file tree
Hide file tree
Showing 9 changed files with 28 additions and 28 deletions.
2 changes: 1 addition & 1 deletion ISSUES.md
Original file line number Diff line number Diff line change
Expand Up @@ -205,7 +205,7 @@ You are not required to read the following guidelines before opening an issue. H

If you really tried to make a short reproducible code but couldn't figure it out, it might be that having a traceback will give the developer enough information to know what's going on. But if it is not enough and we can't reproduce the problem, we can't really solve it.

Do not dispair if you can't figure it out from the begining, just share what you can and perhaps someone else will be able to help you at the forums.
Do not despair if you can't figure it out from the beginning, just share what you can and perhaps someone else will be able to help you at the forums.

If your setup involves any custom datasets, the best way to help us reproduce the problem is to create a [Google Colab notebook](https://colab.research.google.com/) that demonstrates the issue and once you verify that the issue still exists, include a link to that notebook in the Issue. Just make sure that you don't copy and paste the location bar url of the open notebook - as this is private and we won't be able to open it. Instead, you need to click on `Share` in the right upper corner of the notebook, select `Get Link` and then copy and paste the public link it will give to you.

Expand Down
6 changes: 3 additions & 3 deletions docs/source/add_new_model.rst
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@ Let's take a look:

As you can see, we do make use of inheritance in 🤗 Transformers, but we keep the level of abstraction to an absolute
minimum. There are never more than two levels of abstraction for any model in the library. :obj:`BrandNewBertModel`
inherits from :obj:`BrandNewBertPreTrainedModel` which in turn inherits from :class:`~transformres.PreTrainedModel` and
inherits from :obj:`BrandNewBertPreTrainedModel` which in turn inherits from :class:`~transformers.PreTrainedModel` and
that's it. As a general rule, we want to make sure that a new model only depends on
:class:`~transformers.PreTrainedModel`. The important functionalities that are automatically provided to every new
model are :meth:`~transformers.PreTrainedModel.from_pretrained` and
Expand Down Expand Up @@ -271,7 +271,7 @@ logical components from one another and to have faster debugging cycles as inter
notebooks are often easier to share with other contributors, which might be very helpful if you want to ask the Hugging
Face team for help. If you are familiar with Jupiter notebooks, we strongly recommend you to work with them.

The obvious disadvantage of Jupyther notebooks is that if you are not used to working with them you will have to spend
The obvious disadvantage of Jupyter notebooks is that if you are not used to working with them you will have to spend
some time adjusting to the new programming environment and that you might not be able to use your known debugging tools
anymore, like ``ipdb``.

Expand Down Expand Up @@ -674,7 +674,7 @@ the ``input_ids`` (usually the word embeddings) are identical. And then work you
network. At some point, you will notice a difference between the two implementations, which should point you to the bug
in the 🤗 Transformers implementation. From our experience, a simple and efficient way is to add many print statements
in both the original implementation and 🤗 Transformers implementation, at the same positions in the network
respectively, and to successively remove print statements showing the same values for intermediate presentions.
respectively, and to successively remove print statements showing the same values for intermediate presentations.
When you're confident that both implementations yield the same output, verifying the outputs with
``torch.allclose(original_output, output, atol=1e-3)``, you're done with the most difficult part! Congratulations - the
Expand Down
18 changes: 9 additions & 9 deletions docs/source/add_new_pipeline.rst
Original file line number Diff line number Diff line change
Expand Up @@ -13,9 +13,9 @@ How to add a pipeline to 🤗 Transformers?
=======================================================================================================================

First and foremost, you need to decide the raw entries the pipeline will be able to take. It can be strings, raw bytes,
dictionnaries or whatever seems to be the most likely desired input. Try to keep these inputs as pure Python as
possible as it makes compatibility easier (even through other languages via JSON). Those will be the :obj:`inputs` of
the pipeline (:obj:`preprocess`).
dictionaries or whatever seems to be the most likely desired input. Try to keep these inputs as pure Python as possible
as it makes compatibility easier (even through other languages via JSON). Those will be the :obj:`inputs` of the
pipeline (:obj:`preprocess`).

Then define the :obj:`outputs`. Same policy as the :obj:`inputs`. The simpler, the better. Those will be the outputs of
:obj:`postprocess` method.
Expand Down Expand Up @@ -50,15 +50,15 @@ Start by inheriting the base class :obj:`Pipeline`. with the 4 methods needed to
return best_class
The structure of this breakdown is to support relatively seemless support for CPU/GPU, while supporting doing
The structure of this breakdown is to support relatively seamless support for CPU/GPU, while supporting doing
pre/postprocessing on the CPU on different threads

:obj:`preprocess` will take the original defined inputs, and turn them something feedable to the model. It might
contain more information and is usally a :obj:`Dict`.
:obj:`preprocess` will take the originally defined inputs, and turn them into something feedable to the model. It might
contain more information and is usually a :obj:`Dict`.

:obj:`_forward` is the implementation detail and is not meant to be called directly :obj:`forward` is the preferred
:obj:`_forward` is the implementation detail and is not meant to be called directly. :obj:`forward` is the preferred
called method as it contains safeguards to make sure everything is working on the expected device. If anything is
linked to a real model it belongs in the :obj:`_forward` method, anything else is in the preprocess/postrocess.
linked to a real model it belongs in the :obj:`_forward` method, anything else is in the preprocess/postprocess.

:obj:`postprocess` methods will take the output of :obj:`_forward` and turn it into the final output that were decided
earlier.
Expand Down Expand Up @@ -124,7 +124,7 @@ Create a new file ``tests/test_pipelines_MY_PIPELINE.py`` with example with the
The :obj:`run_pipeline_test` function will be very generic and run on small random models on every possible
architecture as defined by :obj:`model_mapping` and :obj:`tf_model_mapping`.

This is very important to test future compatibilty, meaning if someone adds a new model for
This is very important to test future compatibility, meaning if someone adds a new model for
:obj:`XXXForQuestionAnswering` then the pipeline test will attempt to run on it. Because the models are random it's
impossible to check for actual values, that's why There is a helper :obj:`ANY` that will simply attempt to match the
output of the pipeline TYPE.
Expand Down
2 changes: 1 addition & 1 deletion docs/source/community.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ This page regroups resources around 🤗 Transformers developed by the community
|[fine-tune a non-English GPT-2 Model with Trainer class](https://github.com/philschmid/fine-tune-GPT-2/blob/master/Fine_tune_a_non_English_GPT_2_Model_with_Huggingface.ipynb) | How to fine-tune a non-English GPT-2 Model with Trainer class | [Philipp Schmid](https://www.philschmid.de) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/philschmid/fine-tune-GPT-2/blob/master/Fine_tune_a_non_English_GPT_2_Model_with_Huggingface.ipynb)|
|[Fine-tune a DistilBERT Model for Multi Label Classification task](https://github.com/DhavalTaunk08/Transformers_scripts/blob/master/Transformers_multilabel_distilbert.ipynb) | How to fine-tune a DistilBERT Model for Multi Label Classification task | [Dhaval Taunk](https://github.com/DhavalTaunk08) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/DhavalTaunk08/Transformers_scripts/blob/master/Transformers_multilabel_distilbert.ipynb)|
|[Fine-tune ALBERT for sentence-pair classification](https://github.com/NadirEM/nlp-notebooks/blob/master/Fine_tune_ALBERT_sentence_pair_classification.ipynb) | How to fine-tune an ALBERT model or another BERT-based model for the sentence-pair classification task | [Nadir El Manouzi](https://github.com/NadirEM) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/NadirEM/nlp-notebooks/blob/master/Fine_tune_ALBERT_sentence_pair_classification.ipynb)|
|[Fine-tune Roberta for sentiment analysis](https://github.com/DhavalTaunk08/NLP_scripts/blob/master/sentiment_analysis_using_roberta.ipynb) | How to fine-tune an Roberta model for sentiment analysis | [Dhaval Taunk](https://github.com/DhavalTaunk08) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/DhavalTaunk08/NLP_scripts/blob/master/sentiment_analysis_using_roberta.ipynb)|
|[Fine-tune Roberta for sentiment analysis](https://github.com/DhavalTaunk08/NLP_scripts/blob/master/sentiment_analysis_using_roberta.ipynb) | How to fine-tune a Roberta model for sentiment analysis | [Dhaval Taunk](https://github.com/DhavalTaunk08) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/DhavalTaunk08/NLP_scripts/blob/master/sentiment_analysis_using_roberta.ipynb)|
|[Evaluating Question Generation Models](https://github.com/flexudy-pipe/qugeev) | How accurate are the answers to questions generated by your seq2seq transformer model? | [Pascal Zoleko](https://github.com/zolekode) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1bpsSqCQU-iw_5nNoRm_crPq6FRuJthq_?usp=sharing)|
|[Classify text with DistilBERT and Tensorflow](https://github.com/peterbayerle/huggingface_notebook/blob/main/distilbert_tf.ipynb) | How to fine-tune DistilBERT for text classification in TensorFlow | [Peter Bayerle](https://github.com/peterbayerle) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/peterbayerle/huggingface_notebook/blob/main/distilbert_tf.ipynb)|
|[Leverage BERT for Encoder-Decoder Summarization on CNN/Dailymail](https://github.com/patrickvonplaten/notebooks/blob/master/BERT2BERT_for_CNN_Dailymail.ipynb) | How to warm-start a *EncoderDecoderModel* with a *bert-base-uncased* checkpoint for summarization on CNN/Dailymail | [Patrick von Platen](https://github.com/patrickvonplaten) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/patrickvonplaten/notebooks/blob/master/BERT2BERT_for_CNN_Dailymail.ipynb)|
Expand Down
4 changes: 2 additions & 2 deletions docs/source/converting_tensorflow_models.rst
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,8 @@
Converting Tensorflow Checkpoints
=======================================================================================================================

A command-line interface is provided to convert original Bert/GPT/GPT-2/Transformer-XL/XLNet/XLM checkpoints in models
than be loaded using the ``from_pretrained`` methods of the library.
A command-line interface is provided to convert original Bert/GPT/GPT-2/Transformer-XL/XLNet/XLM checkpoints to models
that can be loaded using the ``from_pretrained`` methods of the library.

.. note::
Since 2.3.0 the conversion script is now part of the transformers CLI (**transformers-cli**) available in any
Expand Down
12 changes: 6 additions & 6 deletions docs/source/custom_datasets.rst
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ Fine-tuning with custom datasets

The datasets used in this tutorial are available and can be more easily accessed using the `🤗 Datasets library
<https://github.com/huggingface/datasets>`_. We do not use this library to access the datasets here since this
tutorial meant to illustrate how to work with your own data. A brief of introduction can be found at the end of the
tutorial meant to illustrate how to work with your own data. A brief introduction can be found at the end of the
tutorial in the section ":ref:`datasetslib`".

This tutorial will take you through several examples of using 🤗 Transformers models with your own datasets. The guide
Expand Down Expand Up @@ -74,8 +74,8 @@ read this in.
train_texts, train_labels = read_imdb_split('aclImdb/train')
test_texts, test_labels = read_imdb_split('aclImdb/test')
We now have a train and test dataset, but let's also also create a validation set which we can use for for evaluation
and tuning without tainting our test set results. Sklearn has a convenient utility for creating such splits:
We now have a train and test dataset, but let's also create a validation set which we can use for for evaluation and
tuning without tainting our test set results. Sklearn has a convenient utility for creating such splits:

.. code-block:: python
Expand All @@ -91,8 +91,8 @@ pre-trained DistilBert, so let's use the DistilBert tokenizer.
tokenizer = DistilBertTokenizerFast.from_pretrained('distilbert-base-uncased')
Now we can simply pass our texts to the tokenizer. We'll pass ``truncation=True`` and ``padding=True``, which will
ensure that all of our sequences are padded to the same length and are truncated to be no longer model's maximum input
length. This will allow us to feed batches of sequences into the model at the same time.
ensure that all of our sequences are padded to the same length and are truncated to be no longer than model's maximum
input length. This will allow us to feed batches of sequences into the model at the same time.

.. code-block:: python
Expand Down Expand Up @@ -213,7 +213,7 @@ instantiate a :class:`~transformers.Trainer`/:class:`~transformers.TFTrainer`.
Fine-tuning with native PyTorch/TensorFlow
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

We can also train use native PyTorch or TensorFlow:
We can also train using native PyTorch or TensorFlow:

.. code-block:: python
Expand Down
2 changes: 1 addition & 1 deletion docs/source/debugging.rst
Original file line number Diff line number Diff line change
Expand Up @@ -154,7 +154,7 @@ input elements was ``6.27e+04`` and same for the output was ``inf``.
You can see here, that ``T5DenseGatedGeluDense.forward`` resulted in output activations, whose absolute max value was
around 62.7K, which is very close to fp16's top limit of 64K. In the next frame we have ``Dropout`` which renormalizes
the weights, after it zeroed some of the elements, which pushes the absolute max value to more than 64K, and we get an
overlow (``inf``).
overflow (``inf``).

As you can see it's the previous frames that we need to look into when the numbers start going into very large for fp16
numbers.
Expand Down
2 changes: 1 addition & 1 deletion docs/source/model_sharing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -90,7 +90,7 @@ Directly push your model to the hub
picture-in-picture" allowfullscreen></iframe>

Once you have an API token (either stored in the cache or copied and pasted in your notebook), you can directly push a
finetuned model you saved in :obj:`save_drectory` by calling:
finetuned model you saved in :obj:`save_directory` by calling:

.. code-block:: python
Expand Down
Loading

0 comments on commit 3e04a41

Please sign in to comment.