diff --git a/docs/source/quicktour.rst b/docs/source/quicktour.rst
index 2d645e8b8cc..8a074ac113e 100644
--- a/docs/source/quicktour.rst
+++ b/docs/source/quicktour.rst
@@ -88,7 +88,7 @@ In the rest of this quick-tour we will use this dataset to fine-tune a Bert mode
As you can see from the above features, the labels are a :class:`nlp.ClassLabel` instance with two classes: ``not_equivalent`` and ``equivalent``.
-We can print one example of each class using :func:`nlp.Dataset.filter`and a name-to-integer conversion method of the feature :class:`nlp.ClassLabel` called :func:`nlp.ClassLabel.str2int` (that we detail these methods in :doc:`processing ` and :doc:`exploring `):
+We can print one example of each class using :func:`nlp.Dataset.filter` and a name-to-integer conversion method of the feature :class:`nlp.ClassLabel` called :func:`nlp.ClassLabel.str2int` (that we detail these methods in :doc:`processing ` and :doc:`exploring `):
.. code-block::
@@ -111,6 +111,7 @@ Let's import a pretrained Bert model and its tokenizer using 🤗transformers.
.. code-block::
+ >>> ## PYTORCH CODE
>>> from transformers import AutoModelForSequenceClassification, AutoTokenizer
>>> model = AutoModelForSequenceClassification.from_pretrained('bert-base-cased')
Some weights of the model checkpoint at bert-base-cased were not used when initializing BertForSequenceClassification: ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias']
@@ -119,6 +120,15 @@ Let's import a pretrained Bert model and its tokenizer using 🤗transformers.
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-cased and are newly initialized: ['classifier.weight', 'classifier.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
>>> tokenizer = AutoTokenizer.from_pretrained('bert-base-cased')
+ >>> ## TENSORFLOW CODE
+ >>> from transformers import TFAutoModelForSequenceClassification, AutoTokenizer
+ >>> model = TFAutoModelForSequenceClassification.from_pretrained("bert-base-cased")
+ Some weights of the model checkpoint at bert-base-cased were not used when initializing TFBertForSequenceClassification: ['nsp___cls', 'mlm___cls']
+ - This IS expected if you are initializing TFBertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPretraining model).
+ - This IS NOT expected if you are initializing TFBertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
+ Some weights of TFBertForSequenceClassification were not initialized from the model checkpoint at bert-base-cased and are newly initialized: ['dropout_37', 'classifier']
+ You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
+ >>> tokenizer = AutoTokenizer.from_pretrained('bert-base-cased')
🤗transformers warns us that we should probably train this model on a downstream task before using it which is exactly what we are going to do.
If you want more details on the models and tokenizers of 🤗transformers, you should refer to the documentation and tutorials of this library `which are available here `__.
@@ -172,8 +182,8 @@ Now that we have encoded our dataset, we want to use it in a ``torch.Dataloader`
To be able to train our model with this dataset and PyTorch, we will need to do three modifications:
-- rename our ``label`` column in ``labels`` which is the `expected input name for labels in BertForSequenceClassification `__,
-- get pytorch tensors out of our :class:`nlp.Dataset`, instead of python objects, and
+- rename our ``label`` column in ``labels`` which is the expected input name for labels in `BertForSequenceClassification `__ or `TFBertForSequenceClassification `__,
+- get pytorch (or tensorflow) tensors out of our :class:`nlp.Dataset`, instead of python objects, and
- filter the columns to return only the subset of the columns that we need for our model inputs (``input_ids``, ``token_type_ids`` and ``attention_mask``).
.. note::
@@ -188,10 +198,11 @@ The first modification is just a matter of renaming the column as follow (we cou
The two other modifications can be handled by the :func:`nlp.Dataset.set_format` method which will convert, on the fly, the returned output from :func:`nlp.Dataset.__getitem__` to filter the unwanted columns and convert python objects in PyTorch tensors.
-Here is how we can apply the right format to our dataset using :func:`nlp.Dataset.set_format` and wrap it in a ``torch.utils.data.DataLoader``:
+Here is how we can apply the right format to our dataset using :func:`nlp.Dataset.set_format` and wrap it in a ``torch.utils.data.DataLoader`` or a ``tf.data.Dataset``:
.. code-block::
+ >>> ## PYTORCH CODE
>>> import torch
>>> dataset.set_format(type='torch', columns=['input_ids', 'token_type_ids', 'attention_mask', 'labels'])
>>> dataloader = torch.utils.data.DataLoader(dataset, batch_size=32)
@@ -218,11 +229,43 @@ Here is how we can apply the right format to our dataset using :func:`nlp.Datase
[0, 0, 0, ..., 0, 0, 0],
[0, 0, 0, ..., 0, 0, 0],
[0, 0, 0, ..., 0, 0, 0]])}
+ >>> ## TENSORFLOW CODE
+ >>> import tensorflow as tf
+ >>> dataset.set_format(type='tensorflow', columns=['input_ids', 'token_type_ids', 'attention_mask', 'labels'])
+ >>> features = {x: dataset[x].to_tensor(default_value=0, shape=[None, tokenizer.max_len]) for x in ['input_ids', 'token_type_ids', 'attention_mask']}
+ >>> tfdataset = tf.data.Dataset.from_tensor_slices((features, dataset["labels"])).batch(32)
+ >>> next(iter(tfdataset))
+ ({'input_ids': , 'token_type_ids': , 'attention_mask': }, )
+
We are now ready to train our model. Let's write a simple training loop and a start the training
.. code-block::
+ >>> ## PYTORCH CODE
>>> from tqdm import tqdm
>>> device = 'cuda' if torch.cuda.is_available() else 'cpu'
>>> model.train().to(device)
@@ -237,6 +280,12 @@ We are now ready to train our model. Let's write a simple training loop and a st
>>> optimizer.zero_grad()
>>> if i % 10 == 0:
>>> print(f"loss: {loss}")
+ >>> ## TENSORFLOW CODE
+ >>> loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(reduction=tf.keras.losses.Reduction.NONE, from_logits=True)
+ >>> opt = tf.keras.optimizers.Adam(learning_rate=3e-5)
+ >>> model.compile(optimizer=opt, loss=loss_fn, metrics=["accuracy"])
+ >>> model.fit(tfdataset, epochs=3)
+
Now this was a very simple tour, you should continue with either the detailled notebook which is `here `__ or the in-depth guides on