diff --git a/docs/source/quicktour.rst b/docs/source/quicktour.rst index 2d645e8b8cc..8a074ac113e 100644 --- a/docs/source/quicktour.rst +++ b/docs/source/quicktour.rst @@ -88,7 +88,7 @@ In the rest of this quick-tour we will use this dataset to fine-tune a Bert mode As you can see from the above features, the labels are a :class:`nlp.ClassLabel` instance with two classes: ``not_equivalent`` and ``equivalent``. -We can print one example of each class using :func:`nlp.Dataset.filter`and a name-to-integer conversion method of the feature :class:`nlp.ClassLabel` called :func:`nlp.ClassLabel.str2int` (that we detail these methods in :doc:`processing ` and :doc:`exploring `): +We can print one example of each class using :func:`nlp.Dataset.filter` and a name-to-integer conversion method of the feature :class:`nlp.ClassLabel` called :func:`nlp.ClassLabel.str2int` (that we detail these methods in :doc:`processing ` and :doc:`exploring `): .. code-block:: @@ -111,6 +111,7 @@ Let's import a pretrained Bert model and its tokenizer using 🤗transformers. .. code-block:: + >>> ## PYTORCH CODE >>> from transformers import AutoModelForSequenceClassification, AutoTokenizer >>> model = AutoModelForSequenceClassification.from_pretrained('bert-base-cased') Some weights of the model checkpoint at bert-base-cased were not used when initializing BertForSequenceClassification: ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias'] @@ -119,6 +120,15 @@ Let's import a pretrained Bert model and its tokenizer using 🤗transformers. Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-cased and are newly initialized: ['classifier.weight', 'classifier.bias'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. >>> tokenizer = AutoTokenizer.from_pretrained('bert-base-cased') + >>> ## TENSORFLOW CODE + >>> from transformers import TFAutoModelForSequenceClassification, AutoTokenizer + >>> model = TFAutoModelForSequenceClassification.from_pretrained("bert-base-cased") + Some weights of the model checkpoint at bert-base-cased were not used when initializing TFBertForSequenceClassification: ['nsp___cls', 'mlm___cls'] + - This IS expected if you are initializing TFBertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPretraining model). + - This IS NOT expected if you are initializing TFBertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model). + Some weights of TFBertForSequenceClassification were not initialized from the model checkpoint at bert-base-cased and are newly initialized: ['dropout_37', 'classifier'] + You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. + >>> tokenizer = AutoTokenizer.from_pretrained('bert-base-cased') 🤗transformers warns us that we should probably train this model on a downstream task before using it which is exactly what we are going to do. If you want more details on the models and tokenizers of 🤗transformers, you should refer to the documentation and tutorials of this library `which are available here `__. @@ -172,8 +182,8 @@ Now that we have encoded our dataset, we want to use it in a ``torch.Dataloader` To be able to train our model with this dataset and PyTorch, we will need to do three modifications: -- rename our ``label`` column in ``labels`` which is the `expected input name for labels in BertForSequenceClassification `__, -- get pytorch tensors out of our :class:`nlp.Dataset`, instead of python objects, and +- rename our ``label`` column in ``labels`` which is the expected input name for labels in `BertForSequenceClassification `__ or `TFBertForSequenceClassification `__, +- get pytorch (or tensorflow) tensors out of our :class:`nlp.Dataset`, instead of python objects, and - filter the columns to return only the subset of the columns that we need for our model inputs (``input_ids``, ``token_type_ids`` and ``attention_mask``). .. note:: @@ -188,10 +198,11 @@ The first modification is just a matter of renaming the column as follow (we cou The two other modifications can be handled by the :func:`nlp.Dataset.set_format` method which will convert, on the fly, the returned output from :func:`nlp.Dataset.__getitem__` to filter the unwanted columns and convert python objects in PyTorch tensors. -Here is how we can apply the right format to our dataset using :func:`nlp.Dataset.set_format` and wrap it in a ``torch.utils.data.DataLoader``: +Here is how we can apply the right format to our dataset using :func:`nlp.Dataset.set_format` and wrap it in a ``torch.utils.data.DataLoader`` or a ``tf.data.Dataset``: .. code-block:: + >>> ## PYTORCH CODE >>> import torch >>> dataset.set_format(type='torch', columns=['input_ids', 'token_type_ids', 'attention_mask', 'labels']) >>> dataloader = torch.utils.data.DataLoader(dataset, batch_size=32) @@ -218,11 +229,43 @@ Here is how we can apply the right format to our dataset using :func:`nlp.Datase [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0]])} + >>> ## TENSORFLOW CODE + >>> import tensorflow as tf + >>> dataset.set_format(type='tensorflow', columns=['input_ids', 'token_type_ids', 'attention_mask', 'labels']) + >>> features = {x: dataset[x].to_tensor(default_value=0, shape=[None, tokenizer.max_len]) for x in ['input_ids', 'token_type_ids', 'attention_mask']} + >>> tfdataset = tf.data.Dataset.from_tensor_slices((features, dataset["labels"])).batch(32) + >>> next(iter(tfdataset)) + ({'input_ids': , 'token_type_ids': , 'attention_mask': }, ) + We are now ready to train our model. Let's write a simple training loop and a start the training .. code-block:: + >>> ## PYTORCH CODE >>> from tqdm import tqdm >>> device = 'cuda' if torch.cuda.is_available() else 'cpu' >>> model.train().to(device) @@ -237,6 +280,12 @@ We are now ready to train our model. Let's write a simple training loop and a st >>> optimizer.zero_grad() >>> if i % 10 == 0: >>> print(f"loss: {loss}") + >>> ## TENSORFLOW CODE + >>> loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(reduction=tf.keras.losses.Reduction.NONE, from_logits=True) + >>> opt = tf.keras.optimizers.Adam(learning_rate=3e-5) + >>> model.compile(optimizer=opt, loss=loss_fn, metrics=["accuracy"]) + >>> model.fit(tfdataset, epochs=3) + Now this was a very simple tour, you should continue with either the detailled notebook which is `here `__ or the in-depth guides on