Skip to content

Commit 1581ccb

Browse files
committed
wip:add some explanations
1 parent 41389a9 commit 1581ccb

File tree

1 file changed

+10
-7
lines changed

1 file changed

+10
-7
lines changed

docs/nlu/using-bert.rst

Lines changed: 10 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -81,16 +81,19 @@ config/config-heavy.yml
8181
8282
Note the differences in these files.
8383

84-
In the light configuration we have a CountVectorsFeaturizer, which we
85-
replace in the heavy variant with a HFTransformersNLP together with the
86-
LanguageModelTokenizer and LanguageModelFeaturizer. Notice that we're
87-
no longer using the original WhitespaceTokenizer because tokenization
88-
is now handled by Bert.
84+
In the light configuration we have :ref:`CountVectorsFeaturizer` which create bag-of-word
85+
representations for each incoming message(at word and character levels). The heavy configuration replaces it with a
86+
BERT model inside the pipeline. :ref:`HFTransformersNLP` is a utility component that does the heavy lifting work of loading the
87+
``BERT`` model in memory. Under the hood it leverages HuggingFace's `Transformers library <https://huggingface.co/transformers/>`_ to initialize the specified language model.
88+
Notice that we add two additional components :ref:`LanguageModelTokenizer` and :ref:`LanguageModelFeaturizer` which
89+
pick up the tokens and feature vectors respectively that are constructed by the utility component.
90+
91+
We use the same :ref:`diet-classifier` model for combined intent classification and entity recognition in both cases.
8992

9093
Run the Pipelines
9194
-----------------
9295

93-
You can run both configuarions yourself.
96+
You can run both configurations yourself.
9497

9598
.. code-block:: yaml
9699
@@ -127,7 +130,7 @@ something to seriously consider.
127130
Results
128131
-------
129132

130-
We've summerised the results into two charts, one for intents and
133+
We've summarised the results into two charts, one for intents and
131134
one for entities.
132135

133136

0 commit comments

Comments
 (0)