Skip to content

Commit 7cade8c

Browse files
authored
Merge pull request lisa-lab#188 from kirkins/patch-1
fix typos/spelling
2 parents 6ef907b + 85962ee commit 7cade8c

File tree

8 files changed

+36
-36
lines changed

8 files changed

+36
-36
lines changed

doc/DBN.txt

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ Deep Belief Networks
66
.. note::
77
This section assumes the reader has already read through :doc:`logreg`
88
and :doc:`mlp` and :doc:`rbm`. Additionally it uses the following Theano
9-
functions and concepts : `T.tanh`_, `shared variables`_, `basic arithmetic
9+
functions and concepts: `T.tanh`_, `shared variables`_, `basic arithmetic
1010
ops`_, `T.grad`_, `Random numbers`_, `floatX`_. If you intend to run the
1111
code on GPU also read `GPU`_.
1212

@@ -210,7 +210,7 @@ obtained over these sets.
210210
Putting it all together
211211
+++++++++++++++++++++++
212212

213-
The few lines of code below constructs the deep belief network :
213+
The few lines of code below constructs the deep belief network:
214214

215215
.. literalinclude:: ../code/DBN.py
216216
:start-after: # numpy random generator

doc/SdA.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ Stacked Denoising Autoencoders (SdA)
66
.. note::
77
This section assumes you have already read through :doc:`logreg`
88
and :doc:`mlp`. Additionally it uses the following Theano functions
9-
and concepts : `T.tanh`_, `shared variables`_, `basic arithmetic ops`_, `T.grad`_, `Random numbers`_, `floatX`_. If you intend to run the code on GPU also read `GPU`_.
9+
and concepts: `T.tanh`_, `shared variables`_, `basic arithmetic ops`_, `T.grad`_, `Random numbers`_, `floatX`_. If you intend to run the code on GPU also read `GPU`_.
1010

1111
.. _T.tanh: http://deeplearning.net/software/theano/tutorial/examples.html?highlight=tanh
1212

doc/dA.txt

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ Denoising Autoencoders (dA)
66
.. note::
77
This section assumes the reader has already read through :doc:`logreg`
88
and :doc:`mlp`. Additionally it uses the following Theano functions
9-
and concepts : `T.tanh`_, `shared variables`_, `basic arithmetic ops`_, `T.grad`_, `Random numbers`_, `floatX`_. If you intend to run the code on GPU also read `GPU`_.
9+
and concepts: `T.tanh`_, `shared variables`_, `basic arithmetic ops`_, `T.grad`_, `Random numbers`_, `floatX`_. If you intend to run the code on GPU also read `GPU`_.
1010

1111
.. _T.tanh: http://deeplearning.net/software/theano/tutorial/examples.html?highlight=tanh
1212

@@ -126,7 +126,7 @@ signal:
126126
:pyobject: dA.get_reconstructed_input
127127

128128
And using these functions we can compute the cost and the updates of
129-
one stochastic gradient descent step :
129+
one stochastic gradient descent step:
130130

131131
.. literalinclude:: ../code/dA.py
132132
:pyobject: dA.get_cost_updates
@@ -209,7 +209,7 @@ need to do is to add a stochastic corruption step operating on the input. The in
209209
corrupted in many ways, but in this tutorial we will stick to the original
210210
corruption mechanism of randomly masking entries of the input by making
211211
them zero. The code below
212-
does just that :
212+
does just that:
213213

214214
.. literalinclude:: ../code/dA.py
215215
:pyobject: dA.get_corrupted_input
@@ -221,7 +221,7 @@ For this reason, the constructor of the ``dA`` also gets Theano variables
221221
pointing to the shared parameters. If those parameters are left to ``None``,
222222
new ones will be constructed.
223223

224-
The final denoising autoencoder class becomes :
224+
The final denoising autoencoder class becomes:
225225

226226
.. literalinclude:: ../code/dA.py
227227
:pyobject: dA
@@ -254,7 +254,7 @@ constant (weights are converted to values between 0 and 1).
254254
To plot our filters we will need the help of ``tile_raster_images`` (see
255255
:ref:`how-to-plot`) so we urge the reader to study it. Also
256256
using the help of the Python Image Library, the following lines of code will
257-
save the filters as an image :
257+
save the filters as an image:
258258

259259
.. literalinclude:: ../code/dA.py
260260
:start-after: start-snippet-4
@@ -264,20 +264,20 @@ save the filters as an image :
264264
Running the Code
265265
++++++++++++++++
266266

267-
To run the code :
267+
To run the code:
268268

269269
.. code-block:: bash
270270

271271
python dA.py
272272

273-
The resulted filters when we do not use any noise are :
273+
The resulted filters when we do not use any noise are:
274274

275275
.. figure:: images/filters_corruption_0.png
276276
:align: center
277277

278278

279279

280-
The filters for 30 percent noise :
280+
The filters for 30 percent noise:
281281

282282

283283
.. figure:: images/filters_corruption_30.png

doc/gettingstarted.txt

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -85,7 +85,7 @@ MNIST Dataset
8585
variables and access it based on the minibatch index, given a fixed
8686
and known batch size. The reason behind shared variables is
8787
related to using the GPU. There is a large overhead when copying data
88-
into the GPU memory. If you would copy data on request ( each minibatch
88+
into the GPU memory. If you would copy data on request (each minibatch
8989
individually when needed) as the code will do if you do not use shared
9090
variables, due to this overhead, the GPU code will not be much faster
9191
then the CPU code (maybe even slower). If you have your data in
@@ -147,7 +147,7 @@ MNIST Dataset
147147

148148
The data has to be stored as floats on the GPU ( the right
149149
``dtype`` for storing on the GPU is given by ``theano.config.floatX``).
150-
To get around this shortcomming for the labels, we store them as float,
150+
To get around this shortcoming for the labels, we store them as float,
151151
and then cast it to int.
152152

153153
.. note::
@@ -286,7 +286,7 @@ In this tutorial, :math:`f` is defined as:
286286

287287
f(x) = {\rm argmax}_k P(Y=k | x, \theta)
288288

289-
In python, using Theano this can be written as :
289+
In python, using Theano this can be written as:
290290

291291
.. code-block:: python
292292

@@ -316,7 +316,7 @@ The likelihood of the correct class is not the same as the
316316
number of right predictions, but from the point of view of a randomly
317317
initialized classifier they are pretty similar.
318318
Remember that likelihood and zero-one loss are different objectives;
319-
you should see that they are corralated on the validation set but
319+
you should see that they are correlated on the validation set but
320320
sometimes one will rise while the other falls, or vice-versa.
321321

322322
Since we usually speak in terms of minimizing a loss function, learning will
@@ -331,7 +331,7 @@ The NLL of our classifier is a differentiable surrogate for the zero-one loss,
331331
and we use the gradient of this function over our training data as a
332332
supervised learning signal for deep learning of a classifier.
333333

334-
This can be computed using the following line of code :
334+
This can be computed using the following line of code:
335335

336336
.. code-block:: python
337337

@@ -357,7 +357,7 @@ algorithm in which we repeatedly make small steps downward on an error
357357
surface defined by a loss function of some parameters.
358358
For the purpose of ordinary gradient descent we consider that the training
359359
data is rolled into the loss function. Then the pseudocode of this
360-
algorithm can be described as :
360+
algorithm can be described as:
361361

362362
.. code-block:: python
363363

@@ -421,11 +421,11 @@ but this choice is almost arbitrary (though harmless).
421421
because it controls the number of updates done to your parameters. Training the same model
422422
for 10 epochs using a batch size of 1 yields completely different results compared
423423
to training for the same 10 epochs but with a batchsize of 20. Keep this in mind when
424-
switching between batch sizes and be prepared to tweak all the other parameters acording
424+
switching between batch sizes and be prepared to tweak all the other parameters according
425425
to the batch size used.
426426

427427
All code-blocks above show pseudocode of how the algorithm looks like. Implementing such
428-
algorithm in Theano can be done as follows :
428+
algorithm in Theano can be done as follows:
429429

430430
.. code-block:: python
431431

doc/logreg.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -246,7 +246,7 @@ within the DeepLearningTutorials folder:
246246

247247
python code/logistic_sgd.py
248248

249-
The output one should expect is of the form :
249+
The output one should expect is of the form:
250250

251251
.. code-block:: bash
252252

doc/lstm.txt

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -75,10 +75,10 @@ previous state, as needed.
7575
.. figure:: images/lstm_memorycell.png
7676
:align: center
7777

78-
**Figure 1** : Illustration of an LSTM memory cell.
78+
**Figure 1**: Illustration of an LSTM memory cell.
7979

8080
The equations below describe how a layer of memory cells is updated at every
81-
timestep :math:`t`. In these equations :
81+
timestep :math:`t`. In these equations:
8282

8383
* :math:`x_t` is the input to the memory cell layer at time :math:`t`
8484
* :math:`W_i`, :math:`W_f`, :math:`W_c`, :math:`W_o`, :math:`U_i`,
@@ -89,7 +89,7 @@ timestep :math:`t`. In these equations :
8989

9090
First, we compute the values for :math:`i_t`, the input gate, and
9191
:math:`\widetilde{C_t}` the candidate value for the states of the memory
92-
cells at time :math:`t` :
92+
cells at time :math:`t`:
9393

9494
.. math::
9595
:label: 1
@@ -102,7 +102,7 @@ cells at time :math:`t` :
102102
\widetilde{C_t} = tanh(W_c x_t + U_c h_{t-1} + b_c)
103103

104104
Second, we compute the value for :math:`f_t`, the activation of the memory
105-
cells' forget gates at time :math:`t` :
105+
cells' forget gates at time :math:`t`:
106106

107107
.. math::
108108
:label: 3
@@ -111,15 +111,15 @@ cells' forget gates at time :math:`t` :
111111

112112
Given the value of the input gate activation :math:`i_t`, the forget gate
113113
activation :math:`f_t` and the candidate state value :math:`\widetilde{C_t}`,
114-
we can compute :math:`C_t` the memory cells' new state at time :math:`t` :
114+
we can compute :math:`C_t` the memory cells' new state at time :math:`t`:
115115

116116
.. math::
117117
:label: 4
118118

119119
C_t = i_t * \widetilde{C_t} + f_t * C_{t-1}
120120

121121
With the new state of the memory cells, we can compute the value of their
122-
output gates and, subsequently, their outputs :
122+
output gates and, subsequently, their outputs:
123123

124124
.. math::
125125
:label: 5
@@ -139,7 +139,7 @@ In this variant, the activation of a cell’s output gate does not depend on the
139139
memory cell’s state :math:`C_t`. This allows us to perform part of the
140140
computation more efficiently (see the implementation note, below, for
141141
details). This means that, in the variant we have implemented, there is no
142-
matrix :math:`V_o` and equation :eq:`5` is replaced by equation :eq:`5-alt` :
142+
matrix :math:`V_o` and equation :eq:`5` is replaced by equation :eq:`5-alt`:
143143

144144
.. math::
145145
:label: 5-alt
@@ -170,7 +170,7 @@ concatenating the four matrices :math:`W_*` into a single weight matrix
170170
:math:`W` and performing the same concatenation on the weight matrices
171171
:math:`U_*` to produce the matrix :math:`U` and the bias vectors :math:`b_*`
172172
to produce the vector :math:`b`. Then, the pre-nonlinearity activations can
173-
be computed with :
173+
be computed with:
174174

175175
.. math::
176176

@@ -187,11 +187,11 @@ Code - Citations - Contact
187187
Code
188188
====
189189

190-
The LSTM implementation can be found in the two following files :
190+
The LSTM implementation can be found in the two following files:
191191

192-
* `lstm.py <http://deeplearning.net/tutorial/code/lstm.py>`_ : Main script. Defines and train the model.
192+
* `lstm.py <http://deeplearning.net/tutorial/code/lstm.py>`_: Main script. Defines and train the model.
193193

194-
* `imdb.py <http://deeplearning.net/tutorial/code/imdb.py>`_ : Secondary script. Handles the loading and preprocessing of the IMDB dataset.
194+
* `imdb.py <http://deeplearning.net/tutorial/code/imdb.py>`_: Secondary script. Handles the loading and preprocessing of the IMDB dataset.
195195

196196
After downloading both scripts and putting both in the same folder, the user
197197
can run the code by calling:
@@ -202,7 +202,7 @@ can run the code by calling:
202202

203203
The script will automatically download the data and decompress it.
204204

205-
**Note** : The provided code supports the Stochastic Gradient Descent (SGD),
205+
**Note**: The provided code supports the Stochastic Gradient Descent (SGD),
206206
AdaDelta and RMSProp optimization methods. You are advised to use AdaDelta or
207207
RMSProp because SGD appears to performs poorly on this task with this
208208
particular model.

doc/mlp.txt

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -178,13 +178,13 @@ The code below shows how this can be done, in a way which is analogous to our pr
178178

179179
.. literalinclude:: ../code/mlp.py
180180

181-
The user can then run the code by calling :
181+
The user can then run the code by calling:
182182

183183
.. code-block:: bash
184184

185185
python code/mlp.py
186186

187-
The output one should expect is of the form :
187+
The output one should expect is of the form:
188188

189189
.. code-block:: bash
190190

doc/rbm.txt

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ Restricted Boltzmann Machines (RBM)
77
.. note::
88
This section assumes the reader has already read through :doc:`logreg`
99
and :doc:`mlp`. Additionally it uses the following Theano functions
10-
and concepts : `T.tanh`_, `shared variables`_, `basic arithmetic ops`_, `T.grad`_, `Random numbers`_, `floatX`_ and `scan`_. If you intend to run the code on GPU also read `GPU`_.
10+
and concepts: `T.tanh`_, `shared variables`_, `basic arithmetic ops`_, `T.grad`_, `Random numbers`_, `floatX`_ and `scan`_. If you intend to run the code on GPU also read `GPU`_.
1111

1212
.. _T.tanh: http://deeplearning.net/software/theano/tutorial/examples.html?highlight=tanh
1313

@@ -573,7 +573,7 @@ The output was the following:
573573
... plotting sample 8
574574
... plotting sample 9
575575

576-
The pictures below show the filters after 15 epochs :
576+
The pictures below show the filters after 15 epochs:
577577

578578
.. figure:: images/filters_at_epoch_14.png
579579
:align: center

0 commit comments

Comments
 (0)