Skip to content

Commit

Permalink
Typo fix
Browse files Browse the repository at this point in the history
  • Loading branch information
mbosnjak committed Apr 12, 2016
1 parent 452e7c7 commit 4a64cdb
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion optimization-2.md
Original file line number Diff line number Diff line change
Expand Up @@ -155,7 +155,7 @@ $$
= \left( 1 - \sigma(x) \right) \sigma(x)
$$

As we see, the gradient turns out to simplify and becomes surprisingly simple. For example, the sigmoid expression receives the input 1.0 and computes the ouput 0.73 during the forward pass. The derivation above shows that the *local* gradient would simply be (1 - 0.73) * 0.73 ~= 0.2, as the circuit computed before (see the image above), except this way it would be done with a single, simple and efficient expression (and with less numerical issues). Therefore, in any real practical application it would be very useful to group these operations into a single gate. Lets see the backprop for this neuron in code:
As we see, the gradient turns out to simplify and becomes surprisingly simple. For example, the sigmoid expression receives the input 1.0 and computes the output 0.73 during the forward pass. The derivation above shows that the *local* gradient would simply be (1 - 0.73) * 0.73 ~= 0.2, as the circuit computed before (see the image above), except this way it would be done with a single, simple and efficient expression (and with less numerical issues). Therefore, in any real practical application it would be very useful to group these operations into a single gate. Lets see the backprop for this neuron in code:

```python
w = [2,-3,-3] # assume some random weights and data
Expand Down

0 comments on commit 4a64cdb

Please sign in to comment.