cs231n add 2018-versions of A2 and A3

yavuzKomecoglu · May 23, 2018 · 98407d4 · 98407d4
1 parent 16ec322
commit 98407d4
Show file tree

Hide file tree

Showing 41 changed files with 8,450 additions and 6,957 deletions.
diff --git a/class_cs231n/assignment2/BatchNormalization.ipynb b/class_cs231n/assignment2/BatchNormalization.ipynb
diff --git a/class_cs231n/assignment2/ConvolutionalNetworks.ipynb b/class_cs231n/assignment2/ConvolutionalNetworks.ipynb
diff --git a/class_cs231n/assignment2/Dropout.ipynb b/class_cs231n/assignment2/Dropout.ipynb
@@ -1,19 +1,16 @@
 {
- "nbformat_minor": 0, 
+ "nbformat_minor": 2, 
  "nbformat": 4, 
  "cells": [
   {
    "source": [
     "# Dropout\n", 
     "Dropout [1] is a technique for regularizing neural networks by randomly setting some features to zero during the forward pass. In this exercise you will implement a dropout layer and modify your fully-connected network to optionally use dropout.\n", 
     "\n", 
-    "[1] Geoffrey E. Hinton et al, \"Improving neural networks by preventing co-adaptation of feature detectors\", arXiv 2012"
+    "[1] [Geoffrey E. Hinton et al, \"Improving neural networks by preventing co-adaptation of feature detectors\", arXiv 2012](https://arxiv.org/abs/1207.0580)"
    ], 
    "cell_type": "markdown", 
-   "metadata": {
-    "editable": true, 
-    "deletable": true
-   }
+   "metadata": {}
   }, 
   {
    "execution_count": null, 
@@ -45,9 +42,7 @@
    ], 
    "outputs": [], 
    "metadata": {
-    "collapsed": false, 
-    "editable": true, 
-    "deletable": true
+    "collapsed": true
    }
   }, 
   {
@@ -62,9 +57,7 @@
    ], 
    "outputs": [], 
    "metadata": {
-    "collapsed": false, 
-    "editable": true, 
-    "deletable": true
+    "collapsed": true
    }
   }, 
   {
@@ -75,10 +68,7 @@
     "Once you have done so, run the cell below to test your implementation."
    ], 
    "cell_type": "markdown", 
-   "metadata": {
-    "editable": true, 
-    "deletable": true
-   }
+   "metadata": {}
   }, 
   {
    "execution_count": null, 
@@ -87,7 +77,7 @@
     "np.random.seed(231)\n", 
     "x = np.random.randn(500, 500) + 10\n", 
     "\n", 
-    "for p in [0.3, 0.6, 0.75]:\n", 
+    "for p in [0.25, 0.4, 0.7]:\n", 
     "  out, _ = dropout_forward(x, {'mode': 'train', 'p': p})\n", 
     "  out_test, _ = dropout_forward(x, {'mode': 'test', 'p': p})\n", 
     "\n", 
@@ -101,9 +91,7 @@
    ], 
    "outputs": [], 
    "metadata": {
-    "collapsed": false, 
-    "editable": true, 
-    "deletable": true
+    "collapsed": true
    }
   }, 
   {
@@ -112,10 +100,7 @@
     "In the file `cs231n/layers.py`, implement the backward pass for dropout. After doing so, run the following cell to numerically gradient-check your implementation."
    ], 
    "cell_type": "markdown", 
-   "metadata": {
-    "editable": true, 
-    "deletable": true
-   }
+   "metadata": {}
   }, 
   {
    "execution_count": null, 
@@ -125,30 +110,41 @@
     "x = np.random.randn(10, 10) + 10\n", 
     "dout = np.random.randn(*x.shape)\n", 
     "\n", 
-    "dropout_param = {'mode': 'train', 'p': 0.8, 'seed': 123}\n", 
+    "dropout_param = {'mode': 'train', 'p': 0.2, 'seed': 123}\n", 
     "out, cache = dropout_forward(x, dropout_param)\n", 
     "dx = dropout_backward(dout, cache)\n", 
     "dx_num = eval_numerical_gradient_array(lambda xx: dropout_forward(xx, dropout_param)[0], x, dout)\n", 
     "\n", 
+    "# Error should be around e-10 or less\n", 
     "print('dx relative error: ', rel_error(dx, dx_num))"
    ], 
    "outputs": [], 
    "metadata": {
-    "collapsed": false, 
-    "editable": true, 
-    "deletable": true
+    "collapsed": true
    }
   }, 
+  {
+   "source": [
+    "## Inline Question 1:\n", 
+    "What happens if we do not divide the values being passed through inverse dropout by `p` in the dropout layer? Why does that happen?"
+   ], 
+   "cell_type": "markdown", 
+   "metadata": {}
+  }, 
+  {
+   "source": [
+    "## Answer:\n"
+   ], 
+   "cell_type": "markdown", 
+   "metadata": {}
+  }, 
   {
    "source": [
     "# Fully-connected nets with Dropout\n", 
-    "In the file `cs231n/classifiers/fc_net.py`, modify your implementation to use dropout. Specificially, if the constructor the the net receives a nonzero value for the `dropout` parameter, then the net should add dropout immediately after every ReLU nonlinearity. After doing so, run the following to numerically gradient-check your implementation."
+    "In the file `cs231n/classifiers/fc_net.py`, modify your implementation to use dropout. Specifically, if the constructor of the net receives a value that is not 1 for the `dropout` parameter, then the net should add dropout immediately after every ReLU nonlinearity. After doing so, run the following to numerically gradient-check your implementation."
    ], 
    "cell_type": "markdown", 
-   "metadata": {
-    "editable": true, 
-    "deletable": true
-   }
+   "metadata": {}
   }, 
   {
    "execution_count": null, 
@@ -159,15 +155,17 @@
     "X = np.random.randn(N, D)\n", 
     "y = np.random.randint(C, size=(N,))\n", 
     "\n", 
-    "for dropout in [0, 0.25, 0.5]:\n", 
+    "for dropout in [1, 0.75, 0.5]:\n", 
     "  print('Running check with dropout = ', dropout)\n", 
     "  model = FullyConnectedNet([H1, H2], input_dim=D, num_classes=C,\n", 
     "                            weight_scale=5e-2, dtype=np.float64,\n", 
     "                            dropout=dropout, seed=123)\n", 
     "\n", 
     "  loss, grads = model.loss(X, y)\n", 
     "  print('Initial loss: ', loss)\n", 
-    "\n", 
+    "  \n", 
+    "  # Relative errors should be around e-6 or less; Note that it's fine\n", 
+    "  # if for dropout=1 you have W2 error be on the order of e-5.\n", 
     "  for name in sorted(grads):\n", 
     "    f = lambda _: model.loss(X, y)[0]\n", 
     "    grad_num = eval_numerical_gradient(f, model.params[name], verbose=False, h=1e-5)\n", 
@@ -176,21 +174,16 @@
    ], 
    "outputs": [], 
    "metadata": {
-    "collapsed": false, 
-    "editable": true, 
-    "deletable": true
+    "collapsed": true
    }
   }, 
   {
    "source": [
     "# Regularization experiment\n", 
-    "As an experiment, we will train a pair of two-layer networks on 500 training examples: one will use no dropout, and one will use a dropout probability of 0.75. We will then visualize the training and validation accuracies of the two networks over time."
+    "As an experiment, we will train a pair of two-layer networks on 500 training examples: one will use no dropout, and one will use a keep probability of 0.25. We will then visualize the training and validation accuracies of the two networks over time."
    ], 
    "cell_type": "markdown", 
-   "metadata": {
-    "editable": true, 
-    "deletable": true
-   }
+   "metadata": {}
   }, 
   {
    "execution_count": null, 
@@ -207,7 +200,7 @@
     "}\n", 
     "\n", 
     "solvers = {}\n", 
-    "dropout_choices = [0, 0.75]\n", 
+    "dropout_choices = [1, 0.25]\n", 
     "for dropout in dropout_choices:\n", 
     "  model = FullyConnectedNet([500], dropout=dropout)\n", 
     "  print(dropout)\n", 
@@ -225,9 +218,7 @@
    "outputs": [], 
    "metadata": {
     "scrolled": false, 
-    "collapsed": false, 
-    "editable": true, 
-    "deletable": true
+    "collapsed": true
    }
   }, 
   {
@@ -264,48 +255,64 @@
    ], 
    "outputs": [], 
    "metadata": {
-    "collapsed": false, 
-    "editable": true, 
-    "deletable": true
+    "collapsed": true
    }
   }, 
   {
    "source": [
-    "# Question\n", 
-    "Explain what you see in this experiment. What does it suggest about dropout?"
+    "## Inline Question 2:\n", 
+    "Compare the validation and training accuracies with and without dropout -- what do your results suggest about dropout as a regularizer?"
    ], 
    "cell_type": "markdown", 
-   "metadata": {
-    "editable": true, 
-    "deletable": true
-   }
+   "metadata": {}
+  }, 
+  {
+   "source": [
+    "## Answer:\n"
+   ], 
+   "cell_type": "markdown", 
+   "metadata": {}
   }, 
   {
    "source": [
-    "# Answer\n"
+    "## Inline Question 3:\n", 
+    "Suppose we are training a deep fully-connected network for image classification, with dropout after hidden layers (parameterized by keep probability p). How should we modify p, if at all, if we decide to decrease the size of the hidden layers (that is, the number of nodes in each layer)?"
    ], 
    "cell_type": "markdown", 
+   "metadata": {}
+  }, 
+  {
+   "source": [
+    "## Answer:\n"
+   ], 
+   "cell_type": "markdown", 
+   "metadata": {}
+  }, 
+  {
+   "execution_count": null, 
+   "cell_type": "code", 
+   "source": [], 
+   "outputs": [], 
    "metadata": {
-    "editable": true, 
-    "deletable": true
+    "collapsed": true
    }
   }
  ], 
  "metadata": {
   "kernelspec": {
-   "display_name": "Python 2", 
-   "name": "python2", 
+   "display_name": "Python 3", 
+   "name": "python3", 
    "language": "python"
   }, 
   "language_info": {
    "mimetype": "text/x-python", 
    "nbconvert_exporter": "python", 
    "name": "python", 
    "file_extension": ".py", 
-   "version": "2.7.12+", 
-   "pygments_lexer": "ipython2", 
+   "version": "3.5.1", 
+   "pygments_lexer": "ipython3", 
    "codemirror_mode": {
-    "version": 2, 
+    "version": 3, 
     "name": "ipython"
    }
   }