add module readmes

csutjf · Jul 3, 2019 · b49271a · b49271a
1 parent 1a4fc07
commit b49271a
Show file tree

Hide file tree

Showing 8 changed files with 77 additions and 0 deletions.
diff --git a/neural_nets/initializers/README.md b/neural_nets/initializers/README.md
@@ -0,0 +1,4 @@
+# Initializers
+The `initializers.py` module contains objects for initializing optimizers,
+activation functions, weight initializers, and learning rate schedulers from
+strings or parameter dictionaries.
diff --git a/neural_nets/layers/README.md b/neural_nets/layers/README.md
@@ -0,0 +1,19 @@
+# Layers
+The `layers.py` module implements common layers / layer-wise operations that can
+be composed to create larger neural networks. It includes:
+
+- Fully-connected layers
+- Sparse evolutionary layers ([Mocanu et al., 2018](https://www.nature.com/articles/s41467-018-04316-3))
+- Dot-product attention layers ([Luong, Pho, & Manning, 2015](https://arxiv.org/pdf/1508.04025.pdf); [Vaswani et al., 2017](https://arxiv.org/pdf/1706.03762.pdf))
+- 1D and 2D convolution (with stride, padding, and dilation) layers ([van den Oord et al., 2016](https://arxiv.org/pdf/1609.03499.pdf); [Yu & Kolton, 2016](https://arxiv.org/pdf/1511.07122.pdf))
+- 2D "deconvolution" (with stride and padding) layers ([Zeiler et al., 2010](https://www.matthewzeiler.com/mattzeiler/deconvolutionalnetworks.pdf))
+- Restricted Boltzmann machines (with CD-_n_ training) ([Smolensky, 1996](http://stanford.edu/~jlmcc/papers/PDP/Volume%201/Chap6_PDP86.pdf); [Carreira-Perpiñán & Hinton, 2005](http://www.cs.toronto.edu/~fritz/absps/cdmiguel.pdf))
+- Elementwise multiplication operation
+- Summation operation
+- Flattening operation
+- Softmax layer
+- Max & average pooling layer
+- 1D and 2D batch normalization layers ([Ioffe & Szegedy, 2015](http://proceedings.mlr.press/v37/ioffe15.pdf))
+- 1D and 2D layer normalization layers ([Ba, Kiros, & Hinton, 2016](https://arxiv.org/pdf/1607.06450.pdf))
+- Recurrent layers ([Elman, 1990](https://crl.ucsd.edu/~elman/Papers/fsit.pdf))
+- Long short-term memory (LSTM) layers ([Hochreiter & Schmidhuber, 1997](http://www.bioinf.jku.at/publications/older/2604.pdf))
diff --git a/neural_nets/losses/README.md b/neural_nets/losses/README.md
@@ -0,0 +1,8 @@
+# Losses
+
+The `losses.py` module implements several common loss functions, including:
+
+- Squared error 
+- Cross-entropy
+- Variational lower-bound for binary VAE
+- WGAN-GP loss for generator and critic
diff --git a/neural_nets/models/README.md b/neural_nets/models/README.md
@@ -0,0 +1,8 @@
+# Models
+
+The models module implements popular full neural networks. It includes:
+
+- `vae.py`: A Bernoulli variational autoencoder ([Kingma & Welling, 2014](https://arxiv.org/abs/1312.6114))
+- `wgan_gp.py`: A Wasserstein generative adversarial network with gradient
+      penalty ([Gulrajani et al., 2017](https://arxiv.org/pdf/1704.00028.pdf);
+[Goodfellow et al., 2014](https://papers.nips.cc/paper/5423-generative-adversarial-nets.pdf))
diff --git a/neural_nets/modules/README.md b/neural_nets/modules/README.md
@@ -0,0 +1,11 @@
+# Modules
+
+The `modules.py` module implements common multi-layer blocks that appear across
+many modern deep networks. It includes: 
+
+    - Bidirectional LSTMs ([Schuster & Paliwal, 1997](https://pdfs.semanticscholar.org/4b80/89bc9b49f84de43acc2eb8900035f7d492b2.pdf))
+    - ResNet-style "identity" (i.e., `same`-convolution) residual blocks ([He et al., 2015](https://arxiv.org/pdf/1512.03385.pdf))
+    - ResNet-style "convolutional" (i.e., parametric) residual blocks ([He et al., 2015](https://arxiv.org/pdf/1512.03385.pdf))
+    - WaveNet-style residual block with dilated causal convolutions ([van den Oord et al., 2016](https://arxiv.org/pdf/1609.03499.pdf))
+    - Transformer-style multi-headed dot-product attention ([Vaswani et al., 2017](https://arxiv.org/pdf/1706.03762.pdf))
+
diff --git a/neural_nets/optimizers/README.md b/neural_nets/optimizers/README.md
@@ -0,0 +1,8 @@
+# Optimizers
+
+The `optimizers.py` module implements common modifications to stochastic gradient descent. It includes:
+
+- SGD with momentum ([Rummelhart, Hinton, & Williams, 1986](https://www.cs.princeton.edu/courses/archive/spring18/cos495/res/backprop_old.pdf))
+- AdaGrad ([Duchi, Hazan, & Singer, 2011](http://jmlr.org/papers/volume12/duchi11a/duchi11a.pdf))
+- RMSProp ([Tieleman & Hinton, 2012](http://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf))
+- Adam ([Kingma & Ba, 2015](https://arxiv.org/pdf/1412.6980v8.pdf))
diff --git a/neural_nets/utils/README.md b/neural_nets/utils/README.md
@@ -0,0 +1,14 @@
+# Utilities
+
+The `utils.py` module implements common, neural network-specific helper
+functions, primarily for dealing with CNNs. It includes:
+
+- `im2col` 
+- `col2im` 
+- `conv1D` 
+- `conv2D`
+- `dilate`
+- `deconv2D` 
+- `minibatch`
+- Various weight initialization utilities
+- Various padding and convolution arithmetic utilities
diff --git a/neural_nets/wrappers/README.md b/neural_nets/wrappers/README.md
@@ -0,0 +1,5 @@
+# Wrappers
+
+The `wrappers.py` module implements wrappers for the layers in `layers.py`. It
+includes
+- Dropout ([Srivastava, et al., 2014](http://www.jmlr.org/papers/volume15/srivastava14a/srivastava14a.pdf))