Skip to content

Commit

Permalink
Merge pull request Newmu#2 from Slater-Victoroff/master
Browse files Browse the repository at this point in the history
Added ability to download mnist and updated README accordingly
  • Loading branch information
Newmu committed Jan 23, 2015
2 parents ad67917 + 3d2e161 commit 49de4c5
Show file tree
Hide file tree
Showing 3 changed files with 43 additions and 4 deletions.
12 changes: 12 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,15 @@ Theano-Tutorials
================

Bare bones introduction to machine learning from linear regression to convolutional neural networks using Theano.

***Dataset***
It's worth noting that this library assumes that the reader has access to the mnist dataset. This dataset is freely available and is accessible through Yann LeCun's [personal website](http://yann.lecun.com/exdb/mnist/).

If you want to automate the download of the dataset, there is an included file that will do this for you. Simply run the following:
`sudo ./download_mnist.sh`

***Known Issues***
`Library not loaded: /usr/local/opt/openssl/lib/libssl.1.0.0.dylib`
This results from a broken openssl installation on mac. It can be fixed by uninstalling and reinstalling openssl:
`sudo brew remove openssl`
`brew install openssl`
27 changes: 27 additions & 0 deletions download_mnist.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
#!/bin/bash

mkdir -p /media/datasets/mnist

if ! [ -e /media/datasets/mnist/train-images-idx3-ubyte.gz ]
then
wget -P /media/datasets/mnist/ http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
fi
gzip -d /media/datasets/mnist/train-images-idx3-ubyte.gz

if ! [ -e /media/datasets/mnist/train-labels-idx1-ubyte.gz ]
then
wget -P /media/datasets/mnist/ http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
fi
gzip -d /media/datasets/mnist/train-labels-idx1-ubyte.gz

if ! [ -e /media/datasets/mnist/t10k-images-idx3-ubyte.gz ]
then
wget -P /media/datasets/mnist/ http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
fi
gzip -d /media/datasets/mnist/t10k-images-idx3-ubyte.gz

if ! [ -e /media/datasets/mnist/t10k-labels-idx1-ubyte.gz ]
then
wget -P /media/datasets/mnist/ http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
fi
gzip -d /media/datasets/mnist/t10k-labels-idx1-ubyte.gz
8 changes: 4 additions & 4 deletions load.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,19 +13,19 @@ def one_hot(x,n):

def mnist(ntrain=60000,ntest=10000,onehot=True):
data_dir = os.path.join(datasets_dir,'mnist/')
fd = open(os.path.join(data_dir,'train-images.idx3-ubyte'))
fd = open(os.path.join(data_dir,'train-images-idx3-ubyte'))
loaded = np.fromfile(file=fd,dtype=np.uint8)
trX = loaded[16:].reshape((60000,28*28)).astype(float)

fd = open(os.path.join(data_dir,'train-labels.idx1-ubyte'))
fd = open(os.path.join(data_dir,'train-labels-idx1-ubyte'))
loaded = np.fromfile(file=fd,dtype=np.uint8)
trY = loaded[8:].reshape((60000))

fd = open(os.path.join(data_dir,'t10k-images.idx3-ubyte'))
fd = open(os.path.join(data_dir,'t10k-images-idx3-ubyte'))
loaded = np.fromfile(file=fd,dtype=np.uint8)
teX = loaded[16:].reshape((10000,28*28)).astype(float)

fd = open(os.path.join(data_dir,'t10k-labels.idx1-ubyte'))
fd = open(os.path.join(data_dir,'t10k-labels-idx1-ubyte'))
loaded = np.fromfile(file=fd,dtype=np.uint8)
teY = loaded[8:].reshape((10000))

Expand Down

0 comments on commit 49de4c5

Please sign in to comment.