emnist_image_generator_predictor

Overview

The objective of this programme is to create English word image generators, then feed it to machine learning model [preferably neural network] to recognize the word from the image.

Method

For demonstration purposes, a character handwriting recognizer (instead of an English word image recognizer) is implemented. This character handwriting recognizer composes of two separate sections:

Image generator
- A. Random selection from holdout data
- B. Generating an image using Generative Adversarial Networks trained using samples of the respective character
  - Implementation only uses the Keras package
  - Implementation uses Keras and Keras-Adversarial packages
Recognition model - built using CNN neural networks with different hyperparameters

Dataset

Both models take handwritten images from the EMNIST dataset[1] as training and validation data. For simplification purposes, the EMNIST Balanced Dataset will be used, and therefore adjusting weights would not be required.

The full dataset can be found at http://www.itl.nist.gov/iaui/vip/cs_links/EMNIST/matlab.zip. As an alternative, the emnist-balanced-small.mat file under the dataset folder can be used. However, note this is a direct shrink of the EMNIST Balanced Dataset, and therefore classes will become imbalanced.

According to [1], the Balanced Dataset contains 47 classes, including both upper and lower case alphabets. It should be noted that since characters such as s C, I, J, K, L, M, O, P, S, U, V, W, X, Y and Z have relatively similar upper and lower case letters, therefore the samples of these characters are merged into one single class (Only the Upper case class is available). As a result, only 47 classes (including 10 digits) are available. The focus of this demonstration is on English characters and hence the digit classes will be removed.

The EMNIST dataset, by default, comes in two separate training and testing sets. However, since we are removing the digit classes, and their distribution among the default sets are unknown, we shall:

Consolidate all the training samples
Remove samples relating to digit classes
Split data into 4 portions:
- Training Data (80% of total)
  - Training data for (2) (80% of Training Data or 64% of Total)
  - Validation data for (2) (20% of Training Data or 16% of Total)
- Testing Data (20% of Total)
  - Testing data for (2) (50% of Testing Data or 10% of Total)
  - Training data for (1A) (50% of Testing Data or 10% of Total)

Note that in the usual data splitting practice, testing data accounts for 20% of the data (subject to size of dataset). However, since the nature of the data required by (1A) is similar to that of testing, we shall take the samples from that dataset.

Note: Another data split available in the code, but require manual uncommenting. This split focuses on giving the GAN models more samples to train from. The split is as follows: - Data for (1B) (50% of total) - Data for (2) (50% of total) - Training Data for (2) (80% of 50%) - Training data for (2) (80% of Training Data) - Validation data for (2) (20% of Training Data) - Testing Data for (2) (20% of 50%)

Models Built for the (1B) Image Generator Model

Keras Only Implementation
- Generator
  - 3 feed forward layers 256, 512, 1024
- Discriminator
  - 2 feed foward layers 512, 256
Keras and Keras-Adversarial Implementation
- Generator
  - 3 feed forward layers 256, 512, 1024
- Discriminator
  - 3 feed forward layers 1024, 512, 256

Models Built for the (2) Recognition Model

Note: All models use dropout, batch normalization and relu optimization. For details, please refer to model_hyperparameters.txt.

Model 0
- 1 layer CNN and max pooling layer with 32 filters
- 1 layer feed forward layer with 64 neurons
- Dropout of 25%
Model 1
- 1 layer CNN and max pooling layer with 64 filters
- 1 layer feed forward layer with 64 neurons
- Dropout of 25%
Model 2
- 2 layers CNN and max pooling layer with 64 then 32 filters
- 1 layer feed forward layer with 64 neurons
- Dropout of 25%
Model 3
- 2 layers CNN and max pooling layer with both 64 filters
- 1 layer feed forward layer with 64 neurons
- Dropout of 25%
Model 4
- 2 layers CNN and max pooling layer with both 32 filters
- 1 layer feed forward layer with 64 neurons
- Dropout of 25%
Model 5
- 2 layers CNN and max pooling layer with both 32 filters
- 1 layer feed forward layer with 64 neurons
- Dropout of varying ratios at different layers (75%, 50%, 25%)
Model 6
- 2 layers CNN and max pooling layer with both 32 filters
- 1 layer feed forward layer with 64 neurons
- Dropout of varying ratios at different layers (75%, 75%, 50%)
Model 7
- 2 layers CNN and max pooling layer with both 32 filters
- 2 layer feed forward layer with 128 then 64 neurons
- Dropout of varying ratios at different layers (75%, 75%, 50%, 50%)
Model 8
- 2 layers CNN and max pooling layer with 64 then 32 filters
- 2 layer feed forward layer with 128 then 64 neurons
- Dropout of varying ratios at different layers (75%, 75%, 50%, 50%)
Model 9
- 2 layers CNN and max pooling layer with both 64 filters
- 2 layer feed forward layer with 128 then 64 neurons
- Dropout of varying ratios at different layers (75%, 75%, 50%, 50%)
Model 10
- 2 layers CNN and max pooling layer with both 64 filters
- 2 layer feed forward layer with 256 then 128 neurons
- Dropout of varying ratios at different layers (75%, 75%, 50%, 50%)

Model Selection

The model with the best performance on the validation set will be selected, and the performance on the testing set would be reported as the final model performance.

Results

Results on the model performances on the validation dataset is as follows:

Model 0: Score: 51.17%
Model 1: Score: 50.78%
Model 2: Score: 39.42%
Model 3: Score: 41.52%
Model 4: Score: 40.50%
Model 5: Score: 53.79%
Model 6: Score: 77.15%
Model 7: Score: 83.62%
Model 8: Score: 76.51%
Model 9: Score: 71.24%
Model 10: Score: 57.59%

The best result comes from Model 7 (83.62%). The accuracy on the test set of this model is 83.58%.

Packages Used

Keras (High level execution of deep learning models)
Keras-Adversarial (High level execution of deep learning models) - installation files downloaded from github instead of pypi
Tensorflow (Engine for deep learning models)
Sckit-learn (For data pre-processing)
Scipy (For processing dataset)
Numpy (For loading dataset)
H5py (For saving Keras models)
Matplotlib (Optional) (For displaying image)
Written using Python 3.6 (64-bit)

Usage

Install the required packages
After downloading the files from github, open the folder and run:

python __init__.py ./dataset/emnist-balanced-small.mat -m ./model

Note: -m ./model is optional and can be disregarded if you do not wish to load the pre-built models.

Upon reaching the Main Menu, you have various options to work on:
- Train
  - Enter the filepaths to your specified model hyperparameters and model training parameters to train new models and add to the collection
- Predict
  - Select from "basic", "keras", and "keras-adversarial" for the 3 implementations of the image generator
  - Select a character for image generation and prediction
- Save
  - Save the current model collection to a folder
- Load
  - Load a model collection from a folder
  - Loading the model at the Main Menu immediately after entering the Main Menu is equivalent to specifying -m ./model
- Results
  - Display the validation scores of the models in the current model collection
- Exit
  - Exit the programme and return to command line

Reference

Dataset

[1] Cohen, G., Afshar, S., Tapson, J., & van Schaik, A. (2017). EMNIST: an extension of MNIST to handwritten letters. Retrieved from http://arxiv.org/abs/1702.05373

This piece of program is built from scratch, however, similar pieces were later identified on the internet. These include:

Building GAN using Keras

[5] https://github.com/eriklindernoren/Keras-GAN/blob/master/gan/gan.py

Building GAN using Keras-Adversarial

[6] https://github.com/bstriner/keras-adversarial/blob/master/examples/example_gan.py

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
dataset		dataset
gan_collection_keras_adversarial		gan_collection_keras_adversarial
model		model
README.md		README.md
__init__.py		__init__.py
cnn_model.py		cnn_model.py
data_parser.py		data_parser.py
image_generator.py		image_generator.py
model_collection.py		model_collection.py
model_hyperparameters.txt		model_hyperparameters.txt
model_training_parameters.txt		model_training_parameters.txt
shrink_emnist.py		shrink_emnist.py
trainer_predictor.py		trainer_predictor.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

emnist_image_generator_predictor

Overview

Method

Dataset

Models Built for the (1B) Image Generator Model

Models Built for the (2) Recognition Model

Model Selection

Results

Packages Used

Usage

Reference

About

Releases

Packages

Languages

jeffreynghm/emnist_image_generator_predictor

Folders and files

Latest commit

History

Repository files navigation

emnist_image_generator_predictor

Overview

Method

Dataset

Models Built for the (1B) Image Generator Model

Models Built for the (2) Recognition Model

Model Selection

Results

Packages Used

Usage

Reference

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages