Skip to content

An algorithm that facilitates communication between a speech-impaired person and someone who doesn't understand sign language using Residual networks

Notifications You must be signed in to change notification settings

abraham-21/keras-signs-resnet

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

The main benefit of a very deep network is that it can represent very complex functions. It can also learn features at many different levels of abstraction, from edges (at the lower layers) to very complex features (at the deeper layers). However, a huge barrier to training them is vanishing gradients.

ResNets has 2 advantages:

  1. A "shortcut" or a "skip connection" allows the gradient to be directly backpropagated to earlier layers reducing the vanishing gradient problem alt shortcut
  2. ResNet blocks with the shortcut makes it very easy for one of the blocks to learn an identity function

Training set: 1080 pictures (64 by 64 pixels) of signs representing numbers from 0 to 5 (180 pictures per number)

Test set: 120 pictures (64 by 64 pixels) of signs representing numbers from 0 to 5 (20 pictures per number)

Here are examples for each number, and corresponding labels converted to one-hot. alt signs_dataset

Architecture:

  1. Input is an image of size 64 x 64 x 3 (RGB), which is normalized by dividing 255
  2. Model: alt architecture which comprises of identity block: alt identity and convolution block: alt convolution
  3. The last fully connected layer gives a probability of the image belonging to one of the six classes.
  4. RELU activation function. Categorical cross entropy loss. Adam optimizer
  5. Mini-batch gradient descent with minibatch_size of 32

The model is CONV2D -> BATCHNORM -> RELU -> MAXPOOL -> CONVBLOCK -> IDBLOCK x 2 -> CONVBLOCK -> IDBLOCK x 3 -> CONVBLOCK -> IDBLOCK x 5 -> CONVBLOCK -> IDBLOCK x 2 -> AVGPOOL -> FLATTEN -> FC

Outcome:

  1. Training cost graph-

alt cost

  1. Train accuracy - 0.917
    Train loss - 0.59

    Test accuracy - 0.875
    Test loss - 0.77
  2. TODO- to overcome overfitting, add L2 or dropout regularization

About

An algorithm that facilitates communication between a speech-impaired person and someone who doesn't understand sign language using Residual networks

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%