Skip to content

Latest commit

 

History

History
 
 

models

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 

Models

Copy task

General

  • Batch size : 1
  • Architecture : Single NTM layer + Dense output layer
    • Memory shape : (128, 20)
    • Controller : Dense controller
    • Heads : 1 Read head + 1 Write head
  • Training examples
    • Size : 8
    • Minimum Length : 1
    • Maximum Length : 5

Optimization

  • Objective : Binary Cross-Entropy
    • Prediction trucated to [1e-10, 1 - 1e-10]
  • Learning algorithm : A. Graves' RMSProp
    • Learning rate : 1e-3
    • Chi : 0.95
    • Alpha : 0.9
    • Epsilon : 1e-4

Parameters

  Parameter W (init) b (init) nonlinearity
Controller - GlorotUniform() Constant(0.) rectify
Read Head sign None - -
key GlorotUniform() Constant(0.) rectify
beta GlorotUniform() Constant(0.) rectify
gate GlorotUniform() Constant(0.) hard_sigmoid
shift GlorotUniform() Constant(0.) softmax
gamma GlorotUniform() Constant(0.) 1. + rectify
Write Head sign None - -
key GlorotUniform() Constant(0.) rectify
beta GlorotUniform() Constant(0.) rectify
gate GlorotUniform() Constant(0.) hard_sigmoid
shift GlorotUniform() Constant(0.) softmax
gamma GlorotUniform() Constant(0.) 1. + rectify
sign_add None - -
add GlorotUniform() Constant(0.) rectify
erase GlorotUniform() Constant(0.) hard_sigmoid
Dense Layer - GlorotUniform() Constant(0.) sigmoid

Initialization

  Initialization Learn init? Operation dropout
Memory GlorotUniform() False -
Read Head init.OneHot() False No
Write Head init.OneHot() False No

Repeat Copy task

Git commit : 90d72d6

General

  • Batch size : 1
  • Architecture : Single NTM layer + Dense output layer
    • Memory shape : (128, 20)
    • Controller : Dense controller
    • Heads : 1 Read head + 1 Write head
  • Training examples
    • Size : 8
    • Minimum Length : 3
    • Maximum Length : 5
    • Minimum Repeat number : 1
    • Maximum Repeat number : 5
    • Unary : True

Optimization

  • Objective : Binary Cross-Entropy
    • Prediction trucated to [1e-10, 1 - 1e-10]
  • Learning algorithm : A. Graves' RMSProp
    • Learning rate : 1e-3
    • Chi : 0.95
    • Alpha : 0.9
    • Epsilon : 1e-4

Parameters

  Parameter W (init) b (init) nonlinearity
Controller - GlorotUniform() Constant(0.) rectify
Read Head sign None - -
key GlorotUniform() Constant(0.) rectify
beta GlorotUniform() Constant(0.) rectify
gate GlorotUniform() Constant(0.) hard_sigmoid
shift GlorotUniform() Constant(0.) softmax
gamma GlorotUniform() Constant(0.) 1. + rectify
Write Head sign None - -
key GlorotUniform() Constant(0.) rectify
beta GlorotUniform() Constant(0.) rectify
gate GlorotUniform() Constant(0.) hard_sigmoid
shift GlorotUniform() Constant(0.) softmax
gamma GlorotUniform() Constant(0.) 1. + rectify
sign_add None - -
add GlorotUniform() Constant(0.) rectify
erase GlorotUniform() Constant(0.) hard_sigmoid
Dense Layer - GlorotUniform() Constant(0.) sigmoid

Initialization

  Initialization Learn init? Operation dropout
Memory GlorotUniform() False -
Read Head init.OneHot() False No
Write Head init.OneHot() False No

Associative Recall task

Git commit : 3bd7512

General

  • Batch size : 1
  • Architecture : Single NTM layer + Dense output layer
    • Memory shape : (128, 20)
    • Controller : Dense controller
    • Heads : 1 Read head + 1 Write head
  • Training examples
    • Size : 8
    • Minimum Item Length : 1
    • Maximum Item Length : 3
    • Minimum Number of Items : 2
    • Maximum Number of Items : 6

Optimization

  • Objective : Binary Cross-Entropy
    • Prediction trucated to [1e-10, 1 - 1e-10]
  • Learning algorithm : Adam
    • Learning rate : 1e-4
    • Beta1 : 0.9
    • Beta2 : 0.999
    • Epsilon : 1e-8

Parameters

  Parameter W (init) b (init) nonlinearity
Controller - GlorotUniform() Constant(0.) rectify
Read Head sign None - -
key GlorotUniform() Constant(0.) rectify
beta GlorotUniform() Constant(0.) rectify
gate GlorotUniform() Constant(0.) hard_sigmoid
shift GlorotUniform() Constant(0.) softmax
gamma GlorotUniform() Constant(0.) 1. + rectify
Write Head sign None - -
key GlorotUniform() Constant(0.) rectify
beta GlorotUniform() Constant(0.) rectify
gate GlorotUniform() Constant(0.) hard_sigmoid
shift GlorotUniform() Constant(0.) softmax
gamma GlorotUniform() Constant(0.) 1. + rectify
sign_add None - -
add GlorotUniform() Constant(0.) rectify
erase GlorotUniform() Constant(0.) hard_sigmoid
Dense Layer - GlorotUniform() Constant(0.) sigmoid

Initialization

  Initialization Learn init? Operation dropout
Memory Constant(1e-6) False -
Read Head init.OneHot() False No
Write Head init.OneHot() False No

Dyck Words task

Git commit : 873deec

General

  • Batch size : 1

  • Architecture : Single NTM layer + Dense output layer

    • Memory shape : (128, 20)
    • Controller : Dense controller
    • Heads : 1 Read head + 1 Write head
  • Training examples

    • Initial Maximum Semi-Length : 5

    Double maximum semi-length every time the mean loss over 500 samples is below 1e-4 up to a maximum of 40.

Optimization

  • Objective : Binary Cross-Entropy
    • Prediction trucated to [1e-10, 1 - 1e-10]
  • Learning algorithm : Adam
    • Learning rate : 1e-3
    • Beta1 : 0.9
    • Beta2 : 0.999
    • Epsilon : 1e-8

Parameters

  Parameter W (init) b (init) nonlinearity
Controller - GlorotUniform() Constant(0.) rectify
Read Head sign None - -
key GlorotUniform() Constant(0.) rectify
beta GlorotUniform() Constant(0.) rectify
gate GlorotUniform() Constant(0.) hard_sigmoid
shift GlorotUniform() Constant(0.) softmax
gamma GlorotUniform() Constant(0.) 1. + rectify
Write Head sign None - -
key GlorotUniform() Constant(0.) rectify
beta GlorotUniform() Constant(0.) rectify
gate GlorotUniform() Constant(0.) hard_sigmoid
shift GlorotUniform() Constant(0.) softmax
gamma GlorotUniform() Constant(0.) 1. + rectify
sign_add None - -
add GlorotUniform() Constant(0.) rectify
erase GlorotUniform() Constant(0.) hard_sigmoid
Dense Layer - GlorotUniform() Constant(0.) sigmoid

Initialization

  Initialization Learn init? Operation dropout
Memory Constant(1e-6) False -
Read Head init.OneHot() False No
Write Head init.OneHot() False No