reference paper : http://arxiv.org/abs/1505.02074
dataset : http://www.cs.toronto.edu/~mren/imageqa/data/cocoqa/
theano
- simple : fc7 features with RNNs
- attention : conv5_4 features (similar to 'show, attend and tell' http://arxiv.org/abs/1502.03044) with RNNs
tensorflow - simple : fc7 features with RNNs