Image Captioning with attention Pytorch

Problem Definition

Image caption generation is an interesting artificial intelligence problem where a descriptive sentence is generated for a given image. It involves the dual techniques from computer vision to understand the content of the image and a language model from the field of natural language processing to turn the understanding of the image into words in the right order

Examples

a man sitting in front of a laptop computer

a close up of a red fire hydrant

a person on a surfboard in the water

Dataset

COCO dataset 80K training Image 40K val Image with 5 captioning for every image

Model-Architecture

Encoder

Transfer learning : encode the image (224 * 224 * 3 ) into 4096 feature vector using pre-trained VGG 16 Linear layer : to get small feature vector with shape of 512

Decoder

Its job to encode the feature vector and generate word by word
Since it’s generating sequence , it will need RNN, we will use LSTM

Attention

In a setting with Attention, we want the Decoder to be able to look at different parts of the image at different points in the sequence.

Beam-Search

The straightforward – and greedy – option would be to choose the word with the highest score and use it to predict the next word
It would be best if we could somehow not decide until we've finished decoding completely, and choose the sequence that has the highest overall score from a basket of candidate sequences.

Extra-Examples

pizza sitting on top of a wooden table

A man riding a wave on top of a surfboard

A close up of a plate of food with broccoli

A small boat on a body of water

A box filled with different type of donuts

A baseball player swinging a bat at ball

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
.idea		.idea
Img		Img
models		models
DataLoader.py		DataLoader.py
README.md		README.md
eval.py		eval.py
model.py		model.py
sample.py		sample.py
train.py		train.py
util.py		util.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Image Captioning with attention Pytorch

Problem Definition

Examples

a man sitting in front of a laptop computer

a close up of a red fire hydrant

a person on a surfboard in the water

Contents

Dataset

Model-Architecture

Encoder

Decoder

Attention

Beam-Search

Extra-Examples

About

Releases

Packages

Languages

ZiadAbdElrahman/Image-Captioning-Pytorch

Folders and files

Latest commit

History

Repository files navigation

Image Captioning with attention Pytorch

Problem Definition

Examples

a man sitting in front of a laptop computer

a close up of a red fire hydrant

a person on a surfboard in the water

Contents

Dataset

Model-Architecture

Encoder

Decoder

Attention

Beam-Search

Extra-Examples

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages