Real-time demos that use deep convolutional neural networks to classify and caption what they see in real-time from a webcam stream.
All demos use CPU, but it's trivial to fix them to work with CUDA or OpenCL.
Quick install on OS X:
brew instal opencv3 --with-contrib
OpenCV_DIR=/usr/local/Cellar/opencv3/3.1.0/share/OpenCV luarocks install cv
brew install protobuf
luarocks install loadcaffe
In Linux you have to build OpenCV3 manually. Follow the instructions in
The demo simply takes a central crop from a webcam and uses a small ImageNet classification pretrained network to classify what it see on it. top-5 predicted classes are shown on top, the top one is the most probable.
Run as th demo.lua
This demo uses two networks described here to predict age and gender of the faces that it finds with a simple cascade detector.
Run as
th demo.lua `locate haarcascade_frontalface_default.xml`
IMAGINE Lab gives an example:
This demo uses NeuralTalk2 captioning code from Andrej Karpathy:
The code captions live webcam demo. Follow the installation instructions at first and then run the demo as:
th videocaptioning.lua -gpuid -1 -model model_id1-501-1448236541_cpu.t7
Caption is displayed on top:
2016 Sergey Zagoruyko and Egor Burkov
Thanks to VisionLabs for putting up bindings!