Welcome to NVIDIA's deep learning inference workshop and end-to-end realtime object recognition library for Jetson TX1.
During this tutorial, you'll learn to deploy efficient neural networks using NVIDIA GPU Inference Engine and stream from live camera feed for performing realtime object recognition with the AlexNet / GoogLeNet networks.
- Table of Contents
- Introduction
- Building nvcaffe
- Installing GPU Inference Engine
- Compiling from Source
- Running the Recognition Demo
note: this branch of the tutorial is verified against JetPack 2.2 / L4T R24.1 aarch64.
Deep-learning networks typically have two primary phases of development: training and inference
During the training phase, the network learns from a large dataset of labeled examples. The weights of the neural network become optimized to recognize the patterns contained within the training dataset. Deep neural networks have many layers of neurons connected togethers. Deeper networks take increasingly longer to train and evaluate, but are ultimately able to encode more intelligence within them.
Throughout training, the network's inference performance is tested and refined using trial dataset. Like the training dataset, the trial dataset is labeled with ground-truth so the network's accuracy can be evaluated, but was not included in the training dataset. The network continues to train iteratively until it reaches a certain level of accuracy set by the user.
Due to the size of the datasets and deep inference networks, training is typically very resource-intensive and can take weeks or months on traditional compute architectures. However, using GPUs vastly accellerates the process down to days or hours.
Using DIGITS, anyone can easily get started and interactively train their networks with GPU acceleration.
DIGITS is an open-source project contributed by NVIDIA, located here: https://github.com/NVIDIA/DIGITS.
This tutorial will use DIGITS and Jetson TX1 together for training and deploying deep-learning networks,
refered to as the DIGITS workflow:
Using it's trained weights, the network evaluates live data at runtime. Called inference, the network predicts and applies reasoning based off the examples it learned. Due to the depth of deep learning networks, inference requires significant compute resources to process in realtime on imagery and other sensor data. However, using NVIDIA's GPU Inference Engine which uses Jetson's integrated NVIDIA GPU, inference can be deployed onboard embedded platforms. Applications in robotics like picking, autonomous navigation, agriculture, and industrial inspection have many uses for deploying deep inference, including:
- Image recognition
- Object detection
- Segmentation
- Image registration (homography estimation)
- Depth from raw stereo
- Signal analytics
- Others?
A special branch of caffe is used on TX1 which includes support for FP16.
The code is released in NVIDIA's caffe repo in the experimental/fp16 branch, located here:
$ sudo apt-get install protobuf-compiler libprotobuf-dev cmake git libboost-thread1.55-dev libgflags-dev libgoogle-glog-dev libhdf5-dev libatlas-dev libatlas-base-dev libatlas3-base liblmdb-dev libleveldb-dev
The Snappy package needs a symbolic link created for Caffe to link correctly:
$ sudo ln -s /usr/lib/libsnappy.so.1 /usr/lib/libsnappy.so
$ sudo ldconfig
$ git clone -b experimental/fp16 https://github.com/NVIDIA/caffe
This will checkout the repo to a local directory called caffe
on your Jetson.
$ cd caffe
$ cp Makefile.config.example Makefile.config
$ sed -i 's/# NATIVE_FP16/NATIVE_FP16/g' Makefile.config
$ sed -i 's/# USE_CUDNN/USE_CUDNN/g' Makefile.config
$ sed -i 's/-gencode arch=compute_50,code=compute_50/-gencode arch=compute_53,code=sm_53 -gencode arch=compute_53,code=compute_53/g' Makefile.config
$ make all
$ make test
$ make runtest
NVIDIA's GPU Inference Engine (GIE) is an optimized backend for evaluating deep inference networks in prototxt format.
First, unzip the archive:
$ tar -zxvf gie.aarch64-cuda7.0-1.0-ea.tar.gz
The directory structure is as follows:
| \bin where the samples are built to
| \data sample network model / prototxt's
| \doc API documentation and User Guide
| \include
| \lib
| \samples
If you flashed your Jetson TX1 with JetPack or already have cuDNN installed, remove the version of cuDNN that comes with GIE:
$ cd GIE/lib
$ rm libcudnn*
$ cd ../../
$ cd GIE/samples/sampleMNIST
$ make TARGET=tx1
Compiling: sampleMNIST.cpp
Linking: ../../bin/sample_mnist_debug
Compiling: sampleMNIST.cpp
Linking: ../../bin/sample_mnist
$ cd ../sampleGoogleNet
$ make TARGET=tx1
Compiling: sampleGoogleNet.cpp
Linking: ../../bin/sample_googlenet_debug
Compiling: sampleGoogleNet.cpp
Linking: ../../bin/sample_googlenet
$ cd ../../../
$ cd GIE/bin
$ ./sample_mnist
@@@@@@@@@%+-: =@@@@@@@@@@@@
@@@@@@@%= -@@@**@@@@@@@
@@@@@@@ :%#@-#@@@. #@@@@@@
@@@@@@* +@@@@:*@@@ *@@@@@@
@@@@@@# +@@@@ @@@% @@@@@@@
@@@@@@@. :%@@.@@@. *@@@@@@@
@@@@@@@@- =@@@@. -@@@@@@@@
@@@@@@@@@%: +@- :@@@@@@@@@
@@@@@@@@@@@%. : -@@@@@@@@@@
@@@@@@@@@@@@@+ #@@@@@@@@@@
@@@@@@@@@@@@@@+ :@@@@@@@@@@
@@@@@@@@@@@@@@+ *@@@@@@@@@
@@@@@@@@@@@@@@: = @@@@@@@@@
@@@@@@@@@@@@@@ :@ @@@@@@@@@
@@@@@@@@@@@@@@ -@ @@@@@@@@@
@@@@@@@@@@@@@# +@ @@@@@@@@@
@@@@@@@@@@@@@* ++ @@@@@@@@@
@@@@@@@@@@@@@* *@@@@@@@@@
@@@@@@@@@@@@@# =@@@@@@@@@@
@@@@@@@@@@@@@@. +@@@@@@@@@@@
8: **********
The MNIST sample randomly selects an image of a numeral 0-9, which is then classified with the MNIST network using GIE. In this example, the network correctly recognized the image as #8.
Provided along with this tutorial are examples of running Googlenet/Alexnet on live camera feed, for object recognition.
To obtain the repository, navigate to a folder of your choosing on the Jetson. First, make sure git and cmake are installed locally:
sudo apt-get install git cmake
Then clone the jetson-inference repo:
git clone http://github.org/dusty-nv/jetson-inference
When cmake is run, a special pre-installation script (CMakePreBuild.sh) is run and will automatically install any dependencies.
mkdir build
cd build
cmake ../
Make sure you are still in the jetson-inference/build directory, created above in step #2.
cd jetson-inference/build # omit if pwd is already /build from above
Depending on architecture, the package will be built to either armhf or aarch64, with the following directory structure:
\aarch64 (64-bit)
\bin where the sample binaries are built to
\include where the headers reside
\lib where the libraries are build to
\armhf (32-bit)
\bin where the sample binaries are built to
\include where the headers reside
\lib where the libraries are build to
binaries residing in aarch64/bin, headers in aarch64/include, and libraries in aarch64/lib.
The realtime image recognition demo is located in /aarch64/bin and is called imagenet-camera. It runs on live camera stream and depending on user arguments, loads googlenet or alexnet with GPU Inference Engine:
$ cd jetson-inference/build/aarch64/bin
$ ./imagenet-camera googlenet # to run using googlenet
$ ./imagenet-camera alexnet # to run using alexnet
The frames per second (FPS), classified object name from the video, and confidence of the classified object are printed to the openGL window title bar. By default the application can recognize up to 1000 different types of objects, since Googlenet and Alexnet are trained on the ILSVRC12 ImageNet database which contains 1000 classes of objects. The mapping of names for the 1000 types of objects, you can find included in the repo under data/networks/ilsvrc12_synset_words.txt