Dockerized TensorFlow Object Detection API
Drivers and Cuda versions are a pain to deal with. Praise our lord and savior Docker.
This was tested with a host machine with the following characteristics.
user@machine:~/work/docker/tf_object_detection_api$ nvidia-smi
Wed Jan 8 13:04:47 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.44 Driver Version: 440.44 CUDA Version: 10.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce MX250 Off | 00000000:01:00.0 Off | N/A |
| N/A 44C P0 N/A / N/A | 190MiB / 2002MiB | 2% Default |
+-------------------------------+----------------------+----------------------+
However you should not need CUDA, only a driver for your GPU.
These are the prerequisites are only for the TensorFlow Object Detection API.
- an nvidia driver
docker
nvidia-docker
# build the object detection API image
docker build -t odapi -f ./dockerfiles/odapi.dockerfile ./dockerfiles
# build the image for tensorflow js conversions
docker build -t tfjs -f ./dockerfiles/tfjs.dockerfile ./dockerfiles
# build the image for tensorflow lite conversions
docker build -t toco -f ./dockerfiles/toco.dockerfile ./dockerfiles
You can check that you built the 3 images by running docker images
. Here is a summary though:
odapi
, runs the TensorFlow Object Detection API, which at the date of writing, only runs on TensorFlow 1.x, so it pulls from thetensorflow/tensorflow:1.15.0-gpu-py3
image as its base.tfjs
, encapsulates thetensorflowjs
suite and its function is to provide a TensorFlow JS converter. Since it does not require GPU, and its latest version requires TensorFlow 2.x, it pulls fromtensorflow/tensorflow:2.1.0-py3
as its base image.- finally,
toco
comes with TOCO (i.e. TensorFlow Lite Converter) built with bazel. It pulls from the same base image astfjs
, which is cached.
A docker-compose.yml is in the works to handle port, volumes, launch programs etc. In the meantime, here are a few options.
TensorFlow Object Detection API with GPU support
docker run --rm -it --gpus all -v $(pwd):/home/oceanus odapi
to run this container without a gpu, remove the gpu flag:
docker run --rm -it -v $(pwd):/home/oceanus odapi
to start a container for TensorFlow JS conversions.
docker run --rm -it -v $(pwd):/home/oceanus tfjs
This section describes the scripts you have to run in order to get a model, in relevant order.
It goes from a supervisely-formatted dataset (placed in ./input/supervisely
) » to ./input/tf_csv
» lastly, to ./input/tf_records
.
For these steps, you only need to run a python installation with tensorflow
, or use the odapi container in tty mode.
python ./input/supervisely2tf_csv.py # will use defaults
After you have a tf_csv formatted dataset, run the following to generate TFRecords.
# Generate `train.record`
python input/generate_tfrecord.py \
--csv_input=input/tf_csv/train.csv \
--output_path=input/tf_records/train.record \
--img_path=input/tf_csv/images/train \
--label_map=input/tf_csv/label_map.pbtxt
# Generate `test.record`
python ./input/generate_tfrecord.py \
--csv_input=./input/tf_csv/test.csv \
--output_path=./input/tf_records/test.record \
--img_path=./input/tf_csv/images/test \
--label_map=./input/tf_csv/label_map.pbtxt
This should have you set up with the dataset.
Boot up an odapi
container. From within it run
python model_training/train_cli.py
and follow the prompts.
After training the model, and before obtaining the tflite and tensorflow_js
formats, you need to export from a model.ckpt-*
binary to other formats -
*graph.pb
and saved_model
.
In order to do so, from an odapi
container, you just need to run
python model_training/export_cli.py
and follow the interactive prompt. You'll pick the model and the respective checkpoint to export from.
In order for this to work, you need to have trained a model, as this export
does not work with the model.ckpt
prefix nor with the pipeline.config
(configured for 90 classes of the coco dataset) that come with the download
from tensorflow.org. Besides, pipeline.config
includes a batch_norm_trainable
field which the export scripts from TensorFlow Object Detection API do not support.
I.e. you need a model.ckpt-*
and a training.config
that match each other,
and are setup to work with the export scripts, and these are generated from training.
The resulting model of the previous part comes in three formats.
FrozenGraph
, deprecated in TensorFlow 2.SavedModel
, which will be used to convert to TensorFlowJS- tflite-compatible inference graph, used to obtain a
tflite
executable.- saved in files
tflite/tflite_graph.pb
andtflite/tflite_graph.pbtxt
- saved in files
This is takes from the official guide to getting a tflite model that you may want to read.
Boot the purpose-specific container:
docker run -it --rm -v $(pwd):/home/oceanus toco
# this is the only container that will run as root btw
Edit the tflite_graph2tflite.sh script so that MODEL_NAME
matches that of your target output model. After that is ensured, you just have to run
the script from the container.
./container_scripts/tflite_graph2tflite.sh
The files for TensorFlow Lite will be placed in the ./output/$MODEL_NAME/export/tflite
directory.
Boot the corresponding container
docker run -it --rm -v $(pwd):/home/oceanus tfjs
Edit the saved_model2tfjs.sh script so that MODEL_NAME
matches that of your target output model. After that is ensured, just execute the script from the container.
./container_scripts/saved_model2tfjs.sh
The files for TensorFlow JS will be placed in the ./output/$MODEL_NAME/export/tensorflow_js
directory.
If you want to run a model on a Coral TPU, one way to train a model that supports TPU training - any marked with an asterisk (☆) in the Model Zoo list. (not tested)
Another option is to train an ssd_mobilenet_v2_
quantized
_coco
in order to easily convert to an uint8-tflite-formatted model - as per section tflite conversion section. (tested)