Skip to content

Commit

Permalink
Additional inception fixes
Browse files Browse the repository at this point in the history
  • Loading branch information
nealwu committed Mar 27, 2017
1 parent 9da4485 commit c539b46
Show file tree
Hide file tree
Showing 7 changed files with 26 additions and 39 deletions.
37 changes: 13 additions & 24 deletions inception/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -111,15 +111,12 @@ ready to train or evaluate with the ImageNet data set.
intensive task and depending on your compute setup may take several days or even
weeks.

*Before proceeding* please read the [Convolutional Neural Networks]
(https://www.tensorflow.org/tutorials/deep_cnn/index.html) tutorial in
particular focus on [Training a Model Using Multiple GPU Cards]
(https://www.tensorflow.org/tutorials/deep_cnn/index.html#training-a-model-using-multiple-gpu-cards)
. The model training method is nearly identical to that described in the
*Before proceeding* please read the [Convolutional Neural Networks](https://www.tensorflow.org/tutorials/deep_cnn/index.html) tutorial; in
particular, focus on [Training a Model Using Multiple GPU Cards](https://www.tensorflow.org/tutorials/deep_cnn/index.html#launching_and_training_the_model_on_multiple_gpu_cards). The model training method is nearly identical to that described in the
CIFAR-10 multi-GPU model training. Briefly, the model training

* Places an individual model replica on each GPU. Split the batch across the
GPUs.
* Places an individual model replica on each GPU.
* Splits the batch across the GPUs.
* Updates model parameters synchronously by waiting for all GPUs to finish
processing a batch of data.

Expand Down Expand Up @@ -245,11 +242,9 @@ We term each machine that maintains model parameters a `ps`, short for
`ps` as the model parameters may be sharded across multiple machines.

Variables may be updated with synchronous or asynchronous gradient updates. One
may construct a an [`Optimizer`]
(https://www.tensorflow.org/api_docs/python/train.html#optimizers) in TensorFlow
that constructs the necessary graph for either case diagrammed below from
TensorFlow [Whitepaper]
(http://download.tensorflow.org/paper/whitepaper2015.pdf):
may construct a an [`Optimizer`](https://www.tensorflow.org/api_docs/python/train.html#optimizers) in TensorFlow
that constructs the necessary graph for either case diagrammed below from the
TensorFlow [Whitepaper](http://download.tensorflow.org/paper/whitepaper2015.pdf):

<div style="width:40%; margin:auto; margin-bottom:10px; margin-top:20px;">
<img style="width:100%"
Expand Down Expand Up @@ -380,10 +375,8 @@ training Inception in a distributed manner.
Evaluating an Inception v3 model on the ImageNet 2012 validation data set
requires running a separate binary.

The evaluation procedure is nearly identical to [Evaluating a Model]
(https://www.tensorflow.org/tutorials/deep_cnn/index.html#evaluating-a-model)
described in the [Convolutional Neural Network]
(https://www.tensorflow.org/tutorials/deep_cnn/index.html) tutorial.
The evaluation procedure is nearly identical to [Evaluating a Model](https://www.tensorflow.org/tutorials/deep_cnn/index.html#evaluating_a_model)
described in the [Convolutional Neural Network](https://www.tensorflow.org/tutorials/deep_cnn/index.html) tutorial.

**WARNING** Be careful not to run the evaluation and training binary on the same
GPU or else you might run out of memory. Consider running the evaluation on a
Expand Down Expand Up @@ -438,8 +431,7 @@ daisy, dandelion, roses, sunflowers, tulips
There is a single automated script that downloads the data set and converts it
to the TFRecord format. Much like the ImageNet data set, each record in the
TFRecord format is a serialized `tf.Example` proto whose entries include a
JPEG-encoded string and an integer label. Please see [`parse_example_proto`]
(inception/image_processing.py) for details.
JPEG-encoded string and an integer label. Please see [`parse_example_proto`](inception/image_processing.py) for details.

The script just takes a few minutes to run depending your network connection
speed for downloading and processing the images. Your hard disk requires 200MB
Expand Down Expand Up @@ -471,14 +463,12 @@ and `validation-?????-of-00002`, respectively.
**NOTE** If you wish to prepare a custom image data set for transfer learning,
you will need to invoke [`build_image_data.py`](inception/data/build_image_data.py) on
your custom data set. Please see the associated options and assumptions behind
this script by reading the comments section of [`build_image_data.py`]
(inception/data/build_image_data.py). Also, if your custom data has a different
this script by reading the comments section of [`build_image_data.py`](inception/data/build_image_data.py). Also, if your custom data has a different
number of examples or classes, you need to change the appropriate values in
[`imagenet_data.py`](inception/imagenet_data.py).

The second piece you will need is a trained Inception v3 image model. You have
the option of either training one yourself (See [How to Train from Scratch]
(#how-to-train-from-scratch) for details) or you can download a pre-trained
the option of either training one yourself (See [How to Train from Scratch](#how-to-train-from-scratch) for details) or you can download a pre-trained
model like so:

```shell
Expand Down Expand Up @@ -806,8 +796,7 @@ comments in [`image_processing.py`](inception/image_processing.py) for more deta
#### The model runs out of CPU memory.

In lieu of buying more CPU memory, an easy fix is to decrease
`--input_queue_memory_factor`. See [Adjusting Memory Demands]
(#adjusting-memory-demands).
`--input_queue_memory_factor`. See [Adjusting Memory Demands](#adjusting-memory-demands).

#### The model runs out of GPU memory.

Expand Down
9 changes: 4 additions & 5 deletions inception/inception/data/build_image_data.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@
train_directory/train-00000-of-01024
train_directory/train-00001-of-01024
...
train_directory/train-00127-of-01024
train_directory/train-01023-of-01024
and
Expand All @@ -50,7 +50,7 @@
image/width: integer, image width in pixels
image/colorspace: string, specifying the colorspace, always 'RGB'
image/channels: integer, specifying the number of channels, always 3
image/format: string, specifying the format, always'JPEG'
image/format: string, specifying the format, always 'JPEG'
image/filename: string containing the basename of the image file
e.g. 'n01440764_10026.JPEG' or 'ILSVRC2012_val_00000293.JPEG'
Expand All @@ -60,7 +60,7 @@
image/class/text: string specifying the human-readable version of the label
e.g. 'dog'
If you data set involves bounding boxes, please look at build_imagenet_data.py.
If your data set involves bounding boxes, please look at build_imagenet_data.py.
"""
from __future__ import absolute_import
from __future__ import division
Expand All @@ -72,7 +72,6 @@
import sys
import threading


import numpy as np
import tensorflow as tf

Expand Down Expand Up @@ -306,7 +305,7 @@ def _process_image_files(name, filenames, texts, labels, num_shards):
spacing = np.linspace(0, len(filenames), FLAGS.num_threads + 1).astype(np.int)
ranges = []
for i in range(len(spacing) - 1):
ranges.append([spacing[i], spacing[i+1]])
ranges.append([spacing[i], spacing[i + 1]])

# Launch a thread for each batch.
print('Launching %d threads for spacings: %s' % (FLAGS.num_threads, ranges))
Expand Down
9 changes: 4 additions & 5 deletions inception/inception/data/build_imagenet_data.py
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@
train_directory/train-00000-of-01024
train_directory/train-00001-of-01024
...
train_directory/train-00127-of-01024
train_directory/train-01023-of-01024
and
Expand All @@ -54,7 +54,7 @@
image/width: integer, image width in pixels
image/colorspace: string, specifying the colorspace, always 'RGB'
image/channels: integer, specifying the number of channels, always 3
image/format: string, specifying the format, always'JPEG'
image/format: string, specifying the format, always 'JPEG'
image/filename: string containing the basename of the image file
e.g. 'n01440764_10026.JPEG' or 'ILSVRC2012_val_00000293.JPEG'
Expand All @@ -80,7 +80,7 @@
Note that the length of xmin is identical to the length of xmax, ymin and ymax
for each example.
Running this script using 16 threads may take around ~2.5 hours on a HP Z420.
Running this script using 16 threads may take around ~2.5 hours on an HP Z420.
"""
from __future__ import absolute_import
from __future__ import division
Expand All @@ -92,7 +92,6 @@
import sys
import threading


import numpy as np
import tensorflow as tf

Expand Down Expand Up @@ -435,7 +434,7 @@ def _process_image_files(name, filenames, synsets, labels, humans,
ranges = []
threads = []
for i in range(len(spacing) - 1):
ranges.append([spacing[i], spacing[i+1]])
ranges.append([spacing[i], spacing[i + 1]])

# Launch a thread for each batch.
print('Launching %d threads for spacings: %s' % (FLAGS.num_threads, ranges))
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@
set -e

if [ -z "$1" ]; then
echo "usage download_and_preprocess_flowers.sh [data dir]"
echo "Usage: download_and_preprocess_flowers.sh [data dir]"
exit
fi

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@
set -e

if [ -z "$1" ]; then
echo "usage download_and_preprocess_flowers.sh [data dir]"
echo "Usage: download_and_preprocess_flowers.sh [data dir]"
exit
fi

Expand Down
4 changes: 2 additions & 2 deletions inception/inception/data/download_and_preprocess_imagenet.sh
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@
set -e

if [ -z "$1" ]; then
echo "usage download_and_preprocess_imagenet.sh [data dir]"
echo "Usage: download_and_preprocess_imagenet.sh [data dir]"
exit
fi

Expand Down Expand Up @@ -84,7 +84,7 @@ BOUNDING_BOX_FILE="${SCRATCH_DIR}/imagenet_2012_bounding_boxes.csv"
BOUNDING_BOX_DIR="${SCRATCH_DIR}bounding_boxes/"

"${BOUNDING_BOX_SCRIPT}" "${BOUNDING_BOX_DIR}" "${LABELS_FILE}" \
| sort >"${BOUNDING_BOX_FILE}"
| sort > "${BOUNDING_BOX_FILE}"
echo "Finished downloading and preprocessing the ImageNet data."

# Build the TFRecords version of the ImageNet data.
Expand Down
2 changes: 1 addition & 1 deletion inception/inception/data/download_imagenet.sh
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@
# downloading the raw images.
#
# usage:
# ./download_imagenet.sh [dirname]
# ./download_imagenet.sh [dir name] [synsets file]
set -e

if [ "x$IMAGENET_ACCESS_KEY" == x -o "x$IMAGENET_USERNAME" == x ]; then
Expand Down

0 comments on commit c539b46

Please sign in to comment.