Skip to content

Commit

Permalink
fix a bug in cnnclassify, more improvements on convnet.md
Browse files Browse the repository at this point in the history
  • Loading branch information
liuliu committed Mar 7, 2014
1 parent 18331cd commit be8db89
Show file tree
Hide file tree
Showing 3 changed files with 25 additions and 14 deletions.
6 changes: 3 additions & 3 deletions bin/image-net.c
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ int main(int argc, char** argv)
.mini_batch = 256,
.iterations = 5000,
.symmetric = 1,
.color_gain = 0.001,
.color_gain = 0.01,
};
int i, c;
while (getopt_long_only(argc, argv, "", image_net_options, &c) != -1)
Expand Down Expand Up @@ -409,10 +409,10 @@ int main(int argc, char** argv)
for (i = 0; i < 13; i++)
{
layer_params[i].w.decay = 0.0005;
layer_params[i].w.learn_rate = 0.00001;
layer_params[i].w.learn_rate = 0.0001;
layer_params[i].w.momentum = 0.9;
layer_params[i].bias.decay = 0;
layer_params[i].bias.learn_rate = 0.00001;
layer_params[i].bias.learn_rate = 0.0001;
layer_params[i].bias.momentum = 0.9;
}
layer_params[10].dor = 0.5;
Expand Down
24 changes: 19 additions & 5 deletions doc/convnet.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,9 @@ What about the performance?

ConvNet on the very large scale is not extremely fast. There are a few implementations available
for ConvNet that focused on speed performance, such as [Caffe from Berkeley](http://caffe.berkeleyvision.org/),
or [OverFeat from NYU](http://cilvr.nyu.edu/doku.php?id=software:overfeat:start).
or [OverFeat from NYU](http://cilvr.nyu.edu/doku.php?id=software:overfeat:start). Although not
explicitly optimized for speed (ccv chooses correctness over speed in this preliminary implementation),
the ConvNet implementation presented in ccv speed-wise is inline with other implementations.

Therefore, the analysis related to performance is implemented on ImageNet dataset and the network
topology followed the exact specification detailed in the paper.
Expand All @@ -44,18 +46,30 @@ TODO:
Speed-wise:

The experiment conducted on a computer with Core i7 3770, NVIDIA TITAN graphic card at stock
frequency, and Samsung MZ-7TE500BW 500GiB SSD with clang, libdispatch, GNU Scientific Library.
frequency, and Samsung MZ-7TE500BW 500GiB SSD with clang, libdispatch, libatlas and GNU
Scientific Library.

The CPU version of forward pass (from RGB image input to the classification result) takes about
350ms per image. This is achieved with multi-threaded convolutional kernel computation.
350ms per image. This is achieved with multi-threaded convolutional kernel computation. Decaf (
the CPU counter-part of Caffe) reported their forward pass at around 0.5s per image with
unspecified hardware over 10 patches (the same as ccv's cnnclassify implementation). I cannot
get sensible number off OverFeat on my machine (it reports about 1.4s for forward pass, that
makes little sense). Their reported number are 1s per image on unspecified configuration with
unspecified hardware (I suspect that their unspecified configuration does much more than the
averaging 10 patches ccv or Decaf does).

The GPU version does forward pass + backward error propagate for batch size of 256 in about 1.6s.
Thus, training ImageNet convolutional network takes about 9 days with 100 epochs.
Thus, training ImageNet convolutional network takes about 9 days with 100 epochs. Caffe reported
their forward pass + backward error propagate for batch size of 256 in about 1.8s on Tesla K20 (
known to be about 30% slower cross the board than TITAN). In the paper, Alex reported 90 epochs
within 6 days on two GeForce 580, which suggests my time is within line of these implementations.

As a preliminary implementation, ccv didn't spend enough time to optimize these operations if any
at all. For example, [cuda-convnet](http://code.google.com/p/cuda-convnet/) implements its
functionalities in about 10,000 lines of code, Caffe implements with 14,000 lines of code, as of
this release, ccv implements with about 3,700 lines of code.
this release, ccv implements with about 3,700 lines of code. For the future, the low-hanging
optimization opportunities include using SIMD instruction, doing FFT in densely convolved layers
etc.

How to train my own image classifier?
-------------------------------------
Expand Down
9 changes: 3 additions & 6 deletions lib/ccv_convnet.c
Original file line number Diff line number Diff line change
Expand Up @@ -514,7 +514,7 @@ void ccv_convnet_classify(ccv_convnet_t* convnet, ccv_dense_matrix_t** a, int sy
assert(CCV_GET_CHANNEL(a[i]->type) == convnet->channels);
assert(a[i]->rows == convnet->input.height);
assert(a[i]->cols == convnet->input.width);
b[0] = a[i];
ccv_subtract(a[i], convnet->mean_activity, (ccv_matrix_t**)b, CCV_32F);
// doing the first few layers until the first full connect layer
int rows, cols;
int previous_rows = convnet->input.height;
Expand All @@ -531,11 +531,8 @@ void ccv_convnet_classify(ccv_convnet_t* convnet, ccv_dense_matrix_t** a, int sy
_ccv_convnet_layer_deduce_output_format(layer, &previous_rows, &previous_cols, &partition);
layer->input.matrix.rows = rows;
layer->input.matrix.cols = cols;
if (j > 0)
{
ccv_matrix_free(b[j]);
b[j] = 0;
}
ccv_matrix_free(b[j]);
b[j] = 0;
}
int c = (!!symmetric + 1) * 5;
ccv_convnet_layer_t* start_layer = convnet->layers + last;
Expand Down

0 comments on commit be8db89

Please sign in to comment.