fix a bug in cnnclassify, more improvements on convnet.md

alaeri · Mar 7, 2014 · be8db89 · be8db89
1 parent 18331cd
commit be8db89
Show file tree

Hide file tree

Showing 3 changed files with 25 additions and 14 deletions.
diff --git a/bin/image-net.c b/bin/image-net.c
@@ -44,7 +44,7 @@ int main(int argc, char** argv)
 		.mini_batch = 256,
 		.iterations = 5000,
 		.symmetric = 1,
-		.color_gain = 0.001,
+		.color_gain = 0.01,
 	};
 	int i, c;
 	while (getopt_long_only(argc, argv, "", image_net_options, &c) != -1)
@@ -409,10 +409,10 @@ int main(int argc, char** argv)
 	for (i = 0; i < 13; i++)
 	{
 		layer_params[i].w.decay = 0.0005;
-		layer_params[i].w.learn_rate = 0.00001;
+		layer_params[i].w.learn_rate = 0.0001;
 		layer_params[i].w.momentum = 0.9;
 		layer_params[i].bias.decay = 0;
-		layer_params[i].bias.learn_rate = 0.00001;
+		layer_params[i].bias.learn_rate = 0.0001;
 		layer_params[i].bias.momentum = 0.9;
 	}
 	layer_params[10].dor = 0.5;

diff --git a/doc/convnet.md b/doc/convnet.md
@@ -32,7 +32,9 @@ What about the performance?
 
 ConvNet on the very large scale is not extremely fast. There are a few implementations available
 for ConvNet that focused on speed performance, such as [Caffe from Berkeley](http://caffe.berkeleyvision.org/),
-or [OverFeat from NYU](http://cilvr.nyu.edu/doku.php?id=software:overfeat:start).
+or [OverFeat from NYU](http://cilvr.nyu.edu/doku.php?id=software:overfeat:start). Although not
+explicitly optimized for speed (ccv chooses correctness over speed in this preliminary implementation),
+the ConvNet implementation presented in ccv speed-wise is inline with other implementations.
 
 Therefore, the analysis related to performance is implemented on ImageNet dataset and the network
 topology followed the exact specification detailed in the paper.
@@ -44,18 +46,30 @@ TODO:
 Speed-wise:
 
 The experiment conducted on a computer with Core i7 3770, NVIDIA TITAN graphic card at stock
-frequency, and Samsung MZ-7TE500BW 500GiB SSD with clang, libdispatch, GNU Scientific Library.
+frequency, and Samsung MZ-7TE500BW 500GiB SSD with clang, libdispatch, libatlas and GNU
+Scientific Library.
 
 The CPU version of forward pass (from RGB image input to the classification result) takes about
-350ms per image. This is achieved with multi-threaded convolutional kernel computation.
+350ms per image. This is achieved with multi-threaded convolutional kernel computation. Decaf (
+the CPU counter-part of Caffe) reported their forward pass at around 0.5s per image with
+unspecified hardware over 10 patches (the same as ccv's cnnclassify implementation). I cannot
+get sensible number off OverFeat on my machine (it reports about 1.4s for forward pass, that
+makes little sense). Their reported number are 1s per image on unspecified configuration with
+unspecified hardware (I suspect that their unspecified configuration does much more than the
+averaging 10 patches ccv or Decaf does).
 
 The GPU version does forward pass + backward error propagate for batch size of 256 in about 1.6s.
-Thus, training ImageNet convolutional network takes about 9 days with 100 epochs.
+Thus, training ImageNet convolutional network takes about 9 days with 100 epochs. Caffe reported
+their forward pass + backward error propagate for batch size of 256 in about 1.8s on Tesla K20 (
+known to be about 30% slower cross the board than TITAN). In the paper, Alex reported 90 epochs
+within 6 days on two GeForce 580, which suggests my time is within line of these implementations.
 
 As a preliminary implementation, ccv didn't spend enough time to optimize these operations if any
 at all. For example, [cuda-convnet](http://code.google.com/p/cuda-convnet/) implements its
 functionalities in about 10,000 lines of code, Caffe implements with 14,000 lines of code, as of
-this release, ccv implements with about 3,700 lines of code.
+this release, ccv implements with about 3,700 lines of code. For the future, the low-hanging
+optimization opportunities include using SIMD instruction, doing FFT in densely convolved layers
+etc.
 
 How to train my own image classifier?
 -------------------------------------

diff --git a/lib/ccv_convnet.c b/lib/ccv_convnet.c
@@ -514,7 +514,7 @@ void ccv_convnet_classify(ccv_convnet_t* convnet, ccv_dense_matrix_t** a, int sy
 		assert(CCV_GET_CHANNEL(a[i]->type) == convnet->channels);
 		assert(a[i]->rows == convnet->input.height);
 		assert(a[i]->cols == convnet->input.width);
-		b[0] = a[i];
+		ccv_subtract(a[i], convnet->mean_activity, (ccv_matrix_t**)b, CCV_32F);
 		// doing the first few layers until the first full connect layer
 		int rows, cols;
 		int previous_rows = convnet->input.height;
@@ -531,11 +531,8 @@ void ccv_convnet_classify(ccv_convnet_t* convnet, ccv_dense_matrix_t** a, int sy
 			_ccv_convnet_layer_deduce_output_format(layer, &previous_rows, &previous_cols, &partition);
 			layer->input.matrix.rows = rows;
 			layer->input.matrix.cols = cols;
-			if (j > 0)
-			{
-				ccv_matrix_free(b[j]);
-				b[j] = 0;
-			}
+			ccv_matrix_free(b[j]);
+			b[j] = 0;
 		}
 		int c = (!!symmetric + 1) * 5;
 		ccv_convnet_layer_t* start_layer = convnet->layers + last;