updated and prepare for release

zxwglzi · Dec 16, 2014 · 95f4860 · 95f4860
1 parent f3e7590
commit 95f4860
Show file tree

Hide file tree

Showing 48 changed files with 4,900 additions and 67 deletions.
diff --git a/README.md b/README.md
@@ -10,17 +10,18 @@ Build Status
 Backstory
 ---------
 
-I set to build ccv with minimalism inspiration. That was back in 2010, out of
-frustration with the computer vision library then I was using, ccv was meant
-to be a much easier to deploy, simpler organized code with a bit caution of
-dependency hygiene. The simplicity and minimalistic nature at then, made it
-much easier to integrated into any server-side deployment environments.
+I set to build ccv with a minimalism inspiration. That was back in 2010, out
+of the frustration with the computer vision library then I was using, ccv
+was meant to be a much easier to deploy, simpler organized code with a bit
+caution with dependency hygiene. The simplicity and minimalistic nature at
+then, made it much easier to integrate into any server-side deployment
+environments.
 
 Portability and Embeddable
 --------------------------
 
 Fast forward to now, the world is quite different from then, but ccv adapts
-pretty well in the new, mobile-first environment. It now runs on Mac OSX,
+pretty well in this new, mobile-first environment. It now runs on Mac OSX,
 Linux, FreeBSD, Windows\*, iPhone, iPad, Android, Raspberry Pi. In fact,
 anything that has a proper C compiler probably can run ccv. The majority
 (with notable exception of convolutional networks, which requires a BLAS
@@ -29,15 +30,15 @@ library) of ccv will just work with no compilation flags or dependencies.
 Modern Computer Vision Algorithms
 ---------------------------------
 
-One core concept of ccv development is "application driven". As a result, ccv
-end up implementing a handful state-of-art algorithms. It includes
-a close to state-of-the-art image classifier, a state-of-the-art frontal face
-detector, reasonable collection of object detectors for pedestrians and cars.
-a useful text detection algorithm, a long term general object tracking algorithm,
+One core concept of ccv development is *application driven*. Thus, ccv ends
+up implementing a handful state-of-art algorithms. It includes a close to
+state-of-the-art image classifier, a state-of-the-art frontal face detector,
+reasonable collection of object detectors for pedestrians and cars, a useful
+text detection algorithm, a long-term general object tracking algorithm,
 and the long-standing feature point extraction algorithm.
 
-Cached Image Preprocessing
---------------------------
+Clean Interface with Cached Image Preprocessing
+-----------------------------------------------
 
 Many computer vision tasks nowadays consist of quite a few preprocessing
 layers: image pyramid generation, color space conversion etc. These potentially
@@ -50,3 +51,10 @@ implementation is what it lacks of. After years, we stuck in between either the
 high-performance, battle-tested but old algorithm implementations, or the new,
 shining but Matlab algorithms. ccv is my take on this problem, hope you enjoy
 it.
+
+License
+-------
+
+ccv source code is distributed under BSD 3-clause License.
+
+ccv's data models and documentations are distributed under Creative Commons Attribution 4.0 International License.
diff --git a/bin/image-net.c b/bin/image-net.c
@@ -136,10 +136,10 @@ int main(int argc, char** argv)
 	for (i = 0; i < 21; i++)
 	{
 		layer_params[i].w.decay = 0.0005;
-		layer_params[i].w.learn_rate = 0.001;
+		layer_params[i].w.learn_rate = 0.0001;
 		layer_params[i].w.momentum = 0.9;
 		layer_params[i].bias.decay = 0;
-		layer_params[i].bias.learn_rate = 0.001;
+		layer_params[i].bias.learn_rate = 0.0001;
 		layer_params[i].bias.momentum = 0.9;
 	}
 	layer_params[18].dor = 0.5;

diff --git a/doc/convnet.md b/doc/convnet.md
@@ -157,11 +157,11 @@ is easy, for me on Ubuntu, it is about one line:
 Assuming you've downloaded devkit-1.0 from the above link, and found meta.mat file somewhere in that
 tarball, launching Octave interactive environment and run:
 
-	file = fopen('meta.txt', 'w+')
-	for i = 1:1000
-		fprintf(file, "%d %s %d\n", synsets(i).ILSVRC2010_ID, synsets(i).WNID, synsets(i).num_train_images)
-	endfor
-	fclose(file)
+	octave> file = fopen('meta.txt', 'w+')
+	octave> for i = 1:1000
+	octave> 	fprintf(file, "%d %s %d\n", synsets(i).ILSVRC2010_ID, synsets(i).WNID, synsets(i).num_train_images)
+	octave> endfor
+	octave> fclose(file)
 
 The newly created meta.txt file will give us the class id, the WordNet id, and the number of training
 image available for each class.
@@ -201,9 +201,9 @@ The generated image-net.sqlite3 file is about 600MiB in size because it contains
 and resume. You can either open this file with sqlite command-line tool (it is a vanilla sqlite database
 file), and do:
 
-	> drop table function_state;
-	> drop table momentum_data;
-	> vacuum;
+	sqlite> drop table function_state;
+	sqlite> drop table momentum_data;
+	sqlite> vacuum;
 
 The file size will shrink to about 200MiB. You can achieve further reduction in file size by rewrite it into
 half-precision, with ccv_convnet_write and write_param.half_precision = 1. The resulted image-net.sqlite3

diff --git a/site/0.6/doc/doc-bbf/index.html b/site/0.6/doc/doc-bbf/index.html
@@ -0,0 +1,198 @@
+<!doctype html>
+<html><head><meta charset="utf-8">
+<title>BBF: Brightness Binary Feature</title>
+<link rel="stylesheet" href="/stylesheets/styles.css">
+<link rel="stylesheet" href="/stylesheets/coderay.css">
+<script src="/javascripts/scale.fix.js"></script>
+<meta name="viewport" content="width=device-width, initial-scale=1, user-scalable=no">
+<meta http-equiv="X-UA-Compatible" content="chrome=1">
+<!--[if lt IE 9]>
+<script src="//html5shiv.googlecode.com/svn/trunk/html5.js"></script>
+<![endif]-->
+<script type="text/javascript">
+var _gaq = _gaq || [];
+_gaq.push(['_setAccount', 'UA-303081-6']);
+_gaq.push(['_trackPageview']);
+(function() {
+	var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
+	ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
+	var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
+})();
+</script>
+</head><body><div class="wrapper">
+<header><h1><a href="/">ccv</a></h1>
+<p>A Modern Computer Vision Library</p>
+<p class="view"><a href="https://github.com/liuliu/ccv">View the Project on GitHub <small>liuliu/ccv</small></a></p>
+<ul>
+<li><a href="https://github.com/liuliu/ccv/zipball/stable">Download <strong>ZIP File</strong></a></li>
+<li><a href="https://github.com/liuliu/ccv/tarball/stable">Download <strong>TAR Ball</strong></a></li>
+<li><a href="https://github.com/liuliu/ccv">Fork On <strong>GitHub</strong></a></li>
+</ul>
+</header>
+<section><h1>BBF: Brightness Binary Feature</h1>
+<p><a href="/lib/ccv-bbf/">Library Reference: ccv_bbf.c</a></p>
+
+<h2 id="whats-bbf">What’s BBF?</h2>
+
+<p>The original paper refers to:
+YEF∗ Real-Time Object Detection, Yotam Abramson and Bruno Steux</p>
+
+<p>The improved version refers to:
+High-Performance Rotation Invariant Multiview Face Detection, Chang Huang, Haizhou Ai, Yuan Li and Shihong Lao</p>
+
+<h2 id="how-it-works">How it works?</h2>
+
+<p>That’s a long story, please read the paper. But at least I can show you how to
+use the magic:</p>
+
+<pre><code>./bbfdetect &lt;Your Image contains Faces&gt; ../samples/face | ./bbfdraw.rb &lt;Your Image contains Faces&gt; output.png
+</code></pre>
+
+<p>Check out the output.png, now you get the idea.</p>
+
+<h2 id="what-about-the-performance">What about the performance?</h2>
+
+<p>The tests are performed with MIT+CMU face detection dataset
+(http://vasc.ri.cmu.edu/idb/html/face/frontal_images/index.html)</p>
+
+<p><strong>Setup</strong>:</p>
+
+<p>Download the tarball, copy out files in newtest/ test/ and test-low/ to a single
+folder, let’s say: all/. Since ccv doesn’t support gif format, you need to do file
+format conversion by your own. If you have ImageMagick, it is handy:</p>
+
+<pre><code>for i in *.gif; do convert $i `basename $i .gif`.png; done;
+</code></pre>
+
+<p>For the ground truth data, you can copy them out from
+http://vasc.ri.cmu.edu/idb/images/face/frontal_images/list.html Only Test Set A,
+B, C are needed.</p>
+
+<p>bbfdetect needs a list of files, you can generate them by run the command in the
+same directory of bbfdetect binary:</p>
+
+<pre><code>find &lt;the directory of converted files&gt;/*.png &gt; filelist.txt
+</code></pre>
+
+<p><strong>Speed-wise</strong>:</p>
+
+<p>run</p>
+
+<pre><code>time ./bbfdetect filelist.txt ../samples/face &gt; result.txt
+</code></pre>
+
+<p>On my computer, it reports:</p>
+
+<pre><code>real    0m9.304s
+user    0m9.270s
+sys     0m0.010s
+</code></pre>
+
+<p>How about OpenCV’s face detector? I run OpenCV with default setting on the same
+computer, and it reports:</p>
+
+<pre><code>real    0m27.977s
+user    0m27.860s
+sys     0m0.050s
+</code></pre>
+
+<p>You see the difference.</p>
+
+<p><strong>Accuracy-wise</strong>:</p>
+
+<p>I wrote a little script called bbfvldr.rb that can check the output of bbfdetect
+against ground truth, before run the script, you need to do some house-cleaning
+work on the result.txt:</p>
+
+<p>Basically, the result.txt file will contain the full path to the file, for which,
+we only need the filename, use your favorite editor to remove the directory
+information, for me, it is:</p>
+
+<pre><code>sed -i "s/\.\.\/test\/faces\///g" result.txt
+</code></pre>
+
+<p>Suppose you have copied the ground truth to truth.txt file, run the validator:</p>
+
+<pre><code>./bbfvldr.rb truth.txt result.txt
+</code></pre>
+
+<p>My result for bbfdetect is:</p>
+
+<pre><code>82.97% (12)
+</code></pre>
+
+<p>The former one is detection rate (how many faces are detected), the later one is
+the number of false alarms (how many non-face regions are detected as faces)</p>
+
+<p>The result for OpenCV default face detector is:</p>
+
+<pre><code>86.69% (15)
+</code></pre>
+
+<p>Well, we are a little behind, but you can train the detector yourself, just get
+a better data source!</p>
+
+<h2 id="how-to-train-my-own-detector">How to train my own detector?</h2>
+
+<p>In this chapter, I will go over how I trained the face detector myself. To be
+honest, I lost my face detector training data several years ago. Just like
+everyone else, I have to download it somewhere. In the end, I settled with LFW
+(http://vis-www.cs.umass.edu/lfw/). Technically, it is the dataset for face
+recognition, so there are less variations. But that’s the largest dataset I can
+find to download. I downloaded the aligned data, cropped with random rotation,
+translation and scale variations, got 13125 faces in 24x24 size.</p>
+
+<p>The bbfcreate also requires negative images, just so happened, I have about 8000
+natural scene images that contains no faces downloaded from Flickr. OK, now I
+have all the data, what’s next?</p>
+
+<p>First, you need to create a directory called data/ under the same directory of
+bbfcreate. Then, you need to create two filelists of positive data and negative
+images, for me, it is:</p>
+
+<pre><code>find ../data/faces/*.bmp &gt; faces.dat
+find ../data/negs/*.jpg &gt; negs.dat
+</code></pre>
+
+<p>That’s all! Just find a computer powerful enough and run the following line for several
+days:</p>
+
+<pre><code>./bbfcreate --positive-list faces.dat --background-list negs.dat --negative-count 26250 --working-dir data
+</code></pre>
+
+<p>The –negative-count parameter denotes how many negative samples extracted for each round,
+experimentally, it is something about twice of the number of your positive ones.</p>
+
+<p>If you configure the makefile well, bbfcreate will use OpenMP to speed up, which will
+eat up all the CPUs. My own training process ran about one week, it is a extremely
+powerful desktop PC, you should expect weeks for the result on modest PC with so many
+samples.</p>
+
+<p>You can stop bbfcreate at any time you want, the most recent result will be saved
+in data/ directory, clean up the directory to restart.</p>
+
+<p>I probably will implement MPI support in near future so that you can run this with
+many computers in parallel, but who nowadays have OpenMPI setup besides supercomputing
+centers?</p>
+
+<h3><a href="/">&lsaquo;&nbsp;&nbsp;back&nbsp;</a></h3>
+<div id="disqus_thread"></div>
+<script type="text/javascript">
+	var disqus_shortname = 'libccv';
+	(function() {
+		var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true;
+		dsq.src = 'http://' + disqus_shortname + '.disqus.com/embed.js';
+		(document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq);
+	})();
+</script>
+<a href="http://disqus.com" class="dsq-brlink">comments powered by <span class="logo-disqus">Disqus</span></a>
+
+</section>
+<footer>
+<p>This project is maintained by <a href="https://liuliu.me/">liuliu</a></p>
+<p><small>Theme originated from <a href="https://github.com/orderedlist">orderedlist</a></small></p>
+</footer>
+</div>
+<!--[if !IE]><script>fixScale(document);</script><!--<![endif]-->
+</body>
+</html>
diff --git a/site/0.6/doc/doc-cache/index.html b/site/0.6/doc/doc-cache/index.html
@@ -0,0 +1,94 @@
+<!doctype html>
+<html><head><meta charset="utf-8">
+<title>Cache: We are Terrible Magicians</title>
+<link rel="stylesheet" href="/stylesheets/styles.css">
+<link rel="stylesheet" href="/stylesheets/coderay.css">
+<script src="/javascripts/scale.fix.js"></script>
+<meta name="viewport" content="width=device-width, initial-scale=1, user-scalable=no">
+<meta http-equiv="X-UA-Compatible" content="chrome=1">
+<!--[if lt IE 9]>
+<script src="//html5shiv.googlecode.com/svn/trunk/html5.js"></script>
+<![endif]-->
+<script type="text/javascript">
+var _gaq = _gaq || [];
+_gaq.push(['_setAccount', 'UA-303081-6']);
+_gaq.push(['_trackPageview']);
+(function() {
+	var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
+	ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
+	var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
+})();
+</script>
+</head><body><div class="wrapper">
+<header><h1><a href="/">ccv</a></h1>
+<p>A Modern Computer Vision Library</p>
+<p class="view"><a href="https://github.com/liuliu/ccv">View the Project on GitHub <small>liuliu/ccv</small></a></p>
+<ul>
+<li><a href="https://github.com/liuliu/ccv/zipball/stable">Download <strong>ZIP File</strong></a></li>
+<li><a href="https://github.com/liuliu/ccv/tarball/stable">Download <strong>TAR Ball</strong></a></li>
+<li><a href="https://github.com/liuliu/ccv">Fork On <strong>GitHub</strong></a></li>
+</ul>
+</header>
+<section><h1>Cache: We are Terrible Magicians</h1>
+<p>ccv uses an application-wide transparent cache to de-duplicate matrix computations.
+In the following chapters, I will try to outline how that works, and expose you
+to the inner-working of ccv’s core functionalities.</p>
+
+<h2 id="initial-signature">Initial Signature</h2>
+
+<p><strong>ccv_make_matrix_immutable</strong> computes the SHA-1 hash on matrix raw data, and will
+use the first 64-bit as the signature for that matrix.</p>
+
+<h2 id="derived-signature">Derived Signature</h2>
+
+<p>Derived signature is computed from the specific operation that is going to perform.
+For example, matrix A and matrix B used to generate matrix C through operation X.
+C’s signature is derived from A, B and X.</p>
+
+<h2 id="a-radix-tree-lru-cache">A Radix-tree LRU Cache</h2>
+
+<p>ccv uses a custom radix-tree implementation with generation information. It imposes
+a hard limit on memory usage of 64 MiB, you can adjust this value if you like.
+The custom radix-tree data structure is specifically designed to satisfy our 64-bit
+signature design. If compile with jemalloc, it can be both fast and memory-efficient.</p>
+
+<h2 id="garbage-collection">Garbage Collection</h2>
+
+<p>The matrix signature is important. For every matrix that is freed with <strong>ccv_matrix_free</strong>
+directive, it will first check the signature. If it is a derived signature,
+<strong>ccv_matrix_free</strong> won’t free that matrix to OS immediately, instead, it will put
+that matrix back to the application-wide cache. Sparse matrix, matrix without
+signature / with initial signature will be freed immediately.</p>
+
+<h2 id="shortcut">Shortcut</h2>
+
+<p>For operation X performed with matrix A and B, it will first generate the derived
+signature. The signature will be searched in the application-wide cache in hope
+of finding a result matrix. If such matrix C is found, the operation X will take
+a shortcut and return that matrix to user. Otherwise, it will allocate such matrix,
+set proper signature on it and perform the operation honestly.</p>
+
+<p>After finish this, I found that it may not be the most interesting bit of ccv.
+But still, hope you found it otherwise :-)</p>
+
+<h3><a href="/">&lsaquo;&nbsp;&nbsp;back&nbsp;</a></h3>
+<div id="disqus_thread"></div>
+<script type="text/javascript">
+	var disqus_shortname = 'libccv';
+	(function() {
+		var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true;
+		dsq.src = 'http://' + disqus_shortname + '.disqus.com/embed.js';
+		(document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq);
+	})();
+</script>
+<a href="http://disqus.com" class="dsq-brlink">comments powered by <span class="logo-disqus">Disqus</span></a>
+
+</section>
+<footer>
+<p>This project is maintained by <a href="https://liuliu.me/">liuliu</a></p>
+<p><small>Theme originated from <a href="https://github.com/orderedlist">orderedlist</a></small></p>
+</footer>
+</div>
+<!--[if !IE]><script>fixScale(document);</script><!--<![endif]-->
+</body>
+</html>