Skip to content

Commit

Permalink
updated and prepare for release
Browse files Browse the repository at this point in the history
  • Loading branch information
liuliu committed Dec 16, 2014
1 parent f3e7590 commit 95f4860
Show file tree
Hide file tree
Showing 48 changed files with 4,900 additions and 67 deletions.
34 changes: 21 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,17 +10,18 @@ Build Status
Backstory
---------

I set to build ccv with minimalism inspiration. That was back in 2010, out of
frustration with the computer vision library then I was using, ccv was meant
to be a much easier to deploy, simpler organized code with a bit caution of
dependency hygiene. The simplicity and minimalistic nature at then, made it
much easier to integrated into any server-side deployment environments.
I set to build ccv with a minimalism inspiration. That was back in 2010, out
of the frustration with the computer vision library then I was using, ccv
was meant to be a much easier to deploy, simpler organized code with a bit
caution with dependency hygiene. The simplicity and minimalistic nature at
then, made it much easier to integrate into any server-side deployment
environments.

Portability and Embeddable
--------------------------

Fast forward to now, the world is quite different from then, but ccv adapts
pretty well in the new, mobile-first environment. It now runs on Mac OSX,
pretty well in this new, mobile-first environment. It now runs on Mac OSX,
Linux, FreeBSD, Windows\*, iPhone, iPad, Android, Raspberry Pi. In fact,
anything that has a proper C compiler probably can run ccv. The majority
(with notable exception of convolutional networks, which requires a BLAS
Expand All @@ -29,15 +30,15 @@ library) of ccv will just work with no compilation flags or dependencies.
Modern Computer Vision Algorithms
---------------------------------

One core concept of ccv development is "application driven". As a result, ccv
end up implementing a handful state-of-art algorithms. It includes
a close to state-of-the-art image classifier, a state-of-the-art frontal face
detector, reasonable collection of object detectors for pedestrians and cars.
a useful text detection algorithm, a long term general object tracking algorithm,
One core concept of ccv development is *application driven*. Thus, ccv ends
up implementing a handful state-of-art algorithms. It includes a close to
state-of-the-art image classifier, a state-of-the-art frontal face detector,
reasonable collection of object detectors for pedestrians and cars, a useful
text detection algorithm, a long-term general object tracking algorithm,
and the long-standing feature point extraction algorithm.

Cached Image Preprocessing
--------------------------
Clean Interface with Cached Image Preprocessing
-----------------------------------------------

Many computer vision tasks nowadays consist of quite a few preprocessing
layers: image pyramid generation, color space conversion etc. These potentially
Expand All @@ -50,3 +51,10 @@ implementation is what it lacks of. After years, we stuck in between either the
high-performance, battle-tested but old algorithm implementations, or the new,
shining but Matlab algorithms. ccv is my take on this problem, hope you enjoy
it.

License
-------

ccv source code is distributed under BSD 3-clause License.

ccv's data models and documentations are distributed under Creative Commons Attribution 4.0 International License.
4 changes: 2 additions & 2 deletions bin/image-net.c
Original file line number Diff line number Diff line change
Expand Up @@ -136,10 +136,10 @@ int main(int argc, char** argv)
for (i = 0; i < 21; i++)
{
layer_params[i].w.decay = 0.0005;
layer_params[i].w.learn_rate = 0.001;
layer_params[i].w.learn_rate = 0.0001;
layer_params[i].w.momentum = 0.9;
layer_params[i].bias.decay = 0;
layer_params[i].bias.learn_rate = 0.001;
layer_params[i].bias.learn_rate = 0.0001;
layer_params[i].bias.momentum = 0.9;
}
layer_params[18].dor = 0.5;
Expand Down
16 changes: 8 additions & 8 deletions doc/convnet.md
Original file line number Diff line number Diff line change
Expand Up @@ -157,11 +157,11 @@ is easy, for me on Ubuntu, it is about one line:
Assuming you've downloaded devkit-1.0 from the above link, and found meta.mat file somewhere in that
tarball, launching Octave interactive environment and run:

file = fopen('meta.txt', 'w+')
for i = 1:1000
fprintf(file, "%d %s %d\n", synsets(i).ILSVRC2010_ID, synsets(i).WNID, synsets(i).num_train_images)
endfor
fclose(file)
octave> file = fopen('meta.txt', 'w+')
octave> for i = 1:1000
octave> fprintf(file, "%d %s %d\n", synsets(i).ILSVRC2010_ID, synsets(i).WNID, synsets(i).num_train_images)
octave> endfor
octave> fclose(file)

The newly created meta.txt file will give us the class id, the WordNet id, and the number of training
image available for each class.
Expand Down Expand Up @@ -201,9 +201,9 @@ The generated image-net.sqlite3 file is about 600MiB in size because it contains
and resume. You can either open this file with sqlite command-line tool (it is a vanilla sqlite database
file), and do:

> drop table function_state;
> drop table momentum_data;
> vacuum;
sqlite> drop table function_state;
sqlite> drop table momentum_data;
sqlite> vacuum;

The file size will shrink to about 200MiB. You can achieve further reduction in file size by rewrite it into
half-precision, with ccv_convnet_write and write_param.half_precision = 1. The resulted image-net.sqlite3
Expand Down
198 changes: 198 additions & 0 deletions site/0.6/doc/doc-bbf/index.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,198 @@
<!doctype html>
<html><head><meta charset="utf-8">
<title>BBF: Brightness Binary Feature</title>
<link rel="stylesheet" href="/stylesheets/styles.css">
<link rel="stylesheet" href="/stylesheets/coderay.css">
<script src="/javascripts/scale.fix.js"></script>
<meta name="viewport" content="width=device-width, initial-scale=1, user-scalable=no">
<meta http-equiv="X-UA-Compatible" content="chrome=1">
<!--[if lt IE 9]>
<script src="//html5shiv.googlecode.com/svn/trunk/html5.js"></script>
<![endif]-->
<script type="text/javascript">
var _gaq = _gaq || [];
_gaq.push(['_setAccount', 'UA-303081-6']);
_gaq.push(['_trackPageview']);
(function() {
var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
})();
</script>
</head><body><div class="wrapper">
<header><h1><a href="/">ccv</a></h1>
<p>A Modern Computer Vision Library</p>
<p class="view"><a href="https://github.com/liuliu/ccv">View the Project on GitHub <small>liuliu/ccv</small></a></p>
<ul>
<li><a href="https://github.com/liuliu/ccv/zipball/stable">Download <strong>ZIP File</strong></a></li>
<li><a href="https://github.com/liuliu/ccv/tarball/stable">Download <strong>TAR Ball</strong></a></li>
<li><a href="https://github.com/liuliu/ccv">Fork On <strong>GitHub</strong></a></li>
</ul>
</header>
<section><h1>BBF: Brightness Binary Feature</h1>
<p><a href="/lib/ccv-bbf/">Library Reference: ccv_bbf.c</a></p>

<h2 id="whats-bbf">What’s BBF?</h2>

<p>The original paper refers to:
YEF∗ Real-Time Object Detection, Yotam Abramson and Bruno Steux</p>

<p>The improved version refers to:
High-Performance Rotation Invariant Multiview Face Detection, Chang Huang, Haizhou Ai, Yuan Li and Shihong Lao</p>

<h2 id="how-it-works">How it works?</h2>

<p>That’s a long story, please read the paper. But at least I can show you how to
use the magic:</p>

<pre><code>./bbfdetect &lt;Your Image contains Faces&gt; ../samples/face | ./bbfdraw.rb &lt;Your Image contains Faces&gt; output.png
</code></pre>

<p>Check out the output.png, now you get the idea.</p>

<h2 id="what-about-the-performance">What about the performance?</h2>

<p>The tests are performed with MIT+CMU face detection dataset
(http://vasc.ri.cmu.edu/idb/html/face/frontal_images/index.html)</p>

<p><strong>Setup</strong>:</p>

<p>Download the tarball, copy out files in newtest/ test/ and test-low/ to a single
folder, let’s say: all/. Since ccv doesn’t support gif format, you need to do file
format conversion by your own. If you have ImageMagick, it is handy:</p>

<pre><code>for i in *.gif; do convert $i `basename $i .gif`.png; done;
</code></pre>

<p>For the ground truth data, you can copy them out from
http://vasc.ri.cmu.edu/idb/images/face/frontal_images/list.html Only Test Set A,
B, C are needed.</p>

<p>bbfdetect needs a list of files, you can generate them by run the command in the
same directory of bbfdetect binary:</p>

<pre><code>find &lt;the directory of converted files&gt;/*.png &gt; filelist.txt
</code></pre>

<p><strong>Speed-wise</strong>:</p>

<p>run</p>

<pre><code>time ./bbfdetect filelist.txt ../samples/face &gt; result.txt
</code></pre>

<p>On my computer, it reports:</p>

<pre><code>real 0m9.304s
user 0m9.270s
sys 0m0.010s
</code></pre>

<p>How about OpenCV’s face detector? I run OpenCV with default setting on the same
computer, and it reports:</p>

<pre><code>real 0m27.977s
user 0m27.860s
sys 0m0.050s
</code></pre>

<p>You see the difference.</p>

<p><strong>Accuracy-wise</strong>:</p>

<p>I wrote a little script called bbfvldr.rb that can check the output of bbfdetect
against ground truth, before run the script, you need to do some house-cleaning
work on the result.txt:</p>

<p>Basically, the result.txt file will contain the full path to the file, for which,
we only need the filename, use your favorite editor to remove the directory
information, for me, it is:</p>

<pre><code>sed -i "s/\.\.\/test\/faces\///g" result.txt
</code></pre>

<p>Suppose you have copied the ground truth to truth.txt file, run the validator:</p>

<pre><code>./bbfvldr.rb truth.txt result.txt
</code></pre>

<p>My result for bbfdetect is:</p>

<pre><code>82.97% (12)
</code></pre>

<p>The former one is detection rate (how many faces are detected), the later one is
the number of false alarms (how many non-face regions are detected as faces)</p>

<p>The result for OpenCV default face detector is:</p>

<pre><code>86.69% (15)
</code></pre>

<p>Well, we are a little behind, but you can train the detector yourself, just get
a better data source!</p>

<h2 id="how-to-train-my-own-detector">How to train my own detector?</h2>

<p>In this chapter, I will go over how I trained the face detector myself. To be
honest, I lost my face detector training data several years ago. Just like
everyone else, I have to download it somewhere. In the end, I settled with LFW
(http://vis-www.cs.umass.edu/lfw/). Technically, it is the dataset for face
recognition, so there are less variations. But that’s the largest dataset I can
find to download. I downloaded the aligned data, cropped with random rotation,
translation and scale variations, got 13125 faces in 24x24 size.</p>

<p>The bbfcreate also requires negative images, just so happened, I have about 8000
natural scene images that contains no faces downloaded from Flickr. OK, now I
have all the data, what’s next?</p>

<p>First, you need to create a directory called data/ under the same directory of
bbfcreate. Then, you need to create two filelists of positive data and negative
images, for me, it is:</p>

<pre><code>find ../data/faces/*.bmp &gt; faces.dat
find ../data/negs/*.jpg &gt; negs.dat
</code></pre>

<p>That’s all! Just find a computer powerful enough and run the following line for several
days:</p>

<pre><code>./bbfcreate --positive-list faces.dat --background-list negs.dat --negative-count 26250 --working-dir data
</code></pre>

<p>The –negative-count parameter denotes how many negative samples extracted for each round,
experimentally, it is something about twice of the number of your positive ones.</p>

<p>If you configure the makefile well, bbfcreate will use OpenMP to speed up, which will
eat up all the CPUs. My own training process ran about one week, it is a extremely
powerful desktop PC, you should expect weeks for the result on modest PC with so many
samples.</p>

<p>You can stop bbfcreate at any time you want, the most recent result will be saved
in data/ directory, clean up the directory to restart.</p>

<p>I probably will implement MPI support in near future so that you can run this with
many computers in parallel, but who nowadays have OpenMPI setup besides supercomputing
centers?</p>

<h3><a href="/">&lsaquo;&nbsp;&nbsp;back&nbsp;</a></h3>
<div id="disqus_thread"></div>
<script type="text/javascript">
var disqus_shortname = 'libccv';
(function() {
var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true;
dsq.src = 'http://' + disqus_shortname + '.disqus.com/embed.js';
(document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq);
})();
</script>
<a href="http://disqus.com" class="dsq-brlink">comments powered by <span class="logo-disqus">Disqus</span></a>

</section>
<footer>
<p>This project is maintained by <a href="https://liuliu.me/">liuliu</a></p>
<p><small>Theme originated from <a href="https://github.com/orderedlist">orderedlist</a></small></p>
</footer>
</div>
<!--[if !IE]><script>fixScale(document);</script><!--<![endif]-->
</body>
</html>
94 changes: 94 additions & 0 deletions site/0.6/doc/doc-cache/index.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
<!doctype html>
<html><head><meta charset="utf-8">
<title>Cache: We are Terrible Magicians</title>
<link rel="stylesheet" href="/stylesheets/styles.css">
<link rel="stylesheet" href="/stylesheets/coderay.css">
<script src="/javascripts/scale.fix.js"></script>
<meta name="viewport" content="width=device-width, initial-scale=1, user-scalable=no">
<meta http-equiv="X-UA-Compatible" content="chrome=1">
<!--[if lt IE 9]>
<script src="//html5shiv.googlecode.com/svn/trunk/html5.js"></script>
<![endif]-->
<script type="text/javascript">
var _gaq = _gaq || [];
_gaq.push(['_setAccount', 'UA-303081-6']);
_gaq.push(['_trackPageview']);
(function() {
var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
})();
</script>
</head><body><div class="wrapper">
<header><h1><a href="/">ccv</a></h1>
<p>A Modern Computer Vision Library</p>
<p class="view"><a href="https://github.com/liuliu/ccv">View the Project on GitHub <small>liuliu/ccv</small></a></p>
<ul>
<li><a href="https://github.com/liuliu/ccv/zipball/stable">Download <strong>ZIP File</strong></a></li>
<li><a href="https://github.com/liuliu/ccv/tarball/stable">Download <strong>TAR Ball</strong></a></li>
<li><a href="https://github.com/liuliu/ccv">Fork On <strong>GitHub</strong></a></li>
</ul>
</header>
<section><h1>Cache: We are Terrible Magicians</h1>
<p>ccv uses an application-wide transparent cache to de-duplicate matrix computations.
In the following chapters, I will try to outline how that works, and expose you
to the inner-working of ccv’s core functionalities.</p>

<h2 id="initial-signature">Initial Signature</h2>

<p><strong>ccv_make_matrix_immutable</strong> computes the SHA-1 hash on matrix raw data, and will
use the first 64-bit as the signature for that matrix.</p>

<h2 id="derived-signature">Derived Signature</h2>

<p>Derived signature is computed from the specific operation that is going to perform.
For example, matrix A and matrix B used to generate matrix C through operation X.
C’s signature is derived from A, B and X.</p>

<h2 id="a-radix-tree-lru-cache">A Radix-tree LRU Cache</h2>

<p>ccv uses a custom radix-tree implementation with generation information. It imposes
a hard limit on memory usage of 64 MiB, you can adjust this value if you like.
The custom radix-tree data structure is specifically designed to satisfy our 64-bit
signature design. If compile with jemalloc, it can be both fast and memory-efficient.</p>

<h2 id="garbage-collection">Garbage Collection</h2>

<p>The matrix signature is important. For every matrix that is freed with <strong>ccv_matrix_free</strong>
directive, it will first check the signature. If it is a derived signature,
<strong>ccv_matrix_free</strong> won’t free that matrix to OS immediately, instead, it will put
that matrix back to the application-wide cache. Sparse matrix, matrix without
signature / with initial signature will be freed immediately.</p>

<h2 id="shortcut">Shortcut</h2>

<p>For operation X performed with matrix A and B, it will first generate the derived
signature. The signature will be searched in the application-wide cache in hope
of finding a result matrix. If such matrix C is found, the operation X will take
a shortcut and return that matrix to user. Otherwise, it will allocate such matrix,
set proper signature on it and perform the operation honestly.</p>

<p>After finish this, I found that it may not be the most interesting bit of ccv.
But still, hope you found it otherwise :-)</p>

<h3><a href="/">&lsaquo;&nbsp;&nbsp;back&nbsp;</a></h3>
<div id="disqus_thread"></div>
<script type="text/javascript">
var disqus_shortname = 'libccv';
(function() {
var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true;
dsq.src = 'http://' + disqus_shortname + '.disqus.com/embed.js';
(document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq);
})();
</script>
<a href="http://disqus.com" class="dsq-brlink">comments powered by <span class="logo-disqus">Disqus</span></a>

</section>
<footer>
<p>This project is maintained by <a href="https://liuliu.me/">liuliu</a></p>
<p><small>Theme originated from <a href="https://github.com/orderedlist">orderedlist</a></small></p>
</footer>
</div>
<!--[if !IE]><script>fixScale(document);</script><!--<![endif]-->
</body>
</html>
Loading

0 comments on commit 95f4860

Please sign in to comment.