Enjoy binary neural networks on mobile!
[English] [Chinese/中文]
Join chat at Gitter (English) or QQ Group (Chinese, 1021964010, answer: nndab)
Our ACM MM paper: https://arxiv.org/abs/1908.05858
Binary neural networks (BNNs) have great potential on edge devices since they replace float operations by efficient bit-wise operations. However, to leverage the efficiency of bit-wise operations, the reimplmentation of convolution layer and also other layers is needed.
To our best knowledge, dabnn is the first highly-optimized binary neural networks inference framework for mobile platform. We implemented binary convolutions with ARM assembly. On Google Pixel 1, our dabnn is as 800%~2400% faster as BMXNet (the only one open-sourced BNN inference framework except dabnn to our best knowledge) on a single binary convolution, and as about 700% faster as it on binarized ResNet-18.
Benchmark result on Google Pixel 1 (single thread):
2019-05-06 10:36:48
Running data/local/tmp/dabnn_benchmark
Run on (4 X 1593.6 MHz CPU s)
***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead.
--------------------------------------------------------------------
Benchmark Time CPU Iterations
--------------------------------------------------------------------
dabnn_5x5_256 3661928 ns 3638192 ns 191 <--- input: 14*14*256, kernel: 256*5*5*256, output: 14*14*256, padding: 2
dabnn_3x3_64 1306391 ns 1281553 ns 546 <--- input: 56*56*64, kernel: 64*3*3*64, output: 56*56*64, padding: 1
dabnn_3x3_128 958388 ns 954754 ns 735 <--- input: 28*28*128, kernel: 128*3*3*128, output: 28*28*128, padding: 1
dabnn_3x3_256 975123 ns 969810 ns 691 <--- input: 14*14*256, kernel: 256*3*3*256, output: 14*14*256, padding: 1
dabnn_3x3_256_s2 268310 ns 267712 ns 2618 <--- input: 14*14*256, kernel: 256*3*3*256, output: 7*7*256, padding: 1, stride: 2
dabnn_3x3_512 1281832 ns 1253921 ns 588 <--- input: 7* 7*512, kernel: 512*3*3*512, output: 7* 7*512, padding: 1
dabnn_bireal18_imagenet 61920154 ns 61339185 ns 10 <--- Bi-Real Net 18, 56.4% top-1 on ImageNet
dabnn_bireal18_imagenet_stem 43294019 ns 41401923 ns 14 <--- Bi-Real Net 18 with stem module (The network structure is described in detail in [our paper](https://arxiv.org/abs/1908.05858)), 56.4% top-1 on ImageNet
The following is the comparison between our dabnn and Caffe (full precision), TensorFlow Lite (full precision) and BMXNet (binary). We surprisingly observe that BMXNet is even slower than the full precision TensorFlow Lite. It suggests that the potential of binary neural networks is far from exploited until our dabnn is published.
We provide pre-built onnx2bnn and also dabnn Android package. However, you need to build it if you want to deploy BNNs on non-Android ARM devices.
We use CMake build system like most C++ projects. Check out docs/build.md for the detail instructions.
We provide a conversion tool, named onnx2bnn, to convert an ONNX model to a dabnn model. We provide onnx2bnn pre-built binaries for all platforms in GitHub Releases. For Linux users, the onnx2bnn pre-built binary is AppImage format, see https://appimage.org for details.
Note: Binary convolution is a custom operator, so whether the ONNX model is dabnn-comptabile heavily depends on the implementation of the binary convolution in the training code. Please check out our wiki for the further information.
After conversion, the generated dabnn model can be deployed on ARM devices (e.g., mobile phones and embedded devices). For Android developer, we have provided Android AAR package and published it on jcenter, for the usage please check out example project.
We publish two pretrained binary neural network models based on Bi-Real Net on ImageNet. More pretrained models will be published in the future.
-
Bi-Real Net 18, 56.4% top-1 on ImageNet, 61.3ms/image on Google Pixel 1 (single thread). [dabnn] [ONNX]
-
Bi-Real Net 18 with Stem Module, 56.4% top-1 on ImageNet, 43.2ms/image on Google Pixel 1 (single thread). The detailed network structure is described in our paper. [dabnn] [ONNX]
-
The Implementation of Binary Convolutions: docs/bconv.md
-
Model Conversion: docs/onnx2bnn.md
For more details please read our ACM MM paper.
Android app demo: https://github.com/JDAI-CV/dabnn-example
The following two papers use dabnn to measure the latency of their binary networks on real devices:
Please cite daBNN in your publications if it helps your research:
@misc{zhang2019dabnn,
Author = {Jianhao Zhang and Yingwei Pan and Ting Yao and He Zhao and Tao Mei},
Title = {daBNN: A Super Fast Inference Framework for Binary Neural Networks on ARM devices},
Year = {2019},
Eprint = {arXiv:1908.05858},
}