Enjoy binary neural networks on mobile!
Join chat at Gitter (English) or QQ Group (Chinese, 1021964010, answer: nndab)
Binary neural networks (BNNs) have great potential on edge devices since they replace float operations by efficient bit-wise operations. However, to leverage the efficiency of bit-wise operations, the reimplmentation of convolution layer and also other layers is needed.
To our best knowledge, dabnn is the first highly-optimized binary neural networks inference framework for mobile platform. We implemented binary convolutions with armv8 assembly. On Google Pixel 1, our dabnn is as 800%~2400% faster as BMXNet (the only one open-sourced BNN inference framework except dabnn to our best knowledge) on a single binary convolution, and as about 700% faster as it on binarized ResNet-18.
Benchmark result on Google Pixel 1 (single thread):
2019-05-02 18:00:29
Running data/local/tmp/dabnn_benchmark
Run on (4 X 1593.6 MHz CPU s)
***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead.
--------------------------------------------------------------------
Benchmark Time CPU Iterations
--------------------------------------------------------------------
dabnn_5x5_256 3658193 ns 3636875 ns 192 <--- input: 14*14*256, kernel: 256*5*5*256, output: 14*14*256, padding: 2
dabnn_3x3_64 1285949 ns 1261826 ns 552 <--- input: 56*56*64, kernel: 64*3*3*64, output: 56*56*64, padding: 1
dabnn_3x3_128 988757 ns 981547 ns 721 <--- input: 28*28*128, kernel: 128*3*3*128, output: 28*28*128, padding: 1
dabnn_3x3_256 1018918 ns 1008007 ns 689 <--- input: 14*14*256, kernel: 256*3*3*256, output: 14*14*256, padding: 1
dabnn_3x3_256_s2 269234 ns 268085 ns 2613 <--- input: 14*14*256, kernel: 256*3*3*256, output: 7*7*256, padding: 1, stride: 2
dabnn_3x3_512 1226245 ns 1203749 ns 579 <--- input: 7* 7*512, kernel: 512*3*3*512, output: 7* 7*512, padding: 1
dabnn_bireal18_imagenet 61809506 ns 61056865 ns 10 <--- Bi-Real Net 18, 56.4% top-1 on ImageNet
dabnn_bireal18_imagenet_stem 43279353 ns 41533009 ns 14 <--- Bi-Real Net 18 with stem module (The network structure will be described in detail in the coming paper), 56.4% top-1 on ImageNet
The following is the comparison between our dabnn and Caffe (full precision), TensorFlow Lite (full precision) and BMXNet (binary). Note that "Conv 64", "Conv 128", "Conv 256" and "Conv 512" have the same meaning as in the above benchmark. We surprisingly observe that BMXNet is even slower than the full precision TensorFlow Lite. It suggests that the potential of binary neural networks is far from exploited until our dabnn is published.
We provide a conversion tool, named onnx2bnn, to convert an ONNX model to a dabnn model. We provide onnx2bnn pre-built binaries for all platforms in GitHub Releases. For Linux users, the onnx2bnn pre-built binary is AppImage format, see https://appimage.org for details.
Note: Binary convolution is a custom operator, so whether the ONNX model is dabnn-comptabile heavily depends on the implementation of the binary convolution in the training code. Please check out our wiki for the further information.
After conversion, the generated dabnn model can be deployed on armv8 devices. For Android developer, we have provided Android AAR package and published it on jcenter, for the usage please check out example project.
We publish two pretrained binary neural network models based on Bi-Real Net on ImageNet. More pretrained models will be published in the future.
-
Bi-Real Net 18, 56.4% top-1 on ImageNet, 61.8ms/image on Google Pixel 1 (single thread).
-
Bi-Real Net 18 with Stem Module, 56.4% top-1 on ImageNet, 43.2ms/image on Google Pixel 1 (single thread). The detailed network structure will be described in the coming paper.
We plan to participate the ACM Multimedia 2019 Open Source Software Competition. Our implementation details will be presented in a 4-page short paper soon.
Android app demo: https://github.com/JDAI-CV/dabnn-example