Arm NN ports for MLPerf Inference benchmarks
Please follow the CK installation instructions.
$ ck pull repo --url=https://github.com/arm-software/armnn-mlperf
$ ck list repo:*armnn*
ck-armnn
armnn-mlperf
$ ck list repo:*mlperf*
ck-mlperf
armnn-mlperf
NB: Remember to refresh all the repositories after any updates (e.g. bug fixes):
$ ck pull all
$ ck install package --tags=lib,tflite,v1.13
To install ArmNN with full support (frontends: TF, TFLite, ONNX; backends: Reference, OpenCL, Neon):
$ ck install package --tags=lib,armnn,tf,tflite,onnx,neon,opencl,rel.19.02
NB: On a platform with only a couple of GB of RAM, you way wish to restrict the number of CPU build threads e.g. as follows:
$ ck install package --tags=lib,armnn,tflite,neon,opencl,rel.19.02 \
--env.CK_HOST_CPU_NUMBER_OF_PROCESSORS=4
If you would like to save time, you can build with TFLite frontend support only using one of the backend options below. For more details, please refer to the CK-ArmNN repository.
$ ck install package --tags=lib,armnn,tflite,neon,opencl,rel.19.02
$ ck install package --tags=lib,armnn,tflite,neon,rel.19.02
$ ck install package --tags=lib,armnn,tflite,opencl,rel.19.02
$ ck install package --tags=lib,armnn,tflite,rel.19.02
Please follow the MLPerf image classification instructions to install dependencies such as Python packages:
- first, the common instructions;
- then, the TFLite instructions.
$ ck install package --tags=dataset,imagenet,val,original,full
$ ck install package --tags=dataset,imagenet,val,preprocessed,full
$ ck install package --tags=dataset,imagenet,val,original,min --no_tags=resized
$ ck install package --tags=dataset,imagenet,val,preprocessed
Install the MobileNet model:
$ ck install package --tags=model,tflite,mlperf,mobilenet,non-quantized
$ ck benchmark program:image-classification-tflite \
--repetitions=1 --env.CK_BATCH_SIZE=1 --env.CK_BATCH_COUNT=500 \
--record --record_repo=local --record_uoa=mlperf-mobilenet-tflite-accuracy-500 \
--tags=image-classification,mlperf,mobilenet,tflite,accuracy,500 \
--skip_print_timers --skip_stat_analysis --process_multi_keys
NB: You can also run the same command on the full ImageNet validation dataset of 50,000 images (see below).
$ ck benchmark program:image-classification-tflite \
--repetitions=1 --env.CK_BATCH_SIZE=1 --env.CK_BATCH_COUNT=50000 \
--record --record_repo=local --record_uoa=mlperf-mobilenet-tflite-accuracy-50000 \
--tags=image-classification,mlperf,mobilenet,tflite,accuracy,50000 \
--skip_print_timers --skip_stat_analysis --process_multi_keys
$ ck benchmark program:image-classification-armnn-tflite --env.USE_NEON \
--repetitions=1 --env.CK_BATCH_SIZE=1 --env.CK_BATCH_COUNT=500 \
--record --record_repo=local --record_uoa=mlperf-mobilenet-armnn-tflite-accuracy-neon-500 \
--tags=image-classification,mlperf,mobilenet,armnn-tflite,accuracy,neon,500 \
--skip_print_timers --skip_stat_analysis --process_multi_keys
$ ck benchmark program:image-classification-armnn-tflite --env.USE_NEON \
--repetitions=1 --env.CK_BATCH_SIZE=1 --env.CK_BATCH_COUNT=50000 \
--record --record_repo=local --record_uoa=mlperf-mobilenet-armnn-tflite-accuracy-neon-50000 \
--tags=image-classification,mlperf,mobilenet,armnn-tflite,accuracy,neon,50000 \
--skip_print_timers --skip_stat_analysis --process_multi_keys
$ ck benchmark program:image-classification-armnn-tflite --env.USE_OPENCL \
--repetitions=1 --env.CK_BATCH_SIZE=1 --env.CK_BATCH_COUNT=500 \
--record --record_repo=local --record_uoa=mlperf-mobilenet-armnn-tflite-accuracy-opencl-500 \
--tags=image-classification,mlperf,mobilenet,armnn-tflite,accuracy,opencl,500 \
--skip_print_timers --skip_stat_analysis --process_multi_keys
$ ck benchmark program:image-classification-armnn-tflite --env.USE_OPENCL \
--repetitions=1 --env.CK_BATCH_SIZE=1 --env.CK_BATCH_COUNT=50000 \
--record --record_repo=local --record_uoa=mlperf-mobilenet-armnn-tflite-accuracy-opencl-50000 \
--tags=image-classification,mlperf,mobilenet,armnn-tflite,accuracy,opencl,50000 \
--skip_print_timers --skip_stat_analysis --process_multi_keys
NB: This validation can run on x86 or arm. However, it is completely unoptimized and hence extremely slow (e.g. 6.5 seconds per image on a Linaro HiKey960 board or 2.9 seconds per image on an Intel Xeon server).
$ ck benchmark program:image-classification-armnn-tflite \
--repetitions=1 --env.CK_BATCH_SIZE=1 --env.CK_BATCH_COUNT=500 \
--record --record_repo=local --record_uoa=mlperf-mobilenet-armnn-tflite-accuracy-500 \
--tags=image-classification,mlperf,mobilenet,armnn-tflite,accuracy,500 \
--skip_print_timers --skip_stat_analysis --process_multi_keys
$ ck benchmark program:image-classification-armnn-tflite \
--repetitions=1 --env.CK_BATCH_SIZE=1 --env.CK_BATCH_COUNT=50000 \
--record --record_repo=local --record_uoa=mlperf-mobilenet-armnn-tflite-accuracy-50000 \
--tags=image-classification,mlperf,mobilenet,armnn-tflite,accuracy,50000 \
--skip_print_timers --skip_stat_analysis --process_multi_keys
To validate the equivalence of the optimized ArmNN implementation versus the reference TFLite one, we collected experimental data as above on two platforms:
- A Linaro HiKey960 board (
hikey
): TFLite vs. ArmNN Neon vs. ArmNN OpenCL (500 and 50,000 images). - An Intel Xeon server (
velociti
): TFLite vs. ArmNN Reference (500 and 50,000 images).
The resulting experimental entries were archived e.g. as follows:
hikey$ ck list local:experiment:mlperf-mobilenet*accuracy*500
mlperf-mobilenet-tflite-accuracy-500
mlperf-mobilenet-armnn-tflite-accuracy-neon-500
mlperf-mobilenet-armnn-tflite-accuracy-opencl-500
hikey$ ck zip local:experiment:mlperf-mobilenet*accuracy*500 \
--archive_name=mlperf-mobilenet-accuracy-500-hikey.zip
The archives were then uploaded to DropBox. You can follow instructions below to download the archives and validate the accuracy.
$ wget https://www.dropbox.com/s/9lz7yncy1xtqlvj/mlperf-mobilenet-accuracy-500-hikey.zip
$ ck add repo --zip=mlperf-mobilenet-accuracy-500-hikey.zip
$ ck list --repo_uoa=mlperf-mobilenet-accuracy-500-hikey --print_full
mlperf-mobilenet-accuracy-500-hikey:experiment:mlperf-mobilenet-armnn-tflite-accuracy-neon-500
mlperf-mobilenet-accuracy-500-hikey:experiment:mlperf-mobilenet-tflite-accuracy-500
mlperf-mobilenet-accuracy-500-hikey:experiment:mlperf-mobilenet-armnn-tflite-accuracy-opencl-500
$ ck compare_experiments mlperf \
mlperf-mobilenet-accuracy-500-hikey:experiment:mlperf-mobilenet-tflite-accuracy-500 \
mlperf-mobilenet-accuracy-500-hikey:experiment:mlperf-mobilenet-armnn-tflite-accuracy-neon-500
...
{'epsilon': 1e-05,
'max_delta': 7.000000000034756e-06,
'num_mismatched_classes': 0,
'num_mismatched_elementary_keys': 0,
'num_mismatched_files': 0,
'num_mismatched_probabilities': 0,
'return': 0}
$ ck compare_experiments mlperf \
mlperf-mobilenet-accuracy-500-hikey:experiment:mlperf-mobilenet-tflite-accuracy-500 \
mlperf-mobilenet-accuracy-500-hikey:experiment:mlperf-mobilenet-armnn-tflite-accuracy-opencl-500
...
{'epsilon': 1e-05,
'max_delta': 9.000000000036756e-06,
'num_mismatched_classes': 0,
'num_mismatched_elementary_keys': 0,
'num_mismatched_files': 0,
'num_mismatched_probabilities': 0,
'return': 0}
$ ck compare_experiments mlperf \
mlperf-mobilenet-accuracy-500-hikey:experiment:mlperf-mobilenet-armnn-tflite-accuracy-neon-500 \
mlperf-mobilenet-accuracy-500-hikey:experiment:mlperf-mobilenet-armnn-tflite-accuracy-opencl-500
...
{'epsilon': 1e-05,
'max_delta': 6.0000000000060005e-06,
'num_mismatched_classes': 0,
'num_mismatched_elementary_keys': 0,
'num_mismatched_files': 0,
'num_mismatched_probabilities': 0,
'return': 0}
$ wget https://www.dropbox.com/s/3cdi3lx7jfxwse7/mlperf-mobilenet-accuracy-50000-hikey.zip
$ ck add repo --zip=mlperf-mobilenet-accuracy-50000-hikey.zip
$ ck list --repo_uoa=mlperf-mobilenet-accuracy-50000-hikey --print_full
mlperf-mobilenet-accuracy-50000-hikey:experiment:mlperf-mobilenet-armnn-tflite-accuracy-neon-50000
mlperf-mobilenet-accuracy-50000-hikey:experiment:mlperf-mobilenet-tflite-accuracy-50000
mlperf-mobilenet-accuracy-50000-hikey:experiment:mlperf-mobilenet-armnn-tflite-accuracy-opencl-50000
$ ck compare_experiments mlperf \
mlperf-mobilenet-accuracy-50000-hikey:experiment:mlperf-mobilenet-tflite-accuracy-50000 \
mlperf-mobilenet-accuracy-50000-hikey:experiment:mlperf-mobilenet-armnn-tflite-accuracy-neon-50000
...
{'epsilon': 1e-05,
'max_delta': 1.3000000000040757e-05,
'num_mismatched_classes': 10,
'num_mismatched_elementary_keys': 0,
'num_mismatched_files': 20,
'num_mismatched_probabilities': 19,
'return': 0}
$ ck compare_experiments mlperf \
mlperf-mobilenet-accuracy-50000-hikey:experiment:mlperf-mobilenet-tflite-accuracy-50000 \
mlperf-mobilenet-accuracy-50000-hikey:experiment:mlperf-mobilenet-armnn-tflite-accuracy-opencl-50000
...
{'epsilon': 1e-05,
'max_delta': 1.4000000000014001e-05,
'num_mismatched_classes': 8,
'num_mismatched_elementary_keys': 0,
'num_mismatched_files': 20,
'num_mismatched_probabilities': 18,
'return': 0}
$ ck compare_experiments mlperf \
mlperf-mobilenet-accuracy-50000-hikey:experiment:mlperf-mobilenet-armnn-tflite-accuracy-neon-50000 \
mlperf-mobilenet-accuracy-50000-hikey:experiment:mlperf-mobilenet-armnn-tflite-accuracy-opencl-50000
...
Checking ILSVRC2012_val_00033823.JPEG...
- mismatched classes at index 2: 137 != 136
- mismatched classes at index 3: 136 != 137
...
{'epsilon': 1e-05,
'max_delta': 8.000000000008e-06,
'num_mismatched_classes': 2,
'num_mismatched_elementary_keys': 0,
'num_mismatched_files': 1,
'num_mismatched_probabilities': 0,
'return': 0}
$ wget https://www.dropbox.com/s/j2rdh3uzhz3lqh7/mlperf-mobilenet-accuracy-500-velociti.zip
$ ck add repo --zip=mlperf-mobilenet-accuracy-500-velociti.zip
$ ck list --repo_uoa=mlperf-mobilenet-accuracy-500-velociti --print_full
mlperf-mobilenet-accuracy-500-velociti:experiment:mlperf-mobilenet-armnn-tflite-accuracy-500
mlperf-mobilenet-accuracy-500-velociti:experiment:mlperf-mobilenet-tflite-accuracy-500
$ ck compare_experiments mlperf \
mlperf-mobilenet-accuracy-500-velociti:experiment:mlperf-mobilenet-armnn-tflite-accuracy-500 \
mlperf-mobilenet-accuracy-500-velociti:experiment:mlperf-mobilenet-tflite-accuracy-500
...
{'epsilon': 1e-05,
'max_delta': 7.000000000090267e-06,
'num_mismatched_classes': 0,
'num_mismatched_elementary_keys': 0,
'num_mismatched_files': 0,
'num_mismatched_probabilities': 0,
'return': 0}
$ wget https://www.dropbox.com/s/z5bx7aeocwdyrww/mlperf-mobilenet-accuracy-50000-velociti.zip
$ ck add repo --zip=mlperf-mobilenet-accuracy-50000-velociti.zip
$ ck list --repo_uoa=mlperf-mobilenet-accuracy-50000-velociti --print_full
mlperf-mobilenet-accuracy-50000-velociti:experiment:mlperf-mobilenet-armnn-tflite-accuracy-50000
mlperf-mobilenet-accuracy-50000-velociti:experiment:mlperf-mobilenet-tflite-accuracy-50000
$ ck compare_experiments mlperf \
mlperf-mobilenet-accuracy-50000-velociti:experiment:mlperf-mobilenet-armnn-tflite-accuracy-50000 \
mlperf-mobilenet-accuracy-50000-velociti:experiment:mlperf-mobilenet-tflite-accuracy-50000
...
{'epsilon': 1e-05,
'max_delta': 1.2000000000012001e-05,
'num_mismatched_classes': 2,
'num_mismatched_elementary_keys': 0,
'num_mismatched_files': 14,
'num_mismatched_probabilities': 17,
'return': 0}
Install the ResNet model:
$ ck install package --tags=model,tflite,mlperf,resnet
More than one package or version found:
0) model-tflite-mlperf-resnet-no-argmax Version 1.5 (afb43014ef38f646)
1) model-tflite-mlperf-resnet Version 1.5 (d60d4e9a84151271)
2) model-tflite-convert-from-tf (35e84375ac48dcb1), Variations: resnet
Please select the package to install [ hit return for "0" ]:
Option 0 will download a TFLite model preconverted from the TF model. During
the conversion, the ArgMax
operator causing an
issue with ArmNN v19.02 was
excluded.
Option 1 will download a TFLite model preconverted from the TF model, but
including the ArgMax
operator. This variant can be used once the above issue
is resolved.
Option 2 will download the TF model and convert it to TFLite, while excluding
the ArgMax
operator. Since the conversion relies on a prebuilt version of
TF, this option is only viable on x86. (This constraint can be relaxed, but
building TF on Arm is not officially
supported.)
$ ck benchmark program:image-classification-tflite \
--repetitions=1 --env.CK_BATCH_SIZE=1 --env.CK_BATCH_COUNT=500 \
--record --record_repo=local --record_uoa=mlperf-resnet-tflite-accuracy-500 \
--tags=image-classification,mlperf,resnet,tflite,accuracy,500 \
--skip_print_timers --skip_stat_analysis --process_multi_keys
NB: You can also run the same command on the full ImageNet validation dataset of 50,000 images (see below).
$ ck benchmark program:image-classification-tflite \
--repetitions=1 --env.CK_BATCH_SIZE=1 --env.CK_BATCH_COUNT=50000 \
--record --record_repo=local --record_uoa=mlperf-resnet-tflite-accuracy-50000 \
--tags=image-classification,mlperf,resnet,tflite,accuracy,50000 \
--skip_print_timers --skip_stat_analysis --process_multi_keys
$ ck benchmark program:image-classification-armnn-tflite --env.USE_NEON \
--repetitions=1 --env.CK_BATCH_SIZE=1 --env.CK_BATCH_COUNT=500 \
--record --record_repo=local --record_uoa=mlperf-resnet-armnn-tflite-accuracy-neon-500 \
--tags=image-classification,mlperf,resnet,armnn-tflite,accuracy,neon,500 \
--skip_print_timers --skip_stat_analysis --process_multi_keys
$ ck benchmark program:image-classification-armnn-tflite --env.USE_NEON \
--repetitions=1 --env.CK_BATCH_SIZE=1 --env.CK_BATCH_COUNT=50000 \
--record --record_repo=local --record_uoa=mlperf-resnet-armnn-tflite-accuracy-neon-50000 \
--tags=image-classification,mlperf,resnet,armnn-tflite,accuracy,neon,50000 \
--skip_print_timers --skip_stat_analysis --process_multi_keys
$ ck benchmark program:image-classification-armnn-tflite --env.USE_OPENCL \
--repetitions=1 --env.CK_BATCH_SIZE=1 --env.CK_BATCH_COUNT=500 \
--record --record_repo=local --record_uoa=mlperf-resnet-armnn-tflite-accuracy-opencl-500 \
--tags=image-classification,mlperf,resnet,armnn-tflite,accuracy,opencl,500 \
--skip_print_timers --skip_stat_analysis --process_multi_keys
$ ck benchmark program:image-classification-armnn-tflite --env.USE_OPENCL \
--repetitions=1 --env.CK_BATCH_SIZE=1 --env.CK_BATCH_COUNT=50000 \
--record --record_repo=local --record_uoa=mlperf-resnet-armnn-tflite-accuracy-opencl-50000 \
--tags=image-classification,mlperf,resnet,armnn-tflite,accuracy,opencl,50000 \
--skip_print_timers --skip_stat_analysis --process_multi_keys
NB: This validation can run on x86 or arm. However, it is completely unoptimized and hence extremely slow (e.g. 34.8 seconds per image on a Linaro HiKey960 board or 16.9 seconds per image on an Intel Xeon server).
$ ck benchmark program:image-classification-armnn-tflite \
--repetitions=1 --env.CK_BATCH_SIZE=1 --env.CK_BATCH_COUNT=500 \
--record --record_repo=local --record_uoa=mlperf-resnet-armnn-tflite-accuracy-500 \
--tags=image-classification,mlperf,resnet,armnn-tflite,accuracy,500 \
--skip_print_timers --skip_stat_analysis --process_multi_keys
$ ck benchmark program:image-classification-armnn-tflite \
--repetitions=1 --env.CK_BATCH_SIZE=1 --env.CK_BATCH_COUNT=50000 \
--record --record_repo=local --record_uoa=mlperf-resnet-armnn-tflite-accuracy-50000 \
--tags=image-classification,mlperf,resnet,armnn-tflite,accuracy,50000 \
--skip_print_timers --skip_stat_analysis --process_multi_keys
To validate the equivalence of the optimized ArmNN implementation versus the reference TFLite one, we collected experimental data as above on two platforms:
- A Linaro HiKey960 board (
hikey
): TFLite vs. ArmNN Neon vs. ArmNN OpenCL (500 and 50,000 images). - An Intel Xeon server (
velociti
): TFLite vs. ArmNN Reference (500 images only).
The resulting experimental entries were archived e.g. as follows:
hikey$ ck list local:experiment:mlperf-resnet*accuracy*500
mlperf-resnet-tflite-accuracy-500
mlperf-resnet-armnn-tflite-accuracy-neon-500
mlperf-resnet-armnn-tflite-accuracy-opencl-500
hikey$ ck zip local:experiment:mlperf-resnet*accuracy*500 \
--archive_name=mlperf-resnet-accuracy-500-hikey.zip
The archives were then uploaded to DropBox. You can follow instructions below to download the archives and validate the accuracy.
$ wget https://www.dropbox.com/s/eod0bflxxzpudmr/mlperf-resnet-accuracy-500-hikey.zip
$ ck add repo --zip=mlperf-resnet-accuracy-500-hikey.zip
$ ck list --repo_uoa=mlperf-resnet-accuracy-500-hikey --print_full
mlperf-resnet-accuracy-500-hikey:experiment:mlperf-resnet-armnn-tflite-accuracy-neon-500
mlperf-resnet-accuracy-500-hikey:experiment:mlperf-resnet-armnn-tflite-accuracy-opencl-500
mlperf-resnet-accuracy-500-hikey:experiment:mlperf-resnet-tflite-accuracy-500
$ ck compare_experiments mlperf \
mlperf-resnet-accuracy-500-hikey:experiment:mlperf-resnet-tflite-accuracy-500 \
mlperf-resnet-accuracy-500-hikey:experiment:mlperf-resnet-armnn-tflite-accuracy-neon-500
...
{'epsilon': 1e-05,
'max_delta': 1.0999999999983245e-05,
'num_mismatched_classes': 0,
'num_mismatched_elementary_keys': 0,
'num_mismatched_files': 2,
'num_mismatched_probabilities': 3,
'return': 0}
$ ck compare_experiments mlperf \
mlperf-resnet-accuracy-500-hikey:experiment:mlperf-resnet-tflite-accuracy-500 \
mlperf-resnet-accuracy-500-hikey:experiment:mlperf-resnet-armnn-tflite-accuracy-opencl-500
...
{'epsilon': 1e-05,
'max_delta': 1.0999999999983245e-05,
'num_mismatched_classes': 0,
'num_mismatched_elementary_keys': 0,
'num_mismatched_files': 3,
'num_mismatched_probabilities': 4,
'return': 0}
$ ck compare_experiments mlperf \
mlperf-resnet-accuracy-500-hikey:experiment:mlperf-resnet-armnn-tflite-accuracy-neon-500 \
mlperf-resnet-accuracy-500-hikey:experiment:mlperf-resnet-armnn-tflite-accuracy-opencl-500
...
{'epsilon': 1e-05,
'max_delta': 6.0000000000060005e-06,
'num_mismatched_classes': 0,
'num_mismatched_elementary_keys': 0,
'num_mismatched_files': 0,
'num_mismatched_probabilities': 0,
'return': 0}
$ wget https://www.dropbox.com/s/1yzv6unriqs18yb/mlperf-resnet-accuracy-50000-hikey.zip
$ ck add repo --zip=mlperf-resnet-accuracy-50000-hikey.zip
$ ck list --repo_uoa=mlperf-resnet-accuracy-50000-hikey --print_full
mlperf-resnet-accuracy-50000-hikey:experiment:mlperf-resnet-armnn-tflite-accuracy-opencl-50000
mlperf-resnet-accuracy-50000-hikey:experiment:mlperf-resnet-armnn-tflite-accuracy-neon-50000
mlperf-resnet-accuracy-50000-hikey:experiment:mlperf-resnet-tflite-accuracy-50000
$ ck compare_experiments mlperf \
mlperf-resnet-accuracy-50000-hikey:experiment:mlperf-resnet-tflite-accuracy-50000 \
mlperf-resnet-accuracy-50000-hikey:experiment:mlperf-resnet-armnn-tflite-accuracy-neon-50000
...
{'epsilon': 1e-05,
'max_delta': 2.4000000000024002e-05,
'num_mismatched_classes': 6,
'num_mismatched_elementary_keys': 0,
'num_mismatched_files': 107,
'num_mismatched_probabilities': 153,
'return': 0}
$ ck compare_experiments mlperf \
mlperf-resnet-accuracy-50000-hikey:experiment:mlperf-resnet-tflite-accuracy-50000 \
mlperf-resnet-accuracy-50000-hikey:experiment:mlperf-resnet-armnn-tflite-accuracy-opencl-50000
...
{'epsilon': 1e-05,
'max_delta': 2.5000000000052758e-05,
'num_mismatched_classes': 4,
'num_mismatched_elementary_keys': 0,
'num_mismatched_files': 147,
'num_mismatched_probabilities': 190,
'return': 0}
$ ck compare_experiments mlperf \
mlperf-resnet-accuracy-50000-hikey:experiment:mlperf-resnet-armnn-tflite-accuracy-neon-50000 \
mlperf-resnet-accuracy-50000-hikey:experiment:mlperf-resnet-armnn-tflite-accuracy-opencl-50000
...
{'epsilon': 1e-05,
'max_delta': 1.0000000000010001e-05,
'num_mismatched_classes': 6,
'num_mismatched_elementary_keys': 0,
'num_mismatched_files': 5,
'num_mismatched_probabilities': 2,
'return': 0}
$ wget https://www.dropbox.com/s/1jv4lpfp1ddr2j7/mlperf-resnet-accuracy-500-velociti.zip
$ ck add repo --zip=mlperf-resnet-accuracy-500-velociti.zip
$ ck list --repo_uoa=mlperf-resnet-accuracy-500-velociti --print_full
mlperf-resnet-accuracy-500-velociti:experiment:mlperf-resnet-armnn-tflite-accuracy-500
mlperf-resnet-accuracy-500-velociti:experiment:mlperf-resnet-tflite-accuracy-500
$ ck compare_experiments mlperf \
mlperf-resnet-accuracy-500-velociti:experiment:mlperf-resnet-armnn-tflite-accuracy-500 \
mlperf-resnet-accuracy-500-velociti:experiment:mlperf-resnet-tflite-accuracy-500
...
{'epsilon': 1e-05,
'max_delta': 3.0000000000030003e-06,
'num_mismatched_classes': 0,
'num_mismatched_elementary_keys': 0,
'num_mismatched_files': 0,
'num_mismatched_probabilities': 0,
'return': 0}
Please follow the MLPerf object detection instructions to install dependencies such as Python packages:
- first, the common instructions;
- then, the TFLite instructions.
The SSD models require TFLite 1.13.1.
The COCO API (used to evaluate object detection accuracy on the COCO dataset) requires Python 3.
Since many embedded platforms use Python 2 by default (including HiKey960), care must be taken not to mix Python 3 and Python 2 packages.
Therefore, all benchmarking commands below use the CK_PYTHON=python3
prefix to ensure CK runs under Python 3.
$ ck install package --tags=object-detection,dataset,coco.2017,val,original,full
$ ck install package --tags=object-detection,dataset,coco.2017,preprocessed,full
Install the SSD-MobileNet model:
$ ck install package --tags=model,tflite,mlperf,object-detection,ssd-mobilenet
$ CK_PYTHON=python3 ck benchmark program:object-detection-tflite \
--repetitions=1 --env.CK_BATCH_SIZE=1 --env.CK_BATCH_COUNT=5000 --env.CK_METRIC_TYPE=COCO \
--record --record_repo=local --record_uoa=mlperf-object-detection-ssd-mobilenet-tflite-accuracy \
--tags=mlperf,object-detection,ssd-mobilenet,tflite,accuracy \
--skip_print_timers --skip_stat_analysis --process_multi_keys
$ CK_PYTHON=python3 ck benchmark program:object-detection-armnn-tflite --env.USE_NMS=regular --env.USE_NEON \
--repetitions=1 --env.CK_BATCH_SIZE=1 --env.CK_BATCH_COUNT=5000 --env.CK_METRIC_TYPE=COCO \
--record --record_repo=local --record_uoa=mlperf-object-detection-ssd-mobilenet-armnn-tflite-accuracy-neon \
--tags=mlperf,object-detection,ssd-mobilenet,armnn-tflite,accuracy,neon \
--skip_print_timers --skip_stat_analysis --process_multi_keys
$ CK_PYTHON=python3 ck benchmark program:object-detection-armnn-tflite --env.USE_NMS=regular --env.USE_OPENCL \
--repetitions=1 --env.CK_BATCH_SIZE=1 --env.CK_BATCH_COUNT=5000 --env.CK_METRIC_TYPE=COCO \
--record --record_repo=local --record_uoa=mlperf-object-detection-ssd-mobilenet-armnn-tflite-accuracy-opencl \
--tags=mlperf,object-detection,ssd-mobilenet,armnn-tflite,accuracy,opencl \
--skip_print_timers --skip_stat_analysis --process_multi_keys
To validate the equivalence of the optimized ArmNN implementation versus the reference TFLite one, we collected experimental data as above on:
- A Linaro HiKey960 board (
hikey
): TFLite vs. ArmNN Neon vs. ArmNN OpenCL.
The resulting experimental entries were archived e.g. as follows:
hikey$ ck list local:experiment:mlperf-object-detection-ssd-mobilenet*accuracy*
...
hikey$ ck zip local:experiment:mlperf-object-detection-ssd-mobilenet*accuracy* \
--archive_name=mlperf-object-detection-ssd-mobilenet-accuracy-hikey.zip
The archives were then uploaded to DropBox. You can follow instructions below to download the archives and validate the accuracy.
$ wget https://www.dropbox.com/s/jzpum9fedwgq8rd/mlperf-object-detection-ssd-mobilenet-accuracy-hikey.zip
$ ck add repo --zip=mlperf-object-detection-ssd-mobilenet-accuracy-hikey.zip
$ ck list --repo_uoa=mlperf-object-detection-ssd-mobilenet-accuracy-hikey --print_full
mlperf-object-detection-ssd-mobilenet-accuracy-hikey:experiment:mlperf-object-detection-ssd-mobilenet-armnn-tflite-accuracy-opencl
mlperf-object-detection-ssd-mobilenet-accuracy-hikey:experiment:mlperf-object-detection-ssd-mobilenet-armnn-tflite-accuracy-neon
mlperf-object-detection-ssd-mobilenet-accuracy-hikey:experiment:mlperf-object-detection-ssd-mobilenet-tflite-accuracy
$ ck compare_experiments mlperf \
mlperf-object-detection-ssd-mobilenet-accuracy-hikey:experiment:mlperf-object-detection-ssd-mobilenet-tflite-accuracy \
mlperf-object-detection-ssd-mobilenet-accuracy-hikey:experiment:mlperf-object-detection-ssd-mobilenet-armnn-tflite-accuracy-neon
...
{'delta_mAP': 0.0007913466136450775,
'delta_recall': 0.001550580396864898,
'epsilon_bbox': [1.0, 1.0, 1.0, 1.0],
'epsilon_score': 1e-05,
'max_delta_bbox': [596.98, 490.93, 606.36, 583.04],
'max_delta_score': 0.49000000000000005,
'num_mismatched_classes': 239,
'num_mismatched_files': 1166,
'num_mismatched_probabilities': 442,
'return': 0}
$ ck compare_experiments mlperf \
mlperf-object-detection-ssd-mobilenet-accuracy-hikey:experiment:mlperf-object-detection-ssd-mobilenet-tflite-accuracy \
mlperf-object-detection-ssd-mobilenet-accuracy-hikey:experiment:mlperf-object-detection-ssd-mobilenet-armnn-tflite-accuracy-opencl
...
{'delta_mAP': 0.0007900779499596944,
'delta_recall': 0.0015451924658304583,
'epsilon_bbox': [1.0, 1.0, 1.0, 1.0],
'epsilon_score': 1e-05,
'max_delta_bbox': [596.98, 490.93, 606.36, 583.04],
'max_delta_score': 0.49000000000000005,
'num_mismatched_classes': 239,
'num_mismatched_files': 1167,
'num_mismatched_probabilities': 444,
'return': 0}
$ ck compare_experiments mlperf \
mlperf-object-detection-ssd-mobilenet-accuracy-hikey:experiment:mlperf-object-detection-ssd-mobilenet-armnn-tflite-accuracy-neon \
mlperf-object-detection-ssd-mobilenet-accuracy-hikey:experiment:mlperf-object-detection-ssd-mobilenet-armnn-tflite-accuracy-opencl
...
{'delta_mAP': -1.2686636853831423e-06,
'delta_recall': -5.387931034439575e-06,
'epsilon_bbox': [1.0, 1.0, 1.0, 1.0],
'epsilon_score': 1e-05,
'max_delta_bbox': [0.010000000000047748,
0.010000000000047748,
0.010000000000047748,
0.010000000000047748],
'max_delta_score': 0.0010000000000000009,
'num_mismatched_classes': 0,
'num_mismatched_files': 7,
'num_mismatched_probabilities': 7,
'return': 0}