Skip to content

ARM-software/armnn-mlperf

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

91 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Arm NN ports for MLPerf Inference benchmarks

compatibility

Getting started

Install CK

Please follow the CK installation instructions.

Pull CK repositories

$ ck pull repo --url=https://github.com/arm-software/armnn-mlperf
$ ck list repo:*armnn*
ck-armnn
armnn-mlperf
$ ck list repo:*mlperf*
ck-mlperf
armnn-mlperf

NB: Remember to refresh all the repositories after any updates (e.g. bug fixes):

$ ck pull all

Install TFLite

$ ck install package --tags=lib,tflite,v1.13

Install ArmNN

To install ArmNN with full support (frontends: TF, TFLite, ONNX; backends: Reference, OpenCL, Neon):

$ ck install package --tags=lib,armnn,tf,tflite,onnx,neon,opencl,rel.19.02

NB: On a platform with only a couple of GB of RAM, you way wish to restrict the number of CPU build threads e.g. as follows:

$ ck install package --tags=lib,armnn,tflite,neon,opencl,rel.19.02 \
--env.CK_HOST_CPU_NUMBER_OF_PROCESSORS=4

If you would like to save time, you can build with TFLite frontend support only using one of the backend options below. For more details, please refer to the CK-ArmNN repository.

Option 1: Install ArmNN with TFLite, Neon and OpenCL support (recommended)

$ ck install package --tags=lib,armnn,tflite,neon,opencl,rel.19.02

Option 2: Install ArmNN with TFLite and Neon support

$ ck install package --tags=lib,armnn,tflite,neon,rel.19.02

Option 3: Install ArmNN with TFLite and OpenCL support

$ ck install package --tags=lib,armnn,tflite,opencl,rel.19.02

Option 4: ArmNN with TFLite and Reference support

$ ck install package --tags=lib,armnn,tflite,rel.19.02

Image classification

Please follow the MLPerf image classification instructions to install dependencies such as Python packages:

Download and preprocess the ImageNet 2012 validation dataset

Full (50,000 images)

$ ck install package --tags=dataset,imagenet,val,original,full
$ ck install package --tags=dataset,imagenet,val,preprocessed,full

Minimal (500 images)

$ ck install package --tags=dataset,imagenet,val,original,min --no_tags=resized
$ ck install package --tags=dataset,imagenet,val,preprocessed

MobileNet

Model

Install the MobileNet model:

$ ck install package --tags=model,tflite,mlperf,mobilenet,non-quantized

TFLite data (reference)

Run on 500 images

$ ck benchmark program:image-classification-tflite \
--repetitions=1 --env.CK_BATCH_SIZE=1 --env.CK_BATCH_COUNT=500 \
--record --record_repo=local --record_uoa=mlperf-mobilenet-tflite-accuracy-500 \
--tags=image-classification,mlperf,mobilenet,tflite,accuracy,500 \
--skip_print_timers --skip_stat_analysis --process_multi_keys

NB: You can also run the same command on the full ImageNet validation dataset of 50,000 images (see below).

Run on 50,000 images

$ ck benchmark program:image-classification-tflite \
--repetitions=1 --env.CK_BATCH_SIZE=1 --env.CK_BATCH_COUNT=50000 \
--record --record_repo=local --record_uoa=mlperf-mobilenet-tflite-accuracy-50000 \
--tags=image-classification,mlperf,mobilenet,tflite,accuracy,50000 \
--skip_print_timers --skip_stat_analysis --process_multi_keys

ArmNN Neon data

Run on 500 images

$ ck benchmark program:image-classification-armnn-tflite --env.USE_NEON \
--repetitions=1 --env.CK_BATCH_SIZE=1 --env.CK_BATCH_COUNT=500 \
--record --record_repo=local --record_uoa=mlperf-mobilenet-armnn-tflite-accuracy-neon-500 \
--tags=image-classification,mlperf,mobilenet,armnn-tflite,accuracy,neon,500 \
--skip_print_timers --skip_stat_analysis --process_multi_keys

Run on 50,000 images

$ ck benchmark program:image-classification-armnn-tflite --env.USE_NEON \
--repetitions=1 --env.CK_BATCH_SIZE=1 --env.CK_BATCH_COUNT=50000 \
--record --record_repo=local --record_uoa=mlperf-mobilenet-armnn-tflite-accuracy-neon-50000 \
--tags=image-classification,mlperf,mobilenet,armnn-tflite,accuracy,neon,50000 \
--skip_print_timers --skip_stat_analysis --process_multi_keys

ArmNN OpenCL data

Run on 500 images

$ ck benchmark program:image-classification-armnn-tflite --env.USE_OPENCL \
--repetitions=1 --env.CK_BATCH_SIZE=1 --env.CK_BATCH_COUNT=500 \
--record --record_repo=local --record_uoa=mlperf-mobilenet-armnn-tflite-accuracy-opencl-500 \
--tags=image-classification,mlperf,mobilenet,armnn-tflite,accuracy,opencl,500 \
--skip_print_timers --skip_stat_analysis --process_multi_keys

Run on 50,000 images

$ ck benchmark program:image-classification-armnn-tflite --env.USE_OPENCL \
--repetitions=1 --env.CK_BATCH_SIZE=1 --env.CK_BATCH_COUNT=50000 \
--record --record_repo=local --record_uoa=mlperf-mobilenet-armnn-tflite-accuracy-opencl-50000 \
--tags=image-classification,mlperf,mobilenet,armnn-tflite,accuracy,opencl,50000 \
--skip_print_timers --skip_stat_analysis --process_multi_keys

ArmNN Reference data (NOT RECOMMENDED)

NB: This validation can run on x86 or arm. However, it is completely unoptimized and hence extremely slow (e.g. 6.5 seconds per image on a Linaro HiKey960 board or 2.9 seconds per image on an Intel Xeon server).

Run on 500 images

$ ck benchmark program:image-classification-armnn-tflite \
--repetitions=1 --env.CK_BATCH_SIZE=1 --env.CK_BATCH_COUNT=500 \
--record --record_repo=local --record_uoa=mlperf-mobilenet-armnn-tflite-accuracy-500 \
--tags=image-classification,mlperf,mobilenet,armnn-tflite,accuracy,500 \
--skip_print_timers --skip_stat_analysis --process_multi_keys

Run on 50,000 images

$ ck benchmark program:image-classification-armnn-tflite \
--repetitions=1 --env.CK_BATCH_SIZE=1 --env.CK_BATCH_COUNT=50000 \
--record --record_repo=local --record_uoa=mlperf-mobilenet-armnn-tflite-accuracy-50000 \
--tags=image-classification,mlperf,mobilenet,armnn-tflite,accuracy,50000 \
--skip_print_timers --skip_stat_analysis --process_multi_keys

Validate experimental data

To validate the equivalence of the optimized ArmNN implementation versus the reference TFLite one, we collected experimental data as above on two platforms:

  • A Linaro HiKey960 board (hikey): TFLite vs. ArmNN Neon vs. ArmNN OpenCL (500 and 50,000 images).
  • An Intel Xeon server (velociti): TFLite vs. ArmNN Reference (500 and 50,000 images).

The resulting experimental entries were archived e.g. as follows:

hikey$ ck list local:experiment:mlperf-mobilenet*accuracy*500
mlperf-mobilenet-tflite-accuracy-500
mlperf-mobilenet-armnn-tflite-accuracy-neon-500
mlperf-mobilenet-armnn-tflite-accuracy-opencl-500
hikey$ ck zip local:experiment:mlperf-mobilenet*accuracy*500 \
                --archive_name=mlperf-mobilenet-accuracy-500-hikey.zip

The archives were then uploaded to DropBox. You can follow instructions below to download the archives and validate the accuracy.

hikey

500 images
$ wget https://www.dropbox.com/s/9lz7yncy1xtqlvj/mlperf-mobilenet-accuracy-500-hikey.zip
$ ck add repo --zip=mlperf-mobilenet-accuracy-500-hikey.zip
$ ck list --repo_uoa=mlperf-mobilenet-accuracy-500-hikey --print_full
mlperf-mobilenet-accuracy-500-hikey:experiment:mlperf-mobilenet-armnn-tflite-accuracy-neon-500
mlperf-mobilenet-accuracy-500-hikey:experiment:mlperf-mobilenet-tflite-accuracy-500
mlperf-mobilenet-accuracy-500-hikey:experiment:mlperf-mobilenet-armnn-tflite-accuracy-opencl-500
TFLite vs. ArmNN Neon
$ ck compare_experiments mlperf \
mlperf-mobilenet-accuracy-500-hikey:experiment:mlperf-mobilenet-tflite-accuracy-500 \
mlperf-mobilenet-accuracy-500-hikey:experiment:mlperf-mobilenet-armnn-tflite-accuracy-neon-500
...
{'epsilon': 1e-05,
 'max_delta': 7.000000000034756e-06,
 'num_mismatched_classes': 0,
 'num_mismatched_elementary_keys': 0,
 'num_mismatched_files': 0,
 'num_mismatched_probabilities': 0,
 'return': 0}
TFLite vs. ArmNN OpenCL
$ ck compare_experiments mlperf \
mlperf-mobilenet-accuracy-500-hikey:experiment:mlperf-mobilenet-tflite-accuracy-500 \
mlperf-mobilenet-accuracy-500-hikey:experiment:mlperf-mobilenet-armnn-tflite-accuracy-opencl-500
...
{'epsilon': 1e-05,
 'max_delta': 9.000000000036756e-06,
 'num_mismatched_classes': 0,
 'num_mismatched_elementary_keys': 0,
 'num_mismatched_files': 0,
 'num_mismatched_probabilities': 0,
 'return': 0}
ArmNN Neon vs. ArmNN OpenCL
$ ck compare_experiments mlperf \
mlperf-mobilenet-accuracy-500-hikey:experiment:mlperf-mobilenet-armnn-tflite-accuracy-neon-500 \
mlperf-mobilenet-accuracy-500-hikey:experiment:mlperf-mobilenet-armnn-tflite-accuracy-opencl-500
...
{'epsilon': 1e-05,
 'max_delta': 6.0000000000060005e-06,
 'num_mismatched_classes': 0,
 'num_mismatched_elementary_keys': 0,
 'num_mismatched_files': 0,
 'num_mismatched_probabilities': 0,
 'return': 0}
50,000 images
$ wget https://www.dropbox.com/s/3cdi3lx7jfxwse7/mlperf-mobilenet-accuracy-50000-hikey.zip
$ ck add repo --zip=mlperf-mobilenet-accuracy-50000-hikey.zip
$ ck list --repo_uoa=mlperf-mobilenet-accuracy-50000-hikey --print_full
mlperf-mobilenet-accuracy-50000-hikey:experiment:mlperf-mobilenet-armnn-tflite-accuracy-neon-50000
mlperf-mobilenet-accuracy-50000-hikey:experiment:mlperf-mobilenet-tflite-accuracy-50000
mlperf-mobilenet-accuracy-50000-hikey:experiment:mlperf-mobilenet-armnn-tflite-accuracy-opencl-50000
TFLite vs. ArmNN Neon
$ ck compare_experiments mlperf \
mlperf-mobilenet-accuracy-50000-hikey:experiment:mlperf-mobilenet-tflite-accuracy-50000 \
mlperf-mobilenet-accuracy-50000-hikey:experiment:mlperf-mobilenet-armnn-tflite-accuracy-neon-50000
...
{'epsilon': 1e-05,
 'max_delta': 1.3000000000040757e-05,
 'num_mismatched_classes': 10,
 'num_mismatched_elementary_keys': 0,
 'num_mismatched_files': 20,
 'num_mismatched_probabilities': 19,
 'return': 0}
TFLite vs. ArmNN OpenCL
$ ck compare_experiments mlperf \
mlperf-mobilenet-accuracy-50000-hikey:experiment:mlperf-mobilenet-tflite-accuracy-50000 \
mlperf-mobilenet-accuracy-50000-hikey:experiment:mlperf-mobilenet-armnn-tflite-accuracy-opencl-50000
...
{'epsilon': 1e-05,
 'max_delta': 1.4000000000014001e-05,
 'num_mismatched_classes': 8,
 'num_mismatched_elementary_keys': 0,
 'num_mismatched_files': 20,
 'num_mismatched_probabilities': 18,
 'return': 0}
ArmNN Neon vs. ArmNN OpenCL
$ ck compare_experiments mlperf \
mlperf-mobilenet-accuracy-50000-hikey:experiment:mlperf-mobilenet-armnn-tflite-accuracy-neon-50000 \
mlperf-mobilenet-accuracy-50000-hikey:experiment:mlperf-mobilenet-armnn-tflite-accuracy-opencl-50000
...
Checking ILSVRC2012_val_00033823.JPEG...
- mismatched classes at index 2: 137 != 136
- mismatched classes at index 3: 136 != 137
...
{'epsilon': 1e-05,
 'max_delta': 8.000000000008e-06,
 'num_mismatched_classes': 2,
 'num_mismatched_elementary_keys': 0,
 'num_mismatched_files': 1,
 'num_mismatched_probabilities': 0,
 'return': 0}

velociti

500 images
$ wget https://www.dropbox.com/s/j2rdh3uzhz3lqh7/mlperf-mobilenet-accuracy-500-velociti.zip
$ ck add repo --zip=mlperf-mobilenet-accuracy-500-velociti.zip
$ ck list --repo_uoa=mlperf-mobilenet-accuracy-500-velociti --print_full
mlperf-mobilenet-accuracy-500-velociti:experiment:mlperf-mobilenet-armnn-tflite-accuracy-500
mlperf-mobilenet-accuracy-500-velociti:experiment:mlperf-mobilenet-tflite-accuracy-500
TFLite vs. ArmNN Reference
$ ck compare_experiments mlperf \
mlperf-mobilenet-accuracy-500-velociti:experiment:mlperf-mobilenet-armnn-tflite-accuracy-500 \
mlperf-mobilenet-accuracy-500-velociti:experiment:mlperf-mobilenet-tflite-accuracy-500
...
{'epsilon': 1e-05,
 'max_delta': 7.000000000090267e-06,
 'num_mismatched_classes': 0,
 'num_mismatched_elementary_keys': 0,
 'num_mismatched_files': 0,
 'num_mismatched_probabilities': 0,
 'return': 0}
50,000 images
$ wget https://www.dropbox.com/s/z5bx7aeocwdyrww/mlperf-mobilenet-accuracy-50000-velociti.zip
$ ck add repo --zip=mlperf-mobilenet-accuracy-50000-velociti.zip
$ ck list --repo_uoa=mlperf-mobilenet-accuracy-50000-velociti --print_full
mlperf-mobilenet-accuracy-50000-velociti:experiment:mlperf-mobilenet-armnn-tflite-accuracy-50000
mlperf-mobilenet-accuracy-50000-velociti:experiment:mlperf-mobilenet-tflite-accuracy-50000
TFLite vs. ArmNN Reference
$ ck compare_experiments mlperf \
mlperf-mobilenet-accuracy-50000-velociti:experiment:mlperf-mobilenet-armnn-tflite-accuracy-50000 \
mlperf-mobilenet-accuracy-50000-velociti:experiment:mlperf-mobilenet-tflite-accuracy-50000
...
{'epsilon': 1e-05,
 'max_delta': 1.2000000000012001e-05,
 'num_mismatched_classes': 2,
 'num_mismatched_elementary_keys': 0,
 'num_mismatched_files': 14,
 'num_mismatched_probabilities': 17,
 'return': 0}

ResNet

Model

Install the ResNet model:

$ ck install package --tags=model,tflite,mlperf,resnet

More than one package or version found:

 0) model-tflite-mlperf-resnet-no-argmax  Version 1.5  (afb43014ef38f646)
 1) model-tflite-mlperf-resnet  Version 1.5  (d60d4e9a84151271)
 2) model-tflite-convert-from-tf (35e84375ac48dcb1), Variations: resnet

Please select the package to install [ hit return for "0" ]:

Option 0 will download a TFLite model preconverted from the TF model. During the conversion, the ArgMax operator causing an issue with ArmNN v19.02 was excluded.

Option 1 will download a TFLite model preconverted from the TF model, but including the ArgMax operator. This variant can be used once the above issue is resolved.

Option 2 will download the TF model and convert it to TFLite, while excluding the ArgMax operator. Since the conversion relies on a prebuilt version of TF, this option is only viable on x86. (This constraint can be relaxed, but building TF on Arm is not officially supported.)

TFLite data (reference)

Run on 500 images

$ ck benchmark program:image-classification-tflite \
--repetitions=1 --env.CK_BATCH_SIZE=1 --env.CK_BATCH_COUNT=500 \
--record --record_repo=local --record_uoa=mlperf-resnet-tflite-accuracy-500 \
--tags=image-classification,mlperf,resnet,tflite,accuracy,500 \
--skip_print_timers --skip_stat_analysis --process_multi_keys

NB: You can also run the same command on the full ImageNet validation dataset of 50,000 images (see below).

Run on 50,000 images

$ ck benchmark program:image-classification-tflite \
--repetitions=1 --env.CK_BATCH_SIZE=1 --env.CK_BATCH_COUNT=50000 \
--record --record_repo=local --record_uoa=mlperf-resnet-tflite-accuracy-50000 \
--tags=image-classification,mlperf,resnet,tflite,accuracy,50000 \
--skip_print_timers --skip_stat_analysis --process_multi_keys

ArmNN Neon data

Run on 500 images

$ ck benchmark program:image-classification-armnn-tflite --env.USE_NEON \
--repetitions=1 --env.CK_BATCH_SIZE=1 --env.CK_BATCH_COUNT=500 \
--record --record_repo=local --record_uoa=mlperf-resnet-armnn-tflite-accuracy-neon-500 \
--tags=image-classification,mlperf,resnet,armnn-tflite,accuracy,neon,500 \
--skip_print_timers --skip_stat_analysis --process_multi_keys

Run on 50,000 images

$ ck benchmark program:image-classification-armnn-tflite --env.USE_NEON \
--repetitions=1 --env.CK_BATCH_SIZE=1 --env.CK_BATCH_COUNT=50000 \
--record --record_repo=local --record_uoa=mlperf-resnet-armnn-tflite-accuracy-neon-50000 \
--tags=image-classification,mlperf,resnet,armnn-tflite,accuracy,neon,50000 \
--skip_print_timers --skip_stat_analysis --process_multi_keys

ArmNN OpenCL data

Run on 500 images

$ ck benchmark program:image-classification-armnn-tflite --env.USE_OPENCL \
--repetitions=1 --env.CK_BATCH_SIZE=1 --env.CK_BATCH_COUNT=500 \
--record --record_repo=local --record_uoa=mlperf-resnet-armnn-tflite-accuracy-opencl-500 \
--tags=image-classification,mlperf,resnet,armnn-tflite,accuracy,opencl,500 \
--skip_print_timers --skip_stat_analysis --process_multi_keys

Run on 50,000 images

$ ck benchmark program:image-classification-armnn-tflite --env.USE_OPENCL \
--repetitions=1 --env.CK_BATCH_SIZE=1 --env.CK_BATCH_COUNT=50000 \
--record --record_repo=local --record_uoa=mlperf-resnet-armnn-tflite-accuracy-opencl-50000 \
--tags=image-classification,mlperf,resnet,armnn-tflite,accuracy,opencl,50000 \
--skip_print_timers --skip_stat_analysis --process_multi_keys

ArmNN Reference data (NOT RECOMMENDED)

NB: This validation can run on x86 or arm. However, it is completely unoptimized and hence extremely slow (e.g. 34.8 seconds per image on a Linaro HiKey960 board or 16.9 seconds per image on an Intel Xeon server).

Run on 500 images

$ ck benchmark program:image-classification-armnn-tflite \
--repetitions=1 --env.CK_BATCH_SIZE=1 --env.CK_BATCH_COUNT=500 \
--record --record_repo=local --record_uoa=mlperf-resnet-armnn-tflite-accuracy-500 \
--tags=image-classification,mlperf,resnet,armnn-tflite,accuracy,500 \
--skip_print_timers --skip_stat_analysis --process_multi_keys

Run on 50,000 images (NOT RUN)

$ ck benchmark program:image-classification-armnn-tflite \
--repetitions=1 --env.CK_BATCH_SIZE=1 --env.CK_BATCH_COUNT=50000 \
--record --record_repo=local --record_uoa=mlperf-resnet-armnn-tflite-accuracy-50000 \
--tags=image-classification,mlperf,resnet,armnn-tflite,accuracy,50000 \
--skip_print_timers --skip_stat_analysis --process_multi_keys

Validate experimental data

To validate the equivalence of the optimized ArmNN implementation versus the reference TFLite one, we collected experimental data as above on two platforms:

  • A Linaro HiKey960 board (hikey): TFLite vs. ArmNN Neon vs. ArmNN OpenCL (500 and 50,000 images).
  • An Intel Xeon server (velociti): TFLite vs. ArmNN Reference (500 images only).

The resulting experimental entries were archived e.g. as follows:

hikey$ ck list local:experiment:mlperf-resnet*accuracy*500
mlperf-resnet-tflite-accuracy-500
mlperf-resnet-armnn-tflite-accuracy-neon-500
mlperf-resnet-armnn-tflite-accuracy-opencl-500
hikey$ ck zip local:experiment:mlperf-resnet*accuracy*500 \
                --archive_name=mlperf-resnet-accuracy-500-hikey.zip

The archives were then uploaded to DropBox. You can follow instructions below to download the archives and validate the accuracy.

hikey

500 images
$ wget https://www.dropbox.com/s/eod0bflxxzpudmr/mlperf-resnet-accuracy-500-hikey.zip
$ ck add repo --zip=mlperf-resnet-accuracy-500-hikey.zip
$ ck list --repo_uoa=mlperf-resnet-accuracy-500-hikey --print_full
mlperf-resnet-accuracy-500-hikey:experiment:mlperf-resnet-armnn-tflite-accuracy-neon-500
mlperf-resnet-accuracy-500-hikey:experiment:mlperf-resnet-armnn-tflite-accuracy-opencl-500
mlperf-resnet-accuracy-500-hikey:experiment:mlperf-resnet-tflite-accuracy-500
TFLite vs. ArmNN Neon
$ ck compare_experiments mlperf \
mlperf-resnet-accuracy-500-hikey:experiment:mlperf-resnet-tflite-accuracy-500 \
mlperf-resnet-accuracy-500-hikey:experiment:mlperf-resnet-armnn-tflite-accuracy-neon-500
...
{'epsilon': 1e-05,
 'max_delta': 1.0999999999983245e-05,
 'num_mismatched_classes': 0,
 'num_mismatched_elementary_keys': 0,
 'num_mismatched_files': 2,
 'num_mismatched_probabilities': 3,
 'return': 0}
TFLite vs. ArmNN OpenCL
$ ck compare_experiments mlperf \
mlperf-resnet-accuracy-500-hikey:experiment:mlperf-resnet-tflite-accuracy-500 \
mlperf-resnet-accuracy-500-hikey:experiment:mlperf-resnet-armnn-tflite-accuracy-opencl-500
...
{'epsilon': 1e-05,
 'max_delta': 1.0999999999983245e-05,
 'num_mismatched_classes': 0,
 'num_mismatched_elementary_keys': 0,
 'num_mismatched_files': 3,
 'num_mismatched_probabilities': 4,
 'return': 0}
ArmNN Neon vs. ArmNN OpenCL
$ ck compare_experiments mlperf \
mlperf-resnet-accuracy-500-hikey:experiment:mlperf-resnet-armnn-tflite-accuracy-neon-500 \
mlperf-resnet-accuracy-500-hikey:experiment:mlperf-resnet-armnn-tflite-accuracy-opencl-500
...
{'epsilon': 1e-05,
 'max_delta': 6.0000000000060005e-06,
 'num_mismatched_classes': 0,
 'num_mismatched_elementary_keys': 0,
 'num_mismatched_files': 0,
 'num_mismatched_probabilities': 0,
 'return': 0}
50,000 images
$ wget https://www.dropbox.com/s/1yzv6unriqs18yb/mlperf-resnet-accuracy-50000-hikey.zip
$ ck add repo --zip=mlperf-resnet-accuracy-50000-hikey.zip
$ ck list --repo_uoa=mlperf-resnet-accuracy-50000-hikey --print_full
mlperf-resnet-accuracy-50000-hikey:experiment:mlperf-resnet-armnn-tflite-accuracy-opencl-50000
mlperf-resnet-accuracy-50000-hikey:experiment:mlperf-resnet-armnn-tflite-accuracy-neon-50000
mlperf-resnet-accuracy-50000-hikey:experiment:mlperf-resnet-tflite-accuracy-50000
TFLite vs. ArmNN Neon
$ ck compare_experiments mlperf \
mlperf-resnet-accuracy-50000-hikey:experiment:mlperf-resnet-tflite-accuracy-50000 \
mlperf-resnet-accuracy-50000-hikey:experiment:mlperf-resnet-armnn-tflite-accuracy-neon-50000
...
{'epsilon': 1e-05,
 'max_delta': 2.4000000000024002e-05,
 'num_mismatched_classes': 6,
 'num_mismatched_elementary_keys': 0,
 'num_mismatched_files': 107,
 'num_mismatched_probabilities': 153,
 'return': 0}
TFLite vs. ArmNN OpenCL
$ ck compare_experiments mlperf \
mlperf-resnet-accuracy-50000-hikey:experiment:mlperf-resnet-tflite-accuracy-50000 \
mlperf-resnet-accuracy-50000-hikey:experiment:mlperf-resnet-armnn-tflite-accuracy-opencl-50000
...
{'epsilon': 1e-05,
 'max_delta': 2.5000000000052758e-05,
 'num_mismatched_classes': 4,
 'num_mismatched_elementary_keys': 0,
 'num_mismatched_files': 147,
 'num_mismatched_probabilities': 190,
 'return': 0}
ArmNN Neon vs. ArmNN OpenCL
$ ck compare_experiments mlperf \
mlperf-resnet-accuracy-50000-hikey:experiment:mlperf-resnet-armnn-tflite-accuracy-neon-50000 \
mlperf-resnet-accuracy-50000-hikey:experiment:mlperf-resnet-armnn-tflite-accuracy-opencl-50000
...
{'epsilon': 1e-05,
 'max_delta': 1.0000000000010001e-05,
 'num_mismatched_classes': 6,
 'num_mismatched_elementary_keys': 0,
 'num_mismatched_files': 5,
 'num_mismatched_probabilities': 2,
 'return': 0}

velociti

500 images
$ wget https://www.dropbox.com/s/1jv4lpfp1ddr2j7/mlperf-resnet-accuracy-500-velociti.zip
$ ck add repo --zip=mlperf-resnet-accuracy-500-velociti.zip
$ ck list --repo_uoa=mlperf-resnet-accuracy-500-velociti --print_full
mlperf-resnet-accuracy-500-velociti:experiment:mlperf-resnet-armnn-tflite-accuracy-500
mlperf-resnet-accuracy-500-velociti:experiment:mlperf-resnet-tflite-accuracy-500
TFLite vs. ArmNN Reference
$ ck compare_experiments mlperf \
mlperf-resnet-accuracy-500-velociti:experiment:mlperf-resnet-armnn-tflite-accuracy-500 \
mlperf-resnet-accuracy-500-velociti:experiment:mlperf-resnet-tflite-accuracy-500
...
{'epsilon': 1e-05,
 'max_delta': 3.0000000000030003e-06,
 'num_mismatched_classes': 0,
 'num_mismatched_elementary_keys': 0,
 'num_mismatched_files': 0,
 'num_mismatched_probabilities': 0,
 'return': 0}

Object detection

Please follow the MLPerf object detection instructions to install dependencies such as Python packages:

Caveats

TFLite

The SSD models require TFLite 1.13.1.

Python 3

The COCO API (used to evaluate object detection accuracy on the COCO dataset) requires Python 3. Since many embedded platforms use Python 2 by default (including HiKey960), care must be taken not to mix Python 3 and Python 2 packages. Therefore, all benchmarking commands below use the CK_PYTHON=python3 prefix to ensure CK runs under Python 3.

Download and preprocess the COCO 2017 validation dataset

$ ck install package --tags=object-detection,dataset,coco.2017,val,original,full
$ ck install package --tags=object-detection,dataset,coco.2017,preprocessed,full

SSD-MobileNet

Model

Install the SSD-MobileNet model:

$ ck install package --tags=model,tflite,mlperf,object-detection,ssd-mobilenet

TFLite data (reference)

$ CK_PYTHON=python3 ck benchmark program:object-detection-tflite \
--repetitions=1 --env.CK_BATCH_SIZE=1 --env.CK_BATCH_COUNT=5000 --env.CK_METRIC_TYPE=COCO \
--record --record_repo=local --record_uoa=mlperf-object-detection-ssd-mobilenet-tflite-accuracy \
--tags=mlperf,object-detection,ssd-mobilenet,tflite,accuracy \
--skip_print_timers --skip_stat_analysis --process_multi_keys

ArmNN Neon data

$ CK_PYTHON=python3 ck benchmark program:object-detection-armnn-tflite --env.USE_NMS=regular --env.USE_NEON \
--speed --repetitions=1 --env.CK_BATCH_SIZE=1 --env.CK_BATCH_COUNT=5000 --env.CK_METRIC_TYPE=COCO \
--record --record_repo=local --record_uoa=mlperf-object-detection-ssd-mobilenet-armnn-tflite-accuracy-neon \
--tags=mlperf,object-detection,ssd-mobilenet,armnn-tflite,accuracy,neon \
--skip_print_timers --skip_stat_analysis --process_multi_keys

ArmNN OpenCL data

$ CK_PYTHON=python3 ck benchmark program:object-detection-armnn-tflite --env.USE_NMS=regular --env.USE_OPENCL \
--speed --repetitions=1 --env.CK_BATCH_SIZE=1 --env.CK_BATCH_COUNT=5000 --env.CK_METRIC_TYPE=COCO \
--record --record_repo=local --record_uoa=mlperf-object-detection-ssd-mobilenet-armnn-tflite-accuracy-opencl \
--tags=mlperf,object-detection,ssd-mobilenet,armnn-tflite,accuracy,opencl \
--skip_print_timers --skip_stat_analysis --process_multi_keys

Validate experimental data

To validate the equivalence of the optimized ArmNN implementation versus the reference TFLite one, we collected experimental data as above on:

  • A Linaro HiKey960 board (hikey): TFLite vs. ArmNN Neon vs. ArmNN OpenCL.

The resulting experimental entries were archived e.g. as follows:

hikey$ ck list local:experiment:mlperf-object-detection-ssd-mobilenet*accuracy*
...
hikey$ ck zip local:experiment:mlperf-object-detection-ssd-mobilenet*accuracy* \
                --archive_name=mlperf-object-detection-ssd-mobilenet-accuracy-hikey.zip

The archives were then uploaded to DropBox. You can follow instructions below to download the archives and validate the accuracy.

hikey

$ wget https://www.dropbox.com/s/jzpum9fedwgq8rd/mlperf-object-detection-ssd-mobilenet-accuracy-hikey.zip
$ ck add repo --zip=mlperf-object-detection-ssd-mobilenet-accuracy-hikey.zip
$ ck list --repo_uoa=mlperf-object-detection-ssd-mobilenet-accuracy-hikey --print_full
mlperf-object-detection-ssd-mobilenet-accuracy-hikey:experiment:mlperf-object-detection-ssd-mobilenet-armnn-tflite-accuracy-opencl
mlperf-object-detection-ssd-mobilenet-accuracy-hikey:experiment:mlperf-object-detection-ssd-mobilenet-armnn-tflite-accuracy-neon
mlperf-object-detection-ssd-mobilenet-accuracy-hikey:experiment:mlperf-object-detection-ssd-mobilenet-tflite-accuracy
TFLite vs. ArmNN Neon
$ ck compare_experiments mlperf \
mlperf-object-detection-ssd-mobilenet-accuracy-hikey:experiment:mlperf-object-detection-ssd-mobilenet-tflite-accuracy \
mlperf-object-detection-ssd-mobilenet-accuracy-hikey:experiment:mlperf-object-detection-ssd-mobilenet-armnn-tflite-accuracy-neon
...
{'delta_mAP': 0.0007913466136450775,
 'delta_recall': 0.001550580396864898,
 'epsilon_bbox': [1.0, 1.0, 1.0, 1.0],
 'epsilon_dist': 100.0,
 'epsilon_prob': 1e-05,
 'max_delta_bbox': [596.98, 490.93, 606.36, 583.04],
 'max_delta_dist': 173810.622875,
 'max_delta_prob': 0.49000000000000005,
 'num_mismatched_classes': 239,
 'num_mismatched_distances': 1353,
 'num_mismatched_files': 1166,
 'num_mismatched_probabilities': 442,
 'return': 0}
TFLite vs. ArmNN OpenCL
$ ck compare_experiments mlperf \
mlperf-object-detection-ssd-mobilenet-accuracy-hikey:experiment:mlperf-object-detection-ssd-mobilenet-tflite-accuracy \
mlperf-object-detection-ssd-mobilenet-accuracy-hikey:experiment:mlperf-object-detection-ssd-mobilenet-armnn-tflite-accuracy-opencl
...
{'delta_mAP': 0.0007900779499596944,
 'delta_recall': 0.0015451924658304583,
 'epsilon_bbox': [1.0, 1.0, 1.0, 1.0],
 'epsilon_dist': 100.0,
 'epsilon_prob': 1e-05,
 'max_delta_bbox': [596.98, 490.93, 606.36, 583.04],
 'max_delta_dist': 173810.622875,
 'max_delta_prob': 0.49000000000000005,
 'num_mismatched_classes': 239,
 'num_mismatched_distances': 1353,
 'num_mismatched_files': 1167,
 'num_mismatched_probabilities': 444,
 'return': 0}
ArmNN Neon vs. ArmNN OpenCL
$ ck compare_experiments mlperf \
mlperf-object-detection-ssd-mobilenet-accuracy-hikey:experiment:mlperf-object-detection-ssd-mobilenet-armnn-tflite-accuracy-neon \
mlperf-object-detection-ssd-mobilenet-accuracy-hikey:experiment:mlperf-object-detection-ssd-mobilenet-armnn-tflite-accuracy-opencl
...
{'delta_mAP': -1.2686636853831423e-06,
 'delta_recall': -5.387931034439575e-06,
 'epsilon_bbox': [1.0, 1.0, 1.0, 1.0],
 'epsilon_dist': 100.0,
 'epsilon_prob': 1e-05,
 'max_delta_bbox': [0.010000000000047748,
                    0.010000000000047748,
                    0.010000000000047748,
                    0.010000000000047748],
 'max_delta_dist': 3.1188250000122935,
 'max_delta_prob': 0.0010000000000000009,
 'num_mismatched_classes': 0,
 'num_mismatched_distances': 0,
 'num_mismatched_files': 7,
 'num_mismatched_probabilities': 7,
 'return': 0}

About

Arm mlperf.org benchmark port

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published