Update OVIC README for CVPR2020 competition.
- Added Pixel 4 latency measurements for test and reference models. - Added MobileNetV3 benchmarks. PiperOrigin-RevId: 306954559 Change-Id: I67130537f73b7a2cb7639b54e51cfacc68630d6c
This commit is contained in:
parent
5d192714dd
commit
3ea7b9ec60
@ -1,26 +1,36 @@
|
||||
# OVIC Benchmarker for ICCV-NeurIPS 2019
|
||||
# OVIC Benchmarker for LPCV 2020
|
||||
|
||||
This folder contains the SDK for track one of the [Low Power Computer Vision workshop at ICCV 2019.](https://rebootingcomputing.ieee.org/lpirc/)
|
||||
This folder contains the SDK for track one of the
|
||||
[Low Power Computer Vision workshop at CVPR 2020.](https://lpcv.ai/2020CVPR/ovic-track)
|
||||
|
||||
## Pre-requisite
|
||||
|
||||
Follow the steps [here](https://www.tensorflow.org/lite/demo_android) to install Tensorflow, Bazel, and the Android NDK and SDK.
|
||||
Follow the steps [here](https://www.tensorflow.org/lite/demo_android) to install
|
||||
Tensorflow, Bazel, and the Android NDK and SDK.
|
||||
|
||||
## Test the benchmarker:
|
||||
|
||||
The testing utilities helps the developers (you) to make sure that your submissions in TfLite format will be processed as expected in the competition's benchmarking system.
|
||||
The testing utilities helps the developers (you) to make sure that your
|
||||
submissions in TfLite format will be processed as expected in the competition's
|
||||
benchmarking system.
|
||||
|
||||
Note: for now the tests only provides correctness checks, i.e. classifier predicts the correct category on the test image, but no on-device latency measurements. To test the latency measurement functionality, the tests will print the latency running on a desktop computer, which is not indicative of the on-device run-time.
|
||||
We are releasing an benchmarker Apk that would allow developers to measure latency on their own devices.
|
||||
Note: for now the tests only provides correctness checks, i.e. classifier
|
||||
predicts the correct category on the test image, but no on-device latency
|
||||
measurements. To test the latency measurement functionality, the tests will
|
||||
print the latency running on a desktop computer, which is not indicative of the
|
||||
on-device run-time. We are releasing an benchmarker Apk that would allow
|
||||
developers to measure latency on their own devices.
|
||||
|
||||
### Obtain the sample models
|
||||
|
||||
The test data (models and images) should be downloaded automatically for you by Bazel. In case they are not, you can manually install them as below.
|
||||
The test data (models and images) should be downloaded automatically for you by
|
||||
Bazel. In case they are not, you can manually install them as below.
|
||||
|
||||
Note: all commands should be called from your tensorflow installation folder (under this folder you should find `tensorflow/lite`).
|
||||
Note: all commands should be called from your tensorflow installation folder
|
||||
(under this folder you should find `tensorflow/lite`).
|
||||
|
||||
|
||||
* Download the [testdata package](https://storage.googleapis.com/download.tensorflow.org/data/ovic_2018_10_23.zip):
|
||||
* Download the
|
||||
[testdata package](https://storage.googleapis.com/download.tensorflow.org/data/ovic_2019_04_30.zip):
|
||||
|
||||
```sh
|
||||
curl -L https://storage.googleapis.com/download.tensorflow.org/data/ovic_2019_04_30.zip -o /tmp/ovic.zip
|
||||
@ -34,7 +44,8 @@ unzip -j /tmp/ovic.zip -d tensorflow/lite/java/ovic/src/testdata/
|
||||
|
||||
### Run tests
|
||||
|
||||
You can run test with Bazel as below. This helps to ensure that the installation is correct.
|
||||
You can run test with Bazel as below. This helps to ensure that the installation
|
||||
is correct.
|
||||
|
||||
```sh
|
||||
bazel test //tensorflow/lite/java/ovic:OvicClassifierTest --cxxopt=-Wno-all --test_output=all
|
||||
@ -44,11 +55,16 @@ bazel test //tensorflow/lite/java/ovic:OvicDetectorTest --cxxopt=-Wno-all --test
|
||||
|
||||
### Test your submissions
|
||||
|
||||
Once you have a submission that follows the instructions from the [competition site](https://gdoc.pub/doc/e/2PACX-1vSFTEMAE_N6RgtidT-4DVTje6f6HRJv7Q_zaCab5H66BFyqEiZ8PsUfD_-YmBE7_z67qDiNgk-CJqeE), you can verify it in two ways:
|
||||
Once you have a submission that follows the instructions from the
|
||||
[competition site](https://lpcv.ai/2020CVPR/ovic-track), you can verify it in
|
||||
two ways:
|
||||
|
||||
#### Validate using randomly generated images
|
||||
|
||||
You can call the validator binary below to verify that your model fits the format requirements. This often helps you to catch size mismatches (e.g. output for classification should be [1, 1001] instead of [1,1,1,1001]). Let say the submission file is located at `/path/to/my_model.lite`, then call:
|
||||
You can call the validator binary below to verify that your model fits the
|
||||
format requirements. This often helps you to catch size mismatches (e.g. output
|
||||
for classification should be [1, 1001] instead of [1,1,1,1001]). Let say the
|
||||
submission file is located at `/path/to/my_model.lite`, then call:
|
||||
|
||||
```sh
|
||||
bazel build //tensorflow/lite/java/ovic:ovic_validator --cxxopt=-Wno-all
|
||||
@ -62,12 +78,14 @@ Successfully validated /path/to/my_model.lite.
|
||||
|
||||
```
|
||||
|
||||
To validate detection models, use the same command but provide "detect" as the second argument instead of "classify".
|
||||
|
||||
To validate detection models, use the same command but provide "detect" as the
|
||||
second argument instead of "classify".
|
||||
|
||||
#### Test that the model produces sensible outcomes
|
||||
|
||||
You can go a step further to verify that the model produces results as expected. This helps you catch bugs during TOCO conversion (e.g. using the wrong mean and std values).
|
||||
You can go a step further to verify that the model produces results as expected.
|
||||
This helps you catch bugs during TFLite conversion (e.g. using the wrong mean
|
||||
and std values).
|
||||
|
||||
* Move your submission to the testdata folder:
|
||||
|
||||
@ -75,11 +93,15 @@ You can go a step further to verify that the model produces results as expected.
|
||||
cp /path/to/my_model.lite tensorflow/lite/java/ovic/src/testdata/
|
||||
```
|
||||
|
||||
* Resize the test image to the resolutions that are expected by your submission:
|
||||
* Resize the test image to the resolutions that are expected by your
|
||||
submission:
|
||||
|
||||
The test images can be found at `tensorflow/lite/java/ovic/src/testdata/test_image_*.jpg`. You may reuse these images if your image resolutions are 128x128 or 224x224.
|
||||
The test images can be found at
|
||||
`tensorflow/lite/java/ovic/src/testdata/test_image_*.jpg`. You may reuse these
|
||||
images if your image resolutions are 128x128 or 224x224.
|
||||
|
||||
* Add your model and test image to the BUILD rule at `tensorflow/lite/java/ovic/src/testdata/BUILD`:
|
||||
* Add your model and test image to the BUILD rule at
|
||||
`tensorflow/lite/java/ovic/src/testdata/BUILD`:
|
||||
|
||||
```JSON
|
||||
filegroup(
|
||||
@ -98,32 +120,49 @@ filegroup(
|
||||
```
|
||||
|
||||
* For classification models, modify `OvicClassifierTest.java`:
|
||||
|
||||
* change `TEST_IMAGE_PATH` to `my_test_image.jpg`.
|
||||
|
||||
* change either `FLOAT_MODEL_PATH` or `QUANTIZED_MODEL_PATH` to `my_model.lite` depending on whether your model runs inference in float or [8-bit](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/quantize).
|
||||
* change either `FLOAT_MODEL_PATH` or `QUANTIZED_MODEL_PATH` to
|
||||
`my_model.lite` depending on whether your model runs inference in float
|
||||
or
|
||||
[8-bit](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/quantize).
|
||||
|
||||
* change `TEST_IMAGE_GROUNDTRUTH` (ImageNet class ID) to be consistent with your test image.
|
||||
* change `TEST_IMAGE_GROUNDTRUTH` (ImageNet class ID) to be consistent
|
||||
with your test image.
|
||||
|
||||
* For detection models, modify `OvicDetectorTest.java`:
|
||||
|
||||
* change `TEST_IMAGE_PATH` to `my_test_image.jpg`.
|
||||
* change `MODEL_PATH` to `my_model.lite`.
|
||||
* change `GROUNDTRUTH` (COCO class ID) to be consistent with your test image.
|
||||
* change `GROUNDTRUTH` (COCO class ID) to be consistent with your test
|
||||
image.
|
||||
|
||||
Now you can run the bazel tests to catch any runtime issues with the submission.
|
||||
|
||||
Note: Please make sure that your submission passes the test. If a submission fails to pass the test it will not be processed by the submission server.
|
||||
Note: Please make sure that your submission passes the test. If a submission
|
||||
fails to pass the test it will not be processed by the submission server.
|
||||
|
||||
## Measure on-device latency
|
||||
|
||||
We provide two ways to measure the on-device latency of your submission. The first is through our competition server, which is reliable and repeatable, but is limited to a few trials per day. The second is through the benchmarker Apk, which requires a device and may not be as accurate as the server, but has a fast turn-around and no access limitations. We recommend that the participants use the benchmarker apk for early development, and reserve the competition server for evaluating promising submissions.
|
||||
We provide two ways to measure the on-device latency of your submission. The
|
||||
first is through our competition server, which is reliable and repeatable, but
|
||||
is limited to a few trials per day. The second is through the benchmarker Apk,
|
||||
which requires a device and may not be as accurate as the server, but has a fast
|
||||
turn-around and no access limitations. We recommend that the participants use
|
||||
the benchmarker apk for early development, and reserve the competition server
|
||||
for evaluating promising submissions.
|
||||
|
||||
### Running the benchmarker app
|
||||
|
||||
Make sure that you have followed instructions in [Test your submissions](#test-your-submissions) to add your model to the testdata folder and to the corresponding build rules.
|
||||
Make sure that you have followed instructions in
|
||||
[Test your submissions](#test-your-submissions) to add your model to the
|
||||
testdata folder and to the corresponding build rules.
|
||||
|
||||
Modify `tensorflow/lite/java/ovic/demo/app/OvicBenchmarkerActivity.java`:
|
||||
|
||||
* Add your model to the benchmarker apk by changing `modelPath` and `testImagePath` to your submission and test image.
|
||||
* Add your model to the benchmarker apk by changing `modelPath` and
|
||||
`testImagePath` to your submission and test image.
|
||||
|
||||
```
|
||||
if (benchmarkClassification) {
|
||||
@ -133,13 +172,15 @@ Modify `tensorflow/lite/java/ovic/demo/app/OvicBenchmarkerActivity.java`:
|
||||
} else { // Benchmarking detection.
|
||||
...
|
||||
```
|
||||
If you are adding a detection model, simply modify `modelPath` and `testImagePath` in the else block above.
|
||||
|
||||
If you are adding a detection model, simply modify `modelPath` and
|
||||
`testImagePath` in the else block above.
|
||||
|
||||
* Adjust the benchmark parameters when needed:
|
||||
|
||||
You can change the length of each experiment, and the processor affinity below.
|
||||
`BIG_CORE_MASK` is an integer whose binary encoding represents the set of used
|
||||
cores. This number is phone-specific. For example, Pixel 2 has 8 cores: the 4
|
||||
cores. This number is phone-specific. For example, Pixel 4 has 8 cores: the 4
|
||||
little cores are represented by the 4 less significant bits, and the 4 big cores
|
||||
by the 4 more significant bits. Therefore a mask value of 16, or in binary
|
||||
`00010000`, represents using only the first big core. The mask 32, or in binary
|
||||
@ -151,7 +192,7 @@ mask 16 because the big cores are interchangeable.
|
||||
private static final double WALL_TIME = 3000;
|
||||
/** Maximum number of iterations in each benchmarking experiment. */
|
||||
private static final int MAX_ITERATIONS = 100;
|
||||
/** Mask for binding to a single big core. Pixel 1 (4), Pixel 2 (16). */
|
||||
/** Mask for binding to a single big core. Pixel 1 (4), Pixel 4 (16). */
|
||||
private static final int BIG_CORE_MASK = 16;
|
||||
```
|
||||
|
||||
@ -164,7 +205,10 @@ bazel build -c opt --cxxopt=-Wno-all //tensorflow/lite/java/ovic/demo/app:ovic_b
|
||||
adb install -r bazel-bin/tensorflow/lite/java/ovic/demo/app/ovic_benchmarker_binary.apk
|
||||
```
|
||||
|
||||
Start the app and pick a task by clicking either the `CLF` button for classification or the `DET` button for detection. The button should turn bright green, signaling that the experiment is running. The benchmarking results will be displayed after about the `WALL_TIME` you specified above. For example:
|
||||
Start the app and pick a task by clicking either the `CLF` button for
|
||||
classification or the `DET` button for detection. The button should turn bright
|
||||
green, signaling that the experiment is running. The benchmarking results will
|
||||
be displayed after about the `WALL_TIME` you specified above. For example:
|
||||
|
||||
```
|
||||
my_model.lite: Average latency=158.6ms after 20 runs.
|
||||
@ -172,80 +216,100 @@ my_model.lite: Average latency=158.6ms after 20 runs.
|
||||
|
||||
### Sample latencies
|
||||
|
||||
Note: the benchmarking results can be quite different depending on the background processes running on the phone. A few things that help stabilize the app's readings are placing the phone on a cooling plate, restarting the phone, and shutting down internet access.
|
||||
Note: the benchmarking results can be quite different depending on the
|
||||
background processes running on the phone. A few things that help stabilize the
|
||||
app's readings are placing the phone on a cooling plate, restarting the phone,
|
||||
and shutting down internet access.
|
||||
|
||||
| Classification Model | Pixel 1 latency (ms) | Pixel 2 latency (ms) |
|
||||
| -------------------- |:---------------------:| --------------------:|
|
||||
| float_model.lite | 97 | 113 |
|
||||
| quantized_model.lite | 73 | 61 |
|
||||
| low_res_model.lite | 3 | 3 |
|
||||
Classification Model | Pixel 1 | Pixel 2 | Pixel 4
|
||||
-------------------- | :-----: | ------: | :-----:
|
||||
float_model.lite | 97 | 113 | 37
|
||||
quantized_model.lite | 73 | 61 | 13
|
||||
low_res_model.lite | 3 | 3 | 1
|
||||
|
||||
Detection Model | Pixel 2 | Pixel 4
|
||||
---------------------- | :-----: | :-----:
|
||||
detect.lite | 248 | 82
|
||||
quantized_detect.lite | 59 | 17
|
||||
quantized_fpnlite.lite | 96 | 29
|
||||
|
||||
| Detection Model | Pixel 2 latency (ms) |
|
||||
| -------------------- |:---------------------:|
|
||||
| detect.lite | 248 |
|
||||
| quantized_detect.lite | 59 |
|
||||
| quantized_fpnlite.lite | 96 |
|
||||
All latency numbers are in milliseconds. The Pixel 1 and Pixel 2 latency numbers
|
||||
are measured on `Oct 17 2019` (Github commit hash
|
||||
[I05def66f58fa8f2161522f318e00c1b520cf0606](https://github.com/tensorflow/tensorflow/commit/4b02bc0e0ff7a0bc02264bc87528253291b7c949#diff-4e94df4d2961961ba5f69bbd666e0552))
|
||||
|
||||
The Pixel 4 latency numbers are measured on `Apr 14 2020` (Github commit hash
|
||||
[4b2cb67756009dda843c6b56a8b320c8a54373e0](https://github.com/tensorflow/tensorflow/commit/4b2cb67756009dda843c6b56a8b320c8a54373e0)).
|
||||
|
||||
All latency numbers above are measured on `Oct 17 2019` (Github commit hash [I05def66f58fa8f2161522f318e00c1b520cf0606]( https://github.com/tensorflow/tensorflow/commit/4b02bc0e0ff7a0bc02264bc87528253291b7c949#diff-4e94df4d2961961ba5f69bbd666e0552]))
|
||||
|
||||
Since Pixel 2 has excellent support for 8-bit quantized models, we strongly recommend you to check out the [quantization training tutorial](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/quantize).
|
||||
|
||||
The detection models above are both single-shot models (i.e. no object proposal generation) using TfLite's *fast* version of Non-Max-Suppression (NMS). The fast NMS is significant faster than the regular NMS (used by the ObjectDetectionAPI in training) at the expense of about 1% mAP for the listed models.
|
||||
Since Pixel 4 has excellent support for 8-bit quantized models, we strongly
|
||||
recommend you to check out the
|
||||
[Post-Training Quantization tutorial](https://www.tensorflow.org/lite/performance/post_training_quantization).
|
||||
|
||||
The detection models above are both single-shot models (i.e. no object proposal
|
||||
generation) using TfLite's *fast* version of Non-Max-Suppression (NMS). The fast
|
||||
NMS is significant faster than the regular NMS (used by the ObjectDetectionAPI
|
||||
in training) at the expense of about 1% mAP for the listed models.
|
||||
|
||||
### Latency table
|
||||
|
||||
We have compiled a latency table for common neural network operators such as
|
||||
convolutions, separable convolutions, and matrix multiplications.
|
||||
The table of results is available here:
|
||||
convolutions, separable convolutions, and matrix multiplications. The table of
|
||||
results is available here:
|
||||
|
||||
* https://storage.cloud.google.com/ovic-data/latency_table.csv
|
||||
* https://storage.cloud.google.com/ovic-data/
|
||||
|
||||
The results were generated by creating a small network containing a single
|
||||
operation, and running the op under the test harness. For more details see the
|
||||
NetAdapt paper<sup>1</sup>. We plan to expand table regularly as we test with
|
||||
newer OS releases and updates to Tensorflow Lite.
|
||||
|
||||
|
||||
### Sample benchmarks
|
||||
|
||||
Below are the baseline models (8-bit quantized MobilenetV2 and floating point
|
||||
MnasNet) used to compute the reference accuracy for ImageNet classification. The
|
||||
naming convention of the models are `[model class]_[resolution]_[multiplier]`.
|
||||
Latency (ms) is measured on a single Pixel 2 big core using the competition
|
||||
server on `Oct 17 2019`
|
||||
Below are the baseline models (MobileNetV2, MnasNet, and MobileNetV3) used to
|
||||
compute the reference accuracy for ImageNet classification. The naming
|
||||
convention of the models are `[precision]_[model
|
||||
class]_[resolution]_[multiplier]`. Pixel 2 Latency (ms) is measured on a single
|
||||
Pixel 2 big core using the competition server on `Oct 17 2019`, while Pixel 4
|
||||
latency (ms) is measured on a single Pixel 4 big core using the competition
|
||||
server on `Apr 14 2020`. You can find these models on TFLite's
|
||||
[hosted model page](https://www.tensorflow.org/lite/guide/hosted_models#image_classification).
|
||||
|
||||
Model | Latency | Top-1 Accuracy
|
||||
:-----------------------: | :-----: | :------------:
|
||||
quant_mobilenetv2_96_35 | 4 | 0.420
|
||||
quant_mobilenetv2_96_50 | 5 | 0.478
|
||||
quant_mobilenetv2_128_35 | 6 | 0.474
|
||||
quant_mobilenetv2_128_50 | 8 | 0.546
|
||||
quant_mobilenetv2_160_35 | 9 | 0.534
|
||||
quant_mobilenetv2_96_75 | 8 | 0.560
|
||||
quant_mobilenetv2_96_100 | 10 | 0.579
|
||||
quant_mobilenetv2_160_50 | 12 | 0.583
|
||||
quant_mobilenetv2_192_35 | 12 | 0.557
|
||||
quant_mobilenetv2_128_75 | 13 | 0.611
|
||||
quant_mobilenetv2_224_35 | 17 | 0.581
|
||||
quant_mobilenetv2_192_50 | 16 | 0.616
|
||||
float_mnasnet_96_100 | 21 | 0.625
|
||||
quant_mobilenetv2_128_100 | 16 | 0.629
|
||||
quant_mobilenetv2_160_75 | 20 | 0.646
|
||||
quant_mobilenetv2_224_50 | 22 | 0.637
|
||||
quant_mobilenetv2_160_100 | 25 | 0.674
|
||||
float_mnasnet_224_50 | 35 | 0.679
|
||||
quant_mobilenetv2_192_75 | 29 | 0.674
|
||||
float_mnasnet_160_100 | 45 | 0.706
|
||||
quant_mobilenetv2_192_100 | 35 | 0.695
|
||||
quant_mobilenetv2_224_75 | 39 | 0.684
|
||||
float_mnasnet_224_75 | 55 | 0.718
|
||||
float_mnasnet_192_100 | 62 | 0.724
|
||||
quant_mobilenetv2_224_100 | 48 | 0.704
|
||||
float_mnasnet_224_100 | 84 | 0.742
|
||||
float_mnasnet_224_130 | 126 | 0.758
|
||||
Model | Pixel 2 | Pixel 4 | Top-1 Accuracy
|
||||
:---------------------------------: | :-----: | :-----: | :------------:
|
||||
quant_mobilenetv2_96_35 | 4 | 1 | 0.420
|
||||
quant_mobilenetv2_96_50 | 5 | 1 | 0.478
|
||||
quant_mobilenetv2_128_35 | 6 | 2 | 0.474
|
||||
quant_mobilenetv2_128_50 | 8 | 2 | 0.546
|
||||
quant_mobilenetv2_160_35 | 9 | 2 | 0.534
|
||||
quant_mobilenetv2_96_75 | 8 | 2 | 0.560
|
||||
quant_mobilenetv2_96_100 | 10 | 3 | 0.579
|
||||
quant_mobilenetv2_160_50 | 12 | 3 | 0.583
|
||||
quant_mobilenetv2_192_35 | 12 | 3 | 0.557
|
||||
quant_mobilenetv2_128_75 | 13 | 3 | 0.611
|
||||
quant_mobilenetv2_192_50 | 16 | 4 | 0.616
|
||||
quant_mobilenetv2_128_100 | 16 | 4 | 0.629
|
||||
quant_mobilenetv2_224_35 | 17 | 5 | 0.581
|
||||
quant_mobilenetv2_160_75 | 20 | 5 | 0.646
|
||||
float_mnasnet_96_100 | 21 | 7 | 0.625
|
||||
quant_mobilenetv2_224_50 | 22 | 6 | 0.637
|
||||
quant_mobilenetv2_160_100 | 25 | 6 | 0.674
|
||||
quant_mobilenetv2_192_75 | 29 | 7 | 0.674
|
||||
quant_mobilenetv2_192_100 | 35 | 9 | 0.695
|
||||
float_mnasnet_224_50 | 35 | 12 | 0.679
|
||||
quant_mobilenetv2_224_75 | 39 | 10 | 0.684
|
||||
float_mnasnet_160_100 | 45 | 15 | 0.706
|
||||
quant_mobilenetv2_224_100 | 48 | 12 | 0.704
|
||||
float_mnasnet_224_75 | 55 | 18 | 0.718
|
||||
float_mnasnet_192_100 | 62 | 20 | 0.724
|
||||
float_mnasnet_224_100 | 84 | 27 | 0.742
|
||||
float_mnasnet_224_130 | 126 | 40 | 0.758
|
||||
float_v3-small-minimalistic_224_100 | - | 5 | 0.620
|
||||
quant_v3-small_224_100 | - | 5 | 0.641
|
||||
float_v3-small_224_75 | - | 5 | 0.656
|
||||
float_v3-small_224_100 | - | 7 | 0.677
|
||||
quant_v3-large_224_100 | - | 12 | 0.728
|
||||
float_v3-large_224_75 | - | 15 | 0.735
|
||||
float_v3-large-minimalistic_224_100 | - | 17 | 0.722
|
||||
float_v3-large_224_100 | - | 20 | 0.753
|
||||
|
||||
### References
|
||||
|
||||
@ -255,4 +319,3 @@ float_mnasnet_224_130 | 126 | 0.758
|
||||
Vivienne Sze, and Hartwig Adam. In Proceedings of the European Conference
|
||||
on Computer Vision (ECCV), pp. 285-300. 2018<br />
|
||||
[[link]](https://arxiv.org/abs/1804.03230) arXiv:1804.03230, 2018.
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user