After quantization-aware training launch: documentation updates.

PiperOrigin-RevId: 305804204
Change-Id: I4860198347ed3135128ec9b17d9153e70a4e81ee
This commit is contained in:
Alan Chiao 2020-04-09 18:41:33 -07:00 committed by TensorFlower Gardener
parent 993a6e4f0e
commit 734594bee9

View File

@ -86,12 +86,12 @@ a smaller model size and faster computation.
The following types of quantization are available in TensorFlow Lite: The following types of quantization are available in TensorFlow Lite:
Technique | Data requirements | Size reduction | Accuracy | Supported hardware Technique | Data requirements | Size reduction | Accuracy | Supported hardware
-------------------------------------------------------------------------------------------------------------- | -------------------------------- | -------------- | --------------------------- | ------------------ ------------------------------------------------------------------------------------------------------- | -------------------------------- | -------------- | --------------------------- | ------------------
[Post-training float16 quantization](post_training_float16_quant.ipynb) | No data | Up to 50% | Insignificant accuracy loss | CPU, GPU [Post-training float16 quantization](post_training_float16_quant.ipynb) | No data | Up to 50% | Insignificant accuracy loss | CPU, GPU
[Post-training dynamic range quantization](post_training_quant.ipynb) | No data | Up to 75% | Accuracy loss | CPU [Post-training dynamic range quantization](post_training_quant.ipynb) | No data | Up to 75% | Accuracy loss | CPU
[Post-training integer quantization](post_training_integer_quant.ipynb) | Unlabelled representative sample | Up to 75% | Smaller accuracy loss | CPU, EdgeTPU, Hexagon DSP [Post-training integer quantization](post_training_integer_quant.ipynb) | Unlabelled representative sample | Up to 75% | Smaller accuracy loss | CPU, EdgeTPU, Hexagon DSP
[Quantization-aware training](https://github.com/tensorflow/tensorflow/tree/r1.13/tensorflow/contrib/quantize) | Labelled training data | Up to 75% | Smallest accuracy loss | CPU, EdgeTPU, Hexagon DSP [Quantization-aware training](http://www.tensorflow.org/model_optimization/guide/quantization/training) | Labelled training data | Up to 75% | Smallest accuracy loss | CPU, EdgeTPU, Hexagon DSP
Below are the latency and accuracy results for post-training quantization and Below are the latency and accuracy results for post-training quantization and
quantization-aware training on a few models. All latency numbers are measured on quantization-aware training on a few models. All latency numbers are measured on
@ -144,11 +144,9 @@ broadly applicable and does not require training data.
For cases where the accuracy and latency targets are not met, or hardware For cases where the accuracy and latency targets are not met, or hardware
accelerator support is important, accelerator support is important,
[quantization-aware training](https://github.com/tensorflow/tensorflow/tree/r1.13/tensorflow/contrib/quantize){:.external} [quantization-aware training](https://www.tensorflow.org/model_optimization/guide/quantization/training){:.external}
is the better option. See additional optimization techniques under the is the better option. See additional optimization techniques under the
[Tensorflow Model Optimization Toolkit](https://www.tensorflow.org/model_optimization). [Tensorflow Model Optimization Toolkit](https://www.tensorflow.org/model_optimization).
Note: Quantization-aware training supports a subset of convolutional neural network architectures.
If you want to further reduce your model size, you can try [pruning](#pruning) If you want to further reduce your model size, you can try [pruning](#pruning)
prior to quantizing your models. prior to quantizing your models.