This should be usable with TF v2, too. PiperOrigin-RevId: 355786139 Change-Id: I0725913ab3b410c23588cc59b5847f351a0eeb5a |
||
---|---|---|
.. | ||
BUILD | ||
debugger_test.py | ||
debugger.py | ||
README.md |
TensorFlow Lite Quantization Debugger
[TOC]
Overview
When a quantized model is produced, it requires tedious and manual custom code to debug the model in order to:
- Verify if the quantized model is working as expected (spot errors, check accuracy, etc).
- Compare the quantized model and the original float model.
This is now feasible using the TensorFlow Lite Quantization Debugger, as shown below.
Note: Currently, this workflow is only supported for full integer (int8) quantization. The debug model produced using this workflow should only be used for debugging purposes only (and not for inference).
Analysis with quantized model only
Produce a debug model
Modify the TFLite full integer (int8) quantization steps as shown below to produce a debug model (used for debugging purposes only, and not inference)
How does this work?
With the help of the MLIR quantizer's debug mode feature, the debug model
produced has both the original float operators (or ops) and the quantized ops.
Additionally, NumericVerify
ops are added to compare the outputs of the
original float and quantized ops and to also collect statistics. It has the name
in the format of NumericVerify/{original tensor name}:{original tensor id}
# for mlir_quantize
from tensorflow.lite.python import convert
# set full-integer quantization parameters as usual.
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.representative_dataset = calibration_gen
# Create a TFLite model with new quantizer and numeric verify ops. Rather than
# calling convert() only, calibrate model first and call `mlir_quantize` to run
# the actual quantization, with `enable_numeric_verify` set to `True`.
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter._experimental_calibrate_only = True
calibrated = converter.convert()
return convert.mlir_quantize(calibrated, enable_numeric_verify=True)
Run debugger with debug model
Initialize debugger with the debug model. This can be done in two ways.
from tensorflow.lite.experimental.quantization_debugger import debugger
# `debug_dataset` accpets the same type as `converter.representative_dataset`.
quant_debugger = debugger.QuantizationDebugger(
quant_debug_model_content=quant_debug_model,
debug_dataset=data_gen)
# OR
quant_debugger = debugger.QuantizationDebugger(
quant_debug_model_path='/path/to/debug_model.tflite',
debug_dataset=data_gen)
quant_debugger.run()
Inspect statistics
When you call quant_debugger.run()
, quant_debugger.layer_statistics
is
filled with aggregated statistics for each NumericVerify
ops. Some metrics
(i.e. stddev, mean square error) are calculated by default.
Example output
# `quant_debugger.layer_statistics.metrics` is defaultdict, convert it to dict
# for readable output.
import pprint
for layer_name, metrics in quant_debugger.layer_statistics.items():
print(layer_name)
pprint.pprint(dict(metrics))
# ...
NumericVerify/sequential/dense/MatMul;sequential/dense/BiasAdd3:77
{'max_abs_error': 0.05089309,
'mean_error': -0.00017149668,
'mean_square_error': 0.00040816222,
'num_elements': 256.0,
'stddev': 0.02009948}
NumericVerify/sequential/dense_1/MatMul;sequential/dense_1/BiasAdd3:81
{'max_abs_error': 0.09744112,
'mean_error': 0.0048679365,
'mean_square_error': 0.0036721828,
'num_elements': 10.0,
'stddev': 0.055745363}
NumericVerify/Identity2:85
{'max_abs_error': 0.0036417267,
'mean_error': -0.00068773015,
'mean_square_error': 3.439951e-06,
'num_elements': 10.0,
'stddev': 0.0016223773}
# ...
Adding custom metrics
More metrics can be added by passing QuantizationDebugOptions
to the
initializer. For example, if you want to add mean absolute error, use following
snippet.
debug_options = debugger.QuantizationDebugOptions(
layer_debug_metrics={
'mean_abs_error': lambda diffs: np.mean(np.abs(diffs))
})
quant_debugger = debugger.QuantizationDebugger(
quant_debug_model_content=quant_debug_model,
debug_dataset=data_gen,
debug_options=debug_options
)
quant_debugger.run()
Now quant_debugger.layer_statistics
includes mean absoulte error for each
layer.
Analysis with float and quantized models
In addition to single model analysis, the output of original float model and
quantized model can be compared when both models are given. This can be done
by providing a float model, and metrics to compare outputs. This can be argmax
for classification models, bit for more complex models like detection more
complicated logic should be given.
# functions for model_debug_metrics gets all output tensors from float and
# quantized models, and returns a single metric value.
debug_options = debugger.QuantizationDebugOptions(
model_debug_metrics={
'argmax_accuracy': lambda f, q: np.argmax(f[0]) == np.argmax(q[0])
})
float_model = converter.convert() # converted without any optimizations.
quant_debugger = debugger.QuantizationDebugger(
quant_debug_model_content=quant_debug_model,
float_model_content=float_model, # can pass `float_model_path` instead.
debug_dataset=data_gen,
debug_options=debug_options
)
quant_debugger.run()
The result is a single number per metric, so it's easier to inspect.
>>> quant_debugger.model_statistics
{'argmax_accuracy': 0.89}
Advanced usage: Export stats to csv, and import to pandas
quant_debugger.layer_statistics_dump
function accepts file-like object, and
exports layer statistics to csv. This can be imported to other tools like
pandas
for further processing. The exported data also has name of the op,
originating tensor ID, and quantization parameters (scales and zero points) for
quantized layer.
Note: scales and zero points are lists, and imported to pandas
as text by
default. Additional processing to parse them is required before processing.
import pandas as pd
import yaml # used to parse lists
with open('/path/to/stats.csv', 'w') as f:
quant_debugger.layer_statistics_dump(f)
data = pd.read_csv('/path/to/stats.csv', converters={
'scales': yaml.safe_load, 'zero_points': yaml.safe_load})