Out-of-process conversion was a workaround for the legacy converter,
which would generally crash the process when conversion failed. However,
out-of-process conversion also adds a good deal of complexity, so avoid
it when using the new conversion backend.
PiperOrigin-RevId: 312142994
Change-Id: I7ddc83df99ccf24be6e15f46d6a116dce8321933
This is to prepare the 16 bits activation quantization release. The data type
specified by this flag is only applied on the activations.
PiperOrigin-RevId: 311478782
Change-Id: I5f63f0508011cc0b1b47a0debb35c17d3284eae9
- Verify the given input and output names in the tf.entry_function in MLIR.
- Use input and output names with a colon in the saved model path.
PiperOrigin-RevId: 305810470
Change-Id: Id7f56ba216db2b60e6e1a11dbbcc0761a66b4635
if _experimental_new_quantizer is enabled, the converter will call the
calibration-only api from the post-training quantization calibrator, and then
invoke the mlir model quantizer api to quantize the model.
The mlir model quantizer is added via the toco pybind to reduce the binary size.
PiperOrigin-RevId: 305160381
Change-Id: Ib9f49dd36f3533f2e41b0565bbecb6591452c60c
if _experimental_new_quantizer is enabled, the converter will call the
calibration-only api from the post-training quantization calibrator, and then
invoke the mlir model quantizer api to quantize the model.
The mlir model quantizer is added via the toco pybind to reduce the binary size.
PiperOrigin-RevId: 304482662
Change-Id: I1039cdc3e7f8fb244f9c2da73d96179f2a4f4985
V1 converter requires inference_type, inference_input_type
and quantized_input_stats for conversion, whereas V2
converter utilizes FQ ops inside the graph for conversion
and input information.
This CL does the following
1. Move the input_stats check from the convert code into
the V1 converter since that is now specific to it.
2. Improve the condition checking for post training calibrate
vs weight only quantize vs training time quantize
3. Actually handle training-time quantize by passing the
necessary flags to TOCO.
Important to note, this appraoch leaves the option for both
QAT and post-training calibrate quantize to be applied together
in the same conversion.
PiperOrigin-RevId: 298533518
Change-Id: I48ec5b8db8f20242522ca7af70dcbe339b79aa2f
If either inference_type or inference_input_type is set to int8/uint8 and it is
not post-training quantization, the quantized_input_stats is required.
PiperOrigin-RevId: 291441023
Change-Id: Iaee998f10dc90c66ddafc392de250d0f9234388c
The cl made the following changes:
- add int8 to all the related argument comments
- when the "inference_type" is int8, grappler optimization is disabled
- use "inference_type", instead of "inference_input_type" to verify quant stats is specified when it is not post-training quantization.
PiperOrigin-RevId: 285229735
Change-Id: Ie8da5c4d79fb60100c1041bd4573fe603cd304e6
quantized_input_stats is required to set quant parameters
for quantized inputs to a TFLite model. Currently, it is
only allowed for uint8, but we need to enable it for
int8 as well to support the new quantization scheme.
PiperOrigin-RevId: 278971599
Change-Id: I035ec4dc1529575dcdf59fe8d132248ac7496fc6
This CL added the debug information support for the nodes in the frozen graphs
which are GraphDefs and will be sent to the new tf-tflite converter. A GraphDef
only serializes the node name from the original Graph object, but the whole
stack track defining the node will miss. So to collect the stack trace (debug
information) for the nodes in the GraphDef, a few changes made in this CL:
- For TFLiteConverter (v1), an experimental function, which create Graph Debug
info from the original graph object, is passed to the converter constructor
in addition to the GraphDef, so we can retrive the stack trace for the nodes
from the GraphDef. (TFLiteConverterV2 isn't an issue because function object
has passed to the constructor.)
- Propagate the original node name in the Grappler function inlining pass, so
the original node name is stored in the GraphDef when a node is inlined. And
we can use the stored name to look up the stack trace in the original graph.
- When a node name is looked up in the original graph, We need to consider the
function library as well. For function libraries created by `@tf.function`
and `@defun`, we use the sub-graphs in the original graph. However, function
created by `@Defun` only has FunctionDef for the sub-graphs, so it isn't
supported by this CL.
PiperOrigin-RevId: 253932770