Add python examples for decode_json_example.

People keep mistaking this for a general-purpose JSON parsing op.

PiperOrigin-RevId: 342646224
Change-Id: I8b82c87ff185e681785c97dffb53ca38017dd1f8
This commit is contained in:
Mark Daoust 2020-11-16 08:55:45 -08:00 committed by TensorFlower Gardener
parent 1ba764b368
commit 09c684a7a8
3 changed files with 95 additions and 13 deletions

View File

@ -16,11 +16,14 @@ END
}
summary: "Convert JSON-encoded Example records to binary protocol buffer strings."
description: <<END
This op translates a tensor containing Example records, encoded using
the [standard JSON
mapping](https://developers.google.com/protocol-buffers/docs/proto3#json),
into a tensor containing the same records encoded as binary protocol
buffers. The resulting tensor can then be fed to any of the other
Example-parsing ops.
Note: This is **not** a general purpose JSON parsing op.
This op converts JSON-serialized
`tf.train.Example` (created with `json_format.MessageToJson`, following the
[standard JSON mapping](https://developers.google.com/protocol-buffers/docs/proto3#json))
to a binary-serialized `tf.train.Example` (equivalent to
`Example.SerializeToString()`) suitable for conversion to tensors with
`tf.io.parse_example`.
END
}

View File

@ -1,10 +1,4 @@
op {
graph_op_name: "DecodeJSONExample"
endpoint {
name: "io.decode_json_example"
}
endpoint {
name: "decode_json_example"
deprecation_version: 2
}
visibility: HIDDEN
}

View File

@ -1056,3 +1056,88 @@ def _assert_scalar(value, name):
return value
else:
raise ValueError("Input %s must be a scalar" % name)
@tf_export("io.decode_json_example",
v1=["decode_json_example", "io.decode_json_example"])
def decode_json_example(json_examples, name=None):
r"""Convert JSON-encoded Example records to binary protocol buffer strings.
Note: This is **not** a general purpose JSON parsing op.
This op converts JSON-serialized `tf.train.Example` (maybe created with
`json_format.MessageToJson`, following the
[standard JSON mapping](
https://developers.google.com/protocol-buffers/docs/proto3#json))
to a binary-serialized `tf.train.Example` (equivalent to
`Example.SerializeToString()`) suitable for conversion to tensors with
`tf.io.parse_example`.
Here is a `tf.train.Example` proto:
>>> example = tf.train.Example(
... features=tf.train.Features(
... feature={
... "a": tf.train.Feature(
... int64_list=tf.train.Int64List(
... value=[1, 1, 3]))}))
Here it is converted to JSON:
>>> from google.protobuf import json_format
>>> example_json = json_format.MessageToJson(example)
>>> print(example_json)
{
"features": {
"feature": {
"a": {
"int64List": {
"value": [
"1",
"1",
"3"
]
}
}
}
}
}
This op converts the above json string to a binary proto:
>>> example_binary = tf.io.decode_json_example(example_json)
>>> example_binary.numpy()
b'\n\x0f\n\r\n\x01a\x12\x08\x1a\x06\x08\x01\x08\x01\x08\x03'
The OP works on string tensors of andy shape:
>>> tf.io.decode_json_example([
... [example_json, example_json],
... [example_json, example_json]]).shape.as_list()
[2, 2]
This resulting binary-string is equivalent to `Example.SerializeToString()`,
and can be converted to Tensors using `tf.io.parse_example` and related
functions:
>>> tf.io.parse_example(
... serialized=[example_binary.numpy(),
... example.SerializeToString()],
... features = {'a': tf.io.FixedLenFeature(shape=[3], dtype=tf.int64)})
{'a': <tf.Tensor: shape=(2, 3), dtype=int64, numpy=
array([[1, 1, 3],
[1, 1, 3]])>}
Args:
json_examples: A string tensor containing json-serialized `tf.Example`
protos.
name: A name for the op.
Returns:
A string Tensor containing the binary-serialized `tf.Example` protos.
Raises:
`tf.errors.InvalidArgumentError`: If the JSON could not be converted to a
`tf.Example`
"""
return gen_parsing_ops.decode_json_example(json_examples, name=name)