From b4c95671f2fa2e86a36d080aba233c880645e9f8 Mon Sep 17 00:00:00 2001
From: TensorFlow Release Automation <jenkins@tensorflow.org>
Date: Mon, 22 Jun 2020 21:30:45 -0700
Subject: [PATCH 01/13] Insert release notes place-fill

---
 RELEASE.md | 4 ++++
 1 file changed, 4 insertions(+)
diff --git a/RELEASE.md b/RELEASE.md
index f93626cc876..04f5eb710f9 100644
--- a/RELEASE.md
+++ b/RELEASE.md
@@ -1,5 +1,9 @@
 # Release 2.3.0
 
+<REPLACE THIS TEXT WITH THE RELEASE NOTES>
+
+# Release 2.3.0
+
 ## Breaking Changes
 
 *   `tf.image.extract_glimpse` has been updated to correctly process the case

From 3b27581629cca683877a98864f46582b89a7cfbe Mon Sep 17 00:00:00 2001
From: Goldie Gadde <ggadde@google.com>
Date: Fri, 26 Jun 2020 07:38:48 -0700
Subject: [PATCH 02/13] Update RELEASE.md

---
 RELEASE.md | 194 ++++++++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 184 insertions(+), 10 deletions(-)

diff --git a/RELEASE.md b/RELEASE.md
index 04f5eb710f9..2e7400d6147 100644
--- a/RELEASE.md
+++ b/RELEASE.md
@@ -1,19 +1,193 @@
 # Release 2.3.0
 
-<REPLACE THIS TEXT WITH THE RELEASE NOTES>
+## Major Features and Improvements
+  * `tf.data` adds two new mechanisms to solve input pipeline bottlenecks and save resources:
+    * [snapshot](https://www.tensorflow.org/api_docs/python/tf/data/experimental/snapshot)
+    * [tf.data service](https://www.tensorflow.org/api_docs/python/tf/data/experimental/service/distribute). 
 
-# Release 2.3.0
+  In addition checkout the detailed [guide](https://www.tensorflow.org/guide/data_performance_analysis) on analyzing input pipeline performance with TF Profiler.
+
+  * [`tf.distribute.TPUStrategy`](https://www.tensorflow.org/api_docs/python/tf/distribute/TPUStrategy) is now a stable API and no longer considered experimental for TensorFlow. (earlier `tf.distribute.experimental.TPUStrategy`).
+
+  * TF Profiler introduces two new tools: a memory profiler to visualize your model’s memory usage over time and a python tracer which allows you to trace python function calls in your model. Usability improvements include better diagnostic messages and profile options to customize the host and device trace verbosity level.
+
+  * Introduces experimental support for Keras Preprocessing Layers API ([`tf.keras.layers.experimental.preprocessing.*`](https://www.tensorflow.org/api_docs/python/tf/keras/layers/experimental/preprocessing?version=nightly)) to handle data preprocessing operations, with support for composite tensor inputs. Please see below for additional details on these layers.
+  
+  * TFLite <placeholder>
+
+  * Libtensorflow packages will be available in GCS starting this release. We have started to release a nightly version of these packages. 
 
 ## Breaking Changes
+* Increases the **minimum bazel version** required to build TF to **3.1.0**.
+* `tf.data`
+  *  Makes the following (breaking) changes to the `tf.data` C++ API: - `IteratorBase::RestoreInternal`, `IteratorBase::SaveInternal`, and `DatasetBase::CheckExternalState` become pure-virtual and subclasses are now expected to provide an implementation.
+    * The deprecated `DatasetBase::IsStateful` method is removed in favor of `DatasetBase::CheckExternalState`.
+    * Deprecated overrides of `DatasetBase::MakeIterator` and `MakeIteratorFromInputElement` are removed.
+  * The signature of `tensorflow::data::IteratorBase::SaveInternal` and `tensorflow::data::IteratorBase::SaveInput` has been extended with `SerializationContext` argument to enable overriding the default policy for the handling external state during iterator checkpointing. This is not a backwards compatible change and all subclasses of `IteratorBase` *need to be updated* accordingly.
+* `tf.keras`
+    * Add a new `BackupAndRestore` callback for handling distributed training failures & restarts. Please take a look at this [tutorial](https://www.tensorflow.org/tutorials/distribute/multi_worker_with_keras) for details on how to use the callback. 
+* `tf.image.extract_glimpse` has been updated to correctly process the case
+   where `centered=False` and `normalized=False`. This is a breaking change as
+   the output is different from (incorrect) previous versions. Note this
+   breaking change only impacts `tf.image.extract_glimpse` and
+   `tf.compat.v2.image.extract_glimpse` API endpoints. The behavior of
+   `tf.compat.v1.image.extract_glimpse` does not change. The behavior of
+   exsiting C++ kernel `ExtractGlimpse` does not change as well, so saved
+   models will not be impacted.
+
+## Bug Fixes and Other Changes
+
+### TF Core:
+  * Set `tf2_behavior` to 1 to enable V2 for early loading cases.
+  * Add a function to dynamically choose the implementation based on underlying device placement.
+  * Eager:
+    * Add `reduce_logsumexp` benchmark with experiment compile.
+    * Give `EagerTensor`s a meaningful `__array__` implementation.
+    * Add another version of defun matmul for performance analysis.
+  * `tf.function`/AutoGraph:
+    * `AutoGraph` now includes into TensorFlow loops any variables that are closed over by local functions. Previously, such variables were sometimes incorrectly ignored.
+    * functions returned by the `get_concrete_function` method of `tf.function` objects can now be called with arguments consistent with the original arguments or type specs passed to `get_concrete_function`.  This calling convention is now the preferred way to use concrete functions with nested values and composite tensors. Please check the [guide](https://www.tensorflow.org/guide/concrete_function) for more details on `concrete_ function`.
+    * Update `tf.function`'s `experimental_relax_shapes` to handle composite tensors appropriately.
+    * Optimize `tf.function` invocation, by removing redundant list converter.
+    * `tf.function` will retrace when called with a different variable instead of simply using the `dtype` & `shape`.
+    * [Improve support](https://github.com/tensorflow/tensorflow/issues/33862) for dynamically-sized TensorArray inside `tf.function`.
+  * `tf.math`:
+    * Narrow down `argmin`/`argmax` contract to always return the smallest index for ties.
+    * `tf.math.reduce_variance` and `tf.math.reduce_std` return correct computation for complex types and no longer support integer types.
+    * Add Bessel functions of order 0,1 to `tf.math.special`.
+    * `tf.divide` now always returns a tensor to be consistent with documentation and other APIs.
+  * `tf.image`:
+    * Replaces [`tf.image.non_max_suppression_padded`](https://www.tensorflow.org/api_docs/python/tf/image/non_max_suppression_padded?hl=en&version=nightly) with a new implementation that supports batched inputs, which is considerably faster on TPUs and GPUs. Boxes with area=0 will be neglected. Existing usage with single inputs should still work as before.
+  * `tf.linalg`
+    * Add `tf.linalg.banded_triangular_solve`.
+  * `tf.random`:
+    * Add `tf.random.stateless_parameterized_truncated_normal`.
+  * `tf.ragged`:
+    * Add `tf.ragged.cross` and `tf.ragged.cross_hashed` operations.
+  * `tf.RaggedTensor`:
+    * `RaggedTensor.to_tensor()` now preserves static shape.
+    * Add `tf.strings.format()` and `tf.print()` to support RaggedTensors.
+  * `tf.saved_model`:
+    * `@tf.function` from SavedModel no longer ignores args after a `RaggedTensor` when selecting the concrete function to run.
+    * Fix save model issue for ops with a list of functions.
+    * Add `tf.saved_model.LoadOptions`  with [`experimental_io_device`](https://www.tensorflow.org/api_docs/python/tf/saved_model/LoadOptions) as arg to choose the I/O device for saving and loading models and weights.
+  * GPU
+    * No longer includes PTX kernels for GPU except for sm_70 to reduce binary size.
+  * Profiler
+    * Fix a subtle use-after-free issue in `XStatVisitor::RefValue()`.
+  * Others
+    * Retain parent namescope for ops added inside `tf.while_loop`/`tf.cond`/`tf.switch_case`.
+    * Update `tf.vectorized_map` to support vectorizing `tf.while_loop` and TensorList operations.
+    * `tf.custom_gradient` can now be applied to functions that accept nested structures of `tensors` as inputs (instead of just a list of tensors). Note that Python structures such as tuples and lists now won't be treated as tensors, so if you still want them to be treated that way, you need to wrap them with `tf.convert_to_tensor`.
+    * No lowering on gradient case op when input is `DeviceIndex` op.
+    * Fix in c_api `DEFINE_GETATTR`.
+    * Extend the ragged version of `tf.gather` to support `batch_dims` and `axis` args.
+    * Update `tf.map_fn` to support RaggedTensors and SparseTensors.
+    * Deprecate `tf.group`. It is not useful in eager mode.
+    * Add a new variant of `FTRL` allowing a learning rate of zero.
+    
+### `tf.data: 
+  * `tf.data.experimental.dense_to_ragged_batch` works correctly with tuples.
+  * `tf.data.experimental.dense_to_ragged_batch` to output variable ragged rank.
+  * `tf.data.experimental.cardinality` is now a method on `tf.data.Dataset`.
+  * `tf.data.Dataset` now supports `len(Dataset)` when the cardinality is finite.
+
+### `tf.distribute`: 
+  * Add a `tf.distribute.cluster_resolver.TPUClusterResolver.connect` API to simplify TPU initialization. 
+  * Allow var.assign on `MirroredVariables` with `aggregation=NONE` in replica context. Previously this would raise an error since there was no way to confirm that the values being assigned to the `MirroredVariables` were in fact identical.
+  * `tf.distribute.experimental.MultiWorkerMirroredStrategy` adds support for partial batches. Workers running out of data now continue to participate in the training with empty inputs, instead of raising an error.
+  * Improve the performance of reading metrics eagerly under `tf.distribute.experimental.MultiWorkerMirroredStrategy`.
+  * Fix the issue that `strategy.reduce()` inside `tf.function` may raise exceptions when the value to reduce are from loops or if-clauses.
+  * Fix the issue that `tf.distribute.MirroredStrategy` cannot be used together with `tf.distribute.experimental.MultiWorkerMirroredStrategy`.
+
+### `tf.keras`:
+  * Introduces experimental preprocessing layers API (`tf.keras.layers.experimental.preprocessing`)  to handle data preprocessing operations such as categorical feature encoding, text vectorization, data normalization, and data discretization (binning). The newly added layers provide a replacement for the  legacy feature column API, and support composite tensor inputs. 
+  * Added **categorical data** processing layers:
+    * `IntegerLookup` & `StringLookup`: build an index of categorical feature values
+    * `CategoryEncoding`: turn integer-encoded categories into one-hot, multi-hot, or tf-idf encoded representations
+    * `CategoryCrossing`: create new categorical features representing co-occurrences of previous categorical feature values
+    * `Hashing`: the hashing trick, for large-vocabulary categorical features
+    * `Discretization`: turn continuous numerical features into categorical features by binning their values
+  * Improved **image preprocessing** layers: `CenterCrop`, `Rescaling`
+  * Improved **image augmentation** layers: `RandomCrop`, `RandomFlip`, `RandomTranslation`, `RandomRotation`, `RandomHeight`, `RandomWidth`, `RandomZoom`, `RandomContrast`
+  * Improved **`TextVectorization`** layer, which handles string tokenization, n-gram generation, and token encoding
+    * The `TextVectorization` layer now accounts for the mask_token as part of the vocabulary size when output_mode='int'. This means that, if you have a max_tokens value of 5000, your output will have 5000 unique values (not 5001 as before).
+    * Change the return value of `TextVectorization.get_vocabulary()` from `byte` to `string`. Users who previously were calling 'decode' on the output of this method should no longer need to do so.
+  * Introduce new Keras dataset generation utilities :
+    * **`image_dataset_from_directory`** is a utility based on `tf.data.Dataset`, meant to replace the legacy `ImageDataGenerator`. It takes you from a structured directory of images to a labeled dataset, in one function call. Note that it doesn't perform image data augmentation (which is meant to be done using preprocessing layers).
+    * **`text_dataset_from_directory`** takes you from a structured directory of text files to a labeled dataset, in one function call.
+    * **`timeseries_dataset_from_array`** is a `tf.data.Dataset`-based replacement of the legacy `TimeseriesGenerator`. It takes you from an array of timeseries data to a dataset of shifting windows with their targets.
+  * Added [`experimental_steps_per_execution`](https://www.tensorflow.org/api_docs/python/tf/keras/Model?version=nightly#compile)
+ arg to `model.compile` to indicate the number of batches to run per `tf.function` call. This can speed up Keras Models on TPUs up to 3x.
+  * Extends `tf.keras.layers.Lambda` layers to support multi-argument lambdas, and keyword arguments when calling the layer.
+  * Functional models now get constructed if *any* tensor in a layer call's arguments/keyword arguments comes from a keras input. Previously the functional api would only work if all of the elements in the first argument to the layer came from a keras input.
+  * Clean up `BatchNormalization` layer's `trainable` property to act like standard python state when it's used inside `tf.functions` (frozen at tracing time), instead of acting like a pseudo-variable whose updates *kind of sometimes* get reflected in already-traced `tf.function` traces.
+  * Add the `Conv1DTranspose` layer.
+  * Fix bug in `SensitivitySpecificityBase` derived metrics.
+  * Blacklist Case op from callback
+
+### `tf.lite`:
+  * Converter
+      * Restored `inference_input_type` and `inference_output_type` flags in TF 2.x TFLiteConverter (backward compatible with TF 1.x) to support integer (tf.int8, tf.uint8) input and output types in post training full integer quantized models.
+  * CPU
+      * Fix an issue w/ dynamic weights and `Conv2D` on x86.
+      * Add a runtime Android flag for enabling `XNNPACK` for optimized CPU performance.
+      * Add a runtime iOS flag for enabling `XNNPACK` for optimized CPU performance.
+      * Add a compiler flag to enable building a TFLite library that applies `XNNPACK` delegate automatically when the model has a `fp32` operation.
+  * GPU
+      * Allow GPU acceleration starting with internal graph nodes
+      * Experimental support for quantized models with the Android GPU delegate
+      * Add GPU delegate whitelist.
+      * Rename GPU whitelist -> compatibility (list).
+      * Improve GPU compatibility list entries from crash reports. 
+  * NNAPI
+      * Set default value for `StatefulNnApiDelegate::Options::max_number_delegated_partitions` to 3.
+      * Add capability to disable `NNAPI` CPU and check `NNAPI` Errno.
+      * Fix crashes when using `NNAPI` with target accelerator specified with model containing Conv2d or FullyConnected or LSTM nodes with quantized weights.
+      * Fix `ANEURALNETWORKS_BAD_DATA` execution failures with `sum`/`max`/`min`/`reduce` operations with `scalar` inputs.
+  * Hexagon
+      * TFLite Hexagon Delegate out of experimental.
+      * Experimental `int8` support for most hexagon ops.
+      * Experimental per-channel quant support for `conv` in Hexagon delegate.
+      * Support dynamic batch size in C++ API.
+  * CoreML
+     * Opensource CoreML delegate
+  * Misc
+      * Enable building Android TFLite targets on Windows
+      * Add 3D support for TFLite `BatchToSpaceND`.
+      * Add 5D support for TFLite `BroadcastSub`.
+      * Add 5D support for TFLite `Maximum` `Minimum`.
+      * Add 5D support for TFLite `Transpose`.
+      * Add 5D support for `BroadcastDiv`.
+      * Rename `kTfLiteActRelu1` to `kTfLiteActReluN1To1`.
+      * Enable flex delegate on tensorflow.lite.Interpreter Python package.
+      * Add `Buckettize`, `SparseCross` and `BoostedTreesBucketize` to the flex whitelist.
+      * Selective registration for flex ops.
+      * Add missing kernels for flex delegate whitelisted ops.
+      * Fix issue when using direct `ByteBuffer` inputs with graphs that have dynamic shapes.
+      * Fix error checking supported operations in a model containing `HardSwish`.  
+
+### TPU Enhancements
+  * 3D mesh support
+  * Added TPU code for `FTRL` with `multiply_linear_by_lr`.
+  * Silently adds a new file system registry at `gstpu`.
+  * Support `restartType` in cloud tpu client.
+  * Depend on a specific version of google-api-python-client.
+  * Fixes apiclient import.
+
+### XLA Support
+  * Implement stable `argmin` and `argmax`
+
+### Tracing and Debugging
+  * Add a `TFE_Py_Execute` traceme.    
+  
+  
+## Thanks to our Contributors
+
+This release contains contributions from many people at Google, as well as:
+
+902449@58880@bigcat_chen@ASIC, Abdul Baseer Khan, Abhineet Choudhary, Abolfazl Shahbazi, Adam Hillier, ag.ramesh, Agoniii, Ajay P, Alex Hoffman, Alexander Bayandin, Alexander Grund, Alexandre Abadie, Alexey Rogachevskiy, amoitra, Andrew Stevens, Angus-Luo, Anshuman Tripathy, Anush Elangovan, Artem Mavrin, Ashutosh Hathidara, autoih, Ayushman Kumar, ayushmankumar7, Bairen Yi, Bas Aarts, Bastian Eichenberger, Ben Barsdell, bhack, Bharat Raghunathan, Biagio Montaruli, Bigcat-Himax, blueyi, Bryan Cutler, Byambaa, Carlos Hernandez-Vaquero, Chen Lei, Chris Knorowski, Christian Clauss, chuanqiw, CuiYifeng, Daniel Situnayake, Daria Zhuravleva, Dayananda-V, Deven Desai, Devi Sandeep Endluri, Dmitry Zakharov, Dominic Jack, Duncan Riach, Edgar Liberis, Ehsan Toosi, ekuznetsov139, Elena Zhelezina, Eugene Kuznetsov, Eugene Mikhantiev, Evgenii Zheltonozhskii, Fabio Di Domenico, Fausto Morales, Fei Sun, feihugis, Felix E. Klee, flyingcat, Frederic Bastien, Fredrik Knutsson, frreiss, fsx950223, ganler, Gaurav Singh, Georgios Pinitas, Gian Marco Iodice, Giorgio Arena, Giuseppe Rossini, Gregory Keith, Guozhong Zhuang, gurushantj, Hahn Anselm, Harald Husum, Harjyot Bagga, Hristo Vrigazov, Ilya Persky, Ir1d, Itamar Turner-Trauring, jacco, Jake Tae, Janosh Riebesell, Jason Zaman, jayanth, Jeff Daily, Jens Elofsson, Jinzhe Zeng, JLZ, Jonas Skog, Jonathan Dekhtiar, Josh Meyer, Joshua Chia, Judd, justkw, Kaixi Hou, Kam D Kasravi, Kamil Rakoczy, Karol Gugala, Kayou, Kazuaki Ishizaki, Keith Smiley, Khaled Besrour, Kilaru Yasaswi Sri Chandra Gandhi, Kim, Young Soo, Kristian Hartikainen, Kwabena W. Agyeman, Leslie-Fang, Leslie-Fang-Intel, Li, Guizi, Lukas Geiger, Lutz Roeder, M\U00E5Ns Nilsson, Mahmoud Abuzaina, Manish, Marcel Koester, Marcin Sielski, marload, Martin Jul, Matt Conley, mdfaijul, Meng, Peng, Meteorix, Michael Käufl, Michael137, Milan Straka, Mitchell Vitez, Ml-0, Mokke Meguru, Mshr-H, nammbash, Nathan Luehr, naumkin, Neeraj Bhadani, ngc92, Nick Morgan, nihui, Niranjan Hasabnis, Niranjan Yadla, Nishidha Panpaliya, Oceania2018, oclyke, Ouyang Jin, OverLordGoldDragon, Owen Lyke, Patrick Hemmer, Paul Andrey, Peng Sun, periannath, Phil Pearl, Prashant Dandriyal, Prashant Kumar, Rahul Huilgol, Rajan Singh, Rajeshwar Reddy T, rangjiaheng, Rishit Dagli, Rohan Reddy, rpalakkal, rposts, Ruan Kunliang, Rushabh Vasani, Ryohei Ikegami, Semun Lee, Seo-Inyoung, Sergey Mironov, Sharada Shiddibhavi, ShengYang1, Shraiysh Vaishay, Shunya Ueta, shwetaoj, Siyavash Najafzade, Srinivasan Narayanamoorthy, Stephan Uphoff, storypku, sunchenggen, sunway513, Sven-Hendrik Haase, Swapnil Parekh, Tamas Bela Feher, Teng Lu, tigertang, tomas, Tomohiro Ubukata, tongxuan.ltx, Tony Tonev, Tzu-Wei Huang, Téo Bouvard, Uday Bondhugula, Vaibhav Jade, Vijay Tadikamalla, Vikram Dattu, Vincent Abriou, Vishnuvardhan Janapati, Vo Van Nghia, VoVAllen, Will Battel, William D. Irons, wyzhao, Xiaoming (Jason) Cui, Xiaoquan Kong, Xinan Jiang, xutianming, Yair Ehrenwald, Yasir Modak, Yasuhiro Matsumoto, Yixing Fu, Yong Tang, Yuan Tang, zhaozheng09, Zilin Zhu, zilinzhu, 张志豪
 
-*   `tf.image.extract_glimpse` has been updated to correctly process the case
-    where `centered=False` and `normalized=False`. This is a breaking change as
-    the output is different from (incorrect) previous versions. Note this
-    breaking change only impacts `tf.image.extract_glimpse` and
-    `tf.compat.v2.image.extract_glimpse` API endpoints. The behavior of
-    `tf.compat.v1.image.extract_glimpse` does not change. The behavior of
-    exsiting C++ kernel `ExtractGlimpse` does not change as well, so saved
-    models will not be impacted.
 
 # Release 2.1.1
 

From 9310f2a1801a1c428b6092e5289a6c0fef6867c8 Mon Sep 17 00:00:00 2001
From: Goldie Gadde <ggadde@google.com>
Date: Fri, 26 Jun 2020 15:41:40 -0700
Subject: [PATCH 03/13] Update RELEASE.md

---
 RELEASE.md | 38 ++++++++++++++++++++------------------
 1 file changed, 20 insertions(+), 18 deletions(-)

diff --git a/RELEASE.md b/RELEASE.md
index 2e7400d6147..53f5907f922 100644
--- a/RELEASE.md
+++ b/RELEASE.md
@@ -5,7 +5,7 @@
     * [snapshot](https://www.tensorflow.org/api_docs/python/tf/data/experimental/snapshot)
     * [tf.data service](https://www.tensorflow.org/api_docs/python/tf/data/experimental/service/distribute). 
 
-  In addition checkout the detailed [guide](https://www.tensorflow.org/guide/data_performance_analysis) on analyzing input pipeline performance with TF Profiler.
+  In addition checkout the detailed [guide](https://www.tensorflow.org/guide/data_performance_analysis) for analyzing input pipeline performance with TF Profiler.
 
   * [`tf.distribute.TPUStrategy`](https://www.tensorflow.org/api_docs/python/tf/distribute/TPUStrategy) is now a stable API and no longer considered experimental for TensorFlow. (earlier `tf.distribute.experimental.TPUStrategy`).
 
@@ -13,9 +13,9 @@
 
   * Introduces experimental support for Keras Preprocessing Layers API ([`tf.keras.layers.experimental.preprocessing.*`](https://www.tensorflow.org/api_docs/python/tf/keras/layers/experimental/preprocessing?version=nightly)) to handle data preprocessing operations, with support for composite tensor inputs. Please see below for additional details on these layers.
   
-  * TFLite <placeholder>
+  * TFLite now properly supports dynamic shapes during conversion and inference. We’ve also added opt-in support on Android and iOS for [XNNPACK](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/delegates/xnnpack), a highly optimized set of CPU kernels, as well as opt-in support for [executing quantized models on the GPU](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/g3doc/performance/gpu_advanced.md#running-quantized-models-experimental). 
 
-  * Libtensorflow packages will be available in GCS starting this release. We have started to release a nightly version of these packages. 
+  * Libtensorflow packages are available in GCS starting this release. We have also started to release a nightly version of these packages. 
 
 ## Breaking Changes
 * Increases the **minimum bazel version** required to build TF to **3.1.0**.
@@ -70,7 +70,8 @@
   * `tf.saved_model`:
     * `@tf.function` from SavedModel no longer ignores args after a `RaggedTensor` when selecting the concrete function to run.
     * Fix save model issue for ops with a list of functions.
-    * Add `tf.saved_model.LoadOptions`  with [`experimental_io_device`](https://www.tensorflow.org/api_docs/python/tf/saved_model/LoadOptions) as arg to choose the I/O device for saving and loading models and weights.
+    * Add `tf.saved_model.LoadOptions` with [`experimental_io_device`](https://www.tensorflow.org/api_docs/python/tf/saved_model/LoadOptions) as arg with default value `None` to choose the I/O device for loading models and weights.
+     * Update `tf.saved_model.SaveOptions` with [`experimental_io_device`](https://www.tensorflow.org/api_docs/python/tf/saved_model/SaveOptions?version=nightly) as arg with default value `None` to choose the I/O device for saving models and weights.
   * GPU
     * No longer includes PTX kernels for GPU except for sm_70 to reduce binary size.
   * Profiler
@@ -93,12 +94,14 @@
   * `tf.data.Dataset` now supports `len(Dataset)` when the cardinality is finite.
 
 ### `tf.distribute`: 
-  * Add a `tf.distribute.cluster_resolver.TPUClusterResolver.connect` API to simplify TPU initialization. 
-  * Allow var.assign on `MirroredVariables` with `aggregation=NONE` in replica context. Previously this would raise an error since there was no way to confirm that the values being assigned to the `MirroredVariables` were in fact identical.
+  * Expose experimental [`tf.distribute.DistributedDataset`](https://www.tensorflow.org/api_docs/python/tf/distribute/DistributedDataset) and [`tf.distribute.DistributedIterator`](https://www.tensorflow.org/api_docs/python/tf/distribute/DistributedIterator) to distribute input data when using `tf.distribute` to scale training on multiple devices. 
+    * Added a `get_next_as_optional` method for `tf.distribute.DistributedIterator` class to return a `tf.experimental.Optional` instance that contains the next value for all replicas or none instead of raising an out of range error. Also see *new* [guide on input distribution](https://www.tensorflow.org/tutorials/distribute/input).
+  * Allow `var.assign` on `MirroredVariables` with `aggregation=NONE` in replica context. Previously this would raise an error since there was no way to confirm that the values being assigned to the `MirroredVariables` were in fact identical.
   * `tf.distribute.experimental.MultiWorkerMirroredStrategy` adds support for partial batches. Workers running out of data now continue to participate in the training with empty inputs, instead of raising an error.
   * Improve the performance of reading metrics eagerly under `tf.distribute.experimental.MultiWorkerMirroredStrategy`.
-  * Fix the issue that `strategy.reduce()` inside `tf.function` may raise exceptions when the value to reduce are from loops or if-clauses.
+  * Fix the issue that `strategy.reduce()` inside `tf.function` may raise exceptions when the values to reduce are from loops or if-clauses.
   * Fix the issue that `tf.distribute.MirroredStrategy` cannot be used together with `tf.distribute.experimental.MultiWorkerMirroredStrategy`.
+  * Add a `tf.distribute.cluster_resolver.TPUClusterResolver.connect` API to simplify TPU initialization.
 
 ### `tf.keras`:
   * Introduces experimental preprocessing layers API (`tf.keras.layers.experimental.preprocessing`)  to handle data preprocessing operations such as categorical feature encoding, text vectorization, data normalization, and data discretization (binning). The newly added layers provide a replacement for the  legacy feature column API, and support composite tensor inputs. 
@@ -114,9 +117,9 @@
     * The `TextVectorization` layer now accounts for the mask_token as part of the vocabulary size when output_mode='int'. This means that, if you have a max_tokens value of 5000, your output will have 5000 unique values (not 5001 as before).
     * Change the return value of `TextVectorization.get_vocabulary()` from `byte` to `string`. Users who previously were calling 'decode' on the output of this method should no longer need to do so.
   * Introduce new Keras dataset generation utilities :
-    * **`image_dataset_from_directory`** is a utility based on `tf.data.Dataset`, meant to replace the legacy `ImageDataGenerator`. It takes you from a structured directory of images to a labeled dataset, in one function call. Note that it doesn't perform image data augmentation (which is meant to be done using preprocessing layers).
-    * **`text_dataset_from_directory`** takes you from a structured directory of text files to a labeled dataset, in one function call.
-    * **`timeseries_dataset_from_array`** is a `tf.data.Dataset`-based replacement of the legacy `TimeseriesGenerator`. It takes you from an array of timeseries data to a dataset of shifting windows with their targets.
+    * **[`image_dataset_from_directory`](https://www.tensorflow.org/api_docs/python/tf/keras/preprocessing/image_dataset_from_directory)** is a utility based on `tf.data.Dataset`, meant to replace the legacy `ImageDataGenerator`. It takes you from a structured directory of images to a labeled dataset, in one function call. Note that it doesn't perform image data augmentation (which is meant to be done using preprocessing layers).
+    * **[`text_dataset_from_directory`](https://www.tensorflow.org/api_docs/python/tf/keras/preprocessing/text_dataset_from_directory)** takes you from a structured directory of text files to a labeled dataset, in one function call.
+    * **[`timeseries_dataset_from_array`](https://www.tensorflow.org/api_docs/python/tf/keras/preprocessing/timeseries_dataset_from_array)** is a `tf.data.Dataset`-based replacement of the legacy `TimeseriesGenerator`. It takes you from an array of timeseries data to a dataset of shifting windows with their targets.
   * Added [`experimental_steps_per_execution`](https://www.tensorflow.org/api_docs/python/tf/keras/Model?version=nightly#compile)
  arg to `model.compile` to indicate the number of batches to run per `tf.function` call. This can speed up Keras Models on TPUs up to 3x.
   * Extends `tf.keras.layers.Lambda` layers to support multi-argument lambdas, and keyword arguments when calling the layer.
@@ -129,6 +132,7 @@
 ### `tf.lite`:
   * Converter
       * Restored `inference_input_type` and `inference_output_type` flags in TF 2.x TFLiteConverter (backward compatible with TF 1.x) to support integer (tf.int8, tf.uint8) input and output types in post training full integer quantized models.
+      * Added support for converting and resizing models with dynamic (placeholder) dimensions. Previously, there was only limited support for dynamic batch size, and even that did not guarantee that the model could be properly resized at runtime.
   * CPU
       * Fix an issue w/ dynamic weights and `Conv2D` on x86.
       * Add a runtime Android flag for enabling `XNNPACK` for optimized CPU performance.
@@ -154,19 +158,18 @@
      * Opensource CoreML delegate
   * Misc
       * Enable building Android TFLite targets on Windows
-      * Add 3D support for TFLite `BatchToSpaceND`.
-      * Add 5D support for TFLite `BroadcastSub`.
-      * Add 5D support for TFLite `Maximum` `Minimum`.
-      * Add 5D support for TFLite `Transpose`.
-      * Add 5D support for `BroadcastDiv`.
+      * Add support for `BatchMatMul`.
+      * Add support for `half_pixel_centers` with `ResizeNearestNeighbor`.
+      * Add 3D support for `BatchToSpaceND`.
+      * Add 5D support for `BroadcastSub`, `Maximum`, `Minimum`, `Transpose` and `BroadcastDiv`.
       * Rename `kTfLiteActRelu1` to `kTfLiteActReluN1To1`.
       * Enable flex delegate on tensorflow.lite.Interpreter Python package.
       * Add `Buckettize`, `SparseCross` and `BoostedTreesBucketize` to the flex whitelist.
-      * Selective registration for flex ops.
+      * Add support for selective registration of flex ops.
       * Add missing kernels for flex delegate whitelisted ops.
       * Fix issue when using direct `ByteBuffer` inputs with graphs that have dynamic shapes.
       * Fix error checking supported operations in a model containing `HardSwish`.  
-
+ 
 ### TPU Enhancements
   * 3D mesh support
   * Added TPU code for `FTRL` with `multiply_linear_by_lr`.
@@ -181,7 +184,6 @@
 ### Tracing and Debugging
   * Add a `TFE_Py_Execute` traceme.    
   
-  
 ## Thanks to our Contributors
 
 This release contains contributions from many people at Google, as well as:

From d3dc6a2071f92a369e936e669803d7162d2b3f68 Mon Sep 17 00:00:00 2001
From: Goldie Gadde <ggadde@google.com>
Date: Fri, 26 Jun 2020 15:49:19 -0700
Subject: [PATCH 04/13] Update RELEASE.md

---
 RELEASE.md | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/RELEASE.md b/RELEASE.md
index 53f5907f922..859fb7981a0 100644
--- a/RELEASE.md
+++ b/RELEASE.md
@@ -20,7 +20,8 @@
 ## Breaking Changes
 * Increases the **minimum bazel version** required to build TF to **3.1.0**.
 * `tf.data`
-  *  Makes the following (breaking) changes to the `tf.data` C++ API: - `IteratorBase::RestoreInternal`, `IteratorBase::SaveInternal`, and `DatasetBase::CheckExternalState` become pure-virtual and subclasses are now expected to provide an implementation.
+  *  Makes the following (breaking) changes to the `tf.data`.
+    * C++ API: - `IteratorBase::RestoreInternal`, `IteratorBase::SaveInternal`, and `DatasetBase::CheckExternalState` become pure-virtual and subclasses are now expected to provide an implementation.
     * The deprecated `DatasetBase::IsStateful` method is removed in favor of `DatasetBase::CheckExternalState`.
     * Deprecated overrides of `DatasetBase::MakeIterator` and `MakeIteratorFromInputElement` are removed.
   * The signature of `tensorflow::data::IteratorBase::SaveInternal` and `tensorflow::data::IteratorBase::SaveInput` has been extended with `SerializationContext` argument to enable overriding the default policy for the handling external state during iterator checkpointing. This is not a backwards compatible change and all subclasses of `IteratorBase` *need to be updated* accordingly.
@@ -87,7 +88,7 @@
     * Deprecate `tf.group`. It is not useful in eager mode.
     * Add a new variant of `FTRL` allowing a learning rate of zero.
     
-### `tf.data: 
+### `tf.data`: 
   * `tf.data.experimental.dense_to_ragged_batch` works correctly with tuples.
   * `tf.data.experimental.dense_to_ragged_batch` to output variable ragged rank.
   * `tf.data.experimental.cardinality` is now a method on `tf.data.Dataset`.

From 61b2024a19037479f6df8db38eeb777ba03fd60d Mon Sep 17 00:00:00 2001
From: Austin Anderson <angerson@google.com>
Date: Mon, 6 Jul 2020 14:24:16 -0700
Subject: [PATCH 05/13] Added point about tf.sysconfig.get_build_info()

---
 RELEASE.md | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/RELEASE.md b/RELEASE.md
index 859fb7981a0..e263ddb710f 100644
--- a/RELEASE.md
+++ b/RELEASE.md
@@ -169,7 +169,7 @@
       * Add support for selective registration of flex ops.
       * Add missing kernels for flex delegate whitelisted ops.
       * Fix issue when using direct `ByteBuffer` inputs with graphs that have dynamic shapes.
-      * Fix error checking supported operations in a model containing `HardSwish`.  
+      * Fix error checking supported operations in a model containing `HardSwish`. 
  
 ### TPU Enhancements
   * 3D mesh support
@@ -183,7 +183,10 @@
   * Implement stable `argmin` and `argmax`
 
 ### Tracing and Debugging
-  * Add a `TFE_Py_Execute` traceme.    
+  * Add a `TFE_Py_Execute` traceme.
+  
+### Packaging Support 
+  * Added `tf.sysconfig.get_build_info()`. Returns a dict that describes the currently installed TensorFlow package, e.g. the NVIDIA CUDA and NVIDIA CuDNN versions that the package was built to support.
   
 ## Thanks to our Contributors
 

From 257447e193f77117c91075c3431c99fc651f9813 Mon Sep 17 00:00:00 2001
From: Goldie Gadde <ggadde@google.com>
Date: Mon, 6 Jul 2020 15:23:52 -0700
Subject: [PATCH 06/13] Update RELEASE.md

---
 RELEASE.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/RELEASE.md b/RELEASE.md
index e263ddb710f..bdd30bef9c0 100644
--- a/RELEASE.md
+++ b/RELEASE.md
@@ -3,7 +3,7 @@
 ## Major Features and Improvements
   * `tf.data` adds two new mechanisms to solve input pipeline bottlenecks and save resources:
     * [snapshot](https://www.tensorflow.org/api_docs/python/tf/data/experimental/snapshot)
-    * [tf.data service](https://www.tensorflow.org/api_docs/python/tf/data/experimental/service/distribute). 
+    * [tf.data service](https://www.tensorflow.org/api_docs/python/tf/data/experimental/service). 
 
   In addition checkout the detailed [guide](https://www.tensorflow.org/guide/data_performance_analysis) for analyzing input pipeline performance with TF Profiler.
 

From 98a59288c8fa4441d3a5eb0ef2615ca71794ca0c Mon Sep 17 00:00:00 2001
From: Goldie Gadde <ggadde@google.com>
Date: Wed, 8 Jul 2020 18:22:59 -0700
Subject: [PATCH 07/13] Update RELEASE.md

---
 RELEASE.md | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/RELEASE.md b/RELEASE.md
index bdd30bef9c0..7ad2796083d 100644
--- a/RELEASE.md
+++ b/RELEASE.md
@@ -9,7 +9,7 @@
 
   * [`tf.distribute.TPUStrategy`](https://www.tensorflow.org/api_docs/python/tf/distribute/TPUStrategy) is now a stable API and no longer considered experimental for TensorFlow. (earlier `tf.distribute.experimental.TPUStrategy`).
 
-  * TF Profiler introduces two new tools: a memory profiler to visualize your model’s memory usage over time and a python tracer which allows you to trace python function calls in your model. Usability improvements include better diagnostic messages and profile options to customize the host and device trace verbosity level.
+  * [TF Profiler](https://www.tensorflow.org/guide/profiler) introduces two new tools: a memory profiler to visualize your model’s memory usage over time and a python tracer which allows you to trace python function calls in your model. Usability improvements include better diagnostic messages and [profile options](https://tensorflow.org/guide/profiler#collect_performance_data) to customize the host and device trace verbosity level.
 
   * Introduces experimental support for Keras Preprocessing Layers API ([`tf.keras.layers.experimental.preprocessing.*`](https://www.tensorflow.org/api_docs/python/tf/keras/layers/experimental/preprocessing?version=nightly)) to handle data preprocessing operations, with support for composite tensor inputs. Please see below for additional details on these layers.
   
@@ -33,8 +33,8 @@
    breaking change only impacts `tf.image.extract_glimpse` and
    `tf.compat.v2.image.extract_glimpse` API endpoints. The behavior of
    `tf.compat.v1.image.extract_glimpse` does not change. The behavior of
-   exsiting C++ kernel `ExtractGlimpse` does not change as well, so saved
-   models will not be impacted.
+   exsiting C++ kernel `ExtractGlimpse` does not change either, so saved
+   models using `tf.raw_ops.ExtractGlimpse` will not be impacted.
 
 ## Bug Fixes and Other Changes
 
@@ -58,7 +58,7 @@
     * Add Bessel functions of order 0,1 to `tf.math.special`.
     * `tf.divide` now always returns a tensor to be consistent with documentation and other APIs.
   * `tf.image`:
-    * Replaces [`tf.image.non_max_suppression_padded`](https://www.tensorflow.org/api_docs/python/tf/image/non_max_suppression_padded?hl=en&version=nightly) with a new implementation that supports batched inputs, which is considerably faster on TPUs and GPUs. Boxes with area=0 will be neglected. Existing usage with single inputs should still work as before.
+    * Replaced [`tf.image.non_max_suppression_padded`](https://www.tensorflow.org/versions/r2.3/api_docs/python/tf/image/non_max_suppression_padded?hl=en) with a new implementation that supports batched inputs, which is considerably faster on TPUs and GPUs. Boxes with area=0 will be ignored. Existing usage with single inputs should still work as before.
   * `tf.linalg`
     * Add `tf.linalg.banded_triangular_solve`.
   * `tf.random`:
@@ -71,8 +71,8 @@
   * `tf.saved_model`:
     * `@tf.function` from SavedModel no longer ignores args after a `RaggedTensor` when selecting the concrete function to run.
     * Fix save model issue for ops with a list of functions.
-    * Add `tf.saved_model.LoadOptions` with [`experimental_io_device`](https://www.tensorflow.org/api_docs/python/tf/saved_model/LoadOptions) as arg with default value `None` to choose the I/O device for loading models and weights.
-     * Update `tf.saved_model.SaveOptions` with [`experimental_io_device`](https://www.tensorflow.org/api_docs/python/tf/saved_model/SaveOptions?version=nightly) as arg with default value `None` to choose the I/O device for saving models and weights.
+    * Add `tf.saved_model.LoadOptions` with [`experimental_io_device`](https://www.tensorflow.org/versions/r2.3/api_docs/python/tf/saved_model/LoadOptions?hl=en) as arg with default value `None` to choose the I/O device for loading models and weights.
+     * Update `tf.saved_model.SaveOptions` with [`experimental_io_device`](https://www.tensorflow.org/versions/r2.3/api_docs/python/tf/saved_model/SaveOptions?hl=en) as arg with default value `None` to choose the I/O device for saving models and weights.
   * GPU
     * No longer includes PTX kernels for GPU except for sm_70 to reduce binary size.
   * Profiler
@@ -95,8 +95,8 @@
   * `tf.data.Dataset` now supports `len(Dataset)` when the cardinality is finite.
 
 ### `tf.distribute`: 
-  * Expose experimental [`tf.distribute.DistributedDataset`](https://www.tensorflow.org/api_docs/python/tf/distribute/DistributedDataset) and [`tf.distribute.DistributedIterator`](https://www.tensorflow.org/api_docs/python/tf/distribute/DistributedIterator) to distribute input data when using `tf.distribute` to scale training on multiple devices. 
-    * Added a `get_next_as_optional` method for `tf.distribute.DistributedIterator` class to return a `tf.experimental.Optional` instance that contains the next value for all replicas or none instead of raising an out of range error. Also see *new* [guide on input distribution](https://www.tensorflow.org/tutorials/distribute/input).
+  * Expose experimental [`tf.distribute.DistributedDataset`](https://www.tensorflow.org/versions/r2.3/api_docs/python/tf/distribute/DistributedDataset?hl=en) and [`tf.distribute.DistributedIterator`](https://www.tensorflow.org/versions/r2.3/api_docs/python/tf/distribute/DistributedIterator) to distribute input data when using `tf.distribute` to scale training on multiple devices. 
+    * Added a [`get_next_as_optional`](https://www.tensorflow.org/versions/r2.3/api_docs/python/tf/distribute/DistributedIterator?hl=en#get_next_as_optional) method for [`tf.distribute.DistributedIterator`](https://www.tensorflow.org/versions/r2.3/api_docs/python/tf/distribute/DistributedIterator?hl=en) class to return a `tf.experimental.Optional` instance that contains the next value for all replicas or none instead of raising an out of range error. Also see *new* [guide on input distribution](https://www.tensorflow.org/tutorials/distribute/input).
   * Allow `var.assign` on `MirroredVariables` with `aggregation=NONE` in replica context. Previously this would raise an error since there was no way to confirm that the values being assigned to the `MirroredVariables` were in fact identical.
   * `tf.distribute.experimental.MultiWorkerMirroredStrategy` adds support for partial batches. Workers running out of data now continue to participate in the training with empty inputs, instead of raising an error.
   * Improve the performance of reading metrics eagerly under `tf.distribute.experimental.MultiWorkerMirroredStrategy`.
@@ -121,7 +121,7 @@
     * **[`image_dataset_from_directory`](https://www.tensorflow.org/api_docs/python/tf/keras/preprocessing/image_dataset_from_directory)** is a utility based on `tf.data.Dataset`, meant to replace the legacy `ImageDataGenerator`. It takes you from a structured directory of images to a labeled dataset, in one function call. Note that it doesn't perform image data augmentation (which is meant to be done using preprocessing layers).
     * **[`text_dataset_from_directory`](https://www.tensorflow.org/api_docs/python/tf/keras/preprocessing/text_dataset_from_directory)** takes you from a structured directory of text files to a labeled dataset, in one function call.
     * **[`timeseries_dataset_from_array`](https://www.tensorflow.org/api_docs/python/tf/keras/preprocessing/timeseries_dataset_from_array)** is a `tf.data.Dataset`-based replacement of the legacy `TimeseriesGenerator`. It takes you from an array of timeseries data to a dataset of shifting windows with their targets.
-  * Added [`experimental_steps_per_execution`](https://www.tensorflow.org/api_docs/python/tf/keras/Model?version=nightly#compile)
+  * Added [`experimental_steps_per_execution`](https://www.tensorflow.org/versions/r2.3/api_docs/python/tf/keras/Model?hl=en#compile)
  arg to `model.compile` to indicate the number of batches to run per `tf.function` call. This can speed up Keras Models on TPUs up to 3x.
   * Extends `tf.keras.layers.Lambda` layers to support multi-argument lambdas, and keyword arguments when calling the layer.
   * Functional models now get constructed if *any* tensor in a layer call's arguments/keyword arguments comes from a keras input. Previously the functional api would only work if all of the elements in the first argument to the layer came from a keras input.

From de4c4425b789283be8a4297565e773ec1df2370c Mon Sep 17 00:00:00 2001
From: Shanqing Cai <cais@google.com>
Date: Mon, 13 Jul 2020 21:47:19 -0400
Subject: [PATCH 08/13] Add mentioned Debugger V2 to r2.3 release notes

---
 RELEASE.md | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/RELEASE.md b/RELEASE.md
index 7ad2796083d..0cbdec1f4eb 100644
--- a/RELEASE.md
+++ b/RELEASE.md
@@ -16,7 +16,9 @@
   * TFLite now properly supports dynamic shapes during conversion and inference. We’ve also added opt-in support on Android and iOS for [XNNPACK](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/delegates/xnnpack), a highly optimized set of CPU kernels, as well as opt-in support for [executing quantized models on the GPU](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/g3doc/performance/gpu_advanced.md#running-quantized-models-experimental). 
 
   * Libtensorflow packages are available in GCS starting this release. We have also started to release a nightly version of these packages. 
-
+  
+  * The experimental Python API [`tf.debugging.experimental.enable_dump_debug_info()`](https://www.tensorflow.org/api_docs/python/tf/debugging/experimental/enable_dump_debug_info) now allows you to instrument TensorFlow programs and dump debugging information to a directory on the file system. The directory can be read and visualized by a new interactive dashboard in TensorBoard 2.3 called [Debugger V2](https://www.tensorflow.org/tensorboard/debugger_v2), which reveals details of the TensorFlow program including graph structures, history of op executions at the Python (eager) and intra-graph levels, the runtime dtype, shape, and numerical composistion of tensors, as well as their code locations.
+  
 ## Breaking Changes
 * Increases the **minimum bazel version** required to build TF to **3.1.0**.
 * `tf.data`

From 13c4eadd2557892d3a6b0049300c4caa28d1ebee Mon Sep 17 00:00:00 2001
From: Shanqing Cai <cais@google.com>
Date: Mon, 13 Jul 2020 21:51:43 -0400
Subject: [PATCH 09/13] Grammar tweaks in the Debugger V2 bullet point

---
 RELEASE.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/RELEASE.md b/RELEASE.md
index 0cbdec1f4eb..b3a83f1c9fe 100644
--- a/RELEASE.md
+++ b/RELEASE.md
@@ -17,7 +17,7 @@
 
   * Libtensorflow packages are available in GCS starting this release. We have also started to release a nightly version of these packages. 
   
-  * The experimental Python API [`tf.debugging.experimental.enable_dump_debug_info()`](https://www.tensorflow.org/api_docs/python/tf/debugging/experimental/enable_dump_debug_info) now allows you to instrument TensorFlow programs and dump debugging information to a directory on the file system. The directory can be read and visualized by a new interactive dashboard in TensorBoard 2.3 called [Debugger V2](https://www.tensorflow.org/tensorboard/debugger_v2), which reveals details of the TensorFlow program including graph structures, history of op executions at the Python (eager) and intra-graph levels, the runtime dtype, shape, and numerical composistion of tensors, as well as their code locations.
+  * The experimental Python API [`tf.debugging.experimental.enable_dump_debug_info()`](https://www.tensorflow.org/api_docs/python/tf/debugging/experimental/enable_dump_debug_info) now allows you to instrument a TensorFlow program and dump debugging information to a directory on the file system. The directory can be read and visualized by a new interactive dashboard in TensorBoard 2.3 called [Debugger V2](https://www.tensorflow.org/tensorboard/debugger_v2), which reveals the details of the TensorFlow program including graph structures, history of op executions at the Python (eager) and intra-graph levels, the runtime dtype, shape, and numerical composistion of tensors, as well as their code locations.
   
 ## Breaking Changes
 * Increases the **minimum bazel version** required to build TF to **3.1.0**.

From 549064075e9c54dd338755960555e8c57dc93c5f Mon Sep 17 00:00:00 2001
From: Goldie Gadde <ggadde@google.com>
Date: Wed, 15 Jul 2020 14:58:17 -0700
Subject: [PATCH 10/13] Update RELEASE.md

---
 RELEASE.md | 34 ++++++++++++++++++----------------
 1 file changed, 18 insertions(+), 16 deletions(-)

diff --git a/RELEASE.md b/RELEASE.md
index b3a83f1c9fe..f123cff8efb 100644
--- a/RELEASE.md
+++ b/RELEASE.md
@@ -9,13 +9,13 @@
 
   * [`tf.distribute.TPUStrategy`](https://www.tensorflow.org/api_docs/python/tf/distribute/TPUStrategy) is now a stable API and no longer considered experimental for TensorFlow. (earlier `tf.distribute.experimental.TPUStrategy`).
 
-  * [TF Profiler](https://www.tensorflow.org/guide/profiler) introduces two new tools: a memory profiler to visualize your model’s memory usage over time and a python tracer which allows you to trace python function calls in your model. Usability improvements include better diagnostic messages and [profile options](https://tensorflow.org/guide/profiler#collect_performance_data) to customize the host and device trace verbosity level.
+  * [TF Profiler](https://www.tensorflow.org/guide/profiler) introduces two new tools: a memory profiler to visualize your model’s memory usage over time and a [python tracer](https://www.tensorflow.org/guide/profiler#events) which allows you to trace python function calls in your model. Usability improvements include better diagnostic messages and [profile options](https://tensorflow.org/guide/profiler#collect_performance_data) to customize the host and device trace verbosity level.
 
   * Introduces experimental support for Keras Preprocessing Layers API ([`tf.keras.layers.experimental.preprocessing.*`](https://www.tensorflow.org/api_docs/python/tf/keras/layers/experimental/preprocessing?version=nightly)) to handle data preprocessing operations, with support for composite tensor inputs. Please see below for additional details on these layers.
   
   * TFLite now properly supports dynamic shapes during conversion and inference. We’ve also added opt-in support on Android and iOS for [XNNPACK](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/delegates/xnnpack), a highly optimized set of CPU kernels, as well as opt-in support for [executing quantized models on the GPU](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/g3doc/performance/gpu_advanced.md#running-quantized-models-experimental). 
 
-  * Libtensorflow packages are available in GCS starting this release. We have also started to release a nightly version of these packages. 
+  * Libtensorflow packages are available in GCS starting this release. We have also started to [release a nightly version of these packages](https://github.com/tensorflow/tensorflow#official-builds). 
   
   * The experimental Python API [`tf.debugging.experimental.enable_dump_debug_info()`](https://www.tensorflow.org/api_docs/python/tf/debugging/experimental/enable_dump_debug_info) now allows you to instrument a TensorFlow program and dump debugging information to a directory on the file system. The directory can be read and visualized by a new interactive dashboard in TensorBoard 2.3 called [Debugger V2](https://www.tensorflow.org/tensorboard/debugger_v2), which reveals the details of the TensorFlow program including graph structures, history of op executions at the Python (eager) and intra-graph levels, the runtime dtype, shape, and numerical composistion of tensors, as well as their code locations.
   
@@ -42,7 +42,7 @@
 
 ### TF Core:
   * Set `tf2_behavior` to 1 to enable V2 for early loading cases.
-  * Add a function to dynamically choose the implementation based on underlying device placement.
+  * Add execute_fn_for_device function to dynamically choose the implementation based on underlying device placement.
   * Eager:
     * Add `reduce_logsumexp` benchmark with experiment compile.
     * Give `EagerTensor`s a meaningful `__array__` implementation.
@@ -77,8 +77,6 @@
      * Update `tf.saved_model.SaveOptions` with [`experimental_io_device`](https://www.tensorflow.org/versions/r2.3/api_docs/python/tf/saved_model/SaveOptions?hl=en) as arg with default value `None` to choose the I/O device for saving models and weights.
   * GPU
     * No longer includes PTX kernels for GPU except for sm_70 to reduce binary size.
-  * Profiler
-    * Fix a subtle use-after-free issue in `XStatVisitor::RefValue()`.
   * Others
     * Retain parent namescope for ops added inside `tf.while_loop`/`tf.cond`/`tf.switch_case`.
     * Update `tf.vectorized_map` to support vectorizing `tf.while_loop` and TensorList operations.
@@ -99,8 +97,8 @@
 ### `tf.distribute`: 
   * Expose experimental [`tf.distribute.DistributedDataset`](https://www.tensorflow.org/versions/r2.3/api_docs/python/tf/distribute/DistributedDataset?hl=en) and [`tf.distribute.DistributedIterator`](https://www.tensorflow.org/versions/r2.3/api_docs/python/tf/distribute/DistributedIterator) to distribute input data when using `tf.distribute` to scale training on multiple devices. 
     * Added a [`get_next_as_optional`](https://www.tensorflow.org/versions/r2.3/api_docs/python/tf/distribute/DistributedIterator?hl=en#get_next_as_optional) method for [`tf.distribute.DistributedIterator`](https://www.tensorflow.org/versions/r2.3/api_docs/python/tf/distribute/DistributedIterator?hl=en) class to return a `tf.experimental.Optional` instance that contains the next value for all replicas or none instead of raising an out of range error. Also see *new* [guide on input distribution](https://www.tensorflow.org/tutorials/distribute/input).
-  * Allow `var.assign` on `MirroredVariables` with `aggregation=NONE` in replica context. Previously this would raise an error since there was no way to confirm that the values being assigned to the `MirroredVariables` were in fact identical.
-  * `tf.distribute.experimental.MultiWorkerMirroredStrategy` adds support for partial batches. Workers running out of data now continue to participate in the training with empty inputs, instead of raising an error.
+  * Allow var.assign on MirroredVariables with aggregation=NONE in replica context. Previously this would raise an error. We now allow this because many users and library writers find using `.assign` in replica context to be more convenient, instead of having to use `Strategy.extended.update` which was the previous way of updating variables in this situation.
+  * `tf.distribute.experimental.MultiWorkerMirroredStrategy` adds support for partial batches. Workers running out of data now continue to participate in the training with empty inputs, instead of raising an error. Learn more about [partial batches here](https://www.tensorflow.org/tutorials/distribute/input#partial_batches).
   * Improve the performance of reading metrics eagerly under `tf.distribute.experimental.MultiWorkerMirroredStrategy`.
   * Fix the issue that `strategy.reduce()` inside `tf.function` may raise exceptions when the values to reduce are from loops or if-clauses.
   * Fix the issue that `tf.distribute.MirroredStrategy` cannot be used together with `tf.distribute.experimental.MultiWorkerMirroredStrategy`.
@@ -129,13 +127,14 @@
   * Functional models now get constructed if *any* tensor in a layer call's arguments/keyword arguments comes from a keras input. Previously the functional api would only work if all of the elements in the first argument to the layer came from a keras input.
   * Clean up `BatchNormalization` layer's `trainable` property to act like standard python state when it's used inside `tf.functions` (frozen at tracing time), instead of acting like a pseudo-variable whose updates *kind of sometimes* get reflected in already-traced `tf.function` traces.
   * Add the `Conv1DTranspose` layer.
-  * Fix bug in `SensitivitySpecificityBase` derived metrics.
+  * Refine the semantics of `SensitivitySpecificityBase` derived metrics. See the updated API docstrings for [`tf.keras.metrics.SensitivityAtSpecificity`](https://www.tensorflow.org/versions/r2.3/api_docs/python/tf/keras/metrics/SensitivityAtSpecificity) and [`tf.keras.metrics.SpecificityAtSensitivty`](https://www.tensorflow.org/versions/r2.3/api_docs/python/tf/keras/metrics/SpecificityAtSensitivity). 
   * Blacklist Case op from callback
 
 ### `tf.lite`:
   * Converter
       * Restored `inference_input_type` and `inference_output_type` flags in TF 2.x TFLiteConverter (backward compatible with TF 1.x) to support integer (tf.int8, tf.uint8) input and output types in post training full integer quantized models.
       * Added support for converting and resizing models with dynamic (placeholder) dimensions. Previously, there was only limited support for dynamic batch size, and even that did not guarantee that the model could be properly resized at runtime.
+       * Enabled experimental support for a new quantization mode with 16-bit activations and 8-bit weights. See `lite.OpsSet.EXPERIMENTAL_TFLITE_BUILTINS_ACTIVATIONS_INT16_WEIGHTS_INT8`.
   * CPU
       * Fix an issue w/ dynamic weights and `Conv2D` on x86.
       * Add a runtime Android flag for enabling `XNNPACK` for optimized CPU performance.
@@ -172,23 +171,26 @@
       * Add missing kernels for flex delegate whitelisted ops.
       * Fix issue when using direct `ByteBuffer` inputs with graphs that have dynamic shapes.
       * Fix error checking supported operations in a model containing `HardSwish`. 
- 
+
+### Packaging Support 
+  * Added `tf.sysconfig.get_build_info()`. Returns a dict that describes the currently installed TensorFlow package, e.g. the NVIDIA CUDA and NVIDIA CuDNN versions that the package was built to support.
+  
+### Profiler
+    * Fix a subtle use-after-free issue in `XStatVisitor::RefValue()`.
+  
 ### TPU Enhancements
-  * 3D mesh support
+  * Adds 3D mesh support in TPU configurations ops.
   * Added TPU code for `FTRL` with `multiply_linear_by_lr`.
   * Silently adds a new file system registry at `gstpu`.
   * Support `restartType` in cloud tpu client.
   * Depend on a specific version of google-api-python-client.
   * Fixes apiclient import.
 
-### XLA Support
-  * Implement stable `argmin` and `argmax`
-
 ### Tracing and Debugging
   * Add a `TFE_Py_Execute` traceme.
-  
-### Packaging Support 
-  * Added `tf.sysconfig.get_build_info()`. Returns a dict that describes the currently installed TensorFlow package, e.g. the NVIDIA CUDA and NVIDIA CuDNN versions that the package was built to support.
+
+### XLA Support
+  * Implement stable `argmin` and `argmax`
   
 ## Thanks to our Contributors
 

From 2b03d7b7a0d66b0fe08952ce2cd14f6bc45bb328 Mon Sep 17 00:00:00 2001
From: Goldie Gadde <ggadde@google.com>
Date: Wed, 15 Jul 2020 17:22:49 -0700
Subject: [PATCH 11/13] Update RELEASE.md

---
 RELEASE.md | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/RELEASE.md b/RELEASE.md
index f123cff8efb..324e17f22d7 100644
--- a/RELEASE.md
+++ b/RELEASE.md
@@ -42,7 +42,7 @@
 
 ### TF Core:
   * Set `tf2_behavior` to 1 to enable V2 for early loading cases.
-  * Add execute_fn_for_device function to dynamically choose the implementation based on underlying device placement.
+  * Add `execute_fn_for_device function` to dynamically choose the implementation based on underlying device placement.
   * Eager:
     * Add `reduce_logsumexp` benchmark with experiment compile.
     * Give `EagerTensor`s a meaningful `__array__` implementation.
@@ -82,11 +82,10 @@
     * Update `tf.vectorized_map` to support vectorizing `tf.while_loop` and TensorList operations.
     * `tf.custom_gradient` can now be applied to functions that accept nested structures of `tensors` as inputs (instead of just a list of tensors). Note that Python structures such as tuples and lists now won't be treated as tensors, so if you still want them to be treated that way, you need to wrap them with `tf.convert_to_tensor`.
     * No lowering on gradient case op when input is `DeviceIndex` op.
-    * Fix in c_api `DEFINE_GETATTR`.
     * Extend the ragged version of `tf.gather` to support `batch_dims` and `axis` args.
     * Update `tf.map_fn` to support RaggedTensors and SparseTensors.
     * Deprecate `tf.group`. It is not useful in eager mode.
-    * Add a new variant of `FTRL` allowing a learning rate of zero.
+    * Add CPU and GPU implementation of modified variation of [`FTRL`](https://www.tensorflow.org/versions/r2.3/api_docs/python/tf/raw_ops/ApplyFtrl)/[`FTRLV2`](https://www.tensorflow.org/versions/r2.3/api_docs/python/tf/raw_ops/ApplyFtrlV2) that can triggerred by `multiply_linear_by_lr` allowing a learning rate of zero. 
     
 ### `tf.data`: 
   * `tf.data.experimental.dense_to_ragged_batch` works correctly with tuples.
@@ -128,7 +127,6 @@
   * Clean up `BatchNormalization` layer's `trainable` property to act like standard python state when it's used inside `tf.functions` (frozen at tracing time), instead of acting like a pseudo-variable whose updates *kind of sometimes* get reflected in already-traced `tf.function` traces.
   * Add the `Conv1DTranspose` layer.
   * Refine the semantics of `SensitivitySpecificityBase` derived metrics. See the updated API docstrings for [`tf.keras.metrics.SensitivityAtSpecificity`](https://www.tensorflow.org/versions/r2.3/api_docs/python/tf/keras/metrics/SensitivityAtSpecificity) and [`tf.keras.metrics.SpecificityAtSensitivty`](https://www.tensorflow.org/versions/r2.3/api_docs/python/tf/keras/metrics/SpecificityAtSensitivity). 
-  * Blacklist Case op from callback
 
 ### `tf.lite`:
   * Converter
@@ -176,7 +174,7 @@
   * Added `tf.sysconfig.get_build_info()`. Returns a dict that describes the currently installed TensorFlow package, e.g. the NVIDIA CUDA and NVIDIA CuDNN versions that the package was built to support.
   
 ### Profiler
-    * Fix a subtle use-after-free issue in `XStatVisitor::RefValue()`.
+  * Fix a subtle use-after-free issue in `XStatVisitor::RefValue()`.
   
 ### TPU Enhancements
   * Adds 3D mesh support in TPU configurations ops.

From f9233753f385a407dc3d92e5982af926d16cdf94 Mon Sep 17 00:00:00 2001
From: Austin Anderson <angerson@google.com>
Date: Tue, 21 Jul 2020 16:04:26 -0700
Subject: [PATCH 12/13] Reword tf.sysconfig.get_build_info notice

An internal bug report revealed that "support" could be mistaken for "official support," which is not intended.
---
 RELEASE.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/RELEASE.md b/RELEASE.md
index 324e17f22d7..eb67d308648 100644
--- a/RELEASE.md
+++ b/RELEASE.md
@@ -171,7 +171,7 @@
       * Fix error checking supported operations in a model containing `HardSwish`. 
 
 ### Packaging Support 
-  * Added `tf.sysconfig.get_build_info()`. Returns a dict that describes the currently installed TensorFlow package, e.g. the NVIDIA CUDA and NVIDIA CuDNN versions that the package was built to support.
+  * Added `tf.sysconfig.get_build_info()`. Returns a dict that describes the build environment of the currently installed TensorFlow package, e.g. the NVIDIA CUDA and NVIDIA CuDNN versions used when TensorFlow was built.
   
 ### Profiler
   * Fix a subtle use-after-free issue in `XStatVisitor::RefValue()`.

From dbbdcde0fde46a1005a31645a7039c0759419b08 Mon Sep 17 00:00:00 2001
From: Goldie Gadde <ggadde@google.com>
Date: Wed, 22 Jul 2020 14:21:11 -0700
Subject: [PATCH 13/13] Update RELEASE.md

---
 RELEASE.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/RELEASE.md b/RELEASE.md
index eb67d308648..c2885fbd985 100644
--- a/RELEASE.md
+++ b/RELEASE.md
@@ -76,7 +76,7 @@
     * Add `tf.saved_model.LoadOptions` with [`experimental_io_device`](https://www.tensorflow.org/versions/r2.3/api_docs/python/tf/saved_model/LoadOptions?hl=en) as arg with default value `None` to choose the I/O device for loading models and weights.
      * Update `tf.saved_model.SaveOptions` with [`experimental_io_device`](https://www.tensorflow.org/versions/r2.3/api_docs/python/tf/saved_model/SaveOptions?hl=en) as arg with default value `None` to choose the I/O device for saving models and weights.
   * GPU
-    * No longer includes PTX kernels for GPU except for sm_70 to reduce binary size.
+    * No longer includes PTX kernels for GPU except for sm_70 to reduce binary size. On systems with NVIDIA® Ampere GPUs (CUDA architecture 8.0) or newer, kernels are JIT-compiled from PTX and TensorFlow can take over 30 minutes to start up. This overhead can be limited to the first start up by increasing the default JIT cache size with: `export CUDA_CACHE_MAXSIZE=2147483648`.:
   * Others
     * Retain parent namescope for ops added inside `tf.while_loop`/`tf.cond`/`tf.switch_case`.
     * Update `tf.vectorized_map` to support vectorizing `tf.while_loop` and TensorList operations.