From e69f71759adac4a794d5b159358af5253cb243bf Mon Sep 17 00:00:00 2001
From: Rohan Jain <rohan100jain@gmail.com>
Date: Wed, 5 Apr 2017 14:10:53 -0700
Subject: [PATCH] Branch 152232810 (#8988)

* Improve py_func error handling.

Automatically translate some python errors into corresponding TF errors at runtime.
Change: 152156821

* Update interaction with libpng so that we use the public API instead of
knowledge of the internal libpng data structures.
Change: 152167754

* TensorBoard plugins now contain their own name/route prefix.
Change: 152167807

* Passes trainable flag to separable_conv2d biases.
Change: 152170239

* Saving resource variables with a caching device.
Change: 152171539

* Drop loss from estimator_spec.eval_metric_ops, as required by core Estimator.
Change: 152179924

* sample_stats.percentile DOCFIX.
Change: 152182295

* Added a memory optimizer to grappler.
Change: 152184170

* Change default behavior of the tf runs selector:

- If there are fewer than 41 runs, enable them all by default
- If there are 41 runs or more, disable them all by default

This is in response to user complaints that having it enable only the first ten runs by default was confusing, because it was not obvious to users that some runs had been disabled.
However, it still solves the initial user complaint that having very many runs simultaneously enabled would lag the UI.

I also changed the "toggle all runs" button to try to turn everything off before turning everything on.
Also, I improved the logic for detecting when the runs selection is back in the default state, so that we can avoid generating long URI strings wherever possible.
Change: 152188948

* Autogenerated Change: Change TensorBoard TAG to 52
Change: 152189000

* Remove warning that only happening with config cuda.
Change: 152189205

* Make resource variable shared name consistent with non-resource variables.

Remove colocation constraint from resource variable cached value with the
variable itself.
Change: 152192203

* Add a way to specify the optimization order; refactor and add constant folding to meta optimizer.
Change: 152193646

* Backport fixes and improvements from external Keras.
Change: 152198296

* Merge changes from github.
Change: 152200430

* Go: Update generated wrapper functions for TensorFlow ops.
Change: 152200754

* Update ops-related pbtxt files.
Change: 152203174

* Make ImportGraphDef() work with functions.

In addition to modify graph_constructor.cc, this patch adds some other
functionality to enable importing fucntions:
* Ability to add FunctionDefLibraries to Graphs and
  FunctionLibraryDefinitions (in addition to existing functions)
* FunctionDefsEqual() utility function
Change: 152205258

* Expand contrib test to more than just test targets.
Change: 152206822

* Preserve graph version during optimization
Change: 152213262

* Exclude enter and exit nodes from shape refiner's constant folding.
Change: 152213637

* Allow reshape_mover and algebraic_simplifier to make multiple mutations, by avoiding the short-circuit
std::any_of.
Change: 152232810

* fixing workspace.bzl

* workspace.bzl further fixes

* fixing tensorflow.bzl merge conflicts

* fixing typo in dnn.h

* fixing bad merge for dnn.h
---
 tensorflow/compiler/tests/nary_ops_test.py    |   15 +-
 .../xla/service/algebraic_simplifier.cc       |   15 +-
 .../compiler/xla/service/reshape_mover.cc     |   20 +-
 .../xla/service/reshape_mover_test.cc         |   51 +
 .../distributions/python/ops/sample_stats.py  |    9 +-
 .../contrib/keras/python/keras/__init__.py    |    2 +-
 .../contrib/keras/python/keras/activations.py |   24 +-
 .../python/keras/applications/resnet50.py     |    4 +-
 .../contrib/keras/python/keras/backend.py     |   94 +-
 .../keras/python/keras/engine/topology.py     |   30 +-
 .../keras/python/keras/engine/training.py     |   26 +-
 .../keras/python/keras/initializers.py        |    9 +-
 .../python/keras/layers/convolutional.py      |   36 +-
 .../keras/layers/convolutional_recurrent.py   |    2 +-
 .../contrib/keras/python/keras/layers/core.py |    6 +-
 .../keras/python/keras/layers/local.py        |   16 +-
 .../keras/python/keras/layers/merge.py        |  156 +-
 .../python/keras/layers/normalization.py      |    2 +-
 .../keras/python/keras/layers/pooling.py      |   16 +-
 .../keras/python/keras/layers/recurrent.py    |   26 +-
 .../keras/python/keras/layers/wrappers.py     |   50 +-
 .../contrib/keras/python/keras/metrics.py     |    9 +-
 .../contrib/keras/python/keras/models.py      |   23 +-
 .../keras/python/keras/preprocessing/image.py |    2 +-
 .../keras/python/keras/utils/generic_utils.py |    5 +-
 .../keras/python/keras/utils/layer_utils.py   |    2 +-
 .../python/keras/wrappers/scikit_learn.py     |   34 +-
 .../layers/python/layers/embedding_ops.py     |  113 +-
 .../python/layers/embedding_ops_test.py       |  104 +-
 .../contrib/layers/python/layers/layers.py    |    1 +
 .../layers/python/layers/layers_test.py       |   14 +
 .../dataframe/queues/feeding_functions.py     |    1 +
 .../learn/python/learn/estimators/model_fn.py |   11 +-
 .../python/learn/estimators/model_fn_test.py  |   38 +-
 .../python/learn/learn_io/generator_io.py     |   44 +-
 .../learn/learn_io/generator_io_test.py       |  168 +-
 .../opt/python/training/external_optimizer.py |   18 +-
 tensorflow/contrib/seq2seq/python/ops/loss.py |   26 +-
 .../core/common_runtime/gpu/gpu_device.cc     |    8 +-
 .../common_runtime/gpu/gpu_event_mgr_test.cc  |    4 +-
 .../core/common_runtime/shape_refiner.cc      |    7 +
 tensorflow/core/framework/function.cc         |   42 +
 tensorflow/core/framework/function.h          |    7 +
 tensorflow/core/framework/function_test.cc    |   32 +
 tensorflow/core/graph/graph.cc                |   42 +-
 tensorflow/core/graph/graph.h                 |    6 +
 tensorflow/core/graph/graph_constructor.cc    |   16 +-
 tensorflow/core/graph/graph_constructor.h     |    2 -
 .../core/graph/graph_constructor_test.cc      |  209 +-
 tensorflow/core/graph/graph_test.cc           |   56 +
 tensorflow/core/graph/mkl_layout_pass.cc      |  240 ++-
 tensorflow/core/graph/mkl_layout_pass_test.cc |   17 +-
 .../core/graph/mkl_tfconversion_pass.cc       |    2 +-
 .../core/graph/mkl_tfconversion_pass_test.cc  |    4 +-
 tensorflow/core/grappler/optimizers/BUILD     |   33 +
 .../grappler/optimizers/constant_folding.cc   |    3 +-
 .../grappler/optimizers/memory_optimizer.cc   |   83 +
 .../grappler/optimizers/memory_optimizer.h    |   42 +
 .../optimizers/memory_optimizer_test.cc       |   74 +
 .../grappler/optimizers/meta_optimizer.cc     |   65 +-
 .../core/grappler/optimizers/meta_optimizer.h |    1 +
 .../core/kernels/conv_grad_filter_ops.cc      |   69 +-
 .../core/kernels/conv_grad_input_ops.cc       |   40 +-
 tensorflow/core/kernels/cudnn_pooling_gpu.cc  |    7 +-
 tensorflow/core/kernels/maxpooling_op.cc      |   61 +-
 .../core/kernels/maxpooling_op_gpu.cu.cc      |   28 +-
 tensorflow/core/kernels/maxpooling_op_gpu.h   |   40 +-
 tensorflow/core/kernels/mkl_avgpooling_op.cc  |    9 +-
 .../core/kernels/mkl_conv_grad_bias_ops.cc    |    2 +-
 .../core/kernels/mkl_conv_grad_filter_ops.cc  |    2 +-
 .../core/kernels/mkl_conv_grad_input_ops.cc   |    4 +-
 tensorflow/core/kernels/mkl_conv_ops.cc       |   29 +-
 tensorflow/core/kernels/mkl_maxpooling_op.cc  |   22 +-
 .../core/kernels/mkl_pooling_ops_common.cc    |  228 +-
 .../core/kernels/mkl_pooling_ops_common.h     |    9 +-
 tensorflow/core/kernels/mkl_relu_op.cc        |  794 +++----
 tensorflow/core/kernels/pooling_ops_3d.cc     |   26 +-
 .../core/kernels/pooling_ops_3d_gpu.cu.cc     |    7 +-
 tensorflow/core/kernels/pooling_ops_common.cc |    2 +-
 tensorflow/core/kernels/xsmm_conv2d.cc        |  394 ++--
 tensorflow/core/lib/io/inputbuffer.cc         |    2 +-
 tensorflow/core/lib/png/png_io.cc             |   26 +-
 .../core/ops/compat/ops_history.v1.pbtxt      |  725 ++++++-
 tensorflow/core/ops/nn_grad.cc                |    1 -
 tensorflow/core/ops/ops.pbtxt                 |  320 ++-
 tensorflow/core/platform/cpu_info.cc          |   27 +-
 tensorflow/core/platform/windows/port.cc      |   27 +-
 .../core/protobuf/rewriter_config.proto       |    4 +
 tensorflow/core/util/equal_graph_def.cc       |   16 +-
 tensorflow/core/util/equal_graph_def.h        |    9 +
 tensorflow/core/util/equal_graph_def_test.cc  |   31 +-
 tensorflow/core/util/mkl_util.h               |   10 +-
 .../go/example_inception_inference_test.go    |    2 +-
 tensorflow/go/genop/internal/genop.go         |   12 +-
 tensorflow/go/op/wrappers.go                  | 1179 ++++++-----
 tensorflow/python/estimator/estimator.py      |    6 +-
 tensorflow/python/estimator/estimator_test.py |   14 +-
 .../inputs/queues/feeding_functions.py        |   11 +-
 .../python/framework/tensor_util_test.py      |  106 +-
 .../kernel_tests/pooling_ops_3d_test.py       |    2 +-
 .../python/kernel_tests/pooling_ops_test.py   |   58 +-
 .../python/kernel_tests/py_func_test.py       |   22 +
 .../resource_variable_ops_test.py             |   33 +
 tensorflow/python/lib/core/py_func.cc         |   64 +-
 tensorflow/python/ops/nn_grad.py              |  127 +-
 tensorflow/python/ops/nn_ops.py               |    2 +
 .../python/ops/resource_variable_ops.py       |   24 +-
 tensorflow/python/platform/control_imports.py |   27 +
 tensorflow/python/training/saver_test.py      |   12 +
 tensorflow/stream_executor/cuda/cuda_dnn.cc   |    7 +-
 tensorflow/stream_executor/dnn.h              |   38 +-
 tensorflow/tensorboard/TAG                    |    2 +-
 tensorflow/tensorboard/backend/BUILD          |    1 +
 tensorflow/tensorboard/backend/application.py |   31 +-
 .../tensorboard/backend/application_test.py   |   32 +-
 .../tf-multi-checkbox.html                    |   46 +-
 .../components/vz_line_chart/vz-line-chart.ts |    8 +-
 tensorflow/tensorboard/plugins/base_plugin.py |    6 +
 .../plugins/debugger/debugger_plugin.py       |    4 +-
 .../plugins/debugger/debugger_plugin_test.py  |    2 +-
 .../plugins/projector/projector_plugin.py     |    4 +-
 .../projector/projector_plugin_test.py        |    4 +-
 .../tensorboard/plugins/text/text_plugin.py   |    4 +-
 tensorflow/tensorflow.bzl                     | 1015 +++++----
 .../ci_build/linux/cpu/run_py3_contrib.sh     |    2 +-
 tensorflow/tools/pip_package/setup.py         |    5 +-
 tensorflow/tools/test/check_futures_test.py   |    1 +
 tensorflow/workspace.bzl                      | 1827 ++++++++---------
 third_party/libxsmm.BUILD                     |    3 +-
 129 files changed, 6280 insertions(+), 3651 deletions(-)
 create mode 100644 tensorflow/core/grappler/optimizers/memory_optimizer.cc
 create mode 100644 tensorflow/core/grappler/optimizers/memory_optimizer.h
 create mode 100644 tensorflow/core/grappler/optimizers/memory_optimizer_test.cc
 create mode 100644 tensorflow/python/platform/control_imports.py

diff --git a/tensorflow/compiler/tests/nary_ops_test.py b/tensorflow/compiler/tests/nary_ops_test.py
index e89c411d01f..2660e1d5728 100644
--- a/tensorflow/compiler/tests/nary_ops_test.py
+++ b/tensorflow/compiler/tests/nary_ops_test.py
@@ -116,13 +116,14 @@ class NAryOpsTest(XLATestCase):
                     np.array([1, 1], dtype=np.int32)],
                    expected=np.array([[], []], dtype=np.float32))
 
-    if (np.int64 in self.int_types):
-      self._testNAry(lambda x: array_ops.strided_slice(*x),
-                     [np.array([[], [], []], dtype=np.float32),
-                      np.array([1, 0], dtype=np.int64),
-                      np.array([3, 0], dtype=np.int64),
-                      np.array([1, 1], dtype=np.int64)],
-                     expected=np.array([[], []], dtype=np.float32))
+    if np.int64 in self.int_types:
+      self._testNAry(
+          lambda x: array_ops.strided_slice(*x), [
+              np.array([[], [], []], dtype=np.float32), np.array(
+                  [1, 0], dtype=np.int64), np.array([3, 0], dtype=np.int64),
+              np.array([1, 1], dtype=np.int64)
+          ],
+          expected=np.array([[], []], dtype=np.float32))
 
     self._testNAry(lambda x: array_ops.strided_slice(*x),
                    [np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]],
diff --git a/tensorflow/compiler/xla/service/algebraic_simplifier.cc b/tensorflow/compiler/xla/service/algebraic_simplifier.cc
index 4c058484b9f..415aafe69ac 100644
--- a/tensorflow/compiler/xla/service/algebraic_simplifier.cc
+++ b/tensorflow/compiler/xla/service/algebraic_simplifier.cc
@@ -1348,13 +1348,14 @@ Status AlgebraicSimplifierVisitor::HandleMinimum(HloInstruction* minimum,
 StatusOr<bool> AlgebraicSimplifier::Run(HloModule* module) {
   XLA_VLOG_LINES(2,
                  "AlgebraicSimplifier::Run(), before:\n" + module->ToString());
-  bool changed =
-      std::any_of(module->computations().begin(), module->computations().end(),
-                  [=](const std::unique_ptr<HloComputation>& computation) {
-                    return AlgebraicSimplifierVisitor::Run(
-                        computation.get(), is_layout_sensitive_,
-                        valid_bitcast_callback_, enable_dot_simplification_);
-                  });
+  bool changed = false;
+  for (auto& comp : module->computations()) {
+    if (AlgebraicSimplifierVisitor::Run(comp.get(), is_layout_sensitive_,
+                                        valid_bitcast_callback_,
+                                        enable_dot_simplification_)) {
+      changed = true;
+    }
+  }
   XLA_VLOG_LINES(2,
                  "AlgebraicSimplifier::Run(), after:\n" + module->ToString());
   return changed;
diff --git a/tensorflow/compiler/xla/service/reshape_mover.cc b/tensorflow/compiler/xla/service/reshape_mover.cc
index 3bff35544c8..b72ef95a6a7 100644
--- a/tensorflow/compiler/xla/service/reshape_mover.cc
+++ b/tensorflow/compiler/xla/service/reshape_mover.cc
@@ -234,17 +234,15 @@ bool TrySinkReshapeOrTranspose(HloComputation* computation,
 }  // namespace
 
 StatusOr<bool> ReshapeMover::Run(HloModule* module) {
-  return std::any_of(
-      module->computations().begin(), module->computations().end(),
-      [](const std::unique_ptr<HloComputation>& computation) {
-        std::list<HloInstruction*> postorder =
-            computation->MakeInstructionPostOrder();
-        return std::any_of(postorder.begin(), postorder.end(),
-                           [&computation](HloInstruction* instruction) {
-                             return TrySinkReshapeOrTranspose(computation.get(),
-                                                              instruction);
-                           });
-      });
+  bool changed = false;
+  for (const auto& comp : module->computations()) {
+    for (HloInstruction* instruction : comp->MakeInstructionPostOrder()) {
+      if (TrySinkReshapeOrTranspose(comp.get(), instruction)) {
+        changed = true;
+      }
+    }
+  }
+  return changed;
 }
 
 }  // namespace xla
diff --git a/tensorflow/compiler/xla/service/reshape_mover_test.cc b/tensorflow/compiler/xla/service/reshape_mover_test.cc
index 1862e2e992e..09a673ea809 100644
--- a/tensorflow/compiler/xla/service/reshape_mover_test.cc
+++ b/tensorflow/compiler/xla/service/reshape_mover_test.cc
@@ -202,5 +202,56 @@ TEST_F(ReshapeMoverTest, ScalarReshapeNotMovedAcrossSelect) {
   EXPECT_EQ(select, computation->root_instruction());
 }
 
+// Tree looks like this:
+//
+// add1
+// |
+// +- reshape2 - param2
+// |
+// +- reshape3 - add0
+//               |
+//               + reshape0 - param0
+//               |
+//               + reshape1 - param1
+//
+// We expect reshape{0,1} AND reshape{2,3} to be lifted.
+TEST_F(ReshapeMoverTest, MultiplePasses) {
+  auto shape1 = ShapeUtil::MakeShape(F32, {1, 8, 1, 7});
+  auto shape2 = ShapeUtil::MakeShape(F32, {8, 7, 1});
+  auto shape3 = ShapeUtil::MakeShape(F32, {8, 7});
+  HloComputation::Builder builder(TestName());
+  auto param0 = builder.AddInstruction(
+      HloInstruction::CreateParameter(0, shape1, "param0"));
+  auto param1 = builder.AddInstruction(
+      HloInstruction::CreateParameter(1, shape1, "param1"));
+  auto param2 = builder.AddInstruction(
+      HloInstruction::CreateParameter(2, shape2, "param2"));
+  auto reshape0 =
+      builder.AddInstruction(HloInstruction::CreateReshape(shape2, param0));
+  auto reshape1 =
+      builder.AddInstruction(HloInstruction::CreateReshape(shape2, param1));
+  auto add0 = builder.AddInstruction(HloInstruction::CreateBinary(
+      shape2, HloOpcode::kAdd, reshape0, reshape1));
+  auto reshape2 =
+      builder.AddInstruction(HloInstruction::CreateReshape(shape3, param2));
+  auto reshape3 =
+      builder.AddInstruction(HloInstruction::CreateReshape(shape3, add0));
+  auto add1 = builder.AddInstruction(HloInstruction::CreateBinary(
+      shape3, HloOpcode::kAdd, reshape2, reshape3));
+
+  auto module = MakeUnique<HloModule>(TestName());
+  auto computation = module->AddEntryComputation(builder.Build());
+  EXPECT_EQ(add1, computation->root_instruction());
+  EXPECT_TRUE(ReshapeMover().Run(module.get()).ValueOrDie());
+  EXPECT_EQ(HloOpcode::kReshape, computation->root_instruction()->opcode());
+  EXPECT_EQ(HloOpcode::kAdd,
+            computation->root_instruction()->operand(0)->opcode());
+  const auto& add_params =
+      computation->root_instruction()->operand(0)->operands();
+  EXPECT_EQ(2, add_params.size());
+  EXPECT_EQ(HloOpcode::kParameter, add_params[0]->opcode());
+  EXPECT_EQ(HloOpcode::kReshape, add_params[1]->opcode());
+}
+
 }  // namespace
 }  // namespace xla
diff --git a/tensorflow/contrib/distributions/python/ops/sample_stats.py b/tensorflow/contrib/distributions/python/ops/sample_stats.py
index 0b1ceefe7bf..26cf922d0af 100644
--- a/tensorflow/contrib/distributions/python/ops/sample_stats.py
+++ b/tensorflow/contrib/distributions/python/ops/sample_stats.py
@@ -44,7 +44,7 @@ def percentile(x,
                keep_dims=False,
                validate_args=False,
                name=None):
-  """Compute the `q`-th percentile of `x` along leading (sample) dimensions.
+  """Compute the `q`-th percentile of `x`.
 
   Given a vector `x`, the `q`-th percentile of `x` is the value `q / 100` of the
   way from the minimum to the maximum in in a sorted copy of `x`.
@@ -58,7 +58,7 @@ def percentile(x,
 
 
   ```python
-  # Get 30th percentile with default ('linear') interpolation.
+  # Get 30th percentile with default ('nearest') interpolation.
   x = [1., 2., 3., 4.]
   percentile(x, q=30.)
   ==> 2.0
@@ -91,11 +91,10 @@ def percentile(x,
     axis:  Optional `0-D` or `1-D` integer `Tensor` with constant values.
       The axis that hold independent samples over which to return the desired
       percentile.  If `None` (the default), treat every dimension as a sample
-      dimension, returning a scalar
+      dimension, returning a scalar.
     interpolation : {"lower", "higher", "nearest"}.  Default: "nearest"
       This optional parameter specifies the interpolation method to
-      use when the desired quantile lies between two data points
-      `i < j`:
+      use when the desired quantile lies between two data points `i < j`:
         * lower: `i`.
         * higher: `j`.
         * nearest: `i` or `j`, whichever is nearest.
diff --git a/tensorflow/contrib/keras/python/keras/__init__.py b/tensorflow/contrib/keras/python/keras/__init__.py
index cdfc40dff1d..ec316253dba 100644
--- a/tensorflow/contrib/keras/python/keras/__init__.py
+++ b/tensorflow/contrib/keras/python/keras/__init__.py
@@ -37,4 +37,4 @@ from tensorflow.contrib.keras.python.keras import utils
 from tensorflow.contrib.keras.python.keras import wrappers
 
 
-__version__ = '2.0.0-tf'
+__version__ = '2.0.2-tf'
diff --git a/tensorflow/contrib/keras/python/keras/activations.py b/tensorflow/contrib/keras/python/keras/activations.py
index 1eac52dfad6..67762c83ba2 100644
--- a/tensorflow/contrib/keras/python/keras/activations.py
+++ b/tensorflow/contrib/keras/python/keras/activations.py
@@ -24,18 +24,28 @@ from tensorflow.contrib.keras.python.keras import backend as K
 from tensorflow.contrib.keras.python.keras.utils.generic_utils import deserialize_keras_object
 
 
-def softmax(x):
+def softmax(x, axis=-1):
+  """Softmax activation function.
+
+  Arguments:
+      x : Tensor.
+      axis: Integer, axis along which the softmax normalization is applied.
+
+  Returns:
+      Tensor, output of softmax transformation.
+
+  Raises:
+      ValueError: In case `dim(x) == 1`.
+  """
   ndim = K.ndim(x)
   if ndim == 2:
     return K.softmax(x)
-  elif ndim == 3:
-    e = K.exp(x - K.max(x, axis=-1, keepdims=True))
-    s = K.sum(e, axis=-1, keepdims=True)
+  elif ndim > 2:
+    e = K.exp(x - K.max(x, axis=axis, keepdims=True))
+    s = K.sum(e, axis=axis, keepdims=True)
     return e / s
   else:
-    raise ValueError('Cannot apply softmax to a tensor '
-                     'that is not 2D or 3D. '
-                     'Here, ndim=' + str(ndim))
+    raise ValueError('Cannot apply softmax to a tensor that is 1D')
 
 
 def elu(x, alpha=1.0):
diff --git a/tensorflow/contrib/keras/python/keras/applications/resnet50.py b/tensorflow/contrib/keras/python/keras/applications/resnet50.py
index 546fcb9433a..12f7ca424ed 100644
--- a/tensorflow/contrib/keras/python/keras/applications/resnet50.py
+++ b/tensorflow/contrib/keras/python/keras/applications/resnet50.py
@@ -163,8 +163,8 @@ def ResNet50(include_top=True,
   specified in your Keras config file.
 
   Arguments:
-      include_top: whether to include the 3 fully-connected
-          layers at the top of the network.
+      include_top: whether to include the fully-connected
+          layer at the top of the network.
       weights: one of `None` (random initialization)
           or "imagenet" (pre-training on ImageNet).
       input_tensor: optional Keras tensor (i.e. output of `layers.Input()`)
diff --git a/tensorflow/contrib/keras/python/keras/backend.py b/tensorflow/contrib/keras/python/keras/backend.py
index 9769bce3b05..d7c646c19a7 100644
--- a/tensorflow/contrib/keras/python/keras/backend.py
+++ b/tensorflow/contrib/keras/python/keras/backend.py
@@ -22,7 +22,6 @@ from __future__ import division
 from __future__ import print_function
 
 from collections import defaultdict
-import errno
 import json
 import os
 import warnings
@@ -270,6 +269,7 @@ def clear_session():
   reset_uids()
   _SESSION = None
   phase = array_ops.placeholder(dtype='bool', name='keras_learning_phase')
+  _GRAPH_LEARNING_PHASES = {}
   _GRAPH_LEARNING_PHASES[ops.get_default_graph()] = phase
 
 
@@ -1257,6 +1257,34 @@ def prod(x, axis=None, keepdims=False):
   return math_ops.reduce_prod(x, reduction_indices=axis, keep_dims=keepdims)
 
 
+def cumsum(x, axis=0):
+  """Cumulative sum of the values in a tensor, alongside the specified axis.
+
+  Arguments:
+      x: A tensor or variable.
+      axis: An integer, the axis to compute the sum.
+
+  Returns:
+      A tensor of the cumulative sum of values of `x` along `axis`.
+  """
+  axis = _normalize_axis(axis, ndim(x))
+  return math_ops.cumsum(x, axis=axis)
+
+
+def cumprod(x, axis=0):
+  """Cumulative product of the values in a tensor, alongside the specified axis.
+
+  Arguments:
+      x: A tensor or variable.
+      axis: An integer, the axis to compute the product.
+
+  Returns:
+      A tensor of the cumulative product of values of `x` along `axis`.
+  """
+  axis = _normalize_axis(axis, ndim(x))
+  return math_ops.cumprod(x, axis=axis)
+
+
 def var(x, axis=None, keepdims=False):
   """Variance of a tensor, alongside the specified axis.
 
@@ -1330,8 +1358,7 @@ def any(x, axis=None, keepdims=False):
   """
   axis = _normalize_axis(axis, ndim(x))
   x = math_ops.cast(x, dtypes_module.bool)
-  x = math_ops.reduce_any(x, reduction_indices=axis, keep_dims=keepdims)
-  return math_ops.cast(x, dtypes_module.uint8)
+  return math_ops.reduce_any(x, reduction_indices=axis, keep_dims=keepdims)
 
 
 def all(x, axis=None, keepdims=False):
@@ -1347,8 +1374,7 @@ def all(x, axis=None, keepdims=False):
   """
   axis = _normalize_axis(axis, ndim(x))
   x = math_ops.cast(x, dtypes_module.bool)
-  x = math_ops.reduce_all(x, reduction_indices=axis, keep_dims=keepdims)
-  return math_ops.cast(x, dtypes_module.uint8)
+  return math_ops.reduce_all(x, reduction_indices=axis, keep_dims=keepdims)
 
 
 def argmax(x, axis=-1):
@@ -1645,7 +1671,7 @@ def normalize_batch_in_training(x, gamma, beta, reduction_axes, epsilon=1e-3):
   """
   mean, var = nn.moments(
       x, reduction_axes, shift=None, name=None, keep_dims=False)
-  if sorted(reduction_axes) == range(ndim(x))[:-1]:
+  if sorted(reduction_axes) == list(range(ndim(x)))[:-1]:
     normed = nn.batch_normalization(x, mean, var, beta, gamma, epsilon)
   else:
     # need broadcasting
@@ -2324,8 +2350,8 @@ def rnn(step_function,
           (no time dimension),
           containing the initial values for the states used in
           the step function.
-      go_backwards: boolean. If True, do the iteration over
-          the time dimension in reverse order.
+      go_backwards: boolean. If True, do the iteration over the time
+          dimension in reverse order and return the reversed sequence.
       mask: binary tensor with shape `(samples, time, 1)`,
           with a zero for every element that is masked.
       constants: a list of constant values passed at each step.
@@ -2414,9 +2440,9 @@ def rnn(step_function,
         states = return_states
         successive_outputs.append(output)
         successive_states.append(states)
-        last_output = successive_outputs[-1]
-        new_states = successive_states[-1]
-        outputs = array_ops.stack(successive_outputs)
+      last_output = successive_outputs[-1]
+      new_states = successive_states[-1]
+      outputs = array_ops.stack(successive_outputs)
     else:
       for inp in input_list:
         output, states = step_function(inp, states + constants)
@@ -3534,19 +3560,19 @@ def ctc_decode(y_pred, input_length, greedy=True, beam_width=100, top_paths=1):
 # HIGH ORDER FUNCTIONS
 
 
-def map_fn(fn, elems, name=None):
+def map_fn(fn, elems, name=None, dtype=None):
   """Map the function fn over the elements elems and return the outputs.
 
   Arguments:
       fn: Callable that will be called upon each element in elems
       elems: tensor
       name: A string name for the map node in the graph
+      dtype: Output data type.
 
   Returns:
-      Tensor with first dimension equal to the elems and second depending on
-      fn
+      Tensor with dtype `dtype`.
   """
-  return functional_ops.map_fn(fn, elems, name=name)
+  return functional_ops.map_fn(fn, elems, name=name, dtype=dtype)
 
 
 def foldl(fn, elems, initializer=None, name=None):
@@ -3560,7 +3586,7 @@ def foldl(fn, elems, initializer=None, name=None):
       name: A string name for the foldl node in the graph
 
   Returns:
-      Same type and shape as initializer
+      Tensor with same type and shape as `initializer`.
   """
   return functional_ops.foldl(fn, elems, initializer=initializer, name=name)
 
@@ -3583,27 +3609,39 @@ def foldr(fn, elems, initializer=None, name=None):
 
 # Load Keras default configuration from config file if present.
 _keras_base_dir = os.path.expanduser('~')
-if not os.access(_keras_base_dir, os.W_OK):
-  _keras_base_dir = '/tmp'
 _keras_dir = os.path.join(_keras_base_dir, '.keras')
-if not os.path.exists(_keras_dir):
-  try:
-    os.makedirs(_keras_dir)
-  except OSError as e:
-    if e.errno == errno.EEXIST:
-      pass
-    else:
-      raise
 _config_path = os.path.expanduser(os.path.join(_keras_dir, 'keras.json'))
 if os.path.exists(_config_path):
-  _config = json.load(open(_config_path))
+  try:
+    _config = json.load(open(_config_path))
+  except json.decoder.JSONDecodeError:
+    _config = {}
   _floatx = _config.get('floatx', floatx())
   assert _floatx in {'float16', 'float32', 'float64'}
   _epsilon = _config.get('epsilon', epsilon())
   assert isinstance(_epsilon, float)
-  _backend = backend()
   _image_data_format = _config.get('image_data_format', image_data_format())
   assert _image_data_format in {'channels_last', 'channels_first'}
   set_floatx(_floatx)
   set_epsilon(_epsilon)
   set_image_data_format(_image_data_format)
+
+# Save config file.
+if os.access(_keras_base_dir, os.W_OK):
+  if not os.path.exists(_keras_dir):
+    try:
+      os.makedirs(_keras_dir)
+    except OSError:
+      # Except potential race conditions
+      # in multi-threaded environments.
+      pass
+
+  if not os.path.exists(_config_path):
+    _config = {
+        'floatx': floatx(),
+        'epsilon': epsilon(),
+        'backend': 'tensorflow',
+        'image_data_format': image_data_format()
+    }
+    with open(_config_path, 'w') as f:
+      f.write(json.dumps(_config, indent=4))
diff --git a/tensorflow/contrib/keras/python/keras/engine/topology.py b/tensorflow/contrib/keras/python/keras/engine/topology.py
index 0f506ff0a46..e33268235f0 100644
--- a/tensorflow/contrib/keras/python/keras/engine/topology.py
+++ b/tensorflow/contrib/keras/python/keras/engine/topology.py
@@ -295,8 +295,14 @@ class Layer(object):
     # are only applicable to input layers: do not pass these keywords
     # to non-input layers.
     allowed_kwargs = {
-        'input_shape', 'batch_input_shape', 'batch_size', 'dtype', 'name',
-        'trainable', 'weights'
+        'input_shape',
+        'batch_input_shape',
+        'batch_size',
+        'dtype',
+        'name',
+        'trainable',
+        'weights',
+        'input_dtype',  # legacy
     }
     for kwarg in kwargs:
       if kwarg not in allowed_kwargs:
@@ -320,8 +326,15 @@ class Layer(object):
           batch_size = None
         batch_input_shape = (batch_size,) + tuple(kwargs['input_shape'])
       self.batch_input_shape = batch_input_shape
-      dtype = kwargs.get('dtype', K.floatx())
+
+      # Set dtype.
+      dtype = kwargs.get('dtype')
+      if dtype is None:
+        dtype = kwargs.get('input_dtype')
+      if dtype is None:
+        dtype = K.floatx()
       self.dtype = dtype
+
     if 'weights' in kwargs:
       self._initial_weights = kwargs['weights']
     else:
@@ -485,11 +498,12 @@ class Layer(object):
                                  ': expected shape=' + str(spec.shape) +
                                  ', found shape=' + str(x_shape))
 
-  def call(self, inputs):
+  def call(self, inputs, **kwargs):  # pylint: disable=unused-argument
     """This is where the layer's logic lives.
 
     Arguments:
-        inputs: input tensor, or list/tuple of input tensors.
+        inputs: Input tensor, or list/tuple of input tensors.
+        **kwargs: Additional keyword arguments.
 
     Returns:
         A tensor or list/tuple of tensors.
@@ -518,6 +532,8 @@ class Layer(object):
         ValueError: in case the layer is missing shape information
             for its `build` call.
     """
+    if isinstance(inputs, list):
+      inputs = inputs[:]
     with K.name_scope(self.name):
       # Handle laying building (weight creating, input spec locking).
       if not self.built:
@@ -1417,7 +1433,7 @@ class Container(Layer):
       get_weights
       set_weights
       get_config
-      get_output_shape_for
+      compute_output_shape
 
   # Class Methods
       from_config
@@ -2029,7 +2045,7 @@ class Container(Layer):
       for i in range(len(input_shapes)):
         layer = self.input_layers[i]
         input_shape = input_shapes[i]
-        # It's an input layer: get_output_shape_for is identity,
+        # It's an input layer: compute_output_shape is identity,
         # and there is only one node and one tensor output.
         shape_key = layer.name + '_0_0'
         layers_to_output_shapes[shape_key] = input_shape
diff --git a/tensorflow/contrib/keras/python/keras/engine/training.py b/tensorflow/contrib/keras/python/keras/engine/training.py
index efd437f6f66..0097c4a1c2c 100644
--- a/tensorflow/contrib/keras/python/keras/engine/training.py
+++ b/tensorflow/contrib/keras/python/keras/engine/training.py
@@ -733,11 +733,12 @@ class Model(Container):
       loss_functions = []
       for name in self.output_names:
         if name not in loss:
-          warnings.warn('Output "' + name + '" missing from loss dictionary. '
-                        'We assume this was done on purpose, '
-                        'and we will not be expecting '
-                        'any data to be passed to "' + name +
-                        '" during training.')
+          warnings.warn(
+              'Output "' + name + '" missing from loss dictionary. '
+              'We assume this was done on purpose, '
+              'and we will not be expecting '
+              'any data to be passed to "' + name + '" during training.',
+              stacklevel=2)
         loss_functions.append(losses.get(loss.get(name)))
     elif isinstance(loss, list):
       if len(loss) != len(self.outputs):
@@ -1202,7 +1203,7 @@ class Model(Container):
       if batch_index == 0:
         for batch_out in batch_outs:
           shape = (samples,) + batch_out.shape[1:]
-          outs.append(np.zeros(shape, dtype=K.floatx()))
+          outs.append(np.zeros(shape, dtype=batch_out.dtype))
 
       for i, batch_out in enumerate(batch_outs):
         outs[i][batch_start:batch_end] = batch_out
@@ -1718,7 +1719,7 @@ class Model(Container):
             - a tuple (inputs, targets, sample_weights).
             All arrays should contain the same number of samples.
             The generator is expected to loop over its data
-            indefinitely. An epoch finishes when `samples_per_epoch`
+            indefinitely. An epoch finishes when `steps_per_epoch`
             samples have been seen by the model.
         steps_per_epoch: Total number of steps (batches of samples)
             to yield from `generator` before declaring one epoch
@@ -1767,7 +1768,7 @@ class Model(Container):
                 f.close()
 
         model.fit_generator(generate_arrays_from_file('/my_file.txt'),
-                            samples_per_epoch=10000, epochs=10)
+                            steps_per_epoch=10000, epochs=10)
     ```
 
     Raises:
@@ -2028,7 +2029,8 @@ class Model(Container):
                         steps,
                         max_q_size=10,
                         workers=1,
-                        pickle_safe=False):
+                        pickle_safe=False,
+                        verbose=0):
     """Generates predictions for the input samples from a data generator.
 
     The generator should return the same kind of data as accepted by
@@ -2048,6 +2050,7 @@ class Model(Container):
             non picklable arguments to the generator
             as they can't be passed
             easily to children processes.
+        verbose: verbosity mode, 0 or 1.
 
     Returns:
         Numpy array(s) of predictions.
@@ -2067,6 +2070,9 @@ class Model(Container):
       enqueuer = GeneratorEnqueuer(generator, pickle_safe=pickle_safe)
       enqueuer.start(workers=workers, max_q_size=max_q_size)
 
+      if verbose == 1:
+        progbar = Progbar(target=steps)
+
       while steps_done < steps:
         generator_output = None
         while enqueuer.is_running():
@@ -2103,6 +2109,8 @@ class Model(Container):
         for i, out in enumerate(outs):
           all_outs[i].append(out)
         steps_done += 1
+        if verbose == 1:
+          progbar.update(steps_done)
 
     finally:
       if enqueuer is not None:
diff --git a/tensorflow/contrib/keras/python/keras/initializers.py b/tensorflow/contrib/keras/python/keras/initializers.py
index 621069f424b..f9cb35e171e 100644
--- a/tensorflow/contrib/keras/python/keras/initializers.py
+++ b/tensorflow/contrib/keras/python/keras/initializers.py
@@ -45,14 +45,16 @@ class Initializer(object):
 
 
 class Zeros(Initializer):
-  """Initializer that generates tensors initialized to 0."""
+  """Initializer that generates tensors initialized to 0.
+  """
 
   def __call__(self, shape, dtype=None):
     return K.constant(0, shape=shape, dtype=dtype)
 
 
 class Ones(Initializer):
-  """Initializer that generates tensors initialized to 1."""
+  """Initializer that generates tensors initialized to 1.
+  """
 
   def __call__(self, shape, dtype=None):
     return K.constant(1, shape=shape, dtype=dtype)
@@ -130,7 +132,7 @@ class RandomUniform(Initializer):
 class TruncatedNormal(Initializer):
   """Initializer that generates a truncated normal distribution.
 
-  These values are similar to values from a `random_normal_initializer`
+  These values are similar to values from a `RandomNormal`
   except that values more than two standard deviations from the mean
   are discarded and re-drawn. This is the recommended initializer for
   neural network weights and filters.
@@ -161,6 +163,7 @@ class VarianceScaling(Initializer):
 
   With `distribution="normal"`, samples are drawn from a truncated normal
   distribution centered on zero, with `stddev = sqrt(scale / n)` where n is:
+
       - number of input units in the weight tensor, if mode = "fan_in"
       - number of output units, if mode = "fan_out"
       - average of the numbers of input and output units, if mode = "fan_avg"
diff --git a/tensorflow/contrib/keras/python/keras/layers/convolutional.py b/tensorflow/contrib/keras/python/keras/layers/convolutional.py
index 1a28399a28f..3b68022115a 100644
--- a/tensorflow/contrib/keras/python/keras/layers/convolutional.py
+++ b/tensorflow/contrib/keras/python/keras/layers/convolutional.py
@@ -244,7 +244,7 @@ class _Conv(Layer):
         'kernel_initializer':
             initializers.serialize(self.kernel_initializer),
         'bias_initializer':
-            initializers.serialize(self.kernel_initializer),
+            initializers.serialize(self.bias_initializer),
         'kernel_regularizer':
             regularizers.serialize(self.kernel_regularizer),
         'bias_regularizer':
@@ -289,7 +289,7 @@ class Conv1D(_Conv):
           any `dilation_rate` value != 1.
       padding: One of `"valid"`, `"causal"` or `"same"` (case-insensitive).
           `"causal"` results in causal (dilated) convolutions, e.g. output[t]
-          depends solely on input[:t-1]. Useful when modeling temporal data
+          does not depend on input[t+1:]. Useful when modeling temporal data
           where the model should not violate the temporal order.
           See [WaveNet: A Generative Model for Raw Audio, section
             2.1](https://arxiv.org/abs/1609.03499).
@@ -395,9 +395,9 @@ class Conv2D(_Conv):
           one of `channels_last` (default) or `channels_first`.
           The ordering of the dimensions in the inputs.
           `channels_last` corresponds to inputs with shape
-          `(batch, width, height, channels)` while `channels_first`
+          `(batch, height, width, channels)` while `channels_first`
           corresponds to inputs with shape
-          `(batch, channels, width, height)`.
+          `(batch, channels, height, width)`.
           It defaults to the `image_data_format` value found in your
           Keras config file at `~/.keras/keras.json`.
           If you never set it, then it will be "channels_last".
@@ -621,7 +621,7 @@ class Conv2DTranspose(Conv2D):
 
   Arguments:
       filters: Integer, the dimensionality of the output space
-          (i.e. the number output of filters in the convolution).
+          (i.e. the number of output filters in the convolution).
       kernel_size: An integer or tuple/list of 2 integers, specifying the
           width and height of the 2D convolution window.
           Can be a single integer to specify the same value for
@@ -637,9 +637,9 @@ class Conv2DTranspose(Conv2D):
           one of `channels_last` (default) or `channels_first`.
           The ordering of the dimensions in the inputs.
           `channels_last` corresponds to inputs with shape
-          `(batch, width, height, channels)` while `channels_first`
+          `(batch, height, width, channels)` while `channels_first`
           corresponds to inputs with shape
-          `(batch, channels, width, height)`.
+          `(batch, channels, height, width)`.
           It defaults to the `image_data_format` value found in your
           Keras config file at `~/.keras/keras.json`.
           If you never set it, then it will be "channels_last".
@@ -688,7 +688,7 @@ class Conv2DTranspose(Conv2D):
                kernel_size,
                strides=(1, 1),
                padding='valid',
-               data_format='channels_last',
+               data_format=None,
                activation=None,
                use_bias=True,
                kernel_initializer='glorot_uniform',
@@ -845,9 +845,9 @@ class SeparableConv2D(Conv2D):
           one of `channels_last` (default) or `channels_first`.
           The ordering of the dimensions in the inputs.
           `channels_last` corresponds to inputs with shape
-          `(batch, width, height, channels)` while `channels_first`
+          `(batch, height, width, channels)` while `channels_first`
           corresponds to inputs with shape
-          `(batch, channels, width, height)`.
+          `(batch, channels, height, width)`.
           It defaults to the `image_data_format` value found in your
           Keras config file at `~/.keras/keras.json`.
           If you never set it, then it will be "channels_last".
@@ -1079,9 +1079,9 @@ class UpSampling2D(Layer):
           one of `channels_last` (default) or `channels_first`.
           The ordering of the dimensions in the inputs.
           `channels_last` corresponds to inputs with shape
-          `(batch, width, height, channels)` while `channels_first`
+          `(batch, height, width, channels)` while `channels_first`
           corresponds to inputs with shape
-          `(batch, channels, width, height)`.
+          `(batch, channels, height, width)`.
           It defaults to the `image_data_format` value found in your
           Keras config file at `~/.keras/keras.json`.
           If you never set it, then it will be "channels_last".
@@ -1257,7 +1257,7 @@ class ZeroPadding2D(Layer):
           - If tuple of 2 ints:
               interpreted as two different
               symmetric padding values for height and width:
-              `(symmetric_height_pad, symmetrc_width_pad)`.
+              `(symmetric_height_pad, symmetric_width_pad)`.
           - If tuple of 2 tuples of 2 ints:
               interpreted as
               `((top_pad, bottom_pad), (left_pad, right_pad))`
@@ -1265,9 +1265,9 @@ class ZeroPadding2D(Layer):
           one of `channels_last` (default) or `channels_first`.
           The ordering of the dimensions in the inputs.
           `channels_last` corresponds to inputs with shape
-          `(batch, width, height, channels)` while `channels_first`
+          `(batch, height, width, channels)` while `channels_first`
           corresponds to inputs with shape
-          `(batch, channels, width, height)`.
+          `(batch, channels, height, width)`.
           It defaults to the `image_data_format` value found in your
           Keras config file at `~/.keras/keras.json`.
           If you never set it, then it will be "channels_last".
@@ -1498,7 +1498,7 @@ class Cropping2D(Layer):
           - If tuple of 2 ints:
               interpreted as two different
               symmetric cropping values for height and width:
-              `(symmetric_height_crop, symmetrc_width_crop)`.
+              `(symmetric_height_crop, symmetric_width_crop)`.
           - If tuple of 2 tuples of 2 ints:
               interpreted as
               `((top_crop, bottom_crop), (left_crop, right_crop))`
@@ -1506,9 +1506,9 @@ class Cropping2D(Layer):
           one of `channels_last` (default) or `channels_first`.
           The ordering of the dimensions in the inputs.
           `channels_last` corresponds to inputs with shape
-          `(batch, width, height, channels)` while `channels_first`
+          `(batch, height, width, channels)` while `channels_first`
           corresponds to inputs with shape
-          `(batch, channels, width, height)`.
+          `(batch, channels, height, width)`.
           It defaults to the `image_data_format` value found in your
           Keras config file at `~/.keras/keras.json`.
           If you never set it, then it will be "channels_last".
diff --git a/tensorflow/contrib/keras/python/keras/layers/convolutional_recurrent.py b/tensorflow/contrib/keras/python/keras/layers/convolutional_recurrent.py
index 4ed5046dc31..4d8ef44da7b 100644
--- a/tensorflow/contrib/keras/python/keras/layers/convolutional_recurrent.py
+++ b/tensorflow/contrib/keras/python/keras/layers/convolutional_recurrent.py
@@ -357,7 +357,7 @@ class ConvLSTM2D(ConvRecurrent2D):
       self.states = [None, None]
 
     if self.data_format == 'channels_first':
-      channel_axis = 1
+      channel_axis = 2
     else:
       channel_axis = -1
     if input_shape[channel_axis] is None:
diff --git a/tensorflow/contrib/keras/python/keras/layers/core.py b/tensorflow/contrib/keras/python/keras/layers/core.py
index 1207cc119f2..8dd55aaa2e6 100644
--- a/tensorflow/contrib/keras/python/keras/layers/core.py
+++ b/tensorflow/contrib/keras/python/keras/layers/core.py
@@ -88,7 +88,7 @@ class Dropout(Layer):
   """Applies Dropout to the input.
 
   Dropout consists in randomly setting
-  a fraction `p` of input units to 0 at each update during training time,
+  a fraction `rate` of input units to 0 at each update during training time,
   which helps prevent overfitting.
 
   Arguments:
@@ -140,7 +140,7 @@ class SpatialDropout1D(Dropout):
   between feature maps and should be used instead.
 
   Arguments:
-      p: float between 0 and 1. Fraction of the input units to drop.
+      rate: float between 0 and 1. Fraction of the input units to drop.
 
   Input shape:
       3D tensor with shape:
@@ -775,7 +775,7 @@ class Dense(Layer):
         'kernel_initializer':
             initializers.serialize(self.kernel_initializer),
         'bias_initializer':
-            initializers.serialize(self.kernel_initializer),
+            initializers.serialize(self.bias_initializer),
         'kernel_regularizer':
             regularizers.serialize(self.kernel_regularizer),
         'bias_regularizer':
diff --git a/tensorflow/contrib/keras/python/keras/layers/local.py b/tensorflow/contrib/keras/python/keras/layers/local.py
index 3bf5ee4f0fc..895d6e3727c 100644
--- a/tensorflow/contrib/keras/python/keras/layers/local.py
+++ b/tensorflow/contrib/keras/python/keras/layers/local.py
@@ -59,7 +59,8 @@ class LocallyConnected1D(Layer):
           specifying the stride length of the convolution.
           Specifying any stride value != 1 is incompatible with specifying
           any `dilation_rate` value != 1.
-      padding: One of `"valid"` or `"same"` (case-insensitive).
+      padding: Currently only supports `"valid"` (case-insensitive).
+          `"same"` may be supported in the future.
       activation: Activation function to use.
           If you don't specify anything, no activation is applied
           (ie. "linear" activation: `a(x) = x`).
@@ -188,7 +189,7 @@ class LocallyConnected1D(Layer):
         'kernel_initializer':
             initializers.serialize(self.kernel_initializer),
         'bias_initializer':
-            initializers.serialize(self.kernel_initializer),
+            initializers.serialize(self.bias_initializer),
         'kernel_regularizer':
             regularizers.serialize(self.kernel_regularizer),
         'bias_regularizer':
@@ -239,16 +240,15 @@ class LocallyConnected2D(Layer):
           specifying the strides of the convolution along the width and height.
           Can be a single integer to specify the same value for
           all spatial dimensions.
-          Specifying any stride value != 1 is incompatible with specifying
-          any `dilation_rate` value != 1.
-      padding: one of `"valid"` or `"same"` (case-insensitive).
+      padding: Currently only support `"valid"` (case-insensitive).
+          `"same"` will be supported in future.
       data_format: A string,
           one of `channels_last` (default) or `channels_first`.
           The ordering of the dimensions in the inputs.
           `channels_last` corresponds to inputs with shape
-          `(batch, width, height, channels)` while `channels_first`
+          `(batch, height, width, channels)` while `channels_first`
           corresponds to inputs with shape
-          `(batch, channels, width, height)`.
+          `(batch, channels, height, width)`.
           It defaults to the `image_data_format` value found in your
           Keras config file at `~/.keras/keras.json`.
           If you never set it, then it will be "channels_last".
@@ -460,7 +460,7 @@ class LocallyConnected2D(Layer):
         'kernel_initializer':
             initializers.serialize(self.kernel_initializer),
         'bias_initializer':
-            initializers.serialize(self.kernel_initializer),
+            initializers.serialize(self.bias_initializer),
         'kernel_regularizer':
             regularizers.serialize(self.kernel_regularizer),
         'bias_regularizer':
diff --git a/tensorflow/contrib/keras/python/keras/layers/merge.py b/tensorflow/contrib/keras/python/keras/layers/merge.py
index eea4313d31c..d52bd2bbb3d 100644
--- a/tensorflow/contrib/keras/python/keras/layers/merge.py
+++ b/tensorflow/contrib/keras/python/keras/layers/merge.py
@@ -41,6 +41,44 @@ class _Merge(Layer):
   def _merge_function(self, inputs):
     raise NotImplementedError
 
+  def _compute_elemwise_op_output_shape(self, shape1, shape2):
+    """Computes the shape of the resultant of an elementwise operation.
+
+    Arguments:
+        shape1: tuple or None. Shape of the first tensor
+        shape2: tuple or None. Shape of the second tensor
+
+    Returns:
+        expected output shape when an element-wise operation is
+        carried out on 2 tensors with shapes shape1 and shape2.
+        tuple or None.
+
+    Raises:
+        ValueError: if shape1 and shape2 are not compatible for
+            element-wise operations.
+    """
+    if None in [shape1, shape2]:
+      return None
+    elif len(shape1) < len(shape2):
+      return self._compute_elemwise_op_output_shape(shape2, shape1)
+    elif not shape2:
+      return shape1
+    output_shape = list(shape1[:-len(shape2)])
+    for i, j in zip(shape1[-len(shape2):], shape2):
+      if i is None or j is None:
+        output_shape.append(None)
+      elif i == 1:
+        output_shape.append(j)
+      elif j == 1:
+        output_shape.append(i)
+      else:
+        if i != j:
+          raise ValueError('Operands could not be broadcast '
+                           'together with shapes ' + str(shape1) + ' ' + str(
+                               shape2))
+        output_shape.append(i)
+    return tuple(output_shape)
+
   def build(self, input_shape):
     # Used purely for shape validation.
     if not isinstance(input_shape, list):
@@ -49,23 +87,107 @@ class _Merge(Layer):
       raise ValueError('A merge layer should be called '
                        'on a list of at least 2 inputs. '
                        'Got ' + str(len(input_shape)) + ' inputs.')
-    if all([shape is None for shape in input_shape]):
-      return
-    input_shapes = [
-        tuple(tensor_shape.TensorShape(shape).as_list())
-        for shape in input_shape
-    ]
-    # TODO(fchollet): handle shapes with None entries.
-    input_shapes_set = set(input_shapes)
-    if None in input_shapes_set:
-      input_shapes_set.remove(None)
-    if len(input_shapes_set) > 1:
-      raise ValueError('Only tensors of same shape can '
-                       'be merged by layer' + self.name +
-                       ' Got input shapes: %s' % input_shapes)
+    batch_sizes = [s[0] for s in input_shape if s is not None]
+    batch_sizes = set(batch_sizes)
+    batch_sizes -= set([None])
+    if len(batch_sizes) > 1:
+      raise ValueError('Can not merge tensors with different '
+                       'batch sizes. Got tensors with shapes : ' + str(
+                           input_shape))
+    if input_shape[0] is None:
+      output_shape = None
+    else:
+      output_shape = input_shape[0][1:]
+    for i in range(1, len(input_shape)):
+      if input_shape[i] is None:
+        shape = None
+      else:
+        shape = input_shape[i][1:]
+      output_shape = self._compute_elemwise_op_output_shape(output_shape, shape)
+    # If the inputs have different ranks, we have to reshape them
+    # to make them broadcastable.
+    if None not in input_shape and len(set(map(len, input_shape))) == 1:
+      self._reshape_required = False
+    else:
+      self._reshape_required = True
 
   def call(self, inputs):
-    return self._merge_function(inputs)
+    if self._reshape_required:
+      reshaped_inputs = []
+      input_ndims = list(map(K.ndim, inputs))
+      if None not in input_ndims:
+        # If ranks of all inputs are available,
+        # we simply expand each of them at axis=1
+        # until all of them have the same rank.
+        max_ndim = max(input_ndims)
+        for x in inputs:
+          x_ndim = K.ndim(x)
+          for _ in range(max_ndim - x_ndim):
+            x = K.expand_dims(x, 1)
+          reshaped_inputs.append(x)
+        return self._merge_function(reshaped_inputs)
+      else:
+        # Transpose all inputs so that batch size is the last dimension.
+        # (batch_size, dim1, dim2, ... ) -> (dim1, dim2, ... , batch_size)
+        transposed = False
+        for x in inputs:
+          x_ndim = K.ndim(x)
+          if x_ndim is None:
+            x_shape = K.shape(x)
+            batch_size = x_shape[0]
+            new_shape = K.concatenate([x_shape[1:], K.expand_dims(batch_size)])
+            x_transposed = K.reshape(x,
+                                     K.stack([batch_size, K.prod(x_shape[1:])]))
+            x_transposed = K.permute_dimensions(x_transposed, (1, 0))
+            x_transposed = K.reshape(x_transposed, new_shape)
+            reshaped_inputs.append(x_transposed)
+            transposed = True
+          elif x_ndim > 1:
+            dims = list(range(1, x_ndim)) + [0]
+            reshaped_inputs.append(K.permute_dimensions(x, dims))
+            transposed = True
+          else:
+            # We don't transpose inputs if they are 1D vectors or scalars.
+            reshaped_inputs.append(x)
+        y = self._merge_function(reshaped_inputs)
+        y_ndim = K.ndim(y)
+        if transposed:
+          # If inputs have been transposed, we have to transpose the output too.
+          if y_ndim is None:
+            y_shape = K.shape(y)
+            y_ndim = K.shape(y_shape)[0]
+            batch_size = y_shape[y_ndim - 1]
+            new_shape = K.concatenate(
+                [K.expand_dims(batch_size), y_shape[:y_ndim - 1]])
+            y = K.reshape(y, (-1, batch_size))
+            y = K.permute_dimensions(y, (1, 0))
+            y = K.reshape(y, new_shape)
+          elif y_ndim > 1:
+            dims = [y_ndim - 1] + list(range(y_ndim - 1))
+            y = K.permute_dimensions(y, dims)
+        return y
+    else:
+      return self._merge_function(inputs)
+
+  def compute_output_shape(self, input_shape):
+    if input_shape[0] is None:
+      output_shape = None
+    else:
+      output_shape = input_shape[0][1:]
+    for i in range(1, len(input_shape)):
+      if input_shape[i] is None:
+        shape = None
+      else:
+        shape = input_shape[i][1:]
+      output_shape = self._compute_elemwise_op_output_shape(output_shape, shape)
+    batch_sizes = [s[0] for s in input_shape if s is not None]
+    batch_sizes = set(batch_sizes)
+    batch_sizes -= set([None])
+    if len(batch_sizes) == 1:
+      output_shape = (list(batch_sizes)[0],) + output_shape
+    else:
+      output_shape = (None,) + output_shape
+    return output_shape
 
   def compute_mask(self, inputs, mask=None):
     if mask is None:
@@ -219,8 +341,8 @@ class Concatenate(_Merge):
     for input_i, mask_i in zip(inputs, mask):
       if mask_i is None:
         # Input is unmasked. Append all 1s to masks,
-        # but cast it to uint8 first
-        masks.append(K.cast(K.ones_like(input_i), 'uint8'))
+        # but cast it to bool first
+        masks.append(K.cast(K.ones_like(input_i), 'bool'))
       elif K.ndim(mask_i) < K.ndim(input_i):
         # Mask is smaller than the input, expand it
         masks.append(K.expand_dims(mask_i))
diff --git a/tensorflow/contrib/keras/python/keras/layers/normalization.py b/tensorflow/contrib/keras/python/keras/layers/normalization.py
index 41c618cc79d..d429cd6d9ba 100644
--- a/tensorflow/contrib/keras/python/keras/layers/normalization.py
+++ b/tensorflow/contrib/keras/python/keras/layers/normalization.py
@@ -154,7 +154,7 @@ class BatchNormalization(Layer):
     broadcast_shape[self.axis] = input_shape[self.axis]
 
     # Determines whether broadcasting is needed.
-    needs_broadcasting = (sorted(reduction_axes) != range(ndim)[:-1])
+    needs_broadcasting = (sorted(reduction_axes) != list(range(ndim))[:-1])
 
     normed, mean, variance = K.normalize_batch_in_training(
         inputs, self.gamma, self.beta, reduction_axes, epsilon=self.epsilon)
diff --git a/tensorflow/contrib/keras/python/keras/layers/pooling.py b/tensorflow/contrib/keras/python/keras/layers/pooling.py
index e31caed3ecc..47c88bf4d0b 100644
--- a/tensorflow/contrib/keras/python/keras/layers/pooling.py
+++ b/tensorflow/contrib/keras/python/keras/layers/pooling.py
@@ -199,9 +199,9 @@ class MaxPooling2D(_Pooling2D):
           one of `channels_last` (default) or `channels_first`.
           The ordering of the dimensions in the inputs.
           `channels_last` corresponds to inputs with shape
-          `(batch, width, height, channels)` while `channels_first`
+          `(batch, height, width, channels)` while `channels_first`
           corresponds to inputs with shape
-          `(batch, channels, width, height)`.
+          `(batch, channels, height, width)`.
           It defaults to the `image_data_format` value found in your
           Keras config file at `~/.keras/keras.json`.
           If you never set it, then it will be "channels_last".
@@ -255,9 +255,9 @@ class AveragePooling2D(_Pooling2D):
           one of `channels_last` (default) or `channels_first`.
           The ordering of the dimensions in the inputs.
           `channels_last` corresponds to inputs with shape
-          `(batch, width, height, channels)` while `channels_first`
+          `(batch, height, width, channels)` while `channels_first`
           corresponds to inputs with shape
-          `(batch, channels, width, height)`.
+          `(batch, channels, height, width)`.
           It defaults to the `image_data_format` value found in your
           Keras config file at `~/.keras/keras.json`.
           If you never set it, then it will be "channels_last".
@@ -542,9 +542,9 @@ class GlobalAveragePooling2D(_GlobalPooling2D):
           one of `channels_last` (default) or `channels_first`.
           The ordering of the dimensions in the inputs.
           `channels_last` corresponds to inputs with shape
-          `(batch, width, height, channels)` while `channels_first`
+          `(batch, height, width, channels)` while `channels_first`
           corresponds to inputs with shape
-          `(batch, channels, width, height)`.
+          `(batch, channels, height, width)`.
           It defaults to the `image_data_format` value found in your
           Keras config file at `~/.keras/keras.json`.
           If you never set it, then it will be "channels_last".
@@ -577,9 +577,9 @@ class GlobalMaxPooling2D(_GlobalPooling2D):
           one of `channels_last` (default) or `channels_first`.
           The ordering of the dimensions in the inputs.
           `channels_last` corresponds to inputs with shape
-          `(batch, width, height, channels)` while `channels_first`
+          `(batch, height, width, channels)` while `channels_first`
           corresponds to inputs with shape
-          `(batch, channels, width, height)`.
+          `(batch, channels, height, width)`.
           It defaults to the `image_data_format` value found in your
           Keras config file at `~/.keras/keras.json`.
           If you never set it, then it will be "channels_last".
diff --git a/tensorflow/contrib/keras/python/keras/layers/recurrent.py b/tensorflow/contrib/keras/python/keras/layers/recurrent.py
index 06986d3eaad..6301132f4d2 100644
--- a/tensorflow/contrib/keras/python/keras/layers/recurrent.py
+++ b/tensorflow/contrib/keras/python/keras/layers/recurrent.py
@@ -105,8 +105,16 @@ class Recurrent(Layer):
       # now model.output_shape == (None, 32)
       # note: `None` is the batch dimension.
 
-      # for subsequent layers, not need to specify the input size:
+      # for subsequent layers, no need to specify the input size:
       model.add(LSTM(16))
+
+      # to stack recurrent layers, you must use return_sequences=True
+      # on any recurrent layer that feeds into another recurrent layer.
+      # note that you only need to specify the input size on the first layer.
+      model = Sequential()
+      model.add(LSTM(64, input_dim=64, input_length=10, return_sequences=True))
+      model.add(LSTM(32, return_sequences=True))
+      model.add(LSTM(10))
   ```
 
   Arguments:
@@ -116,7 +124,8 @@ class Recurrent(Layer):
       return_sequences: Boolean. Whether to return the last output
           in the output sequence, or the full sequence.
       go_backwards: Boolean (default False).
-          If True, process the input sequence backwards.
+          If True, process the input sequence backwards and return the
+          reversed sequence.
       stateful: Boolean (default False). If True, the last state
           for each sample at index i in a batch will be used as initial
           state for the sample of index i in the following batch.
@@ -398,6 +407,7 @@ class SimpleRNN(Recurrent):
       units: Positive integer, dimensionality of the output space.
       activation: Activation function to use.
           If you don't specify anything, no activation is applied
+          If you pass None, no activation is applied
           (ie. "linear" activation: `a(x) = x`).
       use_bias: Boolean, whether the layer uses a bias vector.
       kernel_initializer: Initializer for the `kernel` weights matrix,
@@ -547,7 +557,7 @@ class SimpleRNN(Recurrent):
 
   def get_constants(self, inputs, training=None):
     constants = []
-    if self.implementation == 0 and 0 < self.dropout < 1:
+    if self.implementation != 0 and 0 < self.dropout < 1:
       input_shape = K.int_shape(inputs)
       input_dim = input_shape[-1]
       ones = K.ones_like(K.reshape(inputs[:, 0, 0], (-1, 1)))
@@ -619,7 +629,7 @@ class GRU(Recurrent):
   Arguments:
       units: Positive integer, dimensionality of the output space.
       activation: Activation function to use.
-          If you don't specify anything, no activation is applied
+          If you pass None, no activation is applied
           (ie. "linear" activation: `a(x) = x`).
       recurrent_activation: Activation function to use
           for the recurrent step.
@@ -792,7 +802,7 @@ class GRU(Recurrent):
 
   def get_constants(self, inputs, training=None):
     constants = []
-    if self.implementation == 0 and 0 < self.dropout < 1:
+    if self.implementation != 0 and 0 < self.dropout < 1:
       input_shape = K.int_shape(inputs)
       input_dim = input_shape[-1]
       ones = K.ones_like(K.reshape(inputs[:, 0, 0], (-1, 1)))
@@ -861,7 +871,7 @@ class GRU(Recurrent):
         if self.use_bias:
           x_z = K.bias_add(x_z, self.bias_z)
           x_r = K.bias_add(x_r, self.bias_r)
-          x_h = K.bias_add(x_r, self.bias_h)
+          x_h = K.bias_add(x_h, self.bias_h)
       else:
         raise ValueError('Unknown `implementation` mode.')
       z = self.recurrent_activation(x_z + K.dot(h_tm1 * rec_dp_mask[0],
@@ -924,7 +934,7 @@ class LSTM(Recurrent):
   Arguments:
       units: Positive integer, dimensionality of the output space.
       activation: Activation function to use.
-          If you don't specify anything, no activation is applied
+          If you pass None, no activation is applied
           (ie. "linear" activation: `a(x) = x`).
       recurrent_activation: Activation function to use
           for the recurrent step.
@@ -1127,7 +1137,7 @@ class LSTM(Recurrent):
 
   def get_constants(self, inputs, training=None):
     constants = []
-    if self.implementation == 0 and 0 < self.dropout < 1:
+    if self.implementation != 0 and 0 < self.dropout < 1:
       input_shape = K.int_shape(inputs)
       input_dim = input_shape[-1]
       ones = K.ones_like(K.reshape(inputs[:, 0, 0], (-1, 1)))
diff --git a/tensorflow/contrib/keras/python/keras/layers/wrappers.py b/tensorflow/contrib/keras/python/keras/layers/wrappers.py
index 75b4810e40b..eeb67493ee3 100644
--- a/tensorflow/contrib/keras/python/keras/layers/wrappers.py
+++ b/tensorflow/contrib/keras/python/keras/layers/wrappers.py
@@ -12,6 +12,7 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 # ==============================================================================
+# pylint: disable=protected-access
 """Wrapper layers: layers that augment the functionality of another layer.
 """
 from __future__ import absolute_import
@@ -19,6 +20,7 @@ from __future__ import division
 from __future__ import print_function
 
 import copy
+import inspect
 
 from tensorflow.contrib.keras.python.keras import backend as K
 from tensorflow.contrib.keras.python.keras.engine import InputSpec
@@ -70,9 +72,10 @@ class Wrapper(Layer):
     return dict(list(base_config.items()) + list(config.items()))
 
   @classmethod
-  def from_config(cls, config):
+  def from_config(cls, config, custom_objects=None):
     from tensorflow.contrib.keras.python.keras.layers import deserialize as deserialize_layer  # pylint: disable=g-import-not-at-top
-    layer = deserialize_layer(config.pop('layer'))
+    layer = deserialize_layer(
+        config.pop('layer'), custom_objects=custom_objects)
     return cls(layer, **config)
 
 
@@ -188,12 +191,15 @@ class Bidirectional(Wrapper):
           If None, the outputs will not be combined,
           they will be returned as a list.
 
+  Raises:
+      ValueError: In case of invalid `merge_mode` argument.
+
   Examples:
 
   ```python
       model = Sequential()
       model.add(Bidirectional(LSTM(10, return_sequences=True), input_shape=(5,
-        10)))
+      10)))
       model.add(Bidirectional(LSTM(10)))
       model.add(Dense(5))
       model.add(Activation('softmax'))
@@ -242,29 +248,47 @@ class Bidirectional(Wrapper):
       shape = self.forward_layer._compute_output_shape(input_shape)  # pylint: disable=protected-access
       return [shape, copy.copy(shape)]
 
-  def call(self, inputs, mask=None):
-    y = self.forward_layer.call(inputs, mask)
-    y_rev = self.backward_layer.call(inputs, mask)
+  def call(self, inputs, training=None, mask=None):
+    kwargs = {}
+    func_args = inspect.getargspec(self.layer.call).args
+    if 'training' in func_args:
+      kwargs['training'] = training
+    if 'mask' in func_args:
+      kwargs['mask'] = mask
+
+    y = self.forward_layer.call(inputs, **kwargs)
+    y_rev = self.backward_layer.call(inputs, **kwargs)
     if self.return_sequences:
       y_rev = K.reverse(y_rev, 1)
     if self.merge_mode == 'concat':
-      return K.concatenate([y, y_rev])
+      output = K.concatenate([y, y_rev])
     elif self.merge_mode == 'sum':
-      return y + y_rev
+      output = y + y_rev
     elif self.merge_mode == 'ave':
-      return (y + y_rev) / 2
+      output = (y + y_rev) / 2
     elif self.merge_mode == 'mul':
-      return y * y_rev
+      output = y * y_rev
     elif self.merge_mode is None:
-      return [y, y_rev]
+      output = [y, y_rev]
+
+    # Properly set learning phase
+    if 0 < self.layer.dropout + self.layer.recurrent_dropout:
+      if self.merge_mode is None:
+        for out in output:
+          out._uses_learning_phase = True
+      else:
+        output._uses_learning_phase = True
+    return output
 
   def reset_states(self):
     self.forward_layer.reset_states()
     self.backward_layer.reset_states()
 
   def build(self, input_shape):
-    self.forward_layer.build(input_shape)
-    self.backward_layer.build(input_shape)
+    with K.name_scope(self.forward_layer.name):
+      self.forward_layer.build(input_shape)
+    with K.name_scope(self.backward_layer.name):
+      self.backward_layer.build(input_shape)
     self.built = True
 
   def compute_mask(self, inputs, mask):
diff --git a/tensorflow/contrib/keras/python/keras/metrics.py b/tensorflow/contrib/keras/python/keras/metrics.py
index d7266c94cf7..59d380f73bd 100644
--- a/tensorflow/contrib/keras/python/keras/metrics.py
+++ b/tensorflow/contrib/keras/python/keras/metrics.py
@@ -43,12 +43,15 @@ def binary_accuracy(y_true, y_pred):
 
 
 def categorical_accuracy(y_true, y_pred):
-  return K.equal(K.argmax(y_true, axis=-1), K.argmax(y_pred, axis=-1))
+  return K.cast(
+      K.equal(K.argmax(y_true, axis=-1), K.argmax(y_pred, axis=-1)), K.floatx())
 
 
 def sparse_categorical_accuracy(y_true, y_pred):
-  return K.equal(
-      K.max(y_true, axis=-1), K.cast(K.argmax(y_pred, axis=-1), K.floatx()))
+  return K.cast(
+      K.equal(
+          K.max(y_true, axis=-1), K.cast(K.argmax(y_pred, axis=-1),
+                                         K.floatx())), K.floatx())
 
 
 def top_k_categorical_accuracy(y_true, y_pred, k=5):
diff --git a/tensorflow/contrib/keras/python/keras/models.py b/tensorflow/contrib/keras/python/keras/models.py
index 2be4431d03d..5289bb732b1 100644
--- a/tensorflow/contrib/keras/python/keras/models.py
+++ b/tensorflow/contrib/keras/python/keras/models.py
@@ -207,7 +207,7 @@ def load_model(filepath, custom_objects=None):
       ValueError: In case of an invalid savefile.
   """
   if h5py is None:
-    raise ImportError('`save_model` requires h5py.')
+    raise ImportError('`load_model` requires h5py.')
 
   if not custom_objects:
     custom_objects = {}
@@ -1006,7 +1006,7 @@ class Sequential(Model):
         steps_per_epoch: Total number of steps (batches of samples)
             to yield from `generator` before declaring one epoch
             finished and starting the next epoch. It should typically
-            be equal to the number of unique samples if your dataset
+            be equal to the number of unique samples of your dataset
             divided by the batch size.
         epochs: Integer, total number of iterations on the data.
         verbose: Verbosity mode, 0, 1, or 2.
@@ -1017,8 +1017,10 @@ class Sequential(Model):
             - A tuple (inputs, targets, sample_weights).
         validation_steps: Only relevant if `validation_data`
             is a generator.
-            Number of samples to use from validation generator
-            at the end of every epoch.
+            Number of steps to yield from validation generator
+            at the end of every epoch. It should typically
+            be equal to the number of unique samples of your
+            validation dataset divided by the batch size.
         class_weight: Dictionary mapping class indices to a weight
             for the class.
         max_q_size: Maximum size for the generator queue
@@ -1050,7 +1052,7 @@ class Sequential(Model):
                     # and labels, from each line in the file
                     x, y = process_line(line)
                     yield (x, y)
-                f.close()
+                    f.close()
 
         model.fit_generator(generate_arrays_from_file('/my_file.txt'),
                             samples_per_epoch=10000, epochs=10)
@@ -1119,7 +1121,8 @@ class Sequential(Model):
                         steps,
                         max_q_size=10,
                         workers=1,
-                        pickle_safe=False):
+                        pickle_safe=False,
+                        verbose=0):
     """Generates predictions for the input samples from a data generator.
 
     The generator should return the same kind of data as accepted by
@@ -1136,6 +1139,7 @@ class Sequential(Model):
             relies on multiprocessing, you should not pass
             non picklable arguments to the generator
             as they can't be passed easily to children processes.
+        verbose: verbosity mode, 0 or 1.
 
     Returns:
         A Numpy array of predictions.
@@ -1147,7 +1151,8 @@ class Sequential(Model):
         steps,
         max_q_size=max_q_size,
         workers=workers,
-        pickle_safe=pickle_safe)
+        pickle_safe=pickle_safe,
+        verbose=verbose)
 
   def get_config(self):
     config = []
@@ -1159,9 +1164,9 @@ class Sequential(Model):
     return copy.deepcopy(config)
 
   @classmethod
-  def from_config(cls, config):
+  def from_config(cls, config, custom_objects=None):
     model = cls()
     for conf in config:
-      layer = layer_module.deserialize(conf)
+      layer = layer_module.deserialize(conf, custom_objects=custom_objects)
       model.add(layer)
     return model
diff --git a/tensorflow/contrib/keras/python/keras/preprocessing/image.py b/tensorflow/contrib/keras/python/keras/preprocessing/image.py
index 86c7650a073..de0749ae020 100644
--- a/tensorflow/contrib/keras/python/keras/preprocessing/image.py
+++ b/tensorflow/contrib/keras/python/keras/preprocessing/image.py
@@ -785,7 +785,7 @@ class Iterator(object):
           index_array = np.random.permutation(n)
 
       current_index = (self.batch_index * batch_size) % n
-      if n >= current_index + batch_size:
+      if n > current_index + batch_size:
         current_batch_size = batch_size
         self.batch_index += 1
       else:
diff --git a/tensorflow/contrib/keras/python/keras/utils/generic_utils.py b/tensorflow/contrib/keras/python/keras/utils/generic_utils.py
index c1e02968353..6e83fde2c90 100644
--- a/tensorflow/contrib/keras/python/keras/utils/generic_utils.py
+++ b/tensorflow/contrib/keras/python/keras/utils/generic_utils.py
@@ -172,7 +172,8 @@ def deserialize_keras_object(identifier,
     else:
       fn = module_objects.get(function_name)
       if fn is None:
-        raise ValueError('Unknown ' + printable_module_name, ':' + class_name)
+        raise ValueError('Unknown ' + printable_module_name,
+                         ':' + function_name)
     return fn
   else:
     raise ValueError('Could not interpret serialized ' + printable_module_name +
@@ -215,6 +216,8 @@ def func_load(code, defaults=None, closure=None, globs=None):
   """
   if isinstance(code, (tuple, list)):  # unpack previous dump
     code, defaults, closure = code
+    if isinstance(defaults, list):
+      defaults = tuple(defaults)
   code = marshal.loads(code.encode('raw_unicode_escape'))
   if globs is None:
     globs = globals()
diff --git a/tensorflow/contrib/keras/python/keras/utils/layer_utils.py b/tensorflow/contrib/keras/python/keras/utils/layer_utils.py
index 32e0de7d3dc..26878fdd57f 100644
--- a/tensorflow/contrib/keras/python/keras/utils/layer_utils.py
+++ b/tensorflow/contrib/keras/python/keras/utils/layer_utils.py
@@ -171,7 +171,7 @@ def count_total_params(layers, layer_set=None):
           [K.count_params(p) for p in layer.trainable_weights])
       non_trainable_count += np.sum(
           [K.count_params(p) for p in layer.non_trainable_weights])
-  return trainable_count, non_trainable_count
+  return int(trainable_count), int(non_trainable_count)
 
 
 def convert_all_kernels_in_model(model):
diff --git a/tensorflow/contrib/keras/python/keras/wrappers/scikit_learn.py b/tensorflow/contrib/keras/python/keras/wrappers/scikit_learn.py
index ecda890fec9..323c31aee83 100644
--- a/tensorflow/contrib/keras/python/keras/wrappers/scikit_learn.py
+++ b/tensorflow/contrib/keras/python/keras/wrappers/scikit_learn.py
@@ -194,6 +194,36 @@ class KerasClassifier(BaseWrapper):
   """Implementation of the scikit-learn classifier API for Keras.
   """
 
+  def fit(self, x, y, **kwargs):
+    """Constructs a new model with `build_fn` & fit the model to `(x, y)`.
+
+    Arguments:
+        x : array-like, shape `(n_samples, n_features)`
+            Training samples where n_samples in the number of samples
+            and n_features is the number of features.
+        y : array-like, shape `(n_samples,)` or `(n_samples, n_outputs)`
+            True labels for X.
+        **kwargs: dictionary arguments
+            Legal arguments are the arguments of `Sequential.fit`
+
+    Returns:
+        history : object
+            details about the training history at each epoch.
+
+    Raises:
+        ValueError: In case of invalid shape for `y` argument.
+    """
+    y = np.array(y)
+    if len(y.shape) == 2 and y.shape[1] > 1:
+      self.classes_ = np.arange(y.shape[1])
+    elif (len(y.shape) == 2 and y.shape[1] == 1) or len(y.shape) == 1:
+      self.classes_ = np.unique(y)
+      y = np.searchsorted(self.classes_, y)
+    else:
+      raise ValueError('Invalid shape for y: ' + str(y.shape))
+    self.n_classes_ = len(self.classes_)
+    return super(KerasClassifier, self).fit(x, y, **kwargs)
+
   def predict(self, x, **kwargs):
     """Returns the class predictions for the given test data.
 
@@ -210,7 +240,8 @@ class KerasClassifier(BaseWrapper):
             Class predictions.
     """
     kwargs = self.filter_sk_params(Sequential.predict_classes, kwargs)
-    return self.model.predict_classes(x, **kwargs)
+    classes = self.model.predict_classes(x, **kwargs)
+    return self.classes_[classes]
 
   def predict_proba(self, x, **kwargs):
     """Returns class probability estimates for the given test data.
@@ -261,6 +292,7 @@ class KerasClassifier(BaseWrapper):
             compute accuracy. You should pass `metrics=["accuracy"]` to
             the `.compile()` method of the model.
     """
+    y = np.searchsorted(self.classes_, y)
     kwargs = self.filter_sk_params(Sequential.evaluate, kwargs)
 
     loss_name = self.model.loss
diff --git a/tensorflow/contrib/layers/python/layers/embedding_ops.py b/tensorflow/contrib/layers/python/layers/embedding_ops.py
index e42e885364c..b1a7f7ee59a 100644
--- a/tensorflow/contrib/layers/python/layers/embedding_ops.py
+++ b/tensorflow/contrib/layers/python/layers/embedding_ops.py
@@ -22,11 +22,13 @@ from six.moves import xrange  # pylint: disable=redefined-builtin
 from tensorflow.contrib.framework.python.framework import tensor_util as contrib_tensor_util
 from tensorflow.contrib.layers.python.ops import sparse_feature_cross_op
 
+from tensorflow.python.framework import constant_op
 from tensorflow.python.framework import dtypes
 from tensorflow.python.framework import ops
 from tensorflow.python.framework import sparse_tensor
 from tensorflow.python.framework import tensor_shape
 from tensorflow.python.ops import array_ops
+from tensorflow.python.ops import clip_ops
 from tensorflow.python.ops import control_flow_ops
 from tensorflow.python.ops import data_flow_ops
 from tensorflow.python.ops import embedding_ops
@@ -555,8 +557,13 @@ def _sampled_scattered_embedding_lookup_sparse(params,
                                          name=name_scope)
 
 
-def embedding_lookup_sparse_with_distributed_aggregation(params, sp_ids,
-    sp_weights, partition_strategy="mod", name=None, combiner=None,
+def embedding_lookup_sparse_with_distributed_aggregation(
+    params,
+    sp_ids,
+    sp_weights,
+    partition_strategy="mod",
+    name=None,
+    combiner=None,
     max_norm=None):
   """Computes embeddings for the given ids and weights.
 
@@ -638,8 +645,13 @@ def embedding_lookup_sparse_with_distributed_aggregation(params, sp_ids,
 
     weights = None if ignore_weights else sp_weights.values
     embeddings = _embedding_lookup_with_distributed_aggregation(
-        params, ids, partition_strategy=partition_strategy, max_norm=max_norm,
-        weights=weights, idx=idx, segment_ids=segment_ids)
+        params,
+        ids,
+        partition_strategy=partition_strategy,
+        max_norm=max_norm,
+        weights=weights,
+        idx=idx,
+        segment_ids=segment_ids)
     # Set weights to all one if ignore weights.
     if ignore_weights:
       weights = array_ops.fill([array_ops.shape(segment_ids)[0]], 1)
@@ -648,13 +660,13 @@ def embedding_lookup_sparse_with_distributed_aggregation(params, sp_ids,
     # Reshape weights.
     ones = array_ops.fill(
         array_ops.expand_dims(array_ops.rank(embeddings) - 1, 0), 1)
-    bcast_weights_shape = array_ops.concat([array_ops.shape(weights), ones],
-                                           0)
+    bcast_weights_shape = array_ops.concat([array_ops.shape(weights), ones], 0)
     orig_weights_shape = weights.get_shape()
     weights = array_ops.reshape(weights, bcast_weights_shape)
     if embeddings.get_shape().ndims is not None:
-      weights.set_shape(orig_weights_shape.concatenate(
-          [1 for _ in range(embeddings.get_shape().ndims - 1)]))
+      weights.set_shape(
+          orig_weights_shape.concatenate(
+              [1 for _ in range(embeddings.get_shape().ndims - 1)]))
 
     if combiner == "mean":
       weight_sum = math_ops.segment_sum(weights, segment_ids)
@@ -677,16 +689,23 @@ def _do_gather(params, ids, validate_indices=True, name=None):
       params, ids, name=name, validate_indices=validate_indices)
 
 
-def _embedding_lookup_with_distributed_aggregation(params, ids,
-    partition_strategy="mod", name=None, validate_indices=True, max_norm=None,
-    weights=None, idx=None, segment_ids=None):
-  """ Lookup helper for embedding_lookup_sparse_with_distributed_aggregation."""
+def _embedding_lookup_with_distributed_aggregation(params,
+                                                   ids,
+                                                   partition_strategy="mod",
+                                                   name=None,
+                                                   validate_indices=True,
+                                                   max_norm=None,
+                                                   weights=None,
+                                                   idx=None,
+                                                   segment_ids=None):
+  """Lookup helper for embedding_lookup_sparse_with_distributed_aggregation."""
   if params is None or params == []:  # pylint: disable=g-explicit-bool-comparison
     raise ValueError("Need at least one param")
   if isinstance(params, variables.PartitionedVariable):
     params = list(params)  # Iterate to get the underlying Variables.
   if not isinstance(params, list):
     params = [params]
+
   def maybe_normalize(x):
     if max_norm is not None:
       if x.get_shape().ndims is not None:
@@ -695,18 +714,18 @@ def _embedding_lookup_with_distributed_aggregation(params, ids,
         ndims = array_ops.size(array_ops.shape(x))
       return clip_ops.clip_by_norm(x, max_norm, axes=list(range(1, ndims)))
     return x
+
   with ops.name_scope(name, "embedding_lookup_with_distributed_aggregation",
-      params + [ids]) as name:
+                      params + [ids]) as name:
     np = len(params)  # Number of partitions
     # Preserve the resource variable status to avoid accidental dense reads.
-    if not any(isinstance(p, resource_variable_ops.ResourceVariable)
-               for p in params):
+    if not any(
+        isinstance(p, resource_variable_ops.ResourceVariable) for p in params):
       params = ops.convert_n_to_tensor_or_indexed_slices(params, name="params")
     if np == 1:
       with ops.colocate_with(params[0]):
         ret = maybe_normalize(
-            _do_gather(
-                params[0], ids, validate_indices=validate_indices))
+            _do_gather(params[0], ids, validate_indices=validate_indices))
         ignore_weights = weights is None
         if not ignore_weights:
           if weights.dtype != ret.dtype:
@@ -720,8 +739,9 @@ def _embedding_lookup_with_distributed_aggregation(params, ids,
           weights = array_ops.reshape(weights, bcast_weights_shape)
           # Set weights shape after reshape
           if ret.get_shape().ndims is not None:
-            weights.set_shape(orig_weights_shape.concatenate(
-                [1 for _ in range(ret.get_shape().ndims - 1)]))
+            weights.set_shape(
+                orig_weights_shape.concatenate(
+                    [1 for _ in range(ret.get_shape().ndims - 1)]))
           ret *= weights
           return math_ops.segment_sum(ret, segment_ids, name=name)
         else:
@@ -757,18 +777,16 @@ def _embedding_lookup_with_distributed_aggregation(params, ids,
         ids_per_partition = num_total_ids // np
         extras = num_total_ids % np
 
-        p_assignments = math_ops.maximum(
-            flat_ids // (ids_per_partition + 1),
-            (flat_ids - extras) // ids_per_partition)
+        p_assignments = math_ops.maximum(flat_ids // (ids_per_partition + 1), (
+            flat_ids - extras) // ids_per_partition)
 
         # Emulate a conditional using a boolean indicator tensor
-        is_in_first_extras_partitions = math_ops.cast(
-            p_assignments < extras, flat_ids.dtype)
-        new_ids = (
-            is_in_first_extras_partitions * (
-                flat_ids % (ids_per_partition + 1)) +
-            (1 - is_in_first_extras_partitions) * (
-                (flat_ids - extras) % ids_per_partition))
+        is_in_first_extras_partitions = math_ops.cast(p_assignments < extras,
+                                                      flat_ids.dtype)
+        new_ids = (is_in_first_extras_partitions * (flat_ids %
+                                                    (ids_per_partition + 1)) +
+                   (1 - is_in_first_extras_partitions) * (
+                       (flat_ids - extras) % ids_per_partition))
       else:
         raise ValueError("Unrecognized partition strategy: " +
                          partition_strategy)
@@ -786,8 +804,8 @@ def _embedding_lookup_with_distributed_aggregation(params, ids,
       for p in xrange(np):
         with ops.colocate_with(params[p]):
           partitioned_result.append(
-              _do_gather(params[p], gather_ids[p],
-                         validate_indices=validate_indices))
+              _do_gather(
+                  params[p], gather_ids[p], validate_indices=validate_indices))
 
       ignore_weights = weights is None
       if not ignore_weights:
@@ -802,17 +820,21 @@ def _embedding_lookup_with_distributed_aggregation(params, ids,
       if element_shape.is_fully_defined():
         for p in xrange(np):
           with ops.colocate_with(params[p]):
-            partitioned_result[p] = array_ops.reshape(partitioned_result[p],
-                array_ops.concat(
-                    [array_ops.shape(pindices[p]), element_shape], 0))
+            partitioned_result[p] = array_ops.reshape(
+                partitioned_result[p],
+                array_ops.concat([array_ops.shape(pindices[p]), element_shape],
+                                 0))
       else:
         with ops.colocate_with(params[0]):
           params_shape = array_ops.shape(params[0])
         for p in xrange(np):
           with ops.colocate_with(params[p]):
-            partitioned_result[p] = array_ops.reshape(partitioned_result[p],
-                array_ops.concat([array_ops.shape(pindices[p]),
-                    array_ops.slice(params_shape, [1], [-1])], 0))
+            partitioned_result[p] = array_ops.reshape(
+                partitioned_result[p],
+                array_ops.concat([
+                    array_ops.shape(pindices[p]), array_ops.slice(
+                        params_shape, [1], [-1])
+                ], 0))
       # Normalize each partition result.
       for p in xrange(np):
         with ops.colocate_with(params[p]):
@@ -823,7 +845,7 @@ def _embedding_lookup_with_distributed_aggregation(params, ids,
           with ops.colocate_with(params[p]):
             if partitioned_weight[p].dtype != partitioned_result[p].dtype:
               partitioned_weight[p] = math_ops.cast(partitioned_weight[p],
-                  partitioned_result[p].dtype)
+                                                    partitioned_result[p].dtype)
             # Reshape partition weights.
             ones = array_ops.fill(
                 array_ops.expand_dims(
@@ -834,9 +856,12 @@ def _embedding_lookup_with_distributed_aggregation(params, ids,
             partitioned_weight[p] = array_ops.reshape(partitioned_weight[p],
                                                       bcast_weights_shape)
             if partitioned_result[p].get_shape().ndims is not None:
-              partitioned_weight[p].set_shape(orig_weights_shape.concatenate(
-                  [1 for _ in range(
-                      partitioned_result[p].get_shape().ndims - 1)]))
+              partitioned_weight[p].set_shape(
+                  orig_weights_shape.concatenate([
+                      1
+                      for _ in range(partitioned_result[p].get_shape().ndims -
+                                     1)
+                  ]))
             partitioned_result[p] *= partitioned_weight[p]
       partitioned_segment_ids = []
       for p in xrange(np):
@@ -874,5 +899,7 @@ def _embedding_lookup_with_distributed_aggregation(params, ids,
       concat_segment_ids = array_ops.concat(partitioned_segment_ids, 0)
       concat_partitioned_result = array_ops.concat(partitioned_result, 0)
       return math_ops.unsorted_segment_sum(
-          concat_partitioned_result, concat_segment_ids,
-          math_ops.reduce_max(concat_segment_ids) + 1, name=name)
+          concat_partitioned_result,
+          concat_segment_ids,
+          math_ops.reduce_max(concat_segment_ids) + 1,
+          name=name)
diff --git a/tensorflow/contrib/layers/python/layers/embedding_ops_test.py b/tensorflow/contrib/layers/python/layers/embedding_ops_test.py
index eb38d70c52c..bf251449820 100644
--- a/tensorflow/contrib/layers/python/layers/embedding_ops_test.py
+++ b/tensorflow/contrib/layers/python/layers/embedding_ops_test.py
@@ -31,8 +31,9 @@ from tensorflow.python.framework import dtypes
 from tensorflow.python.framework import errors_impl
 from tensorflow.python.framework import random_seed
 from tensorflow.python.framework import sparse_tensor as sparse_tensor_lib
-from tensorflow.python.ops import init_ops
+from tensorflow.python.ops import array_ops
 from tensorflow.python.ops import gradient_checker
+from tensorflow.python.ops import init_ops
 from tensorflow.python.ops import math_ops
 from tensorflow.python.ops import partitioned_variables
 from tensorflow.python.platform import test
@@ -145,8 +146,8 @@ class SafeEmbeddingLookupSparseTest(test.TestCase):
       self.assertAllClose(
           embedding_lookup_result,
           [(embedding_weights[0][0] + embedding_weights[0][1]) / 2.0, [0] * 4,
-           [0] * 4, embedding_weights[0][2],
-           (embedding_weights[0][0] + embedding_weights[0][1]) / 2.0])
+           [0] * 4, embedding_weights[0][2], (
+               embedding_weights[0][0] + embedding_weights[0][1]) / 2.0])
 
   def test_safe_embedding_lookup_sparse_partitioned(self):
     with self.test_session():
@@ -171,8 +172,8 @@ class SafeEmbeddingLookupSparseTest(test.TestCase):
       self.assertRaises(ValueError, embedding_ops.safe_embedding_lookup_sparse,
                         embedding_weights, sparse_ids)
       embedding_weights = [
-          constant_op.constant(
-              w, dtype=dtypes.float64) for w in embedding_weights
+          constant_op.constant(w, dtype=dtypes.float64)
+          for w in embedding_weights
       ]
       self.assertRaises(ValueError, embedding_ops.safe_embedding_lookup_sparse,
                         embedding_weights, sparse_ids, sparse_weights)
@@ -185,11 +186,10 @@ class SafeEmbeddingLookupSparseTest(test.TestCase):
       embedding_lookup_result = (embedding_ops.safe_embedding_lookup_sparse(
           embedding_weights, sparse_ids, sparse_weights).eval())
 
-      self.assertAllClose(
-          embedding_lookup_result,
-          [[(1.0 * embedding_weights[0][0] + 2.0 * embedding_weights[0][1]) /
-            3.0, [0] * 4, [0] * 4],
-           [embedding_weights[0][2], [0] * 4, [0] * 4]])
+      self.assertAllClose(embedding_lookup_result, [[
+          (1.0 * embedding_weights[0][0] + 2.0 * embedding_weights[0][1]) / 3.0,
+          [0] * 4, [0] * 4
+      ], [embedding_weights[0][2], [0] * 4, [0] * 4]])
 
   def test_safe_embedding_lookup_sparse_3d_return_special_vector(self):
     with self.test_session():
@@ -215,14 +215,13 @@ class SafeEmbeddingLookupSparseTest(test.TestCase):
       embedding_lookup_result = (embedding_ops.safe_embedding_lookup_sparse(
           embedding_weights, sparse_ids, None).eval())
 
-      self.assertAllClose(
-          embedding_lookup_result,
-          [[(embedding_weights[0][0] + embedding_weights[0][1]) / 2.0, [0] * 4,
-            [0] * 4], [
-                embedding_weights[0][2],
-                (embedding_weights[0][0] + embedding_weights[0][1]) / 2.0,
-                [0] * 4
-            ]])
+      self.assertAllClose(embedding_lookup_result, [[(
+          embedding_weights[0][0] + embedding_weights[0][1]) / 2.0, [0] * 4, [
+              0
+          ] * 4], [
+              embedding_weights[0][2],
+              (embedding_weights[0][0] + embedding_weights[0][1]) / 2.0, [0] * 4
+          ]])
 
   def test_safe_embedding_lookup_sparse_3d_partitioned(self):
     with self.test_session():
@@ -233,13 +232,12 @@ class SafeEmbeddingLookupSparseTest(test.TestCase):
           embedding_weights, sparse_ids, None).eval())
 
       embedding_weights = list(itertools.chain(*embedding_weights))
-      self.assertAllClose(embedding_lookup_result,
-                          [[(embedding_weights[0] + embedding_weights[1]) / 2.0,
-                            [0] * 4, [0] * 4], [
-                                embedding_weights[2],
-                                (embedding_weights[0] + embedding_weights[1]) /
-                                2.0, [0] * 4
-                            ]])
+      self.assertAllClose(embedding_lookup_result, [[
+          (embedding_weights[0] + embedding_weights[1]) / 2.0, [0] * 4, [0] * 4
+      ], [
+          embedding_weights[2],
+          (embedding_weights[0] + embedding_weights[1]) / 2.0, [0] * 4
+      ]])
 
   def test_safe_embedding_lookup_sparse_3d_partitioned_inconsistent_weights(
       self):
@@ -251,8 +249,8 @@ class SafeEmbeddingLookupSparseTest(test.TestCase):
       self.assertRaises(ValueError, embedding_ops.safe_embedding_lookup_sparse,
                         embedding_weights, sparse_ids)
       embedding_weights = [
-          constant_op.constant(
-              w, dtype=dtypes.float64) for w in embedding_weights
+          constant_op.constant(w, dtype=dtypes.float64)
+          for w in embedding_weights
       ]
       self.assertRaises(ValueError, embedding_ops.safe_embedding_lookup_sparse,
                         embedding_weights, sparse_ids, sparse_weights)
@@ -301,8 +299,8 @@ class ScatteredEmbeddingLookupTest(test.TestCase):
       self.assertAllEqual(embedding_lookup_result[0],
                           embedding_lookup_result[1])
       # Different embedding expected for different value.
-      embedding_diff = np.min((embedding_lookup_result[2] -
-                               embedding_lookup_result[0])**2)
+      embedding_diff = np.min(
+          (embedding_lookup_result[2] - embedding_lookup_result[0])**2)
       self.assertGreater(embedding_diff, 0)
 
   def test_scattered_embedding_coverage(self):
@@ -320,8 +318,8 @@ class ScatteredEmbeddingLookupTest(test.TestCase):
   def test_scattered_embedding_multi_dimension(self):
     with self.test_session():
       embedding_weights = self._random_weights()
-      values = constant_op.constant(
-          [["foo", "bar", "bar"], ["bar", "bar", "foo"]])
+      values = constant_op.constant([["foo", "bar", "bar"],
+                                     ["bar", "bar", "foo"]])
 
       embedding_lookup_result = embedding_ops.scattered_embedding_lookup(
           embedding_weights, values, dimension=10).eval()
@@ -340,8 +338,8 @@ class ScatteredEmbeddingLookupTest(test.TestCase):
 
       embedding_lookup_result = (
           embedding_ops.scattered_embedding_lookup_sparse(
-              embedding_weights, sparse_tensor, dimension=5, combiner="mean")
-          .eval())
+              embedding_weights, sparse_tensor, dimension=5,
+              combiner="mean").eval())
 
       self.assertAllEqual(embedding_lookup_result.shape, [5, 5])
       # Same non-zero embedding for the empty rows filled with a default value.
@@ -433,8 +431,8 @@ class SampledScatteredEmbeddingLookupTest(test.TestCase):
   def test_hashed_embedding_multi_dimension(self):
     with self.test_session():
       embedding_weights = self._random_weights()
-      values = constant_op.constant(
-          [["foo", "bar", "bar"], ["bar", "bar", "foo"]])
+      values = constant_op.constant([["foo", "bar", "bar"],
+                                     ["bar", "bar", "foo"]])
       sampled_candidates = constant_op.constant(
           [[[1, 3, 4, 6], [1, 7, 8, 9], [1, 7, 8, 9]],
            [[1, 7, 8, 9], [1, 7, 8, 9], [1, 3, 4, 6]]])
@@ -491,8 +489,8 @@ class SampledScatteredEmbeddingLookupSparseTest(test.TestCase):
       result = embedding_ops._sampled_scattered_embedding_lookup_sparse(
           params, sp_values, dimension=5, hash_key=self._hash_key)
 
-      self.assertAllClose(result.eval(), [[0., 0., 0., 0., 0.],
-                                          [.3, .2, .2, .3, .1],
+      self.assertAllClose(result.eval(), [[0., 0., 0., 0.,
+                                           0.], [.3, .2, .2, .3, .1],
                                           [0., 0., 0., 0., 0.]])
 
   def test_output_values_with_sampled_candidates(self):
@@ -631,8 +629,8 @@ def _EmbeddingResult(params,
         else:
           partition = extras + (i - threshold) // ids_per_partition
           offset = (i - threshold) % ids_per_partition
-        val = np.copy(params[_PName(partition) + ":0"][
-            offset, :]) * weight_value
+        val = np.copy(
+            params[_PName(partition) + ":0"][offset, :]) * weight_value
       else:
         assert False
       if value_aggregation is None:
@@ -707,19 +705,19 @@ class EmbeddingLookupSparseWithDistributedAggregationTest(test.TestCase):
     grouped_ignored_weights = self._GroupByBatchEntry(
         np.ones(np.sum(vals_per_batch_entry)), vals_per_batch_entry)
 
-    for num_shards, combiner, dtype, ignore_weights in itertools.product([1, 5],
-        ["sum", "mean", "sqrtn"], [dtypes.float32, dtypes.float64],
-        [True, False]):
+    for num_shards, combiner, dtype, ignore_weights in itertools.product(
+        [1, 5], ["sum", "mean", "sqrtn"], [dtypes.float32,
+                                           dtypes.float64], [True, False]):
 
       with self.test_session():
         p, params, feed_dict = _EmbeddingParams(
             num_shards, vocab_size, shape=param_shape, dtype=dtype)
         embedding_sum = \
             embedding_ops.embedding_lookup_sparse_with_distributed_aggregation(
-            p,
-            sp_ids,
-            None if ignore_weights else sp_weights,
-            combiner=combiner)
+                p,
+                sp_ids,
+                None if ignore_weights else sp_weights,
+                combiner=combiner)
 
         self.assertEqual(embedding_sum.get_shape().as_list(),
                          expected_lookup_result_shape)
@@ -731,8 +729,8 @@ class EmbeddingLookupSparseWithDistributedAggregationTest(test.TestCase):
             grouped_ids,
             num_shards,
             vocab_size,
-            weight_vals=grouped_ignored_weights if ignore_weights else
-            grouped_weights)
+            weight_vals=grouped_ignored_weights
+            if ignore_weights else grouped_weights)
         if combiner == "mean":
           np_embedding_sum /= np.reshape(np_weight_sum, (batch_size, 1, 1))
         if combiner == "sqrtn":
@@ -744,12 +742,12 @@ class EmbeddingLookupSparseWithDistributedAggregationTest(test.TestCase):
     vocab_size = 12
     batch_size = 4
     param_shape = [2, 3]
-    sp_ids, sp_weights, _, _, _ = (
-        self._RandomIdsAndWeights(batch_size, vocab_size))
+    sp_ids, sp_weights, _, _, _ = (self._RandomIdsAndWeights(
+        batch_size, vocab_size))
 
-    for num_shards, combiner, dtype, ignore_weights in itertools.product([1, 3],
-        ["sum", "mean", "sqrtn"], [dtypes.float32, dtypes.float64],
-        [True, False]):
+    for num_shards, combiner, dtype, ignore_weights in itertools.product(
+        [1, 3], ["sum", "mean", "sqrtn"], [dtypes.float32,
+                                           dtypes.float64], [True, False]):
       with self.test_session():
         x, params, _ = _EmbeddingParams(
             num_shards, vocab_size, shape=param_shape, dtype=dtype)
diff --git a/tensorflow/contrib/layers/python/layers/layers.py b/tensorflow/contrib/layers/python/layers/layers.py
index 65dcf8577f0..0140f6d0d3e 100644
--- a/tensorflow/contrib/layers/python/layers/layers.py
+++ b/tensorflow/contrib/layers/python/layers/layers.py
@@ -1942,6 +1942,7 @@ def separable_convolution2d(
                                             dtype=dtype,
                                             initializer=biases_initializer,
                                             regularizer=biases_regularizer,
+                                            trainable=trainable,
                                             collections=biases_collections)
           outputs = nn.bias_add(outputs, biases)
 
diff --git a/tensorflow/contrib/layers/python/layers/layers_test.py b/tensorflow/contrib/layers/python/layers/layers_test.py
index 3bc31a26249..2b170e92ba1 100644
--- a/tensorflow/contrib/layers/python/layers/layers_test.py
+++ b/tensorflow/contrib/layers/python/layers/layers_test.py
@@ -2979,6 +2979,20 @@ class SeparableConv2dTest(test.TestCase):
       sess.run(init_op)
       sess.run(net, feed_dict={images_placeholder: images})
 
+  def testTrainableFlagIsPassedOn(self):
+    for trainable in [True, False]:
+      for num_filters in [None, 8]:
+        with ops.Graph().as_default():
+          input_size = [5, 10, 12, 3]
+
+          images = random_ops.random_uniform(input_size, seed=1)
+          layers_lib.separable_conv2d(
+              images, num_filters, [3, 3], 1, trainable=trainable)
+          model_variables = variables.get_model_variables()
+          trainable_variables = variables_lib.trainable_variables()
+          for model_variable in model_variables:
+            self.assertEqual(trainable, model_variable in trainable_variables)
+
 
 class ScaleGradientTests(test.TestCase):
   """Simple tests of the scale_gradient function."""
diff --git a/tensorflow/contrib/learn/python/learn/dataframe/queues/feeding_functions.py b/tensorflow/contrib/learn/python/learn/dataframe/queues/feeding_functions.py
index e71ad9b50b4..dfe08bb8633 100644
--- a/tensorflow/contrib/learn/python/learn/dataframe/queues/feeding_functions.py
+++ b/tensorflow/contrib/learn/python/learn/dataframe/queues/feeding_functions.py
@@ -22,6 +22,7 @@ from __future__ import print_function
 # pylint: disable=unused-import
 from tensorflow.python.estimator.inputs.queues.feeding_functions import _ArrayFeedFn
 from tensorflow.python.estimator.inputs.queues.feeding_functions import _enqueue_data as enqueue_data
+from tensorflow.python.estimator.inputs.queues.feeding_functions import _GeneratorFeedFn
 from tensorflow.python.estimator.inputs.queues.feeding_functions import _OrderedDictNumpyFeedFn
 from tensorflow.python.estimator.inputs.queues.feeding_functions import _PandasFeedFn
 from tensorflow.python.estimator.inputs.queues.feeding_functions import _GeneratorFeedFn
diff --git a/tensorflow/contrib/learn/python/learn/estimators/model_fn.py b/tensorflow/contrib/learn/python/learn/estimators/model_fn.py
index 6d15f83ef53..c0a39185493 100644
--- a/tensorflow/contrib/learn/python/learn/estimators/model_fn.py
+++ b/tensorflow/contrib/learn/python/learn/estimators/model_fn.py
@@ -26,6 +26,7 @@ import six
 from tensorflow.contrib import framework as contrib_framework
 from tensorflow.contrib.framework import get_graph_from_inputs
 from tensorflow.contrib.learn.python.learn.estimators import constants
+from tensorflow.contrib.learn.python.learn.estimators import metric_key
 from tensorflow.contrib.learn.python.learn.estimators import prediction_key
 from tensorflow.python.estimator import model_fn as core_model_fn_lib
 from tensorflow.python.estimator.export import export_output as core_export_lib
@@ -255,12 +256,20 @@ class ModelFnOps(
       export_outputs_dict = {key: _export_output(*val) for key, val in
                              output_alternatives.items()}
 
+    def _get_eval_metric_ops():
+      """Returns self.eval_metric_ops without loss metric."""
+      result = {}
+      for key, value in six.iteritems(self.eval_metric_ops):
+        if key != metric_key.MetricKey.LOSS:
+          result[key] = value
+      return result
+
     return core_model_fn_lib.EstimatorSpec(
         mode=mode,
         predictions=self.predictions,
         loss=self.loss,
         train_op=self.train_op,
-        eval_metric_ops=self.eval_metric_ops,
+        eval_metric_ops=_get_eval_metric_ops(),
         export_outputs=export_outputs_dict,
         training_chief_hooks=self.training_chief_hooks,
         training_hooks=self.training_hooks,
diff --git a/tensorflow/contrib/learn/python/learn/estimators/model_fn_test.py b/tensorflow/contrib/learn/python/learn/estimators/model_fn_test.py
index fe8b3a1b346..51b32359a33 100644
--- a/tensorflow/contrib/learn/python/learn/estimators/model_fn_test.py
+++ b/tensorflow/contrib/learn/python/learn/estimators/model_fn_test.py
@@ -18,6 +18,7 @@ from __future__ import absolute_import
 from __future__ import division
 from __future__ import print_function
 
+import six
 
 from tensorflow.contrib.learn.python.learn.estimators import constants
 from tensorflow.contrib.learn.python.learn.estimators import model_fn
@@ -51,19 +52,26 @@ class ModelFnopsTest(test.TestCase):
         predictions=predictions,
         loss=constant_op.constant([1]),
         train_op=control_flow_ops.no_op(),
-        eval_metric_ops={"metric_key": (control_flow_ops.no_op(),
-                                        control_flow_ops.no_op())},
+        eval_metric_ops={
+            "metric_key": (constant_op.constant(1.), control_flow_ops.no_op()),
+            "loss": (constant_op.constant(1.), control_flow_ops.no_op()),
+        },
         # zzz
         training_chief_hooks=[basic_session_run_hooks.StepCounterHook()],
         training_hooks=[basic_session_run_hooks.StepCounterHook()],
         output_alternatives=output_alternatives,
         scaffold=monitored_session.Scaffold())
 
-  def assertEquals_except_export(self, model_fn_ops, estimator_spec):
+  def assertEquals_except_export_and_eval_loss(
+      self, model_fn_ops, estimator_spec):
+    expected_eval_metric_ops = {}
+    for key, value in six.iteritems(model_fn_ops.eval_metric_ops):
+      if key != "loss":
+        expected_eval_metric_ops[key] = value
     self.assertEqual(model_fn_ops.predictions, estimator_spec.predictions)
     self.assertEqual(model_fn_ops.loss, estimator_spec.loss)
     self.assertEqual(model_fn_ops.train_op, estimator_spec.train_op)
-    self.assertEqual(model_fn_ops.eval_metric_ops,
+    self.assertEqual(expected_eval_metric_ops,
                      estimator_spec.eval_metric_ops)
     self.assertEqual(model_fn_ops.training_chief_hooks,
                      estimator_spec.training_chief_hooks)
@@ -75,7 +83,7 @@ class ModelFnopsTest(test.TestCase):
     model_fn_ops = self.create_model_fn_ops(predictions, None)
 
     estimator_spec = model_fn_ops.estimator_spec(model_fn.ModeKeys.INFER)
-    self.assertEquals_except_export(model_fn_ops, estimator_spec)
+    self.assertEquals_except_export_and_eval_loss(model_fn_ops, estimator_spec)
 
   def testEstimatorSpec_export_regression_with_scores(self):
     predictions = self.create_predictions()
@@ -84,7 +92,7 @@ class ModelFnopsTest(test.TestCase):
     model_fn_ops = self.create_model_fn_ops(predictions, output_alternatives)
 
     estimator_spec = model_fn_ops.estimator_spec(model_fn.ModeKeys.INFER)
-    self.assertEquals_except_export(model_fn_ops, estimator_spec)
+    self.assertEquals_except_export_and_eval_loss(model_fn_ops, estimator_spec)
 
     with session.Session():
       regression_output = estimator_spec.export_outputs["regression_head"]
@@ -103,7 +111,7 @@ class ModelFnopsTest(test.TestCase):
     model_fn_ops = self.create_model_fn_ops(predictions, output_alternatives)
 
     estimator_spec = model_fn_ops.estimator_spec(model_fn.ModeKeys.INFER)
-    self.assertEquals_except_export(model_fn_ops, estimator_spec)
+    self.assertEquals_except_export_and_eval_loss(model_fn_ops, estimator_spec)
 
     with session.Session():
       regression_output = estimator_spec.export_outputs["regression_head"]
@@ -119,7 +127,7 @@ class ModelFnopsTest(test.TestCase):
     model_fn_ops = self.create_model_fn_ops(predictions, output_alternatives)
 
     estimator_spec = model_fn_ops.estimator_spec(model_fn.ModeKeys.INFER)
-    self.assertEquals_except_export(model_fn_ops, estimator_spec)
+    self.assertEquals_except_export_and_eval_loss(model_fn_ops, estimator_spec)
 
     with session.Session():
       classification_output = estimator_spec.export_outputs[
@@ -140,7 +148,7 @@ class ModelFnopsTest(test.TestCase):
     model_fn_ops = self.create_model_fn_ops(predictions, output_alternatives)
 
     estimator_spec = model_fn_ops.estimator_spec(model_fn.ModeKeys.INFER)
-    self.assertEquals_except_export(model_fn_ops, estimator_spec)
+    self.assertEquals_except_export_and_eval_loss(model_fn_ops, estimator_spec)
 
     with session.Session():
       classification_output = estimator_spec.export_outputs[
@@ -162,7 +170,7 @@ class ModelFnopsTest(test.TestCase):
     model_fn_ops = self.create_model_fn_ops(predictions, output_alternatives)
 
     estimator_spec = model_fn_ops.estimator_spec(model_fn.ModeKeys.INFER)
-    self.assertEquals_except_export(model_fn_ops, estimator_spec)
+    self.assertEquals_except_export_and_eval_loss(model_fn_ops, estimator_spec)
 
     with session.Session():
       classification_output = estimator_spec.export_outputs[
@@ -182,7 +190,7 @@ class ModelFnopsTest(test.TestCase):
     model_fn_ops = self.create_model_fn_ops(predictions, output_alternatives)
 
     estimator_spec = model_fn_ops.estimator_spec(model_fn.ModeKeys.INFER)
-    self.assertEquals_except_export(model_fn_ops, estimator_spec)
+    self.assertEquals_except_export_and_eval_loss(model_fn_ops, estimator_spec)
 
     with session.Session():
       classification_output = estimator_spec.export_outputs[
@@ -203,7 +211,7 @@ class ModelFnopsTest(test.TestCase):
     model_fn_ops = self.create_model_fn_ops(predictions, output_alternatives)
 
     estimator_spec = model_fn_ops.estimator_spec(model_fn.ModeKeys.INFER)
-    self.assertEquals_except_export(model_fn_ops, estimator_spec)
+    self.assertEquals_except_export_and_eval_loss(model_fn_ops, estimator_spec)
 
     with session.Session():
       classification_output = estimator_spec.export_outputs[
@@ -221,7 +229,7 @@ class ModelFnopsTest(test.TestCase):
     model_fn_ops = self.create_model_fn_ops(predictions, output_alternatives)
 
     estimator_spec = model_fn_ops.estimator_spec(model_fn.ModeKeys.INFER)
-    self.assertEquals_except_export(model_fn_ops, estimator_spec)
+    self.assertEquals_except_export_and_eval_loss(model_fn_ops, estimator_spec)
 
     with session.Session():
       logistic_output = estimator_spec.export_outputs["logistic_head"]
@@ -240,7 +248,7 @@ class ModelFnopsTest(test.TestCase):
     model_fn_ops = self.create_model_fn_ops(predictions, output_alternatives)
 
     estimator_spec = model_fn_ops.estimator_spec(model_fn.ModeKeys.INFER)
-    self.assertEquals_except_export(model_fn_ops, estimator_spec)
+    self.assertEquals_except_export_and_eval_loss(model_fn_ops, estimator_spec)
 
     with session.Session():
       unspecified_output = estimator_spec.export_outputs["unspecified_head"]
@@ -259,7 +267,7 @@ class ModelFnopsTest(test.TestCase):
 
     estimator_spec = model_fn_ops.estimator_spec(model_fn.ModeKeys.INFER,
                                                  "regression_head")
-    self.assertEquals_except_export(model_fn_ops, estimator_spec)
+    self.assertEquals_except_export_and_eval_loss(model_fn_ops, estimator_spec)
 
     with session.Session():
       regression_output = estimator_spec.export_outputs["regression_head"]
diff --git a/tensorflow/contrib/learn/python/learn/learn_io/generator_io.py b/tensorflow/contrib/learn/python/learn/learn_io/generator_io.py
index 7d08f9b4523..4a70f00407e 100644
--- a/tensorflow/contrib/learn/python/learn/learn_io/generator_io.py
+++ b/tensorflow/contrib/learn/python/learn/learn_io/generator_io.py
@@ -18,8 +18,9 @@ from __future__ import absolute_import
 from __future__ import division
 from __future__ import print_function
 
-from types import FunctionType, GeneratorType
 from collections import Container
+from types import FunctionType
+from types import GeneratorType
 
 from tensorflow.contrib.learn.python.learn.dataframe.queues import feeding_functions
 
@@ -33,7 +34,7 @@ def generator_input_fn(x,
                        num_threads=1):
   """Returns input function that would dicts of numpy arrays
        yielded from a generator.
-  
+
   It is assumed that every dict yielded from the dictionary represents
   a single sample. The generator should consume a single epoch of the data.
 
@@ -82,47 +83,44 @@ def generator_input_fn(x,
     KeyError: `key` mismatch between dicts emitted from `x()`
   """
   if not isinstance(x, FunctionType):
-    raise TypeError('x must be generator function; got {}'.format(
-        type(x).__name__))
+    raise TypeError(
+        'x must be generator function; got {}'.format(type(x).__name__))
   generator = x()
   if not isinstance(generator, GeneratorType):
-    raise TypeError('x() must be generator; got {}'.format(
-        type(generator).__name__))
+    raise TypeError(
+        'x() must be generator; got {}'.format(type(generator).__name__))
   data = next(generator)
   if not isinstance(data, dict):
-    raise TypeError('x() must yield dict; got {}'.format(
-        type(data).__name__))
+    raise TypeError('x() must yield dict; got {}'.format(type(data).__name__))
   input_keys = sorted(next(x()).keys())
   if target_key is not None:
     if isinstance(target_key, str):
       target_key = [target_key]
-    elif isinstance(target_key,  Container):
+    elif isinstance(target_key, Container):
       for item in target_key:
         if not isinstance(item, str):
-          raise TypeError(
-              'target_key must be str or Container of str; got {}'.format(
-                  type(item).__name__))
+          raise TypeError('target_key must be str or Container of str; got {}'.
+                          format(type(item).__name__))
         if item not in input_keys:
           raise KeyError(
               'target_key not in yielded dict. Expected {} keys; got {}'.format(
                   input_keys, item))
     else:
-      raise TypeError(
-          'target_key must be str or Container of str; got {}'.format(
-              type(target_key).__name__))
+      raise TypeError('target_key must be str or Container of str; got {}'.
+                      format(type(target_key).__name__))
 
   def _generator_input_fn():
     """generator input function."""
     queue = feeding_functions.enqueue_data(
-      x,
-      queue_capacity,
-      shuffle=shuffle,
-      num_threads=num_threads,
-      enqueue_size=batch_size,
-      num_epochs=num_epochs)
+        x,
+        queue_capacity,
+        shuffle=shuffle,
+        num_threads=num_threads,
+        enqueue_size=batch_size,
+        num_epochs=num_epochs)
 
-    features = (queue.dequeue_many(batch_size) if num_epochs is None
-                else queue.dequeue_up_to(batch_size))
+    features = (queue.dequeue_many(batch_size)
+                if num_epochs is None else queue.dequeue_up_to(batch_size))
     if not isinstance(features, list):
       features = [features]
     features = dict(zip(input_keys, features))
diff --git a/tensorflow/contrib/learn/python/learn/learn_io/generator_io_test.py b/tensorflow/contrib/learn/python/learn/learn_io/generator_io_test.py
index 8d3cdec819c..ae68e35c219 100644
--- a/tensorflow/contrib/learn/python/learn/learn_io/generator_io_test.py
+++ b/tensorflow/contrib/learn/python/learn/learn_io/generator_io_test.py
@@ -35,17 +35,24 @@ from tensorflow.python.training import queue_runner_impl
 
 
 class GeneratorIoTest(test.TestCase):
+ 
   def testGeneratorInputFn(self):
 
     def generator():
       for index in range(2):
-        yield {'a': np.ones(1) * index,
-               'b': np.ones(1) * index + 32,
-               'label': np.ones(1) * index - 32}
+        yield {
+            'a': np.ones(1) * index,
+            'b': np.ones(1) * index + 32,
+            'label': np.ones(1) * index - 32
+        }
 
     with self.test_session() as session:
       input_fn = generator_io.generator_input_fn(
-        generator, target_key='label', batch_size=2, shuffle=False, num_epochs=1)
+          generator,
+          target_key='label',
+          batch_size=2,
+          shuffle=False,
+          num_epochs=1)
       features, target = input_fn()
 
       coord = coordinator.Coordinator()
@@ -71,7 +78,7 @@ class GeneratorIoTest(test.TestCase):
 
     with self.test_session() as session:
       input_fn = generator_io.generator_input_fn(
-        generator, target_key=None, batch_size=2, shuffle=False, num_epochs=1)
+          generator, target_key=None, batch_size=2, shuffle=False, num_epochs=1)
       features = input_fn()
 
       coord = coordinator.Coordinator()
@@ -91,15 +98,20 @@ class GeneratorIoTest(test.TestCase):
 
     def generator():
       for index in range(2):
-        yield {'a': np.ones(1) * index,
-               'b': np.ones(1) * index + 32,
-               'label': np.ones(1) * index - 32,
-               'label2': np.ones(1) * index - 64,
-               }
+        yield {
+            'a': np.ones(1) * index,
+            'b': np.ones(1) * index + 32,
+            'label': np.ones(1) * index - 32,
+            'label2': np.ones(1) * index - 64,
+        }
 
     with self.test_session() as session:
       input_fn = generator_io.generator_input_fn(
-        generator, target_key=['label','label2'], batch_size=2, shuffle=False, num_epochs=1)
+          generator,
+          target_key=['label', 'label2'],
+          batch_size=2,
+          shuffle=False,
+          num_epochs=1)
       features, target = input_fn()
 
       coord = coordinator.Coordinator()
@@ -108,8 +120,10 @@ class GeneratorIoTest(test.TestCase):
       res = session.run([features, target])
       self.assertAllEqual(res[0]['a'], np.asarray([0, 1]).reshape(-1, 1))
       self.assertAllEqual(res[0]['b'], np.asarray([32, 33]).reshape(-1, 1))
-      self.assertAllEqual(res[1]['label'], np.asarray([-32, -31]).reshape(-1, 1))
-      self.assertAllEqual(res[1]['label2'], np.asarray([-64, -63]).reshape(-1, 1))
+      self.assertAllEqual(res[1]['label'], np.asarray([-32, -31]).reshape(
+          -1, 1))
+      self.assertAllEqual(res[1]['label2'],
+                          np.asarray([-64, -63]).reshape(-1, 1))
 
       session.run([features])
       with self.assertRaises(errors.OutOfRangeError):
@@ -122,22 +136,34 @@ class GeneratorIoTest(test.TestCase):
 
     def generator():
       for index in range(100):
-        yield {'a': np.ones((10, 10)) * index,
-               'b': np.ones((5, 5)) * index + 32,
-               'label': np.ones((3, 3)) * index - 32}
+        yield {
+            'a': np.ones((10, 10)) * index,
+            'b': np.ones((5, 5)) * index + 32,
+            'label': np.ones((3, 3)) * index - 32
+        }
 
     with self.test_session() as session:
       input_fn = generator_io.generator_input_fn(
-        generator, target_key="label", batch_size=2, shuffle=False, num_epochs=1)
+          generator,
+          target_key='label',
+          batch_size=2,
+          shuffle=False,
+          num_epochs=1)
       features, target = input_fn()
 
       coord = coordinator.Coordinator()
       threads = queue_runner_impl.start_queue_runners(session, coord=coord)
 
       res = session.run([features, target])
-      self.assertAllEqual(res[0]['a'], np.vstack((np.zeros((10, 10)), np.ones((10, 10)))).reshape(2, 10, 10))
-      self.assertAllEqual(res[0]['b'], np.vstack((np.zeros((5, 5)), np.ones((5, 5)))).reshape(2, 5, 5) + 32)
-      self.assertAllEqual(res[1], np.vstack((np.zeros((3, 3)), np.ones((3, 3)))).reshape(2, 3, 3) - 32)
+      self.assertAllEqual(res[0]['a'],
+                          np.vstack((np.zeros((10, 10)), np.ones(
+                              (10, 10)))).reshape(2, 10, 10))
+      self.assertAllEqual(res[0]['b'],
+                          np.vstack((np.zeros((5, 5)), np.ones(
+                              (5, 5)))).reshape(2, 5, 5) + 32)
+      self.assertAllEqual(res[1],
+                          np.vstack((np.zeros((3, 3)), np.ones(
+                              (3, 3)))).reshape(2, 3, 3) - 32)
 
       coord.request_stop()
       coord.join(threads)
@@ -147,82 +173,97 @@ class GeneratorIoTest(test.TestCase):
     with self.test_session():
       with self.assertRaisesRegexp(TypeError, 'x must be generator function'):
         failing_input_fn = generator_io.generator_input_fn(
-          x, batch_size=2, shuffle=False, num_epochs=1)
+            x, batch_size=2, shuffle=False, num_epochs=1)
         failing_input_fn()
 
   def testGeneratorInputFnWithXAsNonGenerator(self):
+
     def generator():
       return np.arange(32, 36)
+
     with self.test_session():
-      with self.assertRaisesRegexp(TypeError, "x\(\) must be generator"):
+      with self.assertRaisesRegexp(TypeError, 'x\(\) must be generator'):
         failing_input_fn = generator_io.generator_input_fn(
-          generator, batch_size=2, shuffle=False, num_epochs=1)
+            generator, batch_size=2, shuffle=False, num_epochs=1)
         failing_input_fn()
 
   def testGeneratorInputFnWithXAsNonGeneratorYieldingDicts(self):
+
     def generator():
       yield np.arange(32, 36)
+
     with self.test_session():
-      with self.assertRaisesRegexp(TypeError, "x\(\) must yield dict"):
+      with self.assertRaisesRegexp(TypeError, 'x\(\) must yield dict'):
         failing_input_fn = generator_io.generator_input_fn(
-          generator, batch_size=2, shuffle=False, num_epochs=1)
+            generator, batch_size=2, shuffle=False, num_epochs=1)
         failing_input_fn()
 
   def testGeneratorInputFNWithTargetLabelNotString(self):
+
     def generator():
       for index in range(2):
-        yield {'a': np.ones((10, 10)) * index,
-               'b': np.ones((5, 5)) * index + 32,
-               'label': np.ones((3, 3)) * index - 32}
+        yield {
+            'a': np.ones((10, 10)) * index,
+            'b': np.ones((5, 5)) * index + 32,
+            'label': np.ones((3, 3)) * index - 32
+        }
 
     y = np.arange(32, 36)
     with self.test_session():
       with self.assertRaisesRegexp(TypeError, 'target_key must be str or'
-                                              ' Container of str'):
+                                   ' Container of str'):
         failing_input_fn = generator_io.generator_input_fn(
-          generator, target_key=y, batch_size=2, shuffle=False, num_epochs=1)
+            generator, target_key=y, batch_size=2, shuffle=False, num_epochs=1)
         failing_input_fn()
 
   def testGeneratorInputFNWithTargetLabelListNotString(self):
+
     def generator():
       for index in range(2):
-        yield {'a': np.ones((10, 10)) * index,
-               'b': np.ones((5, 5)) * index + 32,
-               'label': np.ones((3, 3)) * index - 32}
+        yield {
+            'a': np.ones((10, 10)) * index,
+            'b': np.ones((5, 5)) * index + 32,
+            'label': np.ones((3, 3)) * index - 32
+        }
 
-    y = ["label", np.arange(10)]
+    y = ['label', np.arange(10)]
     with self.test_session():
       with self.assertRaisesRegexp(TypeError, 'target_key must be str or'
-                                              ' Container of str'):
+                                   ' Container of str'):
         failing_input_fn = generator_io.generator_input_fn(
-          generator, target_key=y, batch_size=2, shuffle=False, num_epochs=1)
+            generator, target_key=y, batch_size=2, shuffle=False, num_epochs=1)
         failing_input_fn()
 
   def testGeneratorInputFNWithTargetLabelNotInDict(self):
+
     def generator():
       for index in range(2):
-        yield {'a': np.ones((10, 10)) * index,
-               'b': np.ones((5, 5)) * index + 32,
-               'label': np.ones((3, 3)) * index - 32}
+        yield {
+            'a': np.ones((10, 10)) * index,
+            'b': np.ones((5, 5)) * index + 32,
+            'label': np.ones((3, 3)) * index - 32
+        }
 
-    y = ["label", "target"]
+    y = ['label', 'target']
     with self.test_session():
-      with self.assertRaisesRegexp(KeyError,
-                                   'target_key not in yielded dict'):
+      with self.assertRaisesRegexp(KeyError, 'target_key not in yielded dict'):
         failing_input_fn = generator_io.generator_input_fn(
-          generator, target_key=y, batch_size=2, shuffle=False, num_epochs=1)
+            generator, target_key=y, batch_size=2, shuffle=False, num_epochs=1)
         failing_input_fn()
 
   def testGeneratorInputFnWithNoTargetKey(self):
+
     def generator():
       for index in range(2):
-        yield {'a': np.ones(1) * index,
-               'b': np.ones(1) * index + 32,
-               'label': np.ones(1) * index - 32}
+        yield {
+            'a': np.ones(1) * index,
+            'b': np.ones(1) * index + 32,
+            'label': np.ones(1) * index - 32
+        }
 
     with self.test_session() as session:
       input_fn = generator_io.generator_input_fn(
-        generator, target_key=None, batch_size=2, shuffle=False, num_epochs=1)
+          generator, target_key=None, batch_size=2, shuffle=False, num_epochs=1)
       features = input_fn()
 
       coord = coordinator.Coordinator()
@@ -241,15 +282,18 @@ class GeneratorIoTest(test.TestCase):
       coord.join(threads)
 
   def testGeneratorInputFnWithBatchLargerthanData(self):
+
     def generator():
       for index in range(2):
-        yield {'a': np.ones(1) * index,
-               'b': np.ones(1) * index + 32,
-               'label': np.ones(1) * index - 32}
+        yield {
+            'a': np.ones(1) * index,
+            'b': np.ones(1) * index + 32,
+            'label': np.ones(1) * index - 32
+        }
 
     with self.test_session() as session:
       input_fn = generator_io.generator_input_fn(
-        generator, target_key=None, batch_size=4, shuffle=False, num_epochs=1)
+          generator, target_key=None, batch_size=4, shuffle=False, num_epochs=1)
       features = input_fn()
 
       coord = coordinator.Coordinator()
@@ -268,19 +312,24 @@ class GeneratorIoTest(test.TestCase):
       coord.join(threads)
 
   def testGeneratorInputFnWithMismatchinGeneratorKeys(self):
+
     def generator():
       index = 0
-      yield {'a': np.ones(1) * index,
-             'b': np.ones(1) * index + 32,
-             'label': np.ones(1) * index - 32}
+      yield {
+          'a': np.ones(1) * index,
+          'b': np.ones(1) * index + 32,
+          'label': np.ones(1) * index - 32
+      }
       index = 1
-      yield {'a': np.ones(1) * index,
-             'c': np.ones(1) * index + 32,
-             'label': np.ones(1) * index - 32}
+      yield {
+          'a': np.ones(1) * index,
+          'c': np.ones(1) * index + 32,
+          'label': np.ones(1) * index - 32
+      }
 
     with self.test_session() as session:
       input_fn = generator_io.generator_input_fn(
-        generator, target_key=None, batch_size=2, shuffle=False, num_epochs=1)
+          generator, target_key=None, batch_size=2, shuffle=False, num_epochs=1)
       features = input_fn()
 
       coord = coordinator.Coordinator()
@@ -290,9 +339,10 @@ class GeneratorIoTest(test.TestCase):
         session.run([features])
 
       with self.assertRaisesRegex(KeyError, 'key mismatch between dicts emitted'
-                                            ' by GenFunExpected'):
+                                  ' by GenFunExpected'):
         coord.request_stop()
         coord.join(threads)
 
+
 if __name__ == '__main__':
   test.main()
diff --git a/tensorflow/contrib/opt/python/training/external_optimizer.py b/tensorflow/contrib/opt/python/training/external_optimizer.py
index db04cd25607..0909760b383 100644
--- a/tensorflow/contrib/opt/python/training/external_optimizer.py
+++ b/tensorflow/contrib/opt/python/training/external_optimizer.py
@@ -99,8 +99,13 @@ class ExternalOptimizerInterface(object):
         slice(start, end) for start, end in zip(accumulated_dims[:-1],
                                                 accumulated_dims[1:])]
 
-  def minimize(self, session=None, feed_dict=None, fetches=None,
-               step_callback=None, loss_callback=None, **run_kwargs):
+  def minimize(self,
+               session=None,
+               feed_dict=None,
+               fetches=None,
+               step_callback=None,
+               loss_callback=None,
+               **run_kwargs):
     """Minimize a scalar `Tensor`.
 
     Variables subject to optimization are updated in-place at the end of
@@ -120,7 +125,7 @@ class ExternalOptimizerInterface(object):
         flattened into a single vector.
       loss_callback: A function to be called every time the loss and gradients
         are computed, with evaluated fetches supplied as positional arguments.
-      run_kwargs: kwargs to pass to `session.run`.
+      **run_kwargs: kwargs to pass to `session.run`.
     """
     session = session or ops.get_default_session()
     feed_dict = feed_dict or {}
@@ -161,9 +166,10 @@ class ExternalOptimizerInterface(object):
                 for packing_slice in self._packing_slices]
 
     # Set optimization variables to their new values.
-    session.run(self._var_updates,
-                feed_dict=dict(zip(self._update_placeholders, var_vals)),
-                **run_kwargs)
+    session.run(
+        self._var_updates,
+        feed_dict=dict(zip(self._update_placeholders, var_vals)),
+        **run_kwargs)
 
   def _minimize(self, initial_val, loss_grad_func, equality_funcs,
                 equality_grad_funcs, inequality_funcs, inequality_grad_funcs,
diff --git a/tensorflow/contrib/seq2seq/python/ops/loss.py b/tensorflow/contrib/seq2seq/python/ops/loss.py
index 7e67c5f8a40..39a6d2f58b1 100644
--- a/tensorflow/contrib/seq2seq/python/ops/loss.py
+++ b/tensorflow/contrib/seq2seq/python/ops/loss.py
@@ -12,7 +12,6 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 # ==============================================================================
-
 """Seq2seq loss operations for use in sequence models.
 """
 
@@ -28,16 +27,21 @@ from tensorflow.python.ops import nn_ops
 __all__ = ["sequence_loss"]
 
 
-def sequence_loss(logits, targets, weights,
-                  average_across_timesteps=True, average_across_batch=True,
-                  softmax_loss_function=None, name=None):
-  """Weighted cross-entropy loss for a sequence of logits. Depending on the
-  values of `average_across_timesteps` and `average_across_batch`, the return
-  Tensor will have rank 0, 1, or 2 as these arguments reduce the cross-entropy
-  at each target, which has shape `[batch_size, sequence_length]`, over their
-  respective dimensions. For example, if `average_across_timesteps` is `True`
-  and `average_across_batch` is `False`, then the return Tensor will have shape
-  `[batch_size]`.
+def sequence_loss(logits,
+                  targets,
+                  weights,
+                  average_across_timesteps=True,
+                  average_across_batch=True,
+                  softmax_loss_function=None,
+                  name=None):
+  """Weighted cross-entropy loss for a sequence of logits.
+
+  Depending on the values of `average_across_timesteps` and
+  `average_across_batch`, the return Tensor will have rank 0, 1, or 2 as these
+  arguments reduce the cross-entropy at each target, which has shape
+  `[batch_size, sequence_length]`, over their respective dimensions. For
+  example, if `average_across_timesteps` is `True` and `average_across_batch`
+  is `False`, then the return Tensor will have shape `[batch_size]`.
 
   Args:
     logits: A Tensor of shape
diff --git a/tensorflow/core/common_runtime/gpu/gpu_device.cc b/tensorflow/core/common_runtime/gpu/gpu_device.cc
index a863b69cef0..8b9f4684875 100644
--- a/tensorflow/core/common_runtime/gpu/gpu_device.cc
+++ b/tensorflow/core/common_runtime/gpu/gpu_device.cc
@@ -274,14 +274,14 @@ Status BaseGPUDevice::FillContextMap(const Graph* graph,
                                      DeviceContextMap* device_context_map) {
   VLOG(2) << "FillContextMap";
 
-  const auto num_streams = streams_.size();
+  const size_t num_streams = streams_.size();
   // Special case for single stream.
   if (num_streams == 1) {
     return Status::OK();
   }
   const int64 before = Env::Default()->NowMicros();
   gpu_stream_util::AssignStreamsOpts opts;
-  opts.max_streams = num_streams;
+  opts.max_streams = static_cast<int32>(num_streams);
   std::unordered_map<int, int> node_to_stream_id;
   TF_RETURN_IF_ERROR(
       gpu_stream_util::AssignStreams(graph, opts, &node_to_stream_id));
@@ -519,7 +519,7 @@ void BaseGPUDevice::ReinitializeGpuDevice(OpKernelContext* context,
 Status BaseGPUDeviceFactory::CreateDevices(const SessionOptions& options,
                                            const string& name_prefix,
                                            std::vector<Device*>* devices) {
-  int n = INT_MAX;
+  size_t n = INT_MAX;
   auto iter = options.config.device_count().find("GPU");
   if (iter != options.config.device_count().end()) {
     n = iter->second;
@@ -971,7 +971,7 @@ Status BaseGPUDeviceFactory::GetValidDeviceIds(
       continue;
     }
 
-    int new_id = ids->size();
+    size_t new_id = ids->size();
     ids->push_back(visible_gpu_id);
 
     LOG(INFO) << "Creating TensorFlow device (/gpu:" << new_id << ") -> "
diff --git a/tensorflow/core/common_runtime/gpu/gpu_event_mgr_test.cc b/tensorflow/core/common_runtime/gpu/gpu_event_mgr_test.cc
index e4cd79bc7f0..8226cc035c8 100644
--- a/tensorflow/core/common_runtime/gpu/gpu_event_mgr_test.cc
+++ b/tensorflow/core/common_runtime/gpu/gpu_event_mgr_test.cc
@@ -37,12 +37,12 @@ class TEST_EventMgrHelper {
     StopPollingLoop();
   }
 
-  int queue_size() {
+  size_t queue_size() {
     mutex_lock l(em_->mu_);
     return em_->used_events_.size();
   }
 
-  int free_size() {
+  size_t free_size() {
     mutex_lock l(em_->mu_);
     return em_->free_events_.size();
   }
diff --git a/tensorflow/core/common_runtime/shape_refiner.cc b/tensorflow/core/common_runtime/shape_refiner.cc
index f58faefa9fb..f2dff0bf75c 100644
--- a/tensorflow/core/common_runtime/shape_refiner.cc
+++ b/tensorflow/core/common_runtime/shape_refiner.cc
@@ -299,6 +299,13 @@ Status ShapeRefiner::ExtractConstantSubgraph(
       return Status::OK();
     }
 
+    // Don't constant fold enter/exit currently either, as it's easy to end
+    // up with a partial frame.
+    if (IsEnter(current_node) || IsExit(current_node)) {
+      *is_constant_graph = false;
+      return Status::OK();
+    }
+
     // If there is nothing more to recurse down, see if
     // the generator node is a constant.
     if (current_node->num_inputs() == 0) {
diff --git a/tensorflow/core/framework/function.cc b/tensorflow/core/framework/function.cc
index e1603330eba..002929d58a2 100644
--- a/tensorflow/core/framework/function.cc
+++ b/tensorflow/core/framework/function.cc
@@ -24,6 +24,7 @@ limitations under the License.
 #include "tensorflow/core/lib/core/errors.h"
 #include "tensorflow/core/lib/gtl/inlined_vector.h"
 #include "tensorflow/core/lib/gtl/map_util.h"
+#include "tensorflow/core/util/equal_graph_def.h"
 
 namespace tensorflow {
 
@@ -652,6 +653,36 @@ string DebugStringWhole(const GraphDef& gdef) {
   return ret;
 }
 
+bool FunctionDefsEqual(const FunctionDef& f1, const FunctionDef& f2) {
+  // NOTE(skyewm): Using MessageDifferencer would be better here, but that is
+  // currently not included in tensorflow/core/platform/default/protobuf.h, so
+  // play fast and loose here.  I don't see anything in OpDef that should allow
+  // multiple equivalent string serializations, with the exception of
+  // AttrValues, which can vary for tensor values (see AreAttrValuesEqual()
+  // comments).
+  string sig1, sig2;
+  f1.signature().SerializeToString(&sig1);
+  f2.signature().SerializeToString(&sig2);
+  if (sig1 != sig2) return false;
+
+  if (f1.attr().size() != f2.attr().size()) return false;
+  for (auto iter1 : f1.attr()) {
+    auto iter2 = f2.attr().find(iter1.first);
+    if (iter2 == f2.attr().end()) return false;
+    if (!AreAttrValuesEqual(iter1.second, iter2->second)) return false;
+  }
+
+  if (!EqualRepeatedNodeDef(f1.node_def(), f2.node_def(), nullptr)) {
+    return false;
+  }
+
+  std::map<string, string> ret1(f1.ret().begin(), f1.ret().end());
+  std::map<string, string> ret2(f2.ret().begin(), f2.ret().end());
+  if (ret1 != ret2) return false;
+
+  return true;
+}
+
 string Canonicalize(const string& funcname,
                     const InstantiateAttrValueMap& attrs) {
   std::vector<string> entries;
@@ -802,6 +833,17 @@ Status FunctionLibraryDefinition::AddLibrary(
   return Status::OK();
 }
 
+Status FunctionLibraryDefinition::AddLibrary(
+    const FunctionDefLibrary& lib_def) {
+  for (const FunctionDef& fdef : lib_def.function()) {
+    TF_RETURN_IF_ERROR(AddFunctionDef(fdef));
+  }
+  for (const GradientDef& grad : lib_def.gradient()) {
+    TF_RETURN_IF_ERROR(AddGradientDef(grad));
+  }
+  return Status::OK();
+}
+
 string FunctionLibraryDefinition::FindGradient(const string& func) const {
   return gtl::FindWithDefault(func_grad_, func, "");
 }
diff --git a/tensorflow/core/framework/function.h b/tensorflow/core/framework/function.h
index e27311041fd..63c868ac9b8 100644
--- a/tensorflow/core/framework/function.h
+++ b/tensorflow/core/framework/function.h
@@ -230,6 +230,10 @@ string DebugString(const GraphDef& instantiated_func_def);
 // its supporting functions defined in its library).
 string DebugStringWhole(const GraphDef& gdef);
 
+// Returns true if f1 == f2. Compares all fields, including descriptions. Order
+// of NodeDefs doesn't matter.
+bool FunctionDefsEqual(const FunctionDef& f1, const FunctionDef& f2);
+
 // Returns a canonicalized string for the instantiation of the
 // function of the given "name" and attributes "attrs".
 //
@@ -303,6 +307,9 @@ class FunctionLibraryDefinition : public OpRegistryInterface {
   // Adds the functions and gradients in 'other' to this function library.
   Status AddLibrary(const FunctionLibraryDefinition& other);
 
+  // Adds the functions and gradients in 'lib_def' to this function library.
+  Status AddLibrary(const FunctionDefLibrary& lib_def);
+
   // If the gradient function for 'func' is specified explicitly in
   // the library, returns the gradient function name.  Otherwise,
   // returns an empty string.
diff --git a/tensorflow/core/framework/function_test.cc b/tensorflow/core/framework/function_test.cc
index 414b0979978..4cd9dce0def 100644
--- a/tensorflow/core/framework/function_test.cc
+++ b/tensorflow/core/framework/function_test.cc
@@ -1107,4 +1107,36 @@ TEST(FunctionLibraryDefinitionTest, GetAttr_Gradient) {
   EXPECT_EQ(annotation, false);  // WXPlusB has no custom gradient.
 }
 
+// TODO(skyewm): this could be more thorough
+TEST(FunctionDefsEqualTest, TestFunctionDefsEqual) {
+  // Equal functions
+  FunctionDef fdef1 = test::function::XTimesTwo();
+  FunctionDef fdef2 = test::function::XTimesTwo();
+  EXPECT_TRUE(FunctionDefsEqual(fdef1, fdef2));
+
+  // Different functions
+  fdef2 = test::function::XTimesFour();
+  EXPECT_FALSE(FunctionDefsEqual(fdef1, fdef2));
+
+  // Different signatures
+  fdef2 = test::function::XTimesTwo();
+  fdef2.mutable_signature()->mutable_input_arg(0)->set_name("foo");
+  EXPECT_FALSE(FunctionDefsEqual(fdef1, fdef2));
+
+  // Descriptions must be equal
+  fdef2 = test::function::XTimesTwo();
+  fdef2.mutable_signature()->mutable_input_arg(0)->set_description("foo");
+  EXPECT_FALSE(FunctionDefsEqual(fdef1, fdef2));
+
+  // Different NodeDefs
+  fdef2 = test::function::XTimesTwo();
+  *fdef2.add_node_def() = fdef2.node_def(0);
+  EXPECT_FALSE(FunctionDefsEqual(fdef1, fdef2));
+
+  // Different return values
+  fdef2 = test::function::XTimesTwo();
+  (*fdef2.mutable_ret())["y"] = "y:z:1";  // originally is "y:z:0"
+  EXPECT_FALSE(FunctionDefsEqual(fdef1, fdef2));
+}
+
 }  // end namespace tensorflow
diff --git a/tensorflow/core/graph/graph.cc b/tensorflow/core/graph/graph.cc
index 65baf4cd856..c19764d0824 100644
--- a/tensorflow/core/graph/graph.cc
+++ b/tensorflow/core/graph/graph.cc
@@ -360,6 +360,45 @@ void Graph::RemoveEdge(const Edge* e) {
   free_edges_.push_back(del);
 }
 
+Status Graph::AddFunctionLibrary(const FunctionDefLibrary& fdef_lib) {
+  for (const FunctionDef& fdef : fdef_lib.function()) {
+    const FunctionDef* preexisting_fdef = ops_.Find(fdef.signature().name());
+    if (preexisting_fdef != nullptr) {
+      if (!FunctionDefsEqual(*preexisting_fdef, fdef)) {
+        return errors::InvalidArgument(
+            "Cannot add function '", fdef.signature().name(),
+            "' because a different function with the same name already "
+            "exists.");
+      }
+      // Ignore duplicate FunctionDefs
+      continue;
+    }
+    // TODO(skyewm): fix test breakages and reenable this check
+    // const OpDef* op_def;
+    // if (ops_.LookUpOpDef(fdef.signature().name(), &op_def).ok()) {
+    //   return errors::InvalidArgument(
+    //       "Cannot add function '", fdef.signature().name(),
+    //       "' because an op with the same name already exists.");
+    // }
+    TF_RETURN_IF_ERROR(ops_.AddFunctionDef(fdef));
+  }
+  for (const GradientDef& grad : fdef_lib.gradient()) {
+    string preexisting_grad_func = ops_.FindGradient(grad.function_name());
+    if (!preexisting_grad_func.empty()) {
+      if (preexisting_grad_func != grad.gradient_func()) {
+        return errors::InvalidArgument(
+            "Cannot assign gradient function '", grad.gradient_func(), "' to '",
+            grad.function_name(), "' because it already has gradient function ",
+            "'", preexisting_grad_func, "'");
+      }
+      // Ignore duplicate GradientDefs
+      continue;
+    }
+    TF_RETURN_IF_ERROR(ops_.AddGradientDef(grad));
+  }
+  return Status::OK();
+}
+
 namespace {
 
 void AddInput(NodeDef* dst, StringPiece src_name, int src_slot) {
@@ -380,7 +419,8 @@ void Graph::ToGraphDef(GraphDef* graph_def) const {
 
 void Graph::ToGraphDefSubRange(GraphDef* graph_def, int from_node_id) const {
   graph_def->Clear();
-  graph_def->mutable_versions()->CopyFrom(versions());
+  *graph_def->mutable_versions() = versions();
+  *graph_def->mutable_library() = ops_.ToProto();
   std::vector<const Edge*>
       inputs;  // Construct this outside the loop for speed.
   for (auto id = from_node_id; id < num_node_ids(); ++id) {
diff --git a/tensorflow/core/graph/graph.h b/tensorflow/core/graph/graph.h
index 4af4b0bb109..83bb797d1be 100644
--- a/tensorflow/core/graph/graph.h
+++ b/tensorflow/core/graph/graph.h
@@ -324,6 +324,12 @@ class Graph {
   // REQUIRES: The edge must exist.
   void RemoveEdge(const Edge* edge);
 
+  // Adds the function and gradient definitions in `fdef_lib` to this graph's op
+  // registry. Ignores duplicate functions, and returns a bad status if an
+  // imported function differs from an existing function or op with the same
+  // name.
+  Status AddFunctionLibrary(const FunctionDefLibrary& fdef_lib);
+
   // The number of live nodes in the graph.
   //
   // Because nodes can be removed from the graph, num_nodes() is often
diff --git a/tensorflow/core/graph/graph_constructor.cc b/tensorflow/core/graph/graph_constructor.cc
index 6b27e4e2945..6b3b5d36047 100644
--- a/tensorflow/core/graph/graph_constructor.cc
+++ b/tensorflow/core/graph/graph_constructor.cc
@@ -22,6 +22,7 @@ limitations under the License.
 #include <vector>
 
 #include "tensorflow/core/common_runtime/shape_refiner.h"
+#include "tensorflow/core/framework/function.h"
 #include "tensorflow/core/framework/function.pb.h"
 #include "tensorflow/core/framework/graph.pb.h"
 #include "tensorflow/core/framework/node_def_util.h"
@@ -604,6 +605,10 @@ void GraphConstructor::AddPrefixToNodeDef(
 }
 
 Status GraphConstructor::Convert() {
+  // Import functions before adding nodes, since imported nodes may refer to
+  // functions
+  TF_RETURN_IF_ERROR(g_->AddFunctionLibrary(gdef_->library()));
+
   std::vector<InputInfo> inputs;
   int processed = 0;
   // Process the NodeDefs in topological order.
@@ -705,7 +710,12 @@ Status GraphConstructor::Convert() {
         TF_RETURN_IF_ERROR(MakeEdge(inputs[i].node, inputs[i].index, node, i));
       }
     }
-    TF_RETURN_IF_ERROR(ValidateShape(node));
+
+    // TODO(skyewm): remove conditional when b/35715995 ("Functions lack shape
+    // inference") is resolved.
+    if (g_->flib_def().Find(node_def->name()) == nullptr) {
+      TF_RETURN_IF_ERROR(ValidateShape(node));
+    }
 
     // Update pending_count_ for outputs.
     for (size_t i = 0; i < outputs_[o].size(); ++i) {
@@ -847,10 +857,6 @@ Status ImportGraphDef(const ImportGraphDefOptions& opts, const GraphDef& gdef,
           return_tensors->size(), ")");
     }
   }
-  if (gdef.library().function_size() != 0) {
-    return errors::Unimplemented(
-        "Importing GraphDefs containing functions not yet implemented");
-  }
   return GraphConstructor::Construct(opts, &gdef, g, refiner, return_tensors);
 }
 
diff --git a/tensorflow/core/graph/graph_constructor.h b/tensorflow/core/graph/graph_constructor.h
index 4252b08e48c..186859d132a 100644
--- a/tensorflow/core/graph/graph_constructor.h
+++ b/tensorflow/core/graph/graph_constructor.h
@@ -113,8 +113,6 @@ struct ImportGraphDefOptions {
   // with ops that are not defined in the binary calling ImportGraphDef.
   // Similar to the producer_op_list argument to import_graph_def in the
   // python API.
-
-  // TODO(skyewm): Enable importing functions
 };
 
 // Each `return_tensors` entry is the requested node and output index. The index
diff --git a/tensorflow/core/graph/graph_constructor_test.cc b/tensorflow/core/graph/graph_constructor_test.cc
index 7c847916d12..e20dabc8910 100644
--- a/tensorflow/core/graph/graph_constructor_test.cc
+++ b/tensorflow/core/graph/graph_constructor_test.cc
@@ -31,6 +31,7 @@ limitations under the License.
 #include "tensorflow/core/platform/logging.h"
 #include "tensorflow/core/platform/protobuf.h"
 #include "tensorflow/core/platform/test.h"
+#include "tensorflow/core/public/session.h"
 #include "tensorflow/core/public/version.h"
 
 // TODO(josh11b): Test InitCostModel().
@@ -2008,30 +2009,196 @@ TEST_F(GraphConstructorTest, ImportGraphDef_ErrorsDoNoChangeTheGraph) {
 #undef EXPECT_IMPORT_FAILURE
 }
 
-TEST_F(GraphConstructorTest, ImportGraphDef_ErrorFunctionDefsUnimplemented) {
-  ExpectError(
+TEST_F(GraphConstructorTest, ImportGraphDef_FunctionDefs) {
+  // Import a graph def containing a function. The graph def was generated using
+  // this python code:
+  // @function.Defun(tf.float32, tf.float32, tf.float32)
+  // def FooGrad(x, y, dz): return dz, dz
+  //
+  // @function.Defun(tf.float32, tf.float32, grad_func=FooGrad)
+  // def Foo(x, y): return x + y
+  //
+  // p1 = tf.placeholder(tf.float32)
+  // p2 = tf.placeholder(tf.float32)
+  // foo = Foo(p1, p2)
+  ImportGraphDefOptions opts;
+  ExpectOK(
       R"EOF(
-library {
-  function {
-    signature {
-      name: "Foo_cc661786"
-      input_arg {
-        name: "x"
-        type: DT_FLOAT
+      node {
+        name: "Placeholder" op: "Placeholder"
+        attr { key: "dtype" value { type: DT_FLOAT } }
+        attr { key: "shape" value { shape { } } }
       }
-      output_arg {
-        name: "x"
-        type: DT_FLOAT
+      node {
+        name: "Placeholder_1" op: "Placeholder"
+        attr { key: "dtype" value { type: DT_FLOAT } }
+        attr { key: "shape" value { shape { } } }
       }
-    }
-    ret {
-      key: "x"
-      value: "x:0"
-    }
-  }
-})EOF",
-      ImportGraphDefOptions(),
-      {"Importing GraphDefs containing functions not yet implemented"});
+      node {
+        name: "Foo_d03c39a3" op: "Foo_d03c39a3"
+        input: "Placeholder" input: "Placeholder_1"
+      }
+      library {
+        function {
+          signature {
+            name: "Foo_d03c39a3"
+            input_arg { name: "x" type: DT_FLOAT }
+            input_arg { name: "y" type: DT_FLOAT }
+            output_arg { name: "add" type: DT_FLOAT }
+          }
+          node_def {
+            name: "add" op: "Add" input: "x" input: "y"
+            attr { key: "T" value { type: DT_FLOAT } }
+          }
+          ret { key: "add" value: "add:z:0" }
+        }
+        function {
+          signature {
+            name: "FooGrad_dc60abc8"
+            input_arg { name: "x" type: DT_FLOAT }
+            input_arg { name: "y" type: DT_FLOAT }
+            input_arg { name: "dz" type: DT_FLOAT }
+            output_arg { name: "dz" type: DT_FLOAT }
+            output_arg { name: "dz_U0" type: DT_FLOAT }
+          }
+          ret { key: "dz" value: "dz:0" }
+          ret { key: "dz_U0" value: "dz:0" }
+        }
+        gradient {
+          function_name: "Foo_d03c39a3" gradient_func: "FooGrad_dc60abc8"
+        }
+      }
+      versions { producer: 21 min_consumer: 12 }
+      )EOF",
+      opts);
+
+  EXPECT_TRUE(HasNode("Placeholder"));
+  EXPECT_TRUE(HasNode("Placeholder_1"));
+  EXPECT_TRUE(HasNode("Foo_d03c39a3"));
+  // Check that Foo and FooGrad have been imported
+  const OpDef* op_def;
+  TF_ASSERT_OK(graph_.op_registry()->LookUpOpDef("Foo_d03c39a3", &op_def));
+  TF_ASSERT_OK(graph_.op_registry()->LookUpOpDef("FooGrad_dc60abc8", &op_def));
+
+  // Re-serialize and run the graph. This tests that re-serialized functions can
+  // be imported again and that imported functions can be run.
+  GraphDef gdef;
+  graph_.ToGraphDef(&gdef);
+  EXPECT_EQ(gdef.library().function_size(), 2);
+  EXPECT_EQ(gdef.library().gradient_size(), 1);
+  EXPECT_EQ(gdef.library().gradient()[0].function_name(), "Foo_d03c39a3");
+  EXPECT_EQ(gdef.library().gradient()[0].gradient_func(), "FooGrad_dc60abc8");
+
+  std::unique_ptr<Session> sess(NewSession(SessionOptions()));
+  TF_ASSERT_OK(sess->Create(gdef));
+
+  Tensor p1(DT_FLOAT, TensorShape({1}));
+  p1.scalar<float>()() = 1.0;
+  Tensor p2(DT_FLOAT, TensorShape({1}));
+  p2.scalar<float>()() = 2.0;
+  std::vector<std::pair<string, Tensor>> inputs = {{"Placeholder", p1},
+                                                   {"Placeholder_1", p2}};
+  std::vector<string> output_names = {"Foo_d03c39a3"};
+  std::vector<string> target_names;
+  std::vector<Tensor> outputs;
+  TF_ASSERT_OK(sess->Run(inputs, output_names, target_names, &outputs));
+
+  ASSERT_EQ(outputs.size(), 1);
+  EXPECT_EQ(outputs[0].scalar<float>()(), 3.0);
+}
+
+TEST_F(GraphConstructorTest, ImportGraphDef_NestedFunctionDefs) {
+  // Import a graph def containing a function. The graph def was generated using
+  // this python code:
+  //   @function.Defun(tf.float32, tf.float32)
+  //   def Inner(x, y): return x + y
+  //
+  //   @function.Defun(tf.float32, tf.float32)
+  //   def Outer(x, y): return Inner(x, y)
+  //
+  //   p1 = tf.placeholder(tf.float32)
+  //   p2 = tf.placeholder(tf.float32)
+  //   Outer(p1, p2)
+  ImportGraphDefOptions opts;
+  ExpectOK(
+      R"EOF(
+      node {
+        name: "Placeholder" op: "Placeholder"
+        attr { key: "dtype" value { type: DT_FLOAT } }
+        attr { key: "shape" value { shape { } } }
+      }
+      node {
+        name: "Placeholder_1" op: "Placeholder"
+        attr { key: "dtype" value { type: DT_FLOAT } }
+        attr { key: "shape" value { shape { } } }
+      }
+      node {
+        name: "Outer_966fa13d" op: "Outer_966fa13d"
+        input: "Placeholder" input: "Placeholder_1"
+      }
+      library {
+        function {
+          signature {
+            name: "Outer_966fa13d"
+            input_arg { name: "x" type: DT_FLOAT }
+            input_arg { name: "y" type: DT_FLOAT }
+            output_arg { name: "Inner_d03c39a3" type: DT_FLOAT }
+          }
+          node_def {
+            name: "Inner_d03c39a3" op: "Inner_d03c39a3" input: "x" input: "y"
+          }
+          ret { key: "Inner_d03c39a3" value: "Inner_d03c39a3:add:0" }
+        }
+        function {
+          signature {
+            name: "Inner_d03c39a3"
+            input_arg { name: "x" type: DT_FLOAT }
+            input_arg { name: "y" type: DT_FLOAT }
+            output_arg { name: "add" type: DT_FLOAT }
+          }
+          node_def {
+            name: "add" op: "Add" input: "x" input: "y"
+            attr { key: "T" value { type: DT_FLOAT } }
+          }
+          ret { key: "add" value: "add:z:0" }
+        }
+      }
+      versions { producer: 21 min_consumer: 12 }
+      )EOF",
+      opts);
+
+  EXPECT_TRUE(HasNode("Placeholder"));
+  EXPECT_TRUE(HasNode("Placeholder_1"));
+  EXPECT_TRUE(HasNode("Outer_966fa13d"));
+  // Check that Inner and Outer have been imported
+  const OpDef* op_def;
+  Status s = graph_.op_registry()->LookUpOpDef("Inner_d03c39a3", &op_def);
+  ASSERT_TRUE(s.ok()) << s.error_message();
+  s = graph_.op_registry()->LookUpOpDef("Outer_966fa13d", &op_def);
+  ASSERT_TRUE(s.ok()) << s.error_message();
+
+  // Re-serialize and run the graph. This tests that re-serialized functions can
+  // be imported again and that imported functions can be run.
+  GraphDef gdef;
+  graph_.ToGraphDef(&gdef);
+  std::unique_ptr<Session> sess(NewSession(SessionOptions()));
+  s = sess->Create(gdef);
+  ASSERT_TRUE(s.ok()) << s.error_message();
+
+  Tensor p1(DT_FLOAT, TensorShape({1}));
+  p1.scalar<float>()() = 1.0;
+  Tensor p2(DT_FLOAT, TensorShape({1}));
+  p2.scalar<float>()() = 2.0;
+  std::vector<std::pair<string, Tensor>> inputs = {{"Placeholder", p1},
+                                                   {"Placeholder_1", p2}};
+  std::vector<string> output_names = {"Outer_966fa13d"};
+  std::vector<string> target_names;
+  std::vector<Tensor> outputs;
+  s = sess->Run(inputs, output_names, target_names, &outputs);
+  ASSERT_TRUE(s.ok()) << s.error_message();
+
+  ASSERT_EQ(outputs.size(), 1);
+  EXPECT_EQ(outputs[0].scalar<float>()(), 3.0);
 }
 
 TEST_F(GraphConstructorTest, CopyGraph) {
diff --git a/tensorflow/core/graph/graph_test.cc b/tensorflow/core/graph/graph_test.cc
index f5ed7a83e47..739ad90efd2 100644
--- a/tensorflow/core/graph/graph_test.cc
+++ b/tensorflow/core/graph/graph_test.cc
@@ -17,6 +17,7 @@ limitations under the License.
 
 #include <set>
 #include <vector>
+#include "tensorflow/core/framework/function_testlib.h"
 #include "tensorflow/core/graph/graph_constructor.h"
 #include "tensorflow/core/graph/node_builder.h"
 #include "tensorflow/core/kernels/ops_util.h"
@@ -387,6 +388,61 @@ TEST_F(GraphTest, InputEdges) {
   TF_EXPECT_OK(b->input_edges(&edges));
 }
 
+TEST_F(GraphTest, AddFunctionLibrary) {
+  // Basic functionality
+  FunctionDefLibrary proto;
+  *proto.add_function() = test::function::XTimesTwo();
+  *proto.add_function() = test::function::XTimesFour();
+  TF_EXPECT_OK(graph_.AddFunctionLibrary(proto));
+  EXPECT_TRUE(graph_.flib_def().Find("XTimesTwo") != nullptr);
+  EXPECT_TRUE(graph_.flib_def().Find("XTimesFour") != nullptr);
+
+  // Duplicate functions are ignored
+  TF_EXPECT_OK(graph_.AddFunctionLibrary(proto));
+  EXPECT_TRUE(graph_.flib_def().Find("XTimesTwo") != nullptr);
+  EXPECT_TRUE(graph_.flib_def().Find("XTimesFour") != nullptr);
+
+  // Duplicate names corresponding to different functions trigger an error
+  FunctionDefLibrary error_proto = proto;
+  *error_proto.mutable_function(0)->add_node_def() =
+      error_proto.function(0).node_def(0);
+  Status s = graph_.AddFunctionLibrary(error_proto);
+  EXPECT_FALSE(s.ok());
+  EXPECT_EQ(s.error_message(),
+            "Cannot add function 'XTimesTwo' because a different function with "
+            "the same name already exists.");
+
+  // TODO(skyewm): reenable along with duplicate op check
+  // Function with same name as an existing op triggers an error
+  // error_proto = proto;
+  // error_proto.mutable_function(0)->mutable_signature()->set_name("Add");
+  // s = graph_.AddFunctionLibrary(error_proto);
+  // EXPECT_FALSE(s.ok());
+  // EXPECT_EQ(s.error_message(),
+  //           "Cannot add function 'Add' because an op with the same name "
+  //           "already exists.");
+
+  // Adding a gradient function to an existing function is ok
+  GradientDef* grad = proto.add_gradient();
+  grad->set_function_name("XTimesTwo");
+  grad->set_gradient_func("Undefined");  // undefined funcs in grads are ok
+  TF_EXPECT_OK(graph_.AddFunctionLibrary(proto));
+  EXPECT_EQ(graph_.flib_def().FindGradient("XTimesTwo"), "Undefined");
+
+  // Duplicate gradients are ignored
+  TF_EXPECT_OK(graph_.AddFunctionLibrary(proto));
+  EXPECT_EQ(graph_.flib_def().FindGradient("XTimesTwo"), "Undefined");
+
+  // Conflicting gradient triggers an error
+  error_proto = proto;
+  error_proto.mutable_gradient(0)->set_gradient_func("Undefined2");
+  s = graph_.AddFunctionLibrary(error_proto);
+  EXPECT_FALSE(s.ok());
+  EXPECT_EQ(s.error_message(),
+            "Cannot assign gradient function 'Undefined2' to 'XTimesTwo' "
+            "because it already has gradient function 'Undefined'");
+}
+
 REGISTER_OP("Input").Output("o: float");
 REGISTER_OP("In2Out1").Input("a: float").Input("b: float").Output("o: float");
 
diff --git a/tensorflow/core/graph/mkl_layout_pass.cc b/tensorflow/core/graph/mkl_layout_pass.cc
index 6deaa79485d..309c4cd774c 100644
--- a/tensorflow/core/graph/mkl_layout_pass.cc
+++ b/tensorflow/core/graph/mkl_layout_pass.cc
@@ -255,47 +255,47 @@ static size_t kNodeMergeContextMaxDepth = 10;
 class MklLayoutRewritePass : public GraphOptimizationPass {
  public:
   MklLayoutRewritePass() {
-    csinfo_.conv2d            = "Conv2D";
-    csinfo_.mklconv2d         = "MklConv2D";
+    csinfo_.conv2d = "Conv2D";
+    csinfo_.mklconv2d = "MklConv2D";
     csinfo_.mklconv2dwithbias = "MklConv2DWithBias";
     csinfo_.mklconv2dwithbiasbackpropbias = "MklConv2DWithBiasBackpropBias";
-    csinfo_.biasadd           = "BiasAdd";
-    csinfo_.matmul            = "MatMul";
-    csinfo_.biasaddgrad       = "BiasAddGrad";
-    csinfo_.relu              = "Relu";
-    csinfo_.relugrad          = "ReluGrad";
-    csinfo_.maxpool           = "MaxPool";
-    csinfo_.maxpoolgrad       = "MaxPoolGrad";
-    csinfo_.avgpool           = "AvgPool";
-    csinfo_.avgpoolgrad       = "AvgPoolGrad";
-    csinfo_.conv2dgradinput   = "Conv2DBackpropInput";
-    csinfo_.conv2dgradfilter  = "Conv2DBackpropFilter";
+    csinfo_.biasadd = "BiasAdd";
+    csinfo_.matmul = "MatMul";
+    csinfo_.biasaddgrad = "BiasAddGrad";
+    csinfo_.relu = "Relu";
+    csinfo_.relugrad = "ReluGrad";
+    csinfo_.maxpool = "MaxPool";
+    csinfo_.maxpoolgrad = "MaxPoolGrad";
+    csinfo_.avgpool = "AvgPool";
+    csinfo_.avgpoolgrad = "AvgPoolGrad";
+    csinfo_.conv2dgradinput = "Conv2DBackpropInput";
+    csinfo_.conv2dgradfilter = "Conv2DBackpropFilter";
 
-    rinfo_.push_back({csinfo_.conv2d,   csinfo_.mklconv2d,
-                      2, CopyAttrsConv2D, AlwaysRewrite});
+    rinfo_.push_back(
+        {csinfo_.conv2d, csinfo_.mklconv2d, 2, CopyAttrsConv2D, AlwaysRewrite});
     rinfo_.push_back({csinfo_.conv2dgradfilter,
-        GetMklOpName(csinfo_.conv2dgradfilter),
-                      3, CopyAttrsConv2D, AlwaysRewrite});
+                      GetMklOpName(csinfo_.conv2dgradfilter), 3,
+                      CopyAttrsConv2D, AlwaysRewrite});
     rinfo_.push_back({csinfo_.conv2dgradinput,
-        GetMklOpName(csinfo_.conv2dgradinput),
-                      3, CopyAttrsConv2D, AlwaysRewrite});
-    rinfo_.push_back({csinfo_.relu, GetMklOpName(csinfo_.relu),
-                      1, CopyAttrsRelu, AlwaysRewrite});
-    rinfo_.push_back({csinfo_.maxpool, GetMklOpName(csinfo_.maxpool),
-                      1, CopyAttrsPooling, AlwaysRewrite});
-    rinfo_.push_back({csinfo_.maxpoolgrad, GetMklOpName(csinfo_.maxpoolgrad),
-                      3, CopyAttrsPooling, AlwaysRewrite});
-    rinfo_.push_back({csinfo_.avgpool, GetMklOpName(csinfo_.avgpool),
-                      1, CopyAttrsPooling, AlwaysRewrite});
-    rinfo_.push_back({csinfo_.avgpoolgrad, GetMklOpName(csinfo_.avgpoolgrad),
-                      2, CopyAttrsPooling, AlwaysRewrite});
+                      GetMklOpName(csinfo_.conv2dgradinput), 3, CopyAttrsConv2D,
+                      AlwaysRewrite});
+    rinfo_.push_back({csinfo_.relu, GetMklOpName(csinfo_.relu), 1,
+                      CopyAttrsRelu, AlwaysRewrite});
+    rinfo_.push_back({csinfo_.maxpool, GetMklOpName(csinfo_.maxpool), 1,
+                      CopyAttrsPooling, AlwaysRewrite});
+    rinfo_.push_back({csinfo_.maxpoolgrad, GetMklOpName(csinfo_.maxpoolgrad), 3,
+                      CopyAttrsPooling, AlwaysRewrite});
+    rinfo_.push_back({csinfo_.avgpool, GetMklOpName(csinfo_.avgpool), 1,
+                      CopyAttrsPooling, AlwaysRewrite});
+    rinfo_.push_back({csinfo_.avgpoolgrad, GetMklOpName(csinfo_.avgpoolgrad), 2,
+                      CopyAttrsPooling, AlwaysRewrite});
 
     // Add info about which ops to add workspace edge to and the slots.
     wsinfo_.push_back({csinfo_.maxpool, csinfo_.maxpoolgrad, 0, 1, 2, 6});
 
     // Add a rule for merging nodes
-    minfo_.push_back({csinfo_.mklconv2d, csinfo_.biasadd, 0,
-                      csinfo_.mklconv2dwithbias});
+    minfo_.push_back(
+        {csinfo_.mklconv2d, csinfo_.biasadd, 0, csinfo_.mklconv2dwithbias});
 
     // We use maxhop of 10 based on empirical observations. Also, these are
     // maxhops in backward data-flow graph. Since input of forward nodes
@@ -322,13 +322,13 @@ class MklLayoutRewritePass : public GraphOptimizationPass {
   /// the number of inputs to the original op, and the function to be used
   /// to copy attributes for the op
   typedef struct {
-    string name;   // Original name of the op in the graph
-    string newname;   // New name of op in the graph
-    int    numins;  // Number of inputs to the original op
+    string name;     // Original name of the op in the graph
+    string newname;  // New name of op in the graph
+    int numins;      // Number of inputs to the original op
     // Function handler to copy attributes from old node to new node.
     std::function<void(const Node*, NodeBuilder*)> copyattrs;
     std::function<bool(const Node*)> rewriterule;  // Rule under which to
-                    // rewrite this node.
+                                                   // rewrite this node.
   } RewriteInfo;
 
   /// Structure to specify forward op, backward op, and the slot numbers
@@ -348,18 +348,18 @@ class MklLayoutRewritePass : public GraphOptimizationPass {
 
   /// Structure to specify information used in node merge
   typedef struct {
-    string pred;  // Predecessor node string
-    string succ;  // Successor node string
-    int    op;    // What operand no the predecessor node corresponds
-                  // to successor node?
+    string pred;     // Predecessor node string
+    string succ;     // Successor node string
+    int op;          // What operand no the predecessor node corresponds
+                     // to successor node?
     string newnode;  // Name of the node after merge
   } MergeInfo;
 
   /// Structure to specify the context information used in node rewrite rule
   typedef struct {
-    string node;  // Name of the node to be rewritten
-    string fwd;  // Node name in forward pass that this node
-                 // corresponds to
+    string node;    // Name of the node to be rewritten
+    string fwd;     // Node name in forward pass that this node
+                    // corresponds to
     size_t maxhop;  // Maximum number of hops the fwd is located
                     // from this node. If fwd is farther than maxhop
                     // then we do not rewrite the node.
@@ -418,9 +418,7 @@ class MklLayoutRewritePass : public GraphOptimizationPass {
   inline void MarkRewrittenNode(Node* n) { visited_nodes_.insert(n); }
 
   // Clear all visited nodes
-  inline void UnMarkRewrittenNodes() {
-    visited_nodes_.clear();
-  }
+  inline void UnMarkRewrittenNodes() { visited_nodes_.clear(); }
 
   // Get the name of Mkl op from original TensorFlow op
   // We prefix 'Mkl' to the original op to get Mkl op.
@@ -455,7 +453,7 @@ class MklLayoutRewritePass : public GraphOptimizationPass {
   // We check for 2 scenarios for rewrite.
   //
   // @return RewriteInfo* for the applicable rewrite rule
-  const RewriteInfo* CheckForNodeRewrite(const Node *n) const;
+  const RewriteInfo* CheckForNodeRewrite(const Node* n) const;
 
   // Default rewrite rule to be used in scenario 1 for rewrite.
   // @return - true (since we want to always rewrite)
@@ -512,7 +510,7 @@ class MklLayoutRewritePass : public GraphOptimizationPass {
   // NodeBuilder 'nb' for the new node provided. If 'orign' does not dictate
   // adding workspace edge then do not add it.
   void AddWorkSpaceEdgeIfNeeded(std::unique_ptr<Graph>* g, Node* orign,
-      NodeBuilder* nb);
+                                NodeBuilder* nb);
 
   // Functions specific to operators to copy attributes
   // We need operator-specific function to copy attributes because the framework
@@ -528,10 +526,9 @@ class MklLayoutRewritePass : public GraphOptimizationPass {
   void GetDummyMklTensorNode(std::unique_ptr<Graph>* g, Node** out,
                              Node* orign);
   void GetDummyWorkspaceTensorNode(std::unique_ptr<Graph>* g, Node** out,
-                             Node* orign);
+                                   Node* orign);
 };
 
-
 std::vector<MklLayoutRewritePass::ContextInfo> MklLayoutRewritePass::cinfo_;
 
 // We register Mkl rewrite pass for phase 1 in pre-placement group.
@@ -539,7 +536,6 @@ std::vector<MklLayoutRewritePass::ContextInfo> MklLayoutRewritePass::cinfo_;
 REGISTER_OPTIMIZATION(OptimizationPassRegistry::PRE_PLACEMENT, 1,
                       MklLayoutRewritePass);
 
-
 //////////////////////////////////////////////////////////////////////////
 //           Helper functions for creating new node
 //////////////////////////////////////////////////////////////////////////
@@ -578,13 +574,14 @@ void MklLayoutRewritePass::GetDummyMklTensorNode(std::unique_ptr<Graph>* g,
                            8);
   TensorShape dummy_shape({8});
   dummy_shape.AsProto(proto.mutable_tensor_shape());
-  TF_CHECK_OK(NodeBuilder((*g)->NewName("DMT"), "Const")
-                 .Attr("value", proto)
-                 .Attr("dtype", dt)
-                 .Device(orign->def().device())  // We place this node on same
-                                             // device as device of original
-                                             // node.
-                 .Finalize(&**g, out));
+  TF_CHECK_OK(
+      NodeBuilder((*g)->NewName("DMT"), "Const")
+          .Attr("value", proto)
+          .Attr("dtype", dt)
+          .Device(orign->def().device())  // We place this node on same
+                                          // device as device of original
+                                          // node.
+          .Finalize(&**g, out));
   (*out)->set_assigned_device_name(orign->assigned_device_name());
 }
 
@@ -653,29 +650,30 @@ void MklLayoutRewritePass::GetDummyWorkspaceTensorNode(
   TensorProto proto;
   proto.set_dtype(dt);
   float zero[1] = {0};
-  proto.set_tensor_content(const_cast<const void*>(
-      static_cast<void*>(&zero)), 4);
+  proto.set_tensor_content(const_cast<const void*>(static_cast<void*>(&zero)),
+                           4);
   TensorShape dummy_shape({1});
   dummy_shape.AsProto(proto.mutable_tensor_shape());
-  TF_CHECK_OK(NodeBuilder((*g)->NewName("DMT"), "Const")
-                 .Attr("value", proto)
-                 .Attr("dtype", dt)
-                 .Device(orign->def().device())  // We place this node on same
-                                             // device as device of original
-                                             // node.
-                 .Finalize(&**g, out));
+  TF_CHECK_OK(
+      NodeBuilder((*g)->NewName("DMT"), "Const")
+          .Attr("value", proto)
+          .Attr("dtype", dt)
+          .Device(orign->def().device())  // We place this node on same
+                                          // device as device of original
+                                          // node.
+          .Finalize(&**g, out));
   (*out)->set_assigned_device_name(orign->assigned_device_name());
 }
 
 void MklLayoutRewritePass::AddWorkSpaceEdgeIfNeeded(std::unique_ptr<Graph>* g,
-    Node* orign, NodeBuilder* nb) {
+                                                    Node* orign,
+                                                    NodeBuilder* nb) {
   bool workspace_edge_added = false;
   DataType T;
   TF_CHECK_OK(GetNodeAttr(orign->def(), "T", &T));
   for (auto ws : wsinfo_) {
     if (orign->type_string() == ws.fwdop &&
-        mkl_layer_registry::IsMklLayer(
-          GetMklOpName(orign->type_string()), T)) {
+        mkl_layer_registry::IsMklLayer(GetMklOpName(orign->type_string()), T)) {
       // If this op is a fwd op, then we need to check if there is an
       // edge from this node's fwdslot to bwdop's bwdslot. If there is
       // an edge, then we just add an attribute on this node for setting
@@ -701,8 +699,8 @@ void MklLayoutRewritePass::AddWorkSpaceEdgeIfNeeded(std::unique_ptr<Graph>* g,
         nb->Attr("workspace_enabled", false);
       }
     } else if (orign->type_string() == ws.bwdop &&
-          mkl_layer_registry::IsMklLayer(
-            GetMklOpName(orign->type_string()), T)) {
+               mkl_layer_registry::IsMklLayer(
+                   GetMklOpName(orign->type_string()), T)) {
       // If this op is a bwd op, then we need to add workspace edge and
       // it's Mkl tensor edge between its corresponding fwd op and this
       // op. Corresponding fwd op is specified in 'fwdop' field of
@@ -721,7 +719,7 @@ void MklLayoutRewritePass::AddWorkSpaceEdgeIfNeeded(std::unique_ptr<Graph>* g,
           // Add workspace edge between fwd op and bwd op.
           nb->Input(e->src(), ws.wsfwdslot);
           // Add Mkl tensor edge for workspace edge between fwd op and bwd op.
-          nb->Input(e->src(), ws.wsfwdslot+1);
+          nb->Input(e->src(), ws.wsfwdslot + 1);
           // In terms of input ordering, we add these calls to add Input
           // here because workspace edge (and its Mkl tensor) is the last
           // edge in the fwdop and bwdop. So all inputs before workspace
@@ -740,17 +738,17 @@ void MklLayoutRewritePass::AddWorkSpaceEdgeIfNeeded(std::unique_ptr<Graph>* g,
       // workspace_enabled to false.
       if (!workspace_edge_added) {
         nb->Attr("workspace_enabled", false);
-        Node* dmt_ws = nullptr;  // Dummy tensor for workspace
+        Node* dmt_ws = nullptr;      // Dummy tensor for workspace
         Node* dmt_mkl_ws = nullptr;  // Dummy Mkl tensor for workspace
         GetDummyWorkspaceTensorNode(g, &dmt_ws, orign);
         GetDummyMklTensorNode(g, &dmt_mkl_ws, orign);
         CHECK_NOTNULL(dmt_ws);
         CHECK_NOTNULL(dmt_mkl_ws);
-        nb->Input(dmt_ws, 0);  // We add dummy tensor as workspace tensor.
+        nb->Input(dmt_ws, 0);      // We add dummy tensor as workspace tensor.
         nb->Input(dmt_mkl_ws, 0);  // We add dummy tensor as Mkl
-                             // tensor for workspace tensor.
+                                   // tensor for workspace tensor.
         VLOG(1) << "MklLayoutRewritePass: dummy workspace_enabled for "
-              << orign->type_string();
+                << orign->type_string();
       }
     } else {
       // If this node does not match any workspace info, then we do not
@@ -763,8 +761,7 @@ void MklLayoutRewritePass::AddWorkSpaceEdgeIfNeeded(std::unique_ptr<Graph>* g,
 // Op-specific functions to copy attributes from old node to new node
 //////////////////////////////////////////////////////////////////////////
 
-void MklLayoutRewritePass::CopyAttrsConv2D(const Node* orign,
-    NodeBuilder* nb) {
+void MklLayoutRewritePass::CopyAttrsConv2D(const Node* orign, NodeBuilder* nb) {
   DataType T;
   string data_format;
   string padding;
@@ -787,7 +784,7 @@ void MklLayoutRewritePass::CopyAttrsConv2D(const Node* orign,
 }
 
 void MklLayoutRewritePass::CopyAttrsBiasAddGrad(const Node* orign,
-    NodeBuilder* nb) {
+                                                NodeBuilder* nb) {
   DataType T;
   string data_format;
   std::vector<int32> strides;
@@ -804,7 +801,7 @@ void MklLayoutRewritePass::CopyAttrsBiasAddGrad(const Node* orign,
 }
 
 void MklLayoutRewritePass::CopyAttrsPooling(const Node* orign,
-    NodeBuilder* nb) {
+                                            NodeBuilder* nb) {
   DataType T;
   string data_format;
   string padding;
@@ -864,7 +861,7 @@ Node* MklLayoutRewritePass::CheckForNodeMerge(const Node* a) const {
     FillInputs(a, &a_control_edges, &a_in);
 
     // Get operand op of the operator
-    Node *b = nullptr;
+    Node* b = nullptr;
     b = a_in[mi->op].first;
     if (b == nullptr || (b->type_string() != mi->pred)) {
       // NOTE: Should the first check be assert?
@@ -887,8 +884,8 @@ Node* MklLayoutRewritePass::CheckForNodeMerge(const Node* a) const {
   return nullptr;
 }
 
-Status MklLayoutRewritePass::MergeNode(std::unique_ptr<Graph>* g,
-                                     Node* succ, Node* pred) {
+Status MklLayoutRewritePass::MergeNode(std::unique_ptr<Graph>* g, Node* succ,
+                                       Node* pred) {
   CHECK_NOTNULL(succ);
   CHECK_NOTNULL(pred);
 
@@ -906,15 +903,14 @@ Status MklLayoutRewritePass::MergeNode(std::unique_ptr<Graph>* g,
     TF_CHECK_OK(GetNodeAttr(pred->def(), "strides", &strides));
     TF_CHECK_OK(GetNodeAttr(pred->def(), "data_format", &data_format_pred));
     TF_CHECK_OK(GetNodeAttr(succ->def(), "data_format", &data_format_succ));
-    TF_CHECK_OK(GetNodeAttr(pred->def(), "use_cudnn_on_gpu",
-                            &use_cudnn_on_gnu));
+    TF_CHECK_OK(
+        GetNodeAttr(pred->def(), "use_cudnn_on_gpu", &use_cudnn_on_gnu));
     // We check to ensure that data formats of both succ and pred are same.
     // We expect them to be same, so we can enforce this as assert.
     // But assert can be too strict, so we enforce this as a check.
     // If the check fails, then we do not merge two nodes.
     // We also do same check for devices.
-    if (data_format_pred != data_format_succ ||
-        T_pred != T_succ ||
+    if (data_format_pred != data_format_succ || T_pred != T_succ ||
         pred->assigned_device_name() != succ->assigned_device_name() ||
         pred->def().device() != succ->def().device()) {
       return Status(error::Code::INVALID_ARGUMENT,
@@ -940,11 +936,11 @@ Status MklLayoutRewritePass::MergeNode(std::unique_ptr<Graph>* g,
                     "Will skip node merge optimization");
     }
 
-    for (const Edge *e : pred->out_edges()) {
+    for (const Edge* e : pred->out_edges()) {
       if (e->dst() != succ) {
         return Status(error::Code::INVALID_ARGUMENT,
-                    "Conv2D does not feed to BiasAdd."
-                    "Will skip node merge optimization");
+                      "Conv2D does not feed to BiasAdd."
+                      "Will skip node merge optimization");
       }
     }
 
@@ -955,8 +951,8 @@ Status MklLayoutRewritePass::MergeNode(std::unique_ptr<Graph>* g,
     // Get operand 1 of add_bias
     // BiasAdd must have 2 inputs: Conv, bias
     CHECK_EQ(succ->in_edges().size(), 2);
-    Node* oper3_mkl    = nullptr;  // Mkl tensor corresponding to oper3
-    int oper3_mkl_slot = 0;  // For dummy MKL tensor node, output slot is 0.
+    Node* oper3_mkl = nullptr;  // Mkl tensor corresponding to oper3
+    int oper3_mkl_slot = 0;     // For dummy MKL tensor node, output slot is 0.
     GetDummyMklTensorNode(g, &oper3_mkl, succ);  // Get dummy Mkl tensor node
     // as BiasAdd does not have Mkl tensor as input.
     CHECK_NOTNULL(oper3_mkl);
@@ -997,8 +993,8 @@ Status MklLayoutRewritePass::MergeNode(std::unique_ptr<Graph>* g,
     newn->set_assigned_device_name(pred->assigned_device_name());
 
     VLOG(1) << "MklLayoutRewritePass: Merged old node:" << pred->DebugString()
-            << ", and node: " << succ->DebugString() << ", into node:"
-            << newn->DebugString();
+            << ", and node: " << succ->DebugString()
+            << ", into node:" << newn->DebugString();
 
     (*g)->RemoveNode(succ);
     (*g)->RemoveNode(pred);
@@ -1015,8 +1011,8 @@ Status MklLayoutRewritePass::MergeNode(std::unique_ptr<Graph>* g,
 //           Helper functions for node rewrite
 //////////////////////////////////////////////////////////////////////////
 
-Status MklLayoutRewritePass::RewriteNode(
-    std::unique_ptr<Graph>* g, Node* orign, const RewriteInfo* ri) {
+Status MklLayoutRewritePass::RewriteNode(std::unique_ptr<Graph>* g, Node* orign,
+                                         const RewriteInfo* ri) {
   CHECK_NOTNULL(ri);
   CHECK_NOTNULL(orign);
 
@@ -1044,9 +1040,10 @@ Status MklLayoutRewritePass::RewriteNode(
       if (orig_data_format != ctx_data_format || orig_T != ctx_T ||
           orign->assigned_device_name() != fwdn->assigned_device_name() ||
           orign->def().device() != fwdn->def().device()) {
-        return Status(error::Code::INVALID_ARGUMENT,
-                    "data_format or T attribute or devices of BiasAddGrad and "
-                    "Conv2D do not match. Will skip node rewrite optimization");
+        return Status(
+            error::Code::INVALID_ARGUMENT,
+            "data_format or T attribute or devices of BiasAddGrad and "
+            "Conv2D do not match. Will skip node rewrite optimization");
       }
     }
   }
@@ -1077,7 +1074,7 @@ Status MklLayoutRewritePass::RewriteNode(
       ri->copyattrs(fwdn, &nb);
     } else {
       return Status(error::Code::UNIMPLEMENTED,
-                "Unimplemented case for node rewrite optimization.");
+                    "Unimplemented case for node rewrite optimization.");
     }
   } else {
     ri->copyattrs(const_cast<const Node*>(orign), &nb);
@@ -1106,8 +1103,8 @@ Status MklLayoutRewritePass::RewriteNode(
     if (e->src_output() < 0) {
       (*g)->AddEdge(newn, e->src_output(), e->dst(), e->dst_input());
     } else {
-      (*g)->AddEdge(newn, GetTensorDataIndex(e->src_output()),
-                  e->dst(), e->dst_input());
+      (*g)->AddEdge(newn, GetTensorDataIndex(e->src_output()), e->dst(),
+                    e->dst_input());
     }
   }
 
@@ -1123,8 +1120,7 @@ Status MklLayoutRewritePass::RewriteNode(
 }
 
 const MklLayoutRewritePass::ContextInfo*
-MklLayoutRewritePass::SearchMatchingContext(const Node* n,
-    const Node** fwdn) {
+MklLayoutRewritePass::SearchMatchingContext(const Node* n, const Node** fwdn) {
   CHECK_NOTNULL(n);
   CHECK_NOTNULL(fwdn);
   *fwdn = nullptr;
@@ -1144,8 +1140,8 @@ MklLayoutRewritePass::SearchMatchingContext(const Node* n,
     return nullptr;
   }
 
-  VLOG(1) << "MklLayoutRewritePass: Searching graph for: "
-          << n->type_string() << " in backwards.";
+  VLOG(1) << "MklLayoutRewritePass: Searching graph for: " << n->type_string()
+          << " in backwards.";
 
   // Now we will check for forward op name for context info in data
   // flow graph. Get the max hops we should search for the fwd node.
@@ -1164,13 +1160,12 @@ MklLayoutRewritePass::SearchMatchingContext(const Node* n,
     nqueue.pop();
 
     std::set<const Node*> visited_nodes;
-    curr_node  = curr_pair.first;
+    curr_node = curr_pair.first;
     curr_depth = curr_pair.second;
     CHECK_NOTNULL(curr_node);
 
     VLOG(1) << "MklLayoutRewritePass: Visiting node: "
-            << curr_node->type_string()
-            << " at depth: " << curr_depth
+            << curr_node->type_string() << " at depth: " << curr_depth
             << " for node: " << n->type_string();
 
     // If we find a match, we return immediately.
@@ -1186,9 +1181,9 @@ MklLayoutRewritePass::SearchMatchingContext(const Node* n,
     for (const Edge* e : curr_node->in_edges()) {
       // We do not visit already visited node.
       if (visited_nodes.find(e->src()) == visited_nodes.end()) {
-         // Depth of these nodes is 1 more than the depth of current node.
-         nqueue.push(std::make_pair(e->src(), curr_depth+1));
-         visited_nodes.insert(e->src());
+        // Depth of these nodes is 1 more than the depth of current node.
+        nqueue.push(std::make_pair(e->src(), curr_depth + 1));
+        visited_nodes.insert(e->src());
       }
     }
   } /* while */
@@ -1202,8 +1197,7 @@ bool MklLayoutRewritePass::ContextMatchRewrite(const Node* n) {
 }
 
 const MklLayoutRewritePass::RewriteInfo*
-MklLayoutRewritePass::CheckForNodeRewrite(
-    const Node *n) const {
+MklLayoutRewritePass::CheckForNodeRewrite(const Node* n) const {
   CHECK_NOTNULL(n);
 
   // First check if node along with its type is supported by MKL layer.
@@ -1238,8 +1232,7 @@ MklLayoutRewritePass::CheckForNodeRewrite(
 //              Run function for the pass
 ///////////////////////////////////////////////////////////////////////////////
 
-bool MklLayoutRewritePass::RunPass(
-    std::unique_ptr<Graph>* g) {
+bool MklLayoutRewritePass::RunPass(std::unique_ptr<Graph>* g) {
   bool result = false;
   CHECK_NOTNULL(g);
 
@@ -1265,22 +1258,21 @@ bool MklLayoutRewritePass::RunPass(
               << " layout optimization.";
 
       if (RewriteNode(g, n, ri) == Status::OK()) {
-          VLOG(1) << "MklLayoutRewritePass: rewrote node "
-                  << node_name << " with op " << op_name
-                  << " for Mkl layout optimization.";
-          result = true;
+        VLOG(1) << "MklLayoutRewritePass: rewrote node " << node_name
+                << " with op " << op_name << " for Mkl layout optimization.";
+        result = true;
       }
     } else if ((predn = CheckForNodeMerge(n)) != nullptr) {
       // Otherwise, we will check if the node is to be merged.
       string n1_name = n->name();
       string n2_name = predn->name();
 
-      VLOG(1) << "MklLayoutRewritePass: Scheduled nodes "
-              << n1_name << " and " << n2_name << " for merging";
+      VLOG(1) << "MklLayoutRewritePass: Scheduled nodes " << n1_name << " and "
+              << n2_name << " for merging";
 
       if (MergeNode(g, n, predn) == Status::OK()) {
-        VLOG(1) << "MklLayoutRewritePass: Merged nodes " << n1_name
-              << " and " << n2_name;
+        VLOG(1) << "MklLayoutRewritePass: Merged nodes " << n1_name << " and "
+                << n2_name;
         result = true;
       }
     }
diff --git a/tensorflow/core/graph/mkl_layout_pass_test.cc b/tensorflow/core/graph/mkl_layout_pass_test.cc
index dd7ee45a705..142d60d6112 100644
--- a/tensorflow/core/graph/mkl_layout_pass_test.cc
+++ b/tensorflow/core/graph/mkl_layout_pass_test.cc
@@ -18,9 +18,9 @@ limitations under the License.
 #include "tensorflow/core/graph/mkl_layout_pass.h"
 #include "tensorflow/core/util/mkl_util.h"
 
-#include <vector>
-#include <string>
 #include <algorithm>
+#include <string>
+#include <vector>
 
 #include "tensorflow/core/framework/op.h"
 #include "tensorflow/core/framework/tensor.h"
@@ -112,8 +112,7 @@ class MklLayoutPassTest : public ::testing::Test {
 REGISTER_OP("Input").Output("o: float").SetIsStateful();
 REGISTER_OP("HalfInput").Output("o: half").SetIsStateful();
 REGISTER_OP("MklInput").Output("o: uint8").SetIsStateful();
-REGISTER_OP("MklInput2").Output("o: uint8")
-                        .Output("o1: uint8").SetIsStateful();
+REGISTER_OP("MklInput2").Output("o: uint8").Output("o1: uint8").SetIsStateful();
 
 /////////////////////////////////////////////////////////////////////
 //  Unit tests related to node merge optiimization
@@ -240,7 +239,7 @@ TEST_F(MklLayoutPassTest, NodeMerge_Conv2DWithBias_Negative_NoAddBias) {
       " input: ['A', 'M', 'B', 'N']}");
   EXPECT_EQ(DoMklLayoutOptimizationPass(),
             "A(Input);B(Input);C(MklConv2D);M(MklInput);N(MklInput)|"
-             "A->C;B->C:2;M->C:1;N->C:3");
+            "A->C;B->C:2;M->C:1;N->C:3");
 }
 
 // MklConv2D output does not go to BiasAdd.
@@ -372,7 +371,7 @@ TEST_F(MklLayoutPassTest, NodeMerge_Conv2DBackprop_Negative_NoConv2D) {
       " input: ['D'] }");
   EXPECT_EQ(DoMklLayoutOptimizationPass(),
             "A(Input);B(Input);C(Add);D(Sub);E(BiasAddGrad)|"
-             "A->C;A->D:1;B->C:1;C->D;D->E");
+            "A->C;A->D:1;B->C:1;C->D;D->E");
 }
 
 // No Conv2D in the context for BiasAddGrad, but MatMul in context.
@@ -396,7 +395,7 @@ TEST_F(MklLayoutPassTest, NodeMerge_Conv2DBackprop_Negative_NoConv2D_MatMul) {
       " input: ['D'] }");
   EXPECT_EQ(DoMklLayoutOptimizationPass(),
             "A(Input);B(Input);C(MatMul);D(Sub);E(BiasAddGrad)|"
-             "A->C;A->D:1;B->C:1;C->D;D->E");
+            "A->C;A->D:1;B->C:1;C->D;D->E");
 }
 
 // Test set 3: MatMul..BiasAddGrad -> BiasAddGrad rewrite tests
@@ -419,7 +418,7 @@ TEST_F(MklLayoutPassTest, NodeMerge_MatMulBiasAddGrad_Positive) {
       " input: ['D'] }");
   EXPECT_EQ(DoMklLayoutOptimizationPass(),
             "A(Input);B(Input);C(MatMul);D(Sub);E(BiasAddGrad)|"
-             "A->C;A->D:1;B->C:1;C->D;D->E");
+            "A->C;A->D:1;B->C:1;C->D;D->E");
 }
 
 // No MatMul in the context for BiasAddGrad. No rewrite should happen.
@@ -440,7 +439,7 @@ TEST_F(MklLayoutPassTest, NodeMerge_MatMulBiasAddGrad_Negative_NoMatMul) {
       " input: ['D'] }");
   EXPECT_EQ(DoMklLayoutOptimizationPass(),
             "A(Input);B(Input);C(Add);D(Sub);E(BiasAddGrad)|"
-             "A->C;A->D:1;B->C:1;C->D;D->E");
+            "A->C;A->D:1;B->C:1;C->D;D->E");
 }
 
 /////////////////////////////////////////////////////////////////////
diff --git a/tensorflow/core/graph/mkl_tfconversion_pass.cc b/tensorflow/core/graph/mkl_tfconversion_pass.cc
index 2097d432be7..7c3836b3089 100644
--- a/tensorflow/core/graph/mkl_tfconversion_pass.cc
+++ b/tensorflow/core/graph/mkl_tfconversion_pass.cc
@@ -212,7 +212,7 @@ bool MklToTfConversionPass::RunPass(std::unique_ptr<Graph>* g) {
     // Check if src with is Mkl-compliant, while dst is not Mkl-compliant.
 
     if (IsMklSupportedOp(src->type_string(), src_datatype) &&
-       !IsMklSupportedOp(dst->type_string(), dst_datatype)) {
+        !IsMklSupportedOp(dst->type_string(), dst_datatype)) {
       VLOG(1) << "MklToTfConversionPass: Scheduled nodes " << src->name()
               << " and " << dst->name() << " for inserting conversion nodes";
       candidate_edges.push_back(const_cast<Edge*>(e));
diff --git a/tensorflow/core/graph/mkl_tfconversion_pass_test.cc b/tensorflow/core/graph/mkl_tfconversion_pass_test.cc
index 4e211980d7f..7d9237f8454 100644
--- a/tensorflow/core/graph/mkl_tfconversion_pass_test.cc
+++ b/tensorflow/core/graph/mkl_tfconversion_pass_test.cc
@@ -17,9 +17,9 @@ limitations under the License.
 
 #include "tensorflow/core/graph/mkl_tfconversion_pass.h"
 
-#include <vector>
-#include <string>
 #include <algorithm>
+#include <string>
+#include <vector>
 
 #include "tensorflow/core/framework/op.h"
 #include "tensorflow/core/framework/tensor.h"
diff --git a/tensorflow/core/grappler/optimizers/BUILD b/tensorflow/core/grappler/optimizers/BUILD
index d09a3c4e304..bd96e2b33cc 100644
--- a/tensorflow/core/grappler/optimizers/BUILD
+++ b/tensorflow/core/grappler/optimizers/BUILD
@@ -17,6 +17,7 @@ filegroup(
     srcs = glob(
         [
             "*_optimizer.*",
+            "constant_folding.*",
             "model_pruner.*",
             "graph_rewriter.*",
         ],
@@ -117,6 +118,37 @@ cc_test(
     ],
 )
 
+cc_library(
+    name = "memory_optimizer",
+    srcs = ["memory_optimizer.cc"],
+    hdrs = [
+        "memory_optimizer.h",
+    ],
+    visibility = ["//visibility:public"],
+    deps = [
+        ":graph_optimizer",
+        ":graph_rewriter",
+        "//tensorflow/core:protos_all_cc",
+        "//tensorflow/core/grappler:grappler_item",
+        "//tensorflow/core/grappler:utils",
+    ],
+)
+
+cc_test(
+    name = "memory_optimizer_test",
+    srcs = ["memory_optimizer_test.cc"],
+    deps = [
+        ":memory_optimizer",
+        "//tensorflow/cc:cc_ops",
+        "//tensorflow/core:protos_all_cc",
+        "//tensorflow/core:test",
+        "//tensorflow/core:test_main",
+        "//tensorflow/core/grappler:grappler_item",
+        "//tensorflow/core/grappler:utils",
+        "//tensorflow/core/grappler/inputs:trivial_test_graph_input_yielder",
+    ],
+)
+
 cc_library(
     name = "layout_optimizer",
     srcs = ["layout_optimizer.cc"],
@@ -144,6 +176,7 @@ cc_library(
     ],
     visibility = ["//visibility:public"],
     deps = [
+        ":constant_folding",
         ":graph_optimizer",
         ":layout_optimizer",
         ":model_pruner",
diff --git a/tensorflow/core/grappler/optimizers/constant_folding.cc b/tensorflow/core/grappler/optimizers/constant_folding.cc
index 7cddedef2e2..8f79c55810b 100644
--- a/tensorflow/core/grappler/optimizers/constant_folding.cc
+++ b/tensorflow/core/grappler/optimizers/constant_folding.cc
@@ -72,8 +72,7 @@ class DeviceSimple : public DeviceBase {
                              Tensor* tensor) override {
     Tensor parsed(tensor_proto.dtype());
     if (!parsed.FromProto(cpu_allocator(), tensor_proto)) {
-      return errors::InvalidArgument("Cannot parse tensor from proto: ",
-                                     tensor_proto.DebugString());
+      return errors::InvalidArgument("Cannot parse tensor from tensor_proto.");
     }
     *tensor = parsed;
     return Status::OK();
diff --git a/tensorflow/core/grappler/optimizers/memory_optimizer.cc b/tensorflow/core/grappler/optimizers/memory_optimizer.cc
new file mode 100644
index 00000000000..24c6ab12efc
--- /dev/null
+++ b/tensorflow/core/grappler/optimizers/memory_optimizer.cc
@@ -0,0 +1,83 @@
+/* Copyright 2017 The TensorFlow Authors. All Rights Reserved.
+
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+==============================================================================*/
+
+#include "tensorflow/core/grappler/optimizers/memory_optimizer.h"
+#include <unordered_set>
+#include "tensorflow/core/framework/attr_value.pb.h"
+#include "tensorflow/core/framework/node_def.pb.h"
+#include "tensorflow/core/grappler/grappler_item.h"
+#include "tensorflow/core/grappler/optimizers/graph_rewriter.h"
+#include "tensorflow/core/grappler/utils.h"
+
+namespace tensorflow {
+namespace grappler {
+
+std::pair<NodeDef*, NodeDef*> BuildSwapPair(NodeDef* node, int input_to_swap,
+                                            GraphDef* graph) {
+  string tensor_to_swap = strings::StrCat(node->name(), "_", input_to_swap);
+
+  // Force the tensor to be copied to cpu.
+  NodeDef* swap_out_node = graph->add_node();
+  swap_out_node->set_name(strings::StrCat("swap_out_", tensor_to_swap));
+  swap_out_node->set_op("Identity");
+  swap_out_node->set_device("/CPU");
+
+  // Force the tensor to be restored to the device.
+  NodeDef* swap_in_node = graph->add_node();
+  swap_in_node->set_name(strings::StrCat("swap_in_", tensor_to_swap));
+  swap_in_node->set_op("Identity");
+  *swap_in_node->add_input() = swap_out_node->name();
+
+  // Colocate the swap_in_ node with the node itself.
+  string coloc_group = strings::StrCat("loc@", tensor_to_swap);
+  (*swap_in_node->mutable_attr())["_class"].mutable_list()->add_s(coloc_group);
+  (*node->mutable_attr())["_class"].mutable_list()->add_s(coloc_group);
+
+  return std::make_pair(swap_out_node, swap_in_node);
+}
+
+Status MemoryOptimizer::Optimize(Cluster* cluster, const GrapplerItem& item,
+                                 GraphDef* optimized_graph) {
+  *optimized_graph = item.graph;
+
+  for (auto& node : *optimized_graph->mutable_node()) {
+    if (node.attr().count("swap_to_host") == 0) {
+      continue;
+    }
+
+    // Swap all the tensors that are marked with the 'swap_to_host' attribute.
+    for (int input_id : node.attr().at("swap_to_host").list().i()) {
+      std::pair<NodeDef*, NodeDef*> swap_nodes =
+          BuildSwapPair(&node, input_id, optimized_graph);
+      *swap_nodes.first->add_input() = node.input(input_id);
+      *node.mutable_input(input_id) = swap_nodes.second->name();
+
+      // TODO(bsteiner): Make sure the tensor isn't swapped back in right away
+      // by adding a control dependency to delay the execution of the swap.
+      // string trigger;
+      //*swap_nodes.second->add_input() = strings::StrCat("^", trigger);
+    }
+  }
+
+  return Status::OK();
+}
+
+void MemoryOptimizer::Feedback(Cluster* cluster, const GrapplerItem& item,
+                               const GraphDef& optimized_graph, double result) {
+  // Nothing to do for MemoryOptimizer.
+}
+
+}  // end namespace grappler
+}  // end namespace tensorflow
diff --git a/tensorflow/core/grappler/optimizers/memory_optimizer.h b/tensorflow/core/grappler/optimizers/memory_optimizer.h
new file mode 100644
index 00000000000..463067738b3
--- /dev/null
+++ b/tensorflow/core/grappler/optimizers/memory_optimizer.h
@@ -0,0 +1,42 @@
+/* Copyright 2017 The TensorFlow Authors. All Rights Reserved.
+
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+==============================================================================*/
+
+#ifndef TENSORFLOW_GRAPPLER_OPTIMIZERS_MEMORY_OPTIMIZER_H_
+#define TENSORFLOW_GRAPPLER_OPTIMIZERS_MEMORY_OPTIMIZER_H_
+
+#include "tensorflow/core/grappler/optimizers/graph_optimizer.h"
+
+namespace tensorflow {
+namespace grappler {
+
+// Swap tensors in and out of device memory.
+class MemoryOptimizer : public GraphOptimizer {
+ public:
+  MemoryOptimizer() {}
+  ~MemoryOptimizer() override {}
+
+  string name() const override { return "memory_optimizer"; };
+
+  Status Optimize(Cluster* cluster, const GrapplerItem& item,
+                  GraphDef* pruned_graph) override;
+
+  void Feedback(Cluster* cluster, const GrapplerItem& item,
+                const GraphDef& pruned_graph, double result) override;
+};
+
+}  // end namespace grappler
+}  // end namespace tensorflow
+
+#endif  // TENSORFLOW_GRAPPLER_OPTIMIZERS_MEMORY_OPTIMIZER_H_
diff --git a/tensorflow/core/grappler/optimizers/memory_optimizer_test.cc b/tensorflow/core/grappler/optimizers/memory_optimizer_test.cc
new file mode 100644
index 00000000000..9defa72cffb
--- /dev/null
+++ b/tensorflow/core/grappler/optimizers/memory_optimizer_test.cc
@@ -0,0 +1,74 @@
+/* Copyright 2017 The TensorFlow Authors. All Rights Reserved.
+
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+==============================================================================*/
+
+#include "tensorflow/core/grappler/optimizers/memory_optimizer.h"
+#include "tensorflow/cc/ops/standard_ops.h"
+#include "tensorflow/core/framework/node_def.pb.h"
+#include "tensorflow/core/grappler/grappler_item.h"
+#include "tensorflow/core/grappler/utils.h"
+#include "tensorflow/core/lib/core/status_test_util.h"
+#include "tensorflow/core/platform/test.h"
+
+namespace tensorflow {
+namespace grappler {
+namespace {
+
+class MemoryOptimizerTest : public ::testing::Test {};
+
+TEST_F(MemoryOptimizerTest, SimpleSwapping) {
+  // Build a simple graph with an op that's marked for swapping.
+  tensorflow::Scope s = tensorflow::Scope::NewRootScope();
+
+  Output a = ops::Const(s.WithOpName("a"), 0.0f, {10, 10});
+  Output b = ops::AddN(s.WithOpName("b"), {a});
+  Output c = ops::AddN(s.WithOpName("c"), {b});
+  Output d = ops::AddN(s.WithOpName("d"), {c});
+  Output e = ops::AddN(s.WithOpName("e"), {b, d});
+
+  GrapplerItem item;
+  TF_CHECK_OK(s.ToGraphDef(&item.graph));
+
+  EXPECT_EQ(5, item.graph.node_size());
+  EXPECT_EQ(NodeName(e.name()), item.graph.node(4).name());
+  AttrValue& val =
+      (*item.graph.mutable_node(4)->mutable_attr())["swap_to_host"];
+  val.mutable_list()->add_i(0);
+
+  MemoryOptimizer optimizer;
+  GraphDef output;
+  Status status = optimizer.Optimize(nullptr, item, &output);
+  TF_EXPECT_OK(status);
+
+  EXPECT_EQ(7, output.node_size());
+  const NodeDef& new_e = output.node(4);
+  EXPECT_EQ(NodeName(e.name()), new_e.name());
+
+  EXPECT_EQ(2, new_e.input_size());
+  EXPECT_EQ(NodeName(d.name()), new_e.input(1));
+  EXPECT_EQ("swap_in_e_0", new_e.input(0));
+
+  const NodeDef& swap_out = output.node(5);
+  EXPECT_EQ("swap_out_e_0", swap_out.name());
+
+  const NodeDef& swap_in = output.node(6);
+  EXPECT_EQ("swap_in_e_0", swap_in.name());
+
+  EXPECT_EQ(NodeName(b.name()), swap_out.input(0));
+  EXPECT_EQ(NodeName(swap_out.name()), swap_in.input(0));
+}
+
+}  // namespace
+}  // namespace grappler
+}  // namespace tensorflow
diff --git a/tensorflow/core/grappler/optimizers/meta_optimizer.cc b/tensorflow/core/grappler/optimizers/meta_optimizer.cc
index 44a1f5bab92..67ffa7a4b6e 100644
--- a/tensorflow/core/grappler/optimizers/meta_optimizer.cc
+++ b/tensorflow/core/grappler/optimizers/meta_optimizer.cc
@@ -14,6 +14,9 @@ limitations under the License.
 ==============================================================================*/
 
 #include "tensorflow/core/grappler/optimizers/meta_optimizer.h"
+#include "tensorflow/core/framework/versions.pb.h"
+#include "tensorflow/core/grappler/optimizers/constant_folding.h"
+#include "tensorflow/core/grappler/optimizers/graph_optimizer.h"
 #include "tensorflow/core/grappler/optimizers/layout_optimizer.h"
 #include "tensorflow/core/grappler/optimizers/model_pruner.h"
 #include "tensorflow/core/lib/core/status.h"
@@ -21,25 +24,67 @@ limitations under the License.
 namespace tensorflow {
 namespace grappler {
 
+std::unique_ptr<GraphOptimizer> MetaOptimizer::NewOptimizer(
+    const string& optimizer) {
+  VLOG(1) << "Adding graph optimization pass: " << optimizer;
+  std::unique_ptr<GraphOptimizer> graph_optimizer;
+  if (optimizer == "pruning") {
+    graph_optimizer.reset(new ModelPruner());
+  }
+  if (optimizer == "constfold") {
+    graph_optimizer.reset(new ConstantFolding());
+  }
+  if (optimizer == "layout") {
+    graph_optimizer.reset(new LayoutOptimizer());
+  }
+  return graph_optimizer;
+}
+
 Status MetaOptimizer::Optimize(Cluster* cluster, const GrapplerItem& item,
                                GraphDef* optimized_graph) {
-  bool already_optimized = false;
-  if (!cfg_.disable_model_pruning()) {
-    already_optimized = true;
-    ModelPruner pruner;
-    TF_RETURN_IF_ERROR(pruner.Optimize(nullptr, item, optimized_graph));
+  std::vector<std::unique_ptr<GraphOptimizer>> optimizers;
+  if (cfg_.optimizers().empty()) {
+    if (!cfg_.disable_model_pruning()) {
+      optimizers.push_back(std::unique_ptr<GraphOptimizer>(new ModelPruner()));
+    }
+    if (cfg_.constant_folding()) {
+      optimizers.push_back(
+          std::unique_ptr<GraphOptimizer>(new ConstantFolding()));
+    }
+    if (cfg_.optimize_tensor_layout()) {
+      optimizers.push_back(
+          std::unique_ptr<GraphOptimizer>(new LayoutOptimizer()));
+    }
+  } else {
+    std::set<string> avaliable_optimizers = {"pruning", "constfold", "layout"};
+    for (const auto& optimizer : cfg_.optimizers()) {
+      if (avaliable_optimizers.find(optimizer) != avaliable_optimizers.end()) {
+        optimizers.push_back(NewOptimizer(optimizer));
+      }
+    }
   }
-  if (cfg_.optimize_tensor_layout()) {
-    LayoutOptimizer layout_optimizer;
+
+  if (optimizers.empty()) {
+    *optimized_graph = item.graph;
+    return Status::OK();
+  }
+
+  bool already_optimized = false;
+  for (const auto& optimizer : optimizers) {
     if (!already_optimized) {
-      return layout_optimizer.Optimize(nullptr, item, optimized_graph);
+      TF_RETURN_IF_ERROR(optimizer->Optimize(nullptr, item, optimized_graph));
+      already_optimized = true;
     } else {
       GrapplerItem optimized_item = item;
       optimized_item.graph = *optimized_graph;
-      return layout_optimizer.Optimize(nullptr, optimized_item,
-                                       optimized_graph);
+      TF_RETURN_IF_ERROR(
+          optimizer->Optimize(nullptr, optimized_item, optimized_graph));
     }
   }
+
+  // Copy the graph version.
+  *optimized_graph->mutable_versions() = item.graph.versions();
+
   return Status::OK();
 }
 
diff --git a/tensorflow/core/grappler/optimizers/meta_optimizer.h b/tensorflow/core/grappler/optimizers/meta_optimizer.h
index d7ff03f5907..9def2cd711f 100644
--- a/tensorflow/core/grappler/optimizers/meta_optimizer.h
+++ b/tensorflow/core/grappler/optimizers/meta_optimizer.h
@@ -39,6 +39,7 @@ class MetaOptimizer : public GraphOptimizer {
                 const GraphDef& optimized_graph, double result) override;
 
  private:
+  std::unique_ptr<GraphOptimizer> NewOptimizer(const string& optimizer);
   RewriterConfig cfg_;
 };
 
diff --git a/tensorflow/core/kernels/conv_grad_filter_ops.cc b/tensorflow/core/kernels/conv_grad_filter_ops.cc
index a59277f18e2..20394cad432 100644
--- a/tensorflow/core/kernels/conv_grad_filter_ops.cc
+++ b/tensorflow/core/kernels/conv_grad_filter_ops.cc
@@ -99,25 +99,26 @@ struct LaunchXsmmBackwardFilter {
                   typename TTypes<T, 4>::Tensor kernel,
                   typename TTypes<T, 4>::ConstTensor output_backward,
                   int input_rows, int input_cols, int row_stride,
-                  int col_stride, int pad_h, int pad_w, TensorFormat data_format) const {
+                  int col_stride, int pad_h, int pad_w,
+                  TensorFormat data_format) const {
     return false;
   }
 };
- 
+
 template <>
 struct LaunchXsmmBackwardFilter<CPUDevice, float> {
   bool operator()(OpKernelContext* context, const CPUDevice& d,
                   typename TTypes<float, 4>::ConstTensor input,
                   typename TTypes<float, 4>::Tensor filter,
-                  typename TTypes<float, 4>::ConstTensor output,
-                  int input_rows, int input_cols, int row_stride,
-                  int col_stride,int pad_h, int pad_w,  TensorFormat data_format) const {
+                  typename TTypes<float, 4>::ConstTensor output, int input_rows,
+                  int input_cols, int row_stride, int col_stride, int pad_h,
+                  int pad_w, TensorFormat data_format) const {
     auto batch = input.dimension(0);
     auto in_depth = input.dimension(3);
     auto out_depth = output.dimension(3);
     auto filter_rows = filter.dimension(0);
     auto filter_cols = filter.dimension(1);
- 
+
     auto num_threads =
         context->device()->tensorflow_cpu_worker_threads()->num_threads;
     // See libxsmm_dnn.h for this struct definition.
@@ -144,13 +145,11 @@ struct LaunchXsmmBackwardFilter<CPUDevice, float> {
     desc.fuse_ops = LIBXSMM_DNN_CONV_FUSE_NONE;
     desc.options = LIBXSMM_DNN_CONV_OPTION_NONE;
     desc.datatype = LIBXSMM_DNN_DATATYPE_F32;
- 
- 
+
     if (!CanUseXsmmConv2D(desc, data_format)) {
       return false;
     }
- 
- 
+
     auto input_ptr = input.data();
     auto filter_ptr = filter.data();
     auto output_ptr = output.data();
@@ -161,8 +160,6 @@ struct LaunchXsmmBackwardFilter<CPUDevice, float> {
 };
 #endif
 
-
-
 template <typename Device, class T>
 class Conv2DFastBackpropFilterOp : public OpKernel {
  public:
@@ -210,8 +207,7 @@ class Conv2DFastBackpropFilterOp : public OpKernel {
     OP_REQUIRES_OK(context,
                    context->allocate_output(0, filter_shape, &filter_backprop));
 
-    #if defined TENSORFLOW_USE_LIBXSMM && defined TENSORFLOW_USE_LIBXSMM_BACKWARD
- 
+#if defined TENSORFLOW_USE_LIBXSMM && defined TENSORFLOW_USE_LIBXSMM_BACKWARD
     int64 pad_top, pad_bottom;
     int64 pad_left, pad_right;
     OP_REQUIRES_OK(
@@ -226,22 +222,20 @@ class Conv2DFastBackpropFilterOp : public OpKernel {
             dims.spatial_dims[1].input_size, dims.spatial_dims[1].filter_size,
             dims.spatial_dims[1].stride, padding_,
             &dims.spatial_dims[1].output_size, &pad_left, &pad_right));
- 
-    if ( pad_left == pad_right && pad_top == pad_bottom ) {
- 
+
+    if (pad_left == pad_right && pad_top == pad_bottom) {
       if (LaunchXsmmBackwardFilter<Device, T>()(
-            context, context->eigen_device<Device>(),
-            input.tensor<T, 4>(),filter_backprop->tensor<T, 4>(),
-            out_backprop.tensor<T, 4>(),  dims.spatial_dims[0].input_size,
-            dims.spatial_dims[1].input_size, (int)dims.spatial_dims[0].stride,
-            (int)dims.spatial_dims[1].stride,(int)pad_top, (int)pad_left, data_format_)) {
-      return;
+              context, context->eigen_device<Device>(), input.tensor<T, 4>(),
+              filter_backprop->tensor<T, 4>(), out_backprop.tensor<T, 4>(),
+              dims.spatial_dims[0].input_size, dims.spatial_dims[1].input_size,
+              static_cast<int>(dims.spatial_dims[0].stride),
+              static_cast<int>(dims.spatial_dims[1].stride),
+              static_cast<int>(pad_top), static_cast<int>(pad_left),
+              data_format_)) {
+        return;
       }
     }
-    #endif
-
-
-
+#endif
 
     functor::SpatialConvolutionBackwardKernel<Device, T>()(
         context->eigen_device<Device>(), filter_backprop->tensor<T, 4>(),
@@ -321,19 +315,20 @@ class Conv2DCustomBackpropFilterOp : public OpKernel {
             dims.spatial_dims[1].input_size, dims.spatial_dims[1].filter_size,
             dims.spatial_dims[1].stride, padding_,
             &dims.spatial_dims[1].output_size, &pad_left, &pad_right));
-  #if defined TENSORFLOW_USE_LIBXSMM && defined TENSORFLOW_USE_LIBXSMM_BACKWARD
-    if ( pad_left == pad_right && pad_top == pad_bottom ) {
- 
+#if defined TENSORFLOW_USE_LIBXSMM && defined TENSORFLOW_USE_LIBXSMM_BACKWARD
+    if (pad_left == pad_right && pad_top == pad_bottom) {
       if (LaunchXsmmBackwardFilter<Device, T>()(
-            context, context->eigen_device<Device>(),
-            input.tensor<T, 4>(),filter_backprop->tensor<T, 4>(),
-            out_backprop.tensor<T, 4>(),  dims.spatial_dims[0].input_size,
-            dims.spatial_dims[1].input_size, (int)dims.spatial_dims[0].stride,
-            (int)dims.spatial_dims[1].stride,(int)pad_top, (int)pad_left, data_format_)) {
-      return;
+              context, context->eigen_device<Device>(), input.tensor<T, 4>(),
+              filter_backprop->tensor<T, 4>(), out_backprop.tensor<T, 4>(),
+              dims.spatial_dims[0].input_size, dims.spatial_dims[1].input_size,
+              static_cast<int>(dims.spatial_dims[0].stride),
+              static_cast<int>(dims.spatial_dims[1].stride),
+              static_cast<int>(pad_top), static_cast<int>(pad_left),
+              data_format_)) {
+        return;
       }
     }
-  #endif
+#endif
 
     // The total dimension size of each kernel.
     const int filter_total_size = dims.spatial_dims[0].filter_size *
diff --git a/tensorflow/core/kernels/conv_grad_input_ops.cc b/tensorflow/core/kernels/conv_grad_input_ops.cc
index 7e0912b4dbc..9a50431a2fa 100644
--- a/tensorflow/core/kernels/conv_grad_input_ops.cc
+++ b/tensorflow/core/kernels/conv_grad_input_ops.cc
@@ -131,7 +131,8 @@ struct LaunchXsmmBackwardInputConvolution {
                   typename TTypes<T, 4>::ConstTensor kernel,
                   typename TTypes<T, 4>::ConstTensor output_backward,
                   int input_rows, int input_cols, int row_stride,
-                  int col_stride, int pad_h, int pad_w, TensorFormat data_format) const {
+                  int col_stride, int pad_h, int pad_w,
+                  TensorFormat data_format) const {
     return false;
   }
 };
@@ -143,7 +144,8 @@ struct LaunchXsmmBackwardInputConvolution<CPUDevice, float> {
                   typename TTypes<float, 4>::ConstTensor kernel,
                   typename TTypes<float, 4>::ConstTensor output_backward,
                   int input_rows, int input_cols, int row_stride,
-                  int col_stride, int pad_h, int pad_w, TensorFormat data_format) const {
+                  int col_stride, int pad_h, int pad_w,
+                  TensorFormat data_format) const {
     auto batch = input_backward.dimension(0);
     auto in_depth = input_backward.dimension(3);
     auto out_depth = output_backward.dimension(3);
@@ -251,13 +253,16 @@ class Conv2DFastBackpropInputOp : public OpKernel {
             dims.spatial_dims[1].stride, padding_,
             &dims.spatial_dims[1].output_size, &pad_left, &pad_right));
 
-    if ( pad_left == pad_right && pad_top == pad_bottom ) {
+    if (pad_left == pad_right && pad_top == pad_bottom) {
       if (LaunchXsmmBackwardInputConvolution<Device, T>()(
-            context, context->eigen_device<Device>(),
-            in_backprop->tensor<T, 4>(), filter.tensor<T, 4>(),
-            out_backprop.tensor<T, 4>(), dims.spatial_dims[0].input_size,
-            dims.spatial_dims[1].input_size, (int)dims.spatial_dims[0].stride,
-            (int)dims.spatial_dims[1].stride, (int)pad_top, (int)pad_left, data_format_)) {
+              context, context->eigen_device<Device>(),
+              in_backprop->tensor<T, 4>(), filter.tensor<T, 4>(),
+              out_backprop.tensor<T, 4>(), dims.spatial_dims[0].input_size,
+              dims.spatial_dims[1].input_size,
+              static_cast<int>(dims.spatial_dims[0].stride),
+              static_cast<int>(dims.spatial_dims[1].stride),
+              static_cast<int>(pad_top), static_cast<int>(pad_left),
+              data_format_)) {
         return;
       }
     }
@@ -326,8 +331,8 @@ class Conv2DCustomBackpropInputOp : public OpKernel {
     OP_REQUIRES_OK(context,
                    context->allocate_output(0, input_shape, &in_backprop));
 
-    // TODO(andydavis) Consider moving code shared with
-    // Conv2DCustomBackpropFilterOp into a shared helper function.
+// TODO(andydavis) Consider moving code shared with
+// Conv2DCustomBackpropFilterOp into a shared helper function.
 #if defined TENSORFLOW_USE_LIBXSMM && defined TENSORFLOW_USE_LIBXSMM_BACKWARD
     int64 pad_top, pad_bottom;
     int64 pad_left, pad_right;
@@ -344,13 +349,16 @@ class Conv2DCustomBackpropInputOp : public OpKernel {
             dims.spatial_dims[1].stride, padding_,
             &dims.spatial_dims[1].output_size, &pad_left, &pad_right));
 
-    if ( pad_left == pad_right && pad_top == pad_bottom ) {
+    if (pad_left == pad_right && pad_top == pad_bottom) {
       if (LaunchXsmmBackwardInputConvolution<Device, T>()(
-            context, context->eigen_device<Device>(),
-            in_backprop->tensor<T, 4>(), filter.tensor<T, 4>(),
-            out_backprop.tensor<T, 4>(), dims.spatial_dims[0].input_size,
-            dims.spatial_dims[1].input_size, (int)dims.spatial_dims[0].stride,
-            (int)dims.spatial_dims[1].stride, (int)pad_top, (int)pad_left, data_format_)) {
+              context, context->eigen_device<Device>(),
+              in_backprop->tensor<T, 4>(), filter.tensor<T, 4>(),
+              out_backprop.tensor<T, 4>(), dims.spatial_dims[0].input_size,
+              dims.spatial_dims[1].input_size,
+              static_cast<int>(dims.spatial_dims[0].stride),
+              static_cast<int>(dims.spatial_dims[1].stride),
+              static_cast<int>(pad_top), static_cast<int>(pad_left),
+              data_format_)) {
         return;
       }
     }
diff --git a/tensorflow/core/kernels/cudnn_pooling_gpu.cc b/tensorflow/core/kernels/cudnn_pooling_gpu.cc
index 93efc93957b..5939ecdf62b 100644
--- a/tensorflow/core/kernels/cudnn_pooling_gpu.cc
+++ b/tensorflow/core/kernels/cudnn_pooling_gpu.cc
@@ -243,11 +243,10 @@ void DnnPooling3dGradOp<T>::Compute(
   }
 }
 
-#define DEFINE_DNN_OPS(T)                       \
-  template class DnnPooling3dOp<T>;               \
+#define DEFINE_DNN_OPS(T)           \
+  template class DnnPooling3dOp<T>; \
   template class DnnPooling3dGradOp<T>;
-TF_CALL_float(DEFINE_DNN_OPS)
-TF_CALL_half(DEFINE_DNN_OPS)
+TF_CALL_float(DEFINE_DNN_OPS) TF_CALL_half(DEFINE_DNN_OPS)
 #undef DEFINE_DNN_OPS
 
 #endif  // GOOGLE_CUDA
diff --git a/tensorflow/core/kernels/maxpooling_op.cc b/tensorflow/core/kernels/maxpooling_op.cc
index 3a0f19ffb0c..eb590280c9e 100644
--- a/tensorflow/core/kernels/maxpooling_op.cc
+++ b/tensorflow/core/kernels/maxpooling_op.cc
@@ -295,8 +295,8 @@ static void MaxPoolingBackwardCustomKernel(
       params.tensor_in_rows, params.tensor_in_cols, params.depth,
       params.out_height, params.out_width, params.window_rows,
       params.window_cols, params.row_stride, params.col_stride, params.pad_rows,
-      params.pad_cols, out_backprop.flat<T>().data(),
-      output->flat<T>().data(), context->eigen_device<Eigen::GpuDevice>());
+      params.pad_cols, out_backprop.flat<T>().data(), output->flat<T>().data(),
+      context->eigen_device<Eigen::GpuDevice>());
 }
 
 template <class T>
@@ -474,8 +474,7 @@ class MaxPoolingGradGradOp : public OpKernel {
     //    tensor_out_as_matrix with the corresponding values in
     //    top_diff_as_matrix.
     auto shard = [&params, &in_mat, &out_mat, &top_diff_mat, &bottom_diff_mat](
-        int64 start, int64 limit) {
-
+                     int64 start, int64 limit) {
       const int32 depth = params.depth;
       const int32 in_rows = params.tensor_in_rows;
       const int32 in_cols = params.tensor_in_cols;
@@ -1010,38 +1009,34 @@ TF_CALL_GPU_NUMBER_TYPES(REGISTER_GPU_MAX_POOL_KERNELS);
 // default Eigen implementation so we are using the custom kernel as the
 // default. However, you can explicitly invoke the eigen version using
 // kernel_label_map.
-#define REGISTER_GPU_ONLY_POOL_KERNELS(T)                        \
-  REGISTER_KERNEL_BUILDER(                                       \
-      Name("MaxPool")                                            \
-          .Device(DEVICE_GPU)                                    \
-          .TypeConstraint<T>("T")                                \
-          .Label("eigen_tensor"),                                \
-      MaxPoolingOp<GPUDevice, T>);                               \
-  REGISTER_KERNEL_BUILDER(                                       \
-      Name("MaxPool").Device(DEVICE_GPU).TypeConstraint<T>("T"), \
-      MaxPoolingNoMaskOp<GPUDevice, T>);                         \
-  REGISTER_KERNEL_BUILDER(                                       \
-      Name("MaxPoolWithArgmax")                                  \
-          .Device(DEVICE_GPU)                                    \
-          .TypeConstraint<int64>("Targmax")                      \
-          .TypeConstraint<T>("T"),                               \
-      MaxPoolingWithArgmaxOp<GPUDevice, T>);                     \
-  REGISTER_KERNEL_BUILDER(                                       \
-      Name("MaxPoolGradWithArgmax")                              \
-          .Device(DEVICE_GPU)                                    \
-          .TypeConstraint<T>("T")                                \
-          .TypeConstraint<int64>("Targmax"),                     \
-      MaxPoolingGradWithArgmaxOp<GPUDevice, T>);                 \
-  REGISTER_KERNEL_BUILDER(                                       \
-      Name("MaxPoolGradGradWithArgmax")                          \
-          .Device(DEVICE_GPU)                                    \
-          .TypeConstraint<T>("T")                                \
-          .TypeConstraint<int64>("Targmax"),                     \
-      MaxPoolingGradGradWithArgmaxOp<GPUDevice, T>);
+#define REGISTER_GPU_ONLY_POOL_KERNELS(T)                            \
+  REGISTER_KERNEL_BUILDER(Name("MaxPool")                            \
+                              .Device(DEVICE_GPU)                    \
+                              .TypeConstraint<T>("T")                \
+                              .Label("eigen_tensor"),                \
+                          MaxPoolingOp<GPUDevice, T>);               \
+  REGISTER_KERNEL_BUILDER(                                           \
+      Name("MaxPool").Device(DEVICE_GPU).TypeConstraint<T>("T"),     \
+      MaxPoolingNoMaskOp<GPUDevice, T>);                             \
+  REGISTER_KERNEL_BUILDER(Name("MaxPoolWithArgmax")                  \
+                              .Device(DEVICE_GPU)                    \
+                              .TypeConstraint<int64>("Targmax")      \
+                              .TypeConstraint<T>("T"),               \
+                          MaxPoolingWithArgmaxOp<GPUDevice, T>);     \
+  REGISTER_KERNEL_BUILDER(Name("MaxPoolGradWithArgmax")              \
+                              .Device(DEVICE_GPU)                    \
+                              .TypeConstraint<T>("T")                \
+                              .TypeConstraint<int64>("Targmax"),     \
+                          MaxPoolingGradWithArgmaxOp<GPUDevice, T>); \
+  REGISTER_KERNEL_BUILDER(Name("MaxPoolGradGradWithArgmax")          \
+                              .Device(DEVICE_GPU)                    \
+                              .TypeConstraint<T>("T")                \
+                              .TypeConstraint<int64>("Targmax"),     \
+                          MaxPoolingGradGradWithArgmaxOp<GPUDevice, T>);
 TF_CALL_GPU_NUMBER_TYPES(REGISTER_GPU_ONLY_POOL_KERNELS);
 #undef REGISTER_GPU_ONLY_POOL_KERNELS
 
-#endif // GOOGLE_CUDA
+#endif  // GOOGLE_CUDA
 
 #undef REGISTER_MAX_POOL_KERNELS
 
diff --git a/tensorflow/core/kernels/maxpooling_op_gpu.cu.cc b/tensorflow/core/kernels/maxpooling_op_gpu.cu.cc
index 5462c6401da..32b210ecb7f 100644
--- a/tensorflow/core/kernels/maxpooling_op_gpu.cu.cc
+++ b/tensorflow/core/kernels/maxpooling_op_gpu.cu.cc
@@ -333,11 +333,11 @@ namespace functor {
 
 template <typename T>
 bool MaxPoolForwardWithOptionalArgmax<T>::operator()(
-    const T* bottom_data, const int batch, const int height,
-    const int width, const int channels, const int pooled_height,
-    const int pooled_width, const int kernel_h, const int kernel_w,
-    const int stride_h, const int stride_w, const int pad_t, const int pad_l,
-    T* top_data, int64* mask, const Eigen::GpuDevice& d) {
+    const T* bottom_data, const int batch, const int height, const int width,
+    const int channels, const int pooled_height, const int pooled_width,
+    const int kernel_h, const int kernel_w, const int stride_h,
+    const int stride_w, const int pad_t, const int pad_l, T* top_data,
+    int64* mask, const Eigen::GpuDevice& d) {
   const int kThreadsPerBlock = 1024;
   const int output_size = batch * channels * pooled_height * pooled_width;
 
@@ -351,14 +351,11 @@ bool MaxPoolForwardWithOptionalArgmax<T>::operator()(
 
 template <typename T>
 bool MaxPoolBackwardNoMask<T>::operator()(
-    const T* bottom_data, const int batch,
-    const int height, const int width,
-    const int channels, const int pooled_height,
-    const int pooled_width, const int kernel_h,
-    const int kernel_w, const int stride_h,
-    const int stride_w, const int pad_t, const int pad_l,
-    const T* top_diff, T* bottom_diff,
-    const Eigen::GpuDevice& d) {
+    const T* bottom_data, const int batch, const int height, const int width,
+    const int channels, const int pooled_height, const int pooled_width,
+    const int kernel_h, const int kernel_w, const int stride_h,
+    const int stride_w, const int pad_t, const int pad_l, const T* top_diff,
+    T* bottom_diff, const Eigen::GpuDevice& d) {
   const int kThreadsPerBlock = 1024;
   const int bottom_size = batch * channels * height * width;
   const int top_size = batch * channels * pooled_height * pooled_width;
@@ -377,9 +374,8 @@ bool MaxPoolBackwardNoMask<T>::operator()(
 
 template <typename T>
 bool MaxPoolBackwardWithArgmax<T>::operator()(
-    const int output_size, const int input_size,
-    const T* top_diff, const int64* mask,
-    const int top_offset, const int bottom_offset,
+    const int output_size, const int input_size, const T* top_diff,
+    const int64* mask, const int top_offset, const int bottom_offset,
     T* bottom_diff, const Eigen::GpuDevice& d) {
   const int kThreadsPerBlock = 1024;
   SetZero<<<(input_size + kThreadsPerBlock - 1) / kThreadsPerBlock,
diff --git a/tensorflow/core/kernels/maxpooling_op_gpu.h b/tensorflow/core/kernels/maxpooling_op_gpu.h
index 99e2b73d0c9..d2029f5719a 100644
--- a/tensorflow/core/kernels/maxpooling_op_gpu.h
+++ b/tensorflow/core/kernels/maxpooling_op_gpu.h
@@ -36,38 +36,36 @@ template <typename T>
 struct MaxPoolForwardWithOptionalArgmax {
   bool operator()(const T* bottom_data, const int batch, const int height,
                   const int width, const int channels, const int pooled_height,
-                  const int pooled_width, const int kernel_h, const int kernel_w,
-                  const int stride_h, const int stride_w, const int pad_t, const int pad_l,
-                  T* top_data, int64* mask, const Eigen::GpuDevice& d);
+                  const int pooled_width, const int kernel_h,
+                  const int kernel_w, const int stride_h, const int stride_w,
+                  const int pad_t, const int pad_l, T* top_data, int64* mask,
+                  const Eigen::GpuDevice& d);
 };
 
-
 template <typename T>
 struct MaxPoolBackwardWithArgmax {
   bool operator()(const int output_size, const int input_size,
-                  const T* top_diff, const int64* mask,
-                  const int top_offset, const int bottom_offset,
-                  T* bottom_diff, const Eigen::GpuDevice& d);
+                  const T* top_diff, const int64* mask, const int top_offset,
+                  const int bottom_offset, T* bottom_diff,
+                  const Eigen::GpuDevice& d);
 };
 
 template <typename T>
 struct MaxPoolBackwardNoMask {
-  bool operator()(const T* bottom_data, const int batch,
-                  const int height, const int width,
-                  const int channels, const int pooled_height,
+  bool operator()(const T* bottom_data, const int batch, const int height,
+                  const int width, const int channels, const int pooled_height,
                   const int pooled_width, const int kernel_h,
-                  const int kernel_w, const int stride_h,
-                  const int stride_w, const int pad_t, const int pad_l,
-                  const T* top_diff, T* bottom_diff,
-                  const Eigen::GpuDevice& d);
+                  const int kernel_w, const int stride_h, const int stride_w,
+                  const int pad_t, const int pad_l, const T* top_diff,
+                  T* bottom_diff, const Eigen::GpuDevice& d);
 };
 
 template <typename T>
 struct MaxPoolGradBackwardWithArgmax {
   bool operator()(const int output_size, const int input_size,
-                  const T* top_diff, const int64* mask,
-                  const int top_offset, const int bottom_offset,
-                  T* bottom_diff, const Eigen::GpuDevice& d);
+                  const T* top_diff, const int64* mask, const int top_offset,
+                  const int bottom_offset, T* bottom_diff,
+                  const Eigen::GpuDevice& d);
 };
 
 template <typename T>
@@ -75,12 +73,10 @@ struct MaxPoolGradBackwardNoMask {
   bool operator()(TensorFormat data_format, const T* bottom_data,
                   const T* output_data, const int batch,
                   const int pooled_height, const int pooled_width,
-                  const int channels, const int height,
-                  const int width, const int kernel_h,
-                  const int kernel_w, const int stride_h,
+                  const int channels, const int height, const int width,
+                  const int kernel_h, const int kernel_w, const int stride_h,
                   const int stride_w, const int pad_t, const int pad_l,
-                  const T* top_diff, T* bottom_diff,
-                  const Eigen::GpuDevice& d);
+                  const T* top_diff, T* bottom_diff, const Eigen::GpuDevice& d);
 };
 
 }  // namespace functor
diff --git a/tensorflow/core/kernels/mkl_avgpooling_op.cc b/tensorflow/core/kernels/mkl_avgpooling_op.cc
index f1f6a9ce53a..71918fe269c 100644
--- a/tensorflow/core/kernels/mkl_avgpooling_op.cc
+++ b/tensorflow/core/kernels/mkl_avgpooling_op.cc
@@ -336,10 +336,11 @@ class MklAvgPoolingGradOp : public OpKernel {
       if (!outbackprop_in_mkl_format) {
         // For avgpooling, tensor_in_shape should have 1 dimension, and 4
         // elements.
-        OP_REQUIRES(context, tensor_in_shape.dims() == 1 &&
-                                 tensor_in_shape.NumElements() == 4,
-                    errors::InvalidArgument("original input shape must be "
-                                            "1-dimensional and 4 elements"));
+        OP_REQUIRES(
+            context,
+            tensor_in_shape.dims() == 1 && tensor_in_shape.NumElements() == 4,
+            errors::InvalidArgument("original input shape must be "
+                                    "1-dimensional and 4 elements"));
 
         // For avgpooling, out_backprop should have 4 dimensions.
         OP_REQUIRES(context, out_backprop.dims() == 4,
diff --git a/tensorflow/core/kernels/mkl_conv_grad_bias_ops.cc b/tensorflow/core/kernels/mkl_conv_grad_bias_ops.cc
index 90b9f7ba90b..627fd83b0d7 100644
--- a/tensorflow/core/kernels/mkl_conv_grad_bias_ops.cc
+++ b/tensorflow/core/kernels/mkl_conv_grad_bias_ops.cc
@@ -38,9 +38,9 @@ limitations under the License.
 #include "tensorflow/core/util/use_cudnn.h"
 #include "tensorflow/core/util/work_sharder.h"
 
-#include "tensorflow/core/util/mkl_util.h"
 #include "third_party/mkl/include/mkl_dnn.h"
 #include "third_party/mkl/include/mkl_dnn_types.h"
+#include "tensorflow/core/util/mkl_util.h"
 
 namespace tensorflow {
 
diff --git a/tensorflow/core/kernels/mkl_conv_grad_filter_ops.cc b/tensorflow/core/kernels/mkl_conv_grad_filter_ops.cc
index 266f433e703..85198d89f56 100644
--- a/tensorflow/core/kernels/mkl_conv_grad_filter_ops.cc
+++ b/tensorflow/core/kernels/mkl_conv_grad_filter_ops.cc
@@ -37,9 +37,9 @@ limitations under the License.
 #include "tensorflow/core/util/use_cudnn.h"
 #include "tensorflow/core/util/work_sharder.h"
 
-#include "tensorflow/core/util/mkl_util.h"
 #include "third_party/mkl/include/mkl_dnn.h"
 #include "third_party/mkl/include/mkl_dnn_types.h"
+#include "tensorflow/core/util/mkl_util.h"
 
 namespace tensorflow {
 
diff --git a/tensorflow/core/kernels/mkl_conv_grad_input_ops.cc b/tensorflow/core/kernels/mkl_conv_grad_input_ops.cc
index c5ebe8024ed..c7d95c86bcd 100644
--- a/tensorflow/core/kernels/mkl_conv_grad_input_ops.cc
+++ b/tensorflow/core/kernels/mkl_conv_grad_input_ops.cc
@@ -23,6 +23,8 @@ limitations under the License.
 #define EIGEN_USE_THREADS
 #include <algorithm>
 #include <vector>
+#include "third_party/mkl/include/mkl_dnn.h"
+#include "third_party/mkl/include/mkl_dnn_types.h"
 #include "tensorflow/core/framework/numeric_op.h"
 #include "tensorflow/core/framework/op_kernel.h"
 #include "tensorflow/core/framework/register_types.h"
@@ -40,8 +42,6 @@ limitations under the License.
 #include "tensorflow/core/util/tensor_format.h"
 #include "tensorflow/core/util/use_cudnn.h"
 #include "tensorflow/core/util/work_sharder.h"
-#include "third_party/mkl/include/mkl_dnn.h"
-#include "third_party/mkl/include/mkl_dnn_types.h"
 
 namespace tensorflow {
 
diff --git a/tensorflow/core/kernels/mkl_conv_ops.cc b/tensorflow/core/kernels/mkl_conv_ops.cc
index acd37786ff1..e5c4c21a10a 100644
--- a/tensorflow/core/kernels/mkl_conv_ops.cc
+++ b/tensorflow/core/kernels/mkl_conv_ops.cc
@@ -107,10 +107,10 @@ class MklConv2DOp : public OpKernel {
     const int64 input_depth =
         input_in_mkl_format ? GetMklTensorDim(mkl_context.input_shape, 'C')
                             : GetTensorDim(input, data_format_, 'C');
-    OP_REQUIRES(
-        context, input_depth == filter.dim_size(2),
-        errors::InvalidArgument("input and filter must have the same depth: ",
-                                input_depth, " vs ", filter.dim_size(2)));
+    OP_REQUIRES(context, input_depth == filter.dim_size(2),
+                errors::InvalidArgument(
+                    "input and filter must have the same depth: ", input_depth,
+                    " vs ", filter.dim_size(2)));
     // The last dimension for filter is out_depth.
     const int out_depth = static_cast<int>(filter.dim_size(3));
 
@@ -119,9 +119,10 @@ class MklConv2DOp : public OpKernel {
     const int64 input_rows_raw =
         input_in_mkl_format ? GetMklTensorDim(mkl_context.input_shape, 'H')
                             : GetTensorDim(input, data_format_, 'H');
-    OP_REQUIRES(context, FastBoundsCheck(input_rows_raw,
-                                         std::numeric_limits<int>::max()),
-                errors::InvalidArgument("Input rows too large"));
+    OP_REQUIRES(
+        context,
+        FastBoundsCheck(input_rows_raw, std::numeric_limits<int>::max()),
+        errors::InvalidArgument("Input rows too large"));
     const int input_rows = static_cast<int>(input_rows_raw);
     const int filter_rows = static_cast<int>(filter.dim_size(0));
 
@@ -130,9 +131,10 @@ class MklConv2DOp : public OpKernel {
     const int64 input_cols_raw =
         input_in_mkl_format ? GetMklTensorDim(mkl_context.input_shape, 'W')
                             : GetTensorDim(input, data_format_, 'W');
-    OP_REQUIRES(context, FastBoundsCheck(input_cols_raw,
-                                         std::numeric_limits<int>::max()),
-                errors::InvalidArgument("Input cols too large"));
+    OP_REQUIRES(
+        context,
+        FastBoundsCheck(input_cols_raw, std::numeric_limits<int>::max()),
+        errors::InvalidArgument("Input cols too large"));
     const int input_cols = static_cast<int>(input_cols_raw);
     const int filter_cols = static_cast<int>(filter.dim_size(1));
 
@@ -140,9 +142,10 @@ class MklConv2DOp : public OpKernel {
     const int64 input_batch_raw =
         input_in_mkl_format ? GetMklTensorDim(mkl_context.input_shape, 'N')
                             : GetTensorDim(input, data_format_, 'N');
-    OP_REQUIRES(context, FastBoundsCheck(input_batch_raw,
-                                         std::numeric_limits<int>::max()),
-                errors::InvalidArgument("batch is too large"));
+    OP_REQUIRES(
+        context,
+        FastBoundsCheck(input_batch_raw, std::numeric_limits<int>::max()),
+        errors::InvalidArgument("batch is too large"));
     const int batch = static_cast<int>(input_batch_raw);
 
     // For now we take the stride from the second and third dimensions only (we
diff --git a/tensorflow/core/kernels/mkl_maxpooling_op.cc b/tensorflow/core/kernels/mkl_maxpooling_op.cc
index 7e3efdcc06c..9d6cfb0c97d 100644
--- a/tensorflow/core/kernels/mkl_maxpooling_op.cc
+++ b/tensorflow/core/kernels/mkl_maxpooling_op.cc
@@ -393,18 +393,19 @@ class MklMaxPoolingGradOp : public OpKernel {
       if (workspace_enabled == false) {
         if (convert_input != nullptr) {
           if (input_in_mkl_format == false) {
-            CHECK_EQ(
-                dnnConversionExecute_F32(
-                    convert_input, const_cast<void*>(static_cast<const void*>(
-                                       tensor_in.flat<T>().data())),
-                    input_buf),
-                E_SUCCESS);
+            CHECK_EQ(dnnConversionExecute_F32(
+                         convert_input,
+                         const_cast<void*>(static_cast<const void*>(
+                             tensor_in.flat<T>().data())),
+                         input_buf),
+                     E_SUCCESS);
             CHECK_EQ(dnnDelete_F32(convert_input), E_SUCCESS);
             convert_input = nullptr;
           } else {
             input_shape.GetConvertedFlatData(
-                lt_input_prim, const_cast<void*>(static_cast<const void*>(
-                                   tensor_in.flat<T>().data())),
+                lt_input_prim,
+                const_cast<void*>(
+                    static_cast<const void*>(tensor_in.flat<T>().data())),
                 input_buf);
           }
           pooling_resfwd[dnnResourceSrc] = input_buf;
@@ -449,8 +450,9 @@ class MklMaxPoolingGradOp : public OpKernel {
           CHECK_EQ(dnnDelete_F32(convert_outbackprop), E_SUCCESS);
         } else {
           output_backprop_shape.GetConvertedFlatData(
-              lt_outbackprop_prim, const_cast<void*>(static_cast<const void*>(
-                                       out_backprop.flat<T>().data())),
+              lt_outbackprop_prim,
+              const_cast<void*>(
+                  static_cast<const void*>(out_backprop.flat<T>().data())),
               outbackprop_buf);
         }
         pooling_res[dnnResourceDiffDst] = outbackprop_buf;
diff --git a/tensorflow/core/kernels/mkl_pooling_ops_common.cc b/tensorflow/core/kernels/mkl_pooling_ops_common.cc
index 3eb472d7e30..d88bd4c640d 100644
--- a/tensorflow/core/kernels/mkl_pooling_ops_common.cc
+++ b/tensorflow/core/kernels/mkl_pooling_ops_common.cc
@@ -14,153 +14,137 @@ limitations under the License.
 ==============================================================================*/
 
 #ifdef INTEL_MKL
-#include <vector>
-#include "tensorflow/core/framework/common_shape_fns.h"
 #include "tensorflow/core/kernels/mkl_pooling_ops_common.h"
+#include <vector>
 #include "tensorflow/core/common_runtime/device.h"
+#include "tensorflow/core/framework/common_shape_fns.h"
 
 namespace tensorflow {
 
-  // Initialization for TensorFlow format
-  void MklPoolParameters::Init(OpKernelContext* context,
-                               const std::vector<int32>& ksize,
-                               const std::vector<int32>& stride,
-                               Padding padding,
-                               TensorFormat data_format,
-                               const TensorShape& tensor_in_shape) {
-    // For maxpooling, tensor_in should have 4 dimensions.
-    OP_REQUIRES(context, tensor_in_shape.dims() == 4,
-                errors::InvalidArgument("tensor_in must be 4-dimensional"));
+// Initialization for TensorFlow format
+void MklPoolParameters::Init(OpKernelContext* context,
+                             const std::vector<int32>& ksize,
+                             const std::vector<int32>& stride, Padding padding,
+                             TensorFormat data_format,
+                             const TensorShape& tensor_in_shape) {
+  // For maxpooling, tensor_in should have 4 dimensions.
+  OP_REQUIRES(context, tensor_in_shape.dims() == 4,
+              errors::InvalidArgument("tensor_in must be 4-dimensional"));
 
-    depth = GetTensorDim(tensor_in_shape, data_format, 'C');
-    tensor_in_cols = GetTensorDim(tensor_in_shape, data_format, 'W');
-    tensor_in_rows = GetTensorDim(tensor_in_shape, data_format, 'H');
-    tensor_in_batch = GetTensorDim(tensor_in_shape, data_format, 'N');
+  depth = GetTensorDim(tensor_in_shape, data_format, 'C');
+  tensor_in_cols = GetTensorDim(tensor_in_shape, data_format, 'W');
+  tensor_in_rows = GetTensorDim(tensor_in_shape, data_format, 'H');
+  tensor_in_batch = GetTensorDim(tensor_in_shape, data_format, 'N');
 
-    Init(context, ksize, stride, padding, data_format);
-  }
+  Init(context, ksize, stride, padding, data_format);
+}
 
-  // Initialization for MKL format
-  void MklPoolParameters::Init(OpKernelContext* context,
-                               const std::vector<int32>& ksize,
-                               const std::vector<int32>& stride,
-                               Padding padding,
-                               TensorFormat data_format,
-                               const MklShape* mklInputShape) {
-    // Get the input sizes
-    depth = mklInputShape->GetSizes()[2];
-    tensor_in_cols = mklInputShape->GetSizes()[0];
-    tensor_in_rows = mklInputShape->GetSizes()[1];
-    tensor_in_batch = mklInputShape->GetSizes()[3];
+// Initialization for MKL format
+void MklPoolParameters::Init(OpKernelContext* context,
+                             const std::vector<int32>& ksize,
+                             const std::vector<int32>& stride, Padding padding,
+                             TensorFormat data_format,
+                             const MklShape* mklInputShape) {
+  // Get the input sizes
+  depth = mklInputShape->GetSizes()[2];
+  tensor_in_cols = mklInputShape->GetSizes()[0];
+  tensor_in_rows = mklInputShape->GetSizes()[1];
+  tensor_in_batch = mklInputShape->GetSizes()[3];
 
-    Init(context, ksize, stride, padding, data_format);
-  }
+  Init(context, ksize, stride, padding, data_format);
+}
 
-  // Common Initialization for TensorFlow and MKL formats
-  void MklPoolParameters::Init(OpKernelContext* context,
-                               const std::vector<int32>& ksize,
-                               const std::vector<int32>& stride,
-                               Padding padding,
-                               TensorFormat data_format) {
-    // Get the data format
-    this->data_format = data_format;
+// Common Initialization for TensorFlow and MKL formats
+void MklPoolParameters::Init(OpKernelContext* context,
+                             const std::vector<int32>& ksize,
+                             const std::vector<int32>& stride, Padding padding,
+                             TensorFormat data_format) {
+  // Get the data format
+  this->data_format = data_format;
 
-    // Get the output sizes
-    window_rows = GetTensorDim(ksize, data_format, 'H');
-    window_cols = GetTensorDim(ksize, data_format, 'W');
-    depth_window = GetTensorDim(ksize, data_format, 'C');
+  // Get the output sizes
+  window_rows = GetTensorDim(ksize, data_format, 'H');
+  window_cols = GetTensorDim(ksize, data_format, 'W');
+  depth_window = GetTensorDim(ksize, data_format, 'C');
 
-    // Get the strides
-    row_stride = GetTensorDim(stride, data_format, 'H');
-    col_stride = GetTensorDim(stride, data_format, 'W');
-    depth_stride = GetTensorDim(stride, data_format, 'C');
+  // Get the strides
+  row_stride = GetTensorDim(stride, data_format, 'H');
+  col_stride = GetTensorDim(stride, data_format, 'W');
+  depth_stride = GetTensorDim(stride, data_format, 'C');
 
-    // We only support 2D pooling across width/height and depthwise
-    // pooling, not a combination.
-    OP_REQUIRES(context,
-                (depth_window == 1 || (window_rows == 1 && window_cols == 1)),
-                errors::Unimplemented(
+  // We only support 2D pooling across width/height and depthwise
+  // pooling, not a combination.
+  OP_REQUIRES(context,
+              (depth_window == 1 || (window_rows == 1 && window_cols == 1)),
+              errors::Unimplemented(
                   "MaxPooling supports exactly one of pooling across depth "
                   "or pooling across width/height."));
 
-    if (depth_window == 1) {
-      OP_REQUIRES_OK(context,
-                     GetWindowedOutputSizeVerbose(tensor_in_rows,
-                                                  window_rows,
-                                                  row_stride,
-                                                  padding,
-                                                  &out_height,
-                                                  &pad_top,
-                                                  &pad_bottom));
+  if (depth_window == 1) {
+    OP_REQUIRES_OK(context, GetWindowedOutputSizeVerbose(
+                                tensor_in_rows, window_rows, row_stride,
+                                padding, &out_height, &pad_top, &pad_bottom));
 
-      OP_REQUIRES_OK(context,
-                     GetWindowedOutputSizeVerbose(tensor_in_cols,
-                                                  window_cols,
-                                                  col_stride,
-                                                  padding,
-                                                  &out_width,
-                                                  &pad_left,
-                                                  &pad_right));
-    } else {
-      // Our current version of depthwise max pooling does not support
-      // any padding, and expects the depth_window to equal the depth
-      // stride (no overlapping).
-      OP_REQUIRES(context, depth % depth_window == 0,
-                  errors::Unimplemented("Depthwise max pooling requires the"
-                                        " depth window to evenly divide the"
-                                        " input depth"));
-      OP_REQUIRES(context, depth_stride == depth_window,
-                  errors::Unimplemented("Depthwise max pooling requires the"
-                                        " depth window to equal the depth"
-                                        " stride"));
+    OP_REQUIRES_OK(context, GetWindowedOutputSizeVerbose(
+                                tensor_in_cols, window_cols, col_stride,
+                                padding, &out_width, &pad_left, &pad_right));
+  } else {
+    // Our current version of depthwise max pooling does not support
+    // any padding, and expects the depth_window to equal the depth
+    // stride (no overlapping).
+    OP_REQUIRES(context, depth % depth_window == 0,
+                errors::Unimplemented("Depthwise max pooling requires the"
+                                      " depth window to evenly divide the"
+                                      " input depth"));
+    OP_REQUIRES(context, depth_stride == depth_window,
+                errors::Unimplemented("Depthwise max pooling requires the"
+                                      " depth window to equal the depth"
+                                      " stride"));
 
-      // The current version of depthwise max is only implemented on CPU.
-      OP_REQUIRES(context,
-                  (DeviceType(static_cast<Device*>(context->device())
-                              ->attributes()
-                              .device_type()) == DeviceType(DEVICE_CPU)),
-                  errors::Unimplemented("Depthwise max pooling is currently "
-                                        "only implemented for CPU devices."));
+    // The current version of depthwise max is only implemented on CPU.
+    OP_REQUIRES(context,
+                (DeviceType(static_cast<Device*>(context->device())
+                                ->attributes()
+                                .device_type()) == DeviceType(DEVICE_CPU)),
+                errors::Unimplemented("Depthwise max pooling is currently "
+                                      "only implemented for CPU devices."));
 
-      pad_depth = 0;
-      out_depth = depth / depth_window;
-    }
+    pad_depth = 0;
+    out_depth = depth / depth_window;
   }
+}
 
-  // Transfers the right parameters for pooling to the op parameters
-  // Updates context->status if there is an invalid input.
-  void ExtractMklOpParams(OpKernelContext* context,
-                          TensorFormat data_format,
-                          const MklPoolParameters &params,
-                          MklPoolingOpParams *mkl_params) {
-    mkl_params->in_sizes[0] = params.tensor_in_cols;
-    mkl_params->in_sizes[1] = params.tensor_in_rows;
-    mkl_params->in_sizes[2] = params.depth;
-    mkl_params->in_sizes[3] = params.tensor_in_batch;
+// Transfers the right parameters for pooling to the op parameters
+// Updates context->status if there is an invalid input.
+void ExtractMklOpParams(OpKernelContext* context, TensorFormat data_format,
+                        const MklPoolParameters& params,
+                        MklPoolingOpParams* mkl_params) {
+  mkl_params->in_sizes[0] = params.tensor_in_cols;
+  mkl_params->in_sizes[1] = params.tensor_in_rows;
+  mkl_params->in_sizes[2] = params.depth;
+  mkl_params->in_sizes[3] = params.tensor_in_batch;
 
-    GetStridesFromSizes(data_format,
-                        mkl_params->in_strides,
-                        mkl_params->in_sizes);
+  GetStridesFromSizes(data_format, mkl_params->in_strides,
+                      mkl_params->in_sizes);
 
-    mkl_params->out_sizes[0] = params.out_width;
-    mkl_params->out_sizes[1] = params.out_height;
-    mkl_params->out_sizes[2] = params.depth;
-    mkl_params->out_sizes[3] = params.tensor_in_batch;
+  mkl_params->out_sizes[0] = params.out_width;
+  mkl_params->out_sizes[1] = params.out_height;
+  mkl_params->out_sizes[2] = params.depth;
+  mkl_params->out_sizes[3] = params.tensor_in_batch;
 
-    GetStridesFromSizes(data_format,
-                        mkl_params->out_strides,
-                        mkl_params->out_sizes);
+  GetStridesFromSizes(data_format, mkl_params->out_strides,
+                      mkl_params->out_sizes);
 
-    mkl_params->in_offset[0] = -params.pad_left;
-    mkl_params->in_offset[1] = -params.pad_top;
-    mkl_params->in_offset[2] = -params.pad_right;
-    mkl_params->in_offset[3] = -params.pad_bottom;
+  mkl_params->in_offset[0] = -params.pad_left;
+  mkl_params->in_offset[1] = -params.pad_top;
+  mkl_params->in_offset[2] = -params.pad_right;
+  mkl_params->in_offset[3] = -params.pad_bottom;
 
-    mkl_params->kernel_stride[0] = params.col_stride;
-    mkl_params->kernel_stride[1] = params.row_stride;
+  mkl_params->kernel_stride[0] = params.col_stride;
+  mkl_params->kernel_stride[1] = params.row_stride;
 
-    mkl_params->kernel_size[0] = params.window_cols;
-    mkl_params->kernel_size[1] = params.window_rows;
-  }
-}       // namespace tensorflow
+  mkl_params->kernel_size[0] = params.window_cols;
+  mkl_params->kernel_size[1] = params.window_rows;
+}
+}  // namespace tensorflow
 #endif  // INTEL_MKL
diff --git a/tensorflow/core/kernels/mkl_pooling_ops_common.h b/tensorflow/core/kernels/mkl_pooling_ops_common.h
index 0a7c4dd15eb..92ea2beb25a 100644
--- a/tensorflow/core/kernels/mkl_pooling_ops_common.h
+++ b/tensorflow/core/kernels/mkl_pooling_ops_common.h
@@ -76,17 +76,16 @@ typedef struct {
   size_t in_strides[4];
   size_t out_sizes[4];
   size_t out_strides[4];
-  int    in_offset[4];
+  int in_offset[4];
   size_t kernel_stride[2];
   size_t kernel_size[2];
 } MklPoolingOpParams;
 
 // Transfers the right parameters for pooling to the op parameters
 // Updates context->status if there is an invalid input.
-void ExtractMklOpParams(OpKernelContext* context,
-                        TensorFormat data_format,
-                        const MklPoolParameters &params,
-                        MklPoolingOpParams *mkl_params);
+void ExtractMklOpParams(OpKernelContext* context, TensorFormat data_format,
+                        const MklPoolParameters& params,
+                        MklPoolingOpParams* mkl_params);
 }  // namespace tensorflow
 
 #endif  // INTEL_MKL
diff --git a/tensorflow/core/kernels/mkl_relu_op.cc b/tensorflow/core/kernels/mkl_relu_op.cc
index 63c0374981f..7809711524c 100644
--- a/tensorflow/core/kernels/mkl_relu_op.cc
+++ b/tensorflow/core/kernels/mkl_relu_op.cc
@@ -1,397 +1,397 @@
-/* Copyright 2015 The TensorFlow Authors. All Rights Reserved.
-
-Licensed under the Apache License, Version 2.0 (the "License");
-you may not use this file except in compliance with the License.
-You may obtain a copy of the License at
-
-    http://www.apache.org/licenses/LICENSE-2.0
-
-Unless required by applicable law or agreed to in writing, software
-distributed under the License is distributed on an "AS IS" BASIS,
-WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-See the License for the specific language governing permissions and
-limitations under the License.
-==============================================================================*/
-
-// See docs in ../ops/nn_ops.cc.
-#ifdef INTEL_MKL
-
-#include "tensorflow/core/framework/numeric_op.h"
-#include "tensorflow/core/framework/op_kernel.h"
-#include "tensorflow/core/framework/register_types.h"
-#include "tensorflow/core/framework/tensor.h"
-#include "tensorflow/core/lib/core/errors.h"
-#include "third_party/eigen3/unsupported/Eigen/CXX11/Tensor"
-
-#include "tensorflow/core/platform/default/logging.h"
-#include "tensorflow/core/util/mkl_util.h"
-#include "third_party/mkl/include/mkl_dnn.h"
-#include "third_party/mkl/include/mkl_dnn_types.h"
-
-namespace tensorflow {
-
-typedef Eigen::ThreadPoolDevice CPUDevice;
-
-struct MklReluHelpers {
-  static void ValidateSameSizeHelper(OpKernelContext* context, const Tensor& g,
-                                     const Tensor& a) {
-    OP_REQUIRES(context, a.IsSameSize(g),
-                errors::InvalidArgument("g and a must be the same size"));
-  }
-  static bool ValidateSameSize(OpKernelContext* context, const Tensor& g,
-                               const Tensor& a) {
-    ValidateSameSizeHelper(context, g, a);
-    return context->status().ok();
-  }
-};
-
-template <typename Device, typename T>
-class MklReluOp : public OpKernel {
- public:
-  ~MklReluOp() {}
-
-  explicit MklReluOp(OpKernelConstruction* context) : OpKernel(context) {}
-
-  void Compute(OpKernelContext* context) override {
-    MklReluOpContext mkl_context;
-
-    const Tensor& input = MklGetInput(context, 0);
-    GetMklShape(context, 0, &mkl_context.input_shape);
-    void* user_i = static_cast<void*>(const_cast<T*>(input.flat<T>().data()));
-    bool input_in_mkl_format = mkl_context.input_shape.IsMklTensor();
-    if (!input_in_mkl_format && !input.dims()) {  // handle the case of a scalar
-      const TensorShape& o_shape = input.shape();
-      Tensor* out_tensor = nullptr;
-      mkl_context.output_shape.SetMklTensor(false);
-      AllocateOutputSetMklshape(context, 0, &out_tensor, o_shape,
-                                mkl_context.output_shape);
-      void* out_o = static_cast<void*>(out_tensor->flat<T>().data());
-      (static_cast<T*>(out_o))[0] =
-          std::max((static_cast<T*>(user_i))[0], static_cast<T>(0));
-      return;
-    }
-
-    // Generate size, stride for input if input is in MKL format.
-    if (input_in_mkl_format) {
-      mkl_context.in_dims = mkl_context.input_shape.GetDimension();
-      mkl_context.in_sizes = new size_t[mkl_context.in_dims];
-      mkl_context.in_strides = new size_t[mkl_context.in_dims];
-      for (int i = 0; i < mkl_context.in_dims; i++) {
-        mkl_context.in_sizes[i] = mkl_context.input_shape.GetSizes()[i];
-        mkl_context.in_strides[i] = mkl_context.input_shape.GetStrides()[i];
-      }
-    } else {
-      mkl_context.in_dims = input.dims();
-      mkl_context.in_sizes = new size_t[mkl_context.in_dims];
-      mkl_context.in_strides = new size_t[mkl_context.in_dims];
-      for (int i = 0; i < mkl_context.in_dims; i++) {
-        mkl_context.in_sizes[i] = input.dim_size((mkl_context.in_dims - 1) - i);
-      }
-      mkl_context.in_strides[0] = 1;
-      for (int i = 1; i < mkl_context.in_dims; i++) {
-        mkl_context.in_strides[i] =
-            mkl_context.in_strides[i - 1] * mkl_context.in_sizes[i - 1];
-      }
-    }
-
-    float negative_slope = 0.0;
-    mkl_context.MklCreateInputLayouts(context);
-    CHECK_EQ(dnnReLUCreateForward_F32(&mkl_context.prim_relu_fwd, NULL,
-                                      mkl_context.lt_input, negative_slope),
-             E_SUCCESS);
-
-    Tensor* output = nullptr;
-
-    if (input_in_mkl_format) {
-      TensorShape tf_shape;
-      mkl_context.output_shape.SetMklTensor(true);
-      mkl_context.output_shape.SetMklLayout(mkl_context.prim_relu_fwd,
-                                            dnnResourceDst);
-      mkl_context.output_shape.SetTfLayout(
-          mkl_context.in_dims, mkl_context.in_sizes, mkl_context.in_strides);
-      mkl_context.output_shape.SetTfDimOrder(
-          mkl_context.in_dims, mkl_context.input_shape.GetTfToMklDimMap());
-      tf_shape.AddDim(dnnLayoutGetMemorySize_F32(static_cast<dnnLayout_t>(
-                          mkl_context.output_shape.GetMklLayout())) /
-                      sizeof(T));
-      AllocateOutputSetMklshape(context, 0, &output, tf_shape,
-                                mkl_context.output_shape);
-    } else {
-      const TensorShape& o_shape = input.shape();
-      mkl_context.output_shape.SetMklTensor(false);
-      AllocateOutputSetMklshape(context, 0, &output, o_shape,
-                                mkl_context.output_shape);
-    }
-
-    void* user_o = static_cast<void*>(const_cast<T*>(output->flat<T>().data()));
-
-    mkl_context.relu_res[dnnResourceDst] = user_o;
-    mkl_context.relu_res[dnnResourceSrc] = user_i;
-    CHECK_EQ(dnnExecute_F32(mkl_context.prim_relu_fwd, mkl_context.relu_res),
-             E_SUCCESS);
-    mkl_context.MklCleanup();
-  }
-
- private:
-  typedef struct {
-    int in_dims;
-    size_t* in_sizes;
-    size_t* in_strides;
-    MklShape input_shape, output_shape;
-    dnnPrimitive_t prim_relu_fwd = nullptr;
-    void* relu_res[dnnResourceNumber];
-    dnnLayout_t lt_input = nullptr;
-
-    void MklCleanup() {
-      bool input_in_mkl_format = input_shape.IsMklTensor();
-      if (!input_in_mkl_format) {
-        dnnLayoutDelete_F32(lt_input);
-        free(in_sizes);
-        free(in_strides);
-      }
-      dnnDelete_F32(prim_relu_fwd);
-    }
-
-    void MklCreateInputLayouts(OpKernelContext* context) {
-      bool input_in_mkl_format = input_shape.IsMklTensor();
-      if (!input_in_mkl_format) {
-        CHECK_EQ(dnnLayoutCreate_F32(&lt_input, in_dims, in_sizes, in_strides),
-                 E_SUCCESS);
-      } else {
-        lt_input = static_cast<dnnLayout_t>(input_shape.GetCurLayout());
-      }
-    }
-  } MklReluOpContext;
-};
-
-template <typename Device, typename T>
-class MklReluGradOp : public OpKernel {
- public:
-  ~MklReluGradOp() {}
-
-  explicit MklReluGradOp(OpKernelConstruction* context) : OpKernel(context) {}
-
-  void Compute(OpKernelContext* context) override;
-
- private:
-  typedef struct {
-    int in_dims;
-    size_t* in_sizes;
-    size_t* in_strides;
-    MklShape input_shape, grad_shape, output_shape;
-    void* relu_res[dnnResourceNumber];
-    dnnPrimitive_t prim_relu_bwd;
-    dnnLayout_t lt_input, lt_grad;
-
-    void MklPrepareReluGradInputs(OpKernelContext* context,
-                                  Tensor* mkl_tmp_grad_buf_tensor,
-                                  Tensor* mkl_tmp_input_buf_tensor) {
-      dnnPrimitive_t cv_user_to_reluB_input, cv_user_to_reluB_grad;
-      dnnLayout_t mkl_lt_internal_input, mkl_lt_internal_grad;
-
-      const Tensor& g = MklGetInput(context, 0);
-      const Tensor& a = MklGetInput(context, 1);
-
-      void* user_i = static_cast<void*>(const_cast<T*>(a.flat<T>().data()));
-      void* user_g = static_cast<void*>(const_cast<T*>(g.flat<T>().data()));
-
-      CHECK_EQ(dnnLayoutCreateFromPrimitive_F32(
-                   &mkl_lt_internal_grad, prim_relu_bwd, dnnResourceDiffDst),
-               E_SUCCESS);
-
-      CHECK_EQ(dnnLayoutCreateFromPrimitive_F32(&mkl_lt_internal_input,
-                                                prim_relu_bwd, dnnResourceSrc),
-               E_SUCCESS);
-
-      if (!dnnLayoutCompare_F32(mkl_lt_internal_grad, lt_grad)) {
-        AllocTmpBuffer(context, mkl_tmp_grad_buf_tensor, mkl_lt_internal_grad,
-                       &relu_res[dnnResourceDiffDst]);
-        CHECK_EQ(dnnConversionCreate_F32(&cv_user_to_reluB_grad, lt_grad,
-                                         mkl_lt_internal_grad),
-                 E_SUCCESS);
-        CHECK_EQ(dnnConversionExecute_F32(cv_user_to_reluB_grad, user_g,
-                                          relu_res[dnnResourceDiffDst]),
-                 E_SUCCESS);
-        dnnDelete_F32(cv_user_to_reluB_grad);
-      } else {
-        relu_res[dnnResourceDiffDst] = user_g;
-      }
-
-      if (!dnnLayoutCompare_F32(mkl_lt_internal_input, lt_input)) {
-        AllocTmpBuffer(context, mkl_tmp_input_buf_tensor, mkl_lt_internal_input,
-                       &relu_res[dnnResourceSrc]);
-        CHECK_EQ(dnnConversionCreate_F32(&cv_user_to_reluB_input, lt_input,
-                                         mkl_lt_internal_input),
-                 E_SUCCESS);
-        CHECK_EQ(dnnConversionExecute_F32(cv_user_to_reluB_input, user_i,
-                                          relu_res[dnnResourceSrc]),
-                 E_SUCCESS);
-        dnnDelete_F32(cv_user_to_reluB_input);
-      } else {
-        relu_res[dnnResourceSrc] = user_i;
-      }
-
-      dnnLayoutDelete_F32(mkl_lt_internal_input);
-      dnnLayoutDelete_F32(mkl_lt_internal_grad);
-    }
-
-    void MklCreateInputLayouts(OpKernelContext* context) {
-      bool grad_is_mkl = grad_shape.IsMklTensor();
-      bool input_is_mkl = input_shape.IsMklTensor();
-      if (!input_is_mkl) {
-        CHECK_EQ(dnnLayoutCreate_F32(&lt_input, in_dims, in_sizes, in_strides),
-                 E_SUCCESS);
-      } else {
-        lt_input = static_cast<dnnLayout_t>(input_shape.GetCurLayout());
-      }
-
-      if (!grad_is_mkl) {
-        CHECK_EQ(dnnLayoutCreate_F32(&lt_grad, in_dims, in_sizes, in_strides),
-                 E_SUCCESS);
-      } else {
-        lt_grad = static_cast<dnnLayout_t>(grad_shape.GetCurLayout());
-      }
-    }
-
-    void MklCleanup() {
-      bool grad_is_mkl = grad_shape.IsMklTensor();
-      bool input_is_mkl = input_shape.IsMklTensor();
-      dnnDelete_F32(prim_relu_bwd);
-      if (!input_is_mkl) {
-        dnnLayoutDelete_F32(lt_input);
-        free(in_sizes);
-        free(in_strides);
-      }
-      if (!grad_is_mkl) {
-        dnnLayoutDelete_F32(lt_grad);
-      }
-    }
-  } MklReluGradOpContext;
-};
-
-template <typename Device, typename T>
-
-void MklReluGradOp<Device, T>::Compute(OpKernelContext* context) {
-  MklReluGradOpContext mkl_context;
-  const Tensor& g = MklGetInput(context, 0);
-  const Tensor& a = MklGetInput(context, 1);
-
-  void* user_i = static_cast<void*>(const_cast<T*>(a.flat<T>().data()));
-  void* user_g = static_cast<void*>(const_cast<T*>(g.flat<T>().data()));
-
-  GetMklShape(context, 0, &mkl_context.grad_shape);
-  GetMklShape(context, 1, &mkl_context.input_shape);
-
-  bool grad_is_mkl = mkl_context.grad_shape.IsMklTensor();
-  bool input_is_mkl = mkl_context.input_shape.IsMklTensor();
-  if (!input_is_mkl && !grad_is_mkl &&
-      !MklReluHelpers::ValidateSameSize(context, g, a))
-    return;
-  Tensor* output = nullptr;
-  if (!input_is_mkl && !grad_is_mkl &&
-      !a.dims()) {  // handle the case of a scalar
-    // Allocate space for g and
-    const TensorShape& g_shape = g.shape();
-    mkl_context.output_shape.SetMklTensor(false);
-    AllocateOutputSetMklshape(context, 0, &output, g_shape,
-                              mkl_context.output_shape);
-    void* out_o = static_cast<void*>(output->flat<T>().data());
-    (static_cast<T*>(out_o))[0] =
-        (static_cast<T*>(user_g))[0] * ((static_cast<T*>(user_i))[0] > 0);
-    return;
-  }
-
-  // Generate size, stride for input if input/grad is in MKL format.
-  if (grad_is_mkl || input_is_mkl) {
-    const MklShape* tmp_mkl_shape =
-        (grad_is_mkl) ? &mkl_context.grad_shape : &mkl_context.input_shape;
-
-    mkl_context.in_dims = tmp_mkl_shape->GetDimension();
-    mkl_context.in_strides = new size_t[mkl_context.in_dims];
-    mkl_context.in_sizes = new size_t[mkl_context.in_dims];
-    for (int i = 0; i < mkl_context.in_dims; i++) {
-      mkl_context.in_sizes[i] = tmp_mkl_shape->GetSizes()[i];
-      mkl_context.in_strides[i] = tmp_mkl_shape->GetStrides()[i];
-    }
-  } else {
-    mkl_context.in_dims = g.dims();
-    mkl_context.in_strides = new size_t[mkl_context.in_dims];
-    mkl_context.in_sizes = new size_t[mkl_context.in_dims];
-
-    for (int i = 0; i < mkl_context.in_dims; i++) {
-      mkl_context.in_sizes[i] = g.dim_size((mkl_context.in_dims - 1) - i);
-    }
-    mkl_context.in_strides[0] = 1;
-    for (int i = 1; i < mkl_context.in_dims; i++) {
-      mkl_context.in_strides[i] =
-          mkl_context.in_strides[i - 1] * mkl_context.in_sizes[i - 1];
-    }
-  }
-
-  mkl_context.MklCreateInputLayouts(context);
-  float negative_slope = 0.0;
-  CHECK_EQ(dnnReLUCreateBackward_F32(&mkl_context.prim_relu_bwd, NULL,
-                                     mkl_context.lt_grad, mkl_context.lt_input,
-                                     negative_slope),
-           E_SUCCESS);
-  Tensor mkl_tmp_grad_buf_tensor, mkl_tmp_input_buf_tensor;
-  mkl_context.MklPrepareReluGradInputs(context, &mkl_tmp_grad_buf_tensor,
-                                       &mkl_tmp_input_buf_tensor);
-
-  if (input_is_mkl ||
-      grad_is_mkl) { /*if  grad or input are MKL leave it in MKL*/
-    TensorShape tf_shape;
-    mkl_context.output_shape.SetMklTensor(true);
-    mkl_context.output_shape.SetMklLayout(mkl_context.prim_relu_bwd,
-                                          dnnResourceDiffSrc);
-    mkl_context.output_shape.SetTfLayout(
-        mkl_context.in_dims, mkl_context.in_sizes, mkl_context.in_strides);
-    // If input_is_mkl or grad_is_mkl, then we copy strides and sizes from Mkl
-    // shape of one that is in MKL layout.
-    if (grad_is_mkl == true) {
-      mkl_context.output_shape.SetTfDimOrder(
-          mkl_context.in_dims, mkl_context.grad_shape.GetTfToMklDimMap());
-    } else {
-      mkl_context.output_shape.SetTfDimOrder(
-          mkl_context.in_dims, mkl_context.input_shape.GetTfToMklDimMap());
-    }
-
-    tf_shape.AddDim(dnnLayoutGetMemorySize_F32(static_cast<dnnLayout_t>(
-                        mkl_context.output_shape.GetMklLayout())) /
-                    sizeof(T));
-    AllocateOutputSetMklshape(context, 0, &output, tf_shape,
-                              mkl_context.output_shape);
-
-  } else {
-    const TensorShape& o_shape = g.shape();
-    mkl_context.output_shape.SetMklTensor(false);
-    AllocateOutputSetMklshape(context, 0, &output, o_shape,
-                              mkl_context.output_shape);
-  }
-
-  mkl_context.relu_res[dnnResourceDiffSrc] =
-      static_cast<void*>(output->flat<T>().data());
-
-  CHECK_EQ(dnnExecute_F32(mkl_context.prim_relu_bwd, mkl_context.relu_res),
-           E_SUCCESS);
-  mkl_context.MklCleanup();
-}
-
-/* Register DNN kernels for supported operations and supported types - right now
- * it is only Relu and f32*/
-#define REGISTER_RELU_MKL_SUPPORTED_KERNELS_TYPES(type)                   \
-  REGISTER_KERNEL_BUILDER(Name("MklRelu")                                 \
-                              .Device(DEVICE_CPU)                         \
-                              .TypeConstraint<type>("T")                  \
-                              .Label(mkl_layer_registry::kMklLayerLabel), \
-                          MklReluOp<CPUDevice, type>);                    \
-  REGISTER_KERNEL_BUILDER(Name("MklReluGrad")                             \
-                              .Device(DEVICE_CPU)                         \
-                              .TypeConstraint<type>("T")                  \
-                              .Label(mkl_layer_registry::kMklLayerLabel), \
-                          MklReluGradOp<CPUDevice, type>);
-TF_CALL_float(REGISTER_RELU_MKL_SUPPORTED_KERNELS_TYPES);
-
-}  // namespace tensorflow
-
-#endif  // INTEL_MKL
+/* Copyright 2015 The TensorFlow Authors. All Rights Reserved.
+
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+==============================================================================*/
+
+// See docs in ../ops/nn_ops.cc.
+#ifdef INTEL_MKL
+
+#include "third_party/eigen3/unsupported/Eigen/CXX11/Tensor"
+#include "tensorflow/core/framework/numeric_op.h"
+#include "tensorflow/core/framework/op_kernel.h"
+#include "tensorflow/core/framework/register_types.h"
+#include "tensorflow/core/framework/tensor.h"
+#include "tensorflow/core/lib/core/errors.h"
+
+#include "third_party/mkl/include/mkl_dnn.h"
+#include "third_party/mkl/include/mkl_dnn_types.h"
+#include "tensorflow/core/platform/default/logging.h"
+#include "tensorflow/core/util/mkl_util.h"
+
+namespace tensorflow {
+
+typedef Eigen::ThreadPoolDevice CPUDevice;
+
+struct MklReluHelpers {
+  static void ValidateSameSizeHelper(OpKernelContext* context, const Tensor& g,
+                                     const Tensor& a) {
+    OP_REQUIRES(context, a.IsSameSize(g),
+                errors::InvalidArgument("g and a must be the same size"));
+  }
+  static bool ValidateSameSize(OpKernelContext* context, const Tensor& g,
+                               const Tensor& a) {
+    ValidateSameSizeHelper(context, g, a);
+    return context->status().ok();
+  }
+};
+
+template <typename Device, typename T>
+class MklReluOp : public OpKernel {
+ public:
+  ~MklReluOp() {}
+
+  explicit MklReluOp(OpKernelConstruction* context) : OpKernel(context) {}
+
+  void Compute(OpKernelContext* context) override {
+    MklReluOpContext mkl_context;
+
+    const Tensor& input = MklGetInput(context, 0);
+    GetMklShape(context, 0, &mkl_context.input_shape);
+    void* user_i = static_cast<void*>(const_cast<T*>(input.flat<T>().data()));
+    bool input_in_mkl_format = mkl_context.input_shape.IsMklTensor();
+    if (!input_in_mkl_format && !input.dims()) {  // handle the case of a scalar
+      const TensorShape& o_shape = input.shape();
+      Tensor* out_tensor = nullptr;
+      mkl_context.output_shape.SetMklTensor(false);
+      AllocateOutputSetMklshape(context, 0, &out_tensor, o_shape,
+                                mkl_context.output_shape);
+      void* out_o = static_cast<void*>(out_tensor->flat<T>().data());
+      (static_cast<T*>(out_o))[0] =
+          std::max((static_cast<T*>(user_i))[0], static_cast<T>(0));
+      return;
+    }
+
+    // Generate size, stride for input if input is in MKL format.
+    if (input_in_mkl_format) {
+      mkl_context.in_dims = mkl_context.input_shape.GetDimension();
+      mkl_context.in_sizes = new size_t[mkl_context.in_dims];
+      mkl_context.in_strides = new size_t[mkl_context.in_dims];
+      for (int i = 0; i < mkl_context.in_dims; i++) {
+        mkl_context.in_sizes[i] = mkl_context.input_shape.GetSizes()[i];
+        mkl_context.in_strides[i] = mkl_context.input_shape.GetStrides()[i];
+      }
+    } else {
+      mkl_context.in_dims = input.dims();
+      mkl_context.in_sizes = new size_t[mkl_context.in_dims];
+      mkl_context.in_strides = new size_t[mkl_context.in_dims];
+      for (int i = 0; i < mkl_context.in_dims; i++) {
+        mkl_context.in_sizes[i] = input.dim_size((mkl_context.in_dims - 1) - i);
+      }
+      mkl_context.in_strides[0] = 1;
+      for (int i = 1; i < mkl_context.in_dims; i++) {
+        mkl_context.in_strides[i] =
+            mkl_context.in_strides[i - 1] * mkl_context.in_sizes[i - 1];
+      }
+    }
+
+    float negative_slope = 0.0;
+    mkl_context.MklCreateInputLayouts(context);
+    CHECK_EQ(dnnReLUCreateForward_F32(&mkl_context.prim_relu_fwd, NULL,
+                                      mkl_context.lt_input, negative_slope),
+             E_SUCCESS);
+
+    Tensor* output = nullptr;
+
+    if (input_in_mkl_format) {
+      TensorShape tf_shape;
+      mkl_context.output_shape.SetMklTensor(true);
+      mkl_context.output_shape.SetMklLayout(mkl_context.prim_relu_fwd,
+                                            dnnResourceDst);
+      mkl_context.output_shape.SetTfLayout(
+          mkl_context.in_dims, mkl_context.in_sizes, mkl_context.in_strides);
+      mkl_context.output_shape.SetTfDimOrder(
+          mkl_context.in_dims, mkl_context.input_shape.GetTfToMklDimMap());
+      tf_shape.AddDim(dnnLayoutGetMemorySize_F32(static_cast<dnnLayout_t>(
+                          mkl_context.output_shape.GetMklLayout())) /
+                      sizeof(T));
+      AllocateOutputSetMklshape(context, 0, &output, tf_shape,
+                                mkl_context.output_shape);
+    } else {
+      const TensorShape& o_shape = input.shape();
+      mkl_context.output_shape.SetMklTensor(false);
+      AllocateOutputSetMklshape(context, 0, &output, o_shape,
+                                mkl_context.output_shape);
+    }
+
+    void* user_o = static_cast<void*>(const_cast<T*>(output->flat<T>().data()));
+
+    mkl_context.relu_res[dnnResourceDst] = user_o;
+    mkl_context.relu_res[dnnResourceSrc] = user_i;
+    CHECK_EQ(dnnExecute_F32(mkl_context.prim_relu_fwd, mkl_context.relu_res),
+             E_SUCCESS);
+    mkl_context.MklCleanup();
+  }
+
+ private:
+  typedef struct {
+    int in_dims;
+    size_t* in_sizes;
+    size_t* in_strides;
+    MklShape input_shape, output_shape;
+    dnnPrimitive_t prim_relu_fwd = nullptr;
+    void* relu_res[dnnResourceNumber];
+    dnnLayout_t lt_input = nullptr;
+
+    void MklCleanup() {
+      bool input_in_mkl_format = input_shape.IsMklTensor();
+      if (!input_in_mkl_format) {
+        dnnLayoutDelete_F32(lt_input);
+        free(in_sizes);
+        free(in_strides);
+      }
+      dnnDelete_F32(prim_relu_fwd);
+    }
+
+    void MklCreateInputLayouts(OpKernelContext* context) {
+      bool input_in_mkl_format = input_shape.IsMklTensor();
+      if (!input_in_mkl_format) {
+        CHECK_EQ(dnnLayoutCreate_F32(&lt_input, in_dims, in_sizes, in_strides),
+                 E_SUCCESS);
+      } else {
+        lt_input = static_cast<dnnLayout_t>(input_shape.GetCurLayout());
+      }
+    }
+  } MklReluOpContext;
+};
+
+template <typename Device, typename T>
+class MklReluGradOp : public OpKernel {
+ public:
+  ~MklReluGradOp() {}
+
+  explicit MklReluGradOp(OpKernelConstruction* context) : OpKernel(context) {}
+
+  void Compute(OpKernelContext* context) override;
+
+ private:
+  typedef struct {
+    int in_dims;
+    size_t* in_sizes;
+    size_t* in_strides;
+    MklShape input_shape, grad_shape, output_shape;
+    void* relu_res[dnnResourceNumber];
+    dnnPrimitive_t prim_relu_bwd;
+    dnnLayout_t lt_input, lt_grad;
+
+    void MklPrepareReluGradInputs(OpKernelContext* context,
+                                  Tensor* mkl_tmp_grad_buf_tensor,
+                                  Tensor* mkl_tmp_input_buf_tensor) {
+      dnnPrimitive_t cv_user_to_reluB_input, cv_user_to_reluB_grad;
+      dnnLayout_t mkl_lt_internal_input, mkl_lt_internal_grad;
+
+      const Tensor& g = MklGetInput(context, 0);
+      const Tensor& a = MklGetInput(context, 1);
+
+      void* user_i = static_cast<void*>(const_cast<T*>(a.flat<T>().data()));
+      void* user_g = static_cast<void*>(const_cast<T*>(g.flat<T>().data()));
+
+      CHECK_EQ(dnnLayoutCreateFromPrimitive_F32(
+                   &mkl_lt_internal_grad, prim_relu_bwd, dnnResourceDiffDst),
+               E_SUCCESS);
+
+      CHECK_EQ(dnnLayoutCreateFromPrimitive_F32(&mkl_lt_internal_input,
+                                                prim_relu_bwd, dnnResourceSrc),
+               E_SUCCESS);
+
+      if (!dnnLayoutCompare_F32(mkl_lt_internal_grad, lt_grad)) {
+        AllocTmpBuffer(context, mkl_tmp_grad_buf_tensor, mkl_lt_internal_grad,
+                       &relu_res[dnnResourceDiffDst]);
+        CHECK_EQ(dnnConversionCreate_F32(&cv_user_to_reluB_grad, lt_grad,
+                                         mkl_lt_internal_grad),
+                 E_SUCCESS);
+        CHECK_EQ(dnnConversionExecute_F32(cv_user_to_reluB_grad, user_g,
+                                          relu_res[dnnResourceDiffDst]),
+                 E_SUCCESS);
+        dnnDelete_F32(cv_user_to_reluB_grad);
+      } else {
+        relu_res[dnnResourceDiffDst] = user_g;
+      }
+
+      if (!dnnLayoutCompare_F32(mkl_lt_internal_input, lt_input)) {
+        AllocTmpBuffer(context, mkl_tmp_input_buf_tensor, mkl_lt_internal_input,
+                       &relu_res[dnnResourceSrc]);
+        CHECK_EQ(dnnConversionCreate_F32(&cv_user_to_reluB_input, lt_input,
+                                         mkl_lt_internal_input),
+                 E_SUCCESS);
+        CHECK_EQ(dnnConversionExecute_F32(cv_user_to_reluB_input, user_i,
+                                          relu_res[dnnResourceSrc]),
+                 E_SUCCESS);
+        dnnDelete_F32(cv_user_to_reluB_input);
+      } else {
+        relu_res[dnnResourceSrc] = user_i;
+      }
+
+      dnnLayoutDelete_F32(mkl_lt_internal_input);
+      dnnLayoutDelete_F32(mkl_lt_internal_grad);
+    }
+
+    void MklCreateInputLayouts(OpKernelContext* context) {
+      bool grad_is_mkl = grad_shape.IsMklTensor();
+      bool input_is_mkl = input_shape.IsMklTensor();
+      if (!input_is_mkl) {
+        CHECK_EQ(dnnLayoutCreate_F32(&lt_input, in_dims, in_sizes, in_strides),
+                 E_SUCCESS);
+      } else {
+        lt_input = static_cast<dnnLayout_t>(input_shape.GetCurLayout());
+      }
+
+      if (!grad_is_mkl) {
+        CHECK_EQ(dnnLayoutCreate_F32(&lt_grad, in_dims, in_sizes, in_strides),
+                 E_SUCCESS);
+      } else {
+        lt_grad = static_cast<dnnLayout_t>(grad_shape.GetCurLayout());
+      }
+    }
+
+    void MklCleanup() {
+      bool grad_is_mkl = grad_shape.IsMklTensor();
+      bool input_is_mkl = input_shape.IsMklTensor();
+      dnnDelete_F32(prim_relu_bwd);
+      if (!input_is_mkl) {
+        dnnLayoutDelete_F32(lt_input);
+        free(in_sizes);
+        free(in_strides);
+      }
+      if (!grad_is_mkl) {
+        dnnLayoutDelete_F32(lt_grad);
+      }
+    }
+  } MklReluGradOpContext;
+};
+
+template <typename Device, typename T>
+
+void MklReluGradOp<Device, T>::Compute(OpKernelContext* context) {
+  MklReluGradOpContext mkl_context;
+  const Tensor& g = MklGetInput(context, 0);
+  const Tensor& a = MklGetInput(context, 1);
+
+  void* user_i = static_cast<void*>(const_cast<T*>(a.flat<T>().data()));
+  void* user_g = static_cast<void*>(const_cast<T*>(g.flat<T>().data()));
+
+  GetMklShape(context, 0, &mkl_context.grad_shape);
+  GetMklShape(context, 1, &mkl_context.input_shape);
+
+  bool grad_is_mkl = mkl_context.grad_shape.IsMklTensor();
+  bool input_is_mkl = mkl_context.input_shape.IsMklTensor();
+  if (!input_is_mkl && !grad_is_mkl &&
+      !MklReluHelpers::ValidateSameSize(context, g, a))
+    return;
+  Tensor* output = nullptr;
+  if (!input_is_mkl && !grad_is_mkl &&
+      !a.dims()) {  // handle the case of a scalar
+    // Allocate space for g and
+    const TensorShape& g_shape = g.shape();
+    mkl_context.output_shape.SetMklTensor(false);
+    AllocateOutputSetMklshape(context, 0, &output, g_shape,
+                              mkl_context.output_shape);
+    void* out_o = static_cast<void*>(output->flat<T>().data());
+    (static_cast<T*>(out_o))[0] =
+        (static_cast<T*>(user_g))[0] * ((static_cast<T*>(user_i))[0] > 0);
+    return;
+  }
+
+  // Generate size, stride for input if input/grad is in MKL format.
+  if (grad_is_mkl || input_is_mkl) {
+    const MklShape* tmp_mkl_shape =
+        (grad_is_mkl) ? &mkl_context.grad_shape : &mkl_context.input_shape;
+
+    mkl_context.in_dims = tmp_mkl_shape->GetDimension();
+    mkl_context.in_strides = new size_t[mkl_context.in_dims];
+    mkl_context.in_sizes = new size_t[mkl_context.in_dims];
+    for (int i = 0; i < mkl_context.in_dims; i++) {
+      mkl_context.in_sizes[i] = tmp_mkl_shape->GetSizes()[i];
+      mkl_context.in_strides[i] = tmp_mkl_shape->GetStrides()[i];
+    }
+  } else {
+    mkl_context.in_dims = g.dims();
+    mkl_context.in_strides = new size_t[mkl_context.in_dims];
+    mkl_context.in_sizes = new size_t[mkl_context.in_dims];
+
+    for (int i = 0; i < mkl_context.in_dims; i++) {
+      mkl_context.in_sizes[i] = g.dim_size((mkl_context.in_dims - 1) - i);
+    }
+    mkl_context.in_strides[0] = 1;
+    for (int i = 1; i < mkl_context.in_dims; i++) {
+      mkl_context.in_strides[i] =
+          mkl_context.in_strides[i - 1] * mkl_context.in_sizes[i - 1];
+    }
+  }
+
+  mkl_context.MklCreateInputLayouts(context);
+  float negative_slope = 0.0;
+  CHECK_EQ(dnnReLUCreateBackward_F32(&mkl_context.prim_relu_bwd, NULL,
+                                     mkl_context.lt_grad, mkl_context.lt_input,
+                                     negative_slope),
+           E_SUCCESS);
+  Tensor mkl_tmp_grad_buf_tensor, mkl_tmp_input_buf_tensor;
+  mkl_context.MklPrepareReluGradInputs(context, &mkl_tmp_grad_buf_tensor,
+                                       &mkl_tmp_input_buf_tensor);
+
+  if (input_is_mkl ||
+      grad_is_mkl) { /*if  grad or input are MKL leave it in MKL*/
+    TensorShape tf_shape;
+    mkl_context.output_shape.SetMklTensor(true);
+    mkl_context.output_shape.SetMklLayout(mkl_context.prim_relu_bwd,
+                                          dnnResourceDiffSrc);
+    mkl_context.output_shape.SetTfLayout(
+        mkl_context.in_dims, mkl_context.in_sizes, mkl_context.in_strides);
+    // If input_is_mkl or grad_is_mkl, then we copy strides and sizes from Mkl
+    // shape of one that is in MKL layout.
+    if (grad_is_mkl == true) {
+      mkl_context.output_shape.SetTfDimOrder(
+          mkl_context.in_dims, mkl_context.grad_shape.GetTfToMklDimMap());
+    } else {
+      mkl_context.output_shape.SetTfDimOrder(
+          mkl_context.in_dims, mkl_context.input_shape.GetTfToMklDimMap());
+    }
+
+    tf_shape.AddDim(dnnLayoutGetMemorySize_F32(static_cast<dnnLayout_t>(
+                        mkl_context.output_shape.GetMklLayout())) /
+                    sizeof(T));
+    AllocateOutputSetMklshape(context, 0, &output, tf_shape,
+                              mkl_context.output_shape);
+
+  } else {
+    const TensorShape& o_shape = g.shape();
+    mkl_context.output_shape.SetMklTensor(false);
+    AllocateOutputSetMklshape(context, 0, &output, o_shape,
+                              mkl_context.output_shape);
+  }
+
+  mkl_context.relu_res[dnnResourceDiffSrc] =
+      static_cast<void*>(output->flat<T>().data());
+
+  CHECK_EQ(dnnExecute_F32(mkl_context.prim_relu_bwd, mkl_context.relu_res),
+           E_SUCCESS);
+  mkl_context.MklCleanup();
+}
+
+/* Register DNN kernels for supported operations and supported types - right now
+ * it is only Relu and f32*/
+#define REGISTER_RELU_MKL_SUPPORTED_KERNELS_TYPES(type)                   \
+  REGISTER_KERNEL_BUILDER(Name("MklRelu")                                 \
+                              .Device(DEVICE_CPU)                         \
+                              .TypeConstraint<type>("T")                  \
+                              .Label(mkl_layer_registry::kMklLayerLabel), \
+                          MklReluOp<CPUDevice, type>);                    \
+  REGISTER_KERNEL_BUILDER(Name("MklReluGrad")                             \
+                              .Device(DEVICE_CPU)                         \
+                              .TypeConstraint<type>("T")                  \
+                              .Label(mkl_layer_registry::kMklLayerLabel), \
+                          MklReluGradOp<CPUDevice, type>);
+TF_CALL_float(REGISTER_RELU_MKL_SUPPORTED_KERNELS_TYPES);
+
+}  // namespace tensorflow
+
+#endif  // INTEL_MKL
diff --git a/tensorflow/core/kernels/pooling_ops_3d.cc b/tensorflow/core/kernels/pooling_ops_3d.cc
index 44861d9595d..538dca24ae6 100644
--- a/tensorflow/core/kernels/pooling_ops_3d.cc
+++ b/tensorflow/core/kernels/pooling_ops_3d.cc
@@ -580,8 +580,7 @@ struct LaunchMaxPooling3dGradGradOp<CPUDevice, T> {
         *(context->device()->tensorflow_cpu_worker_threads());
 
     auto shard = [&params, &in_mat, &out_mat, &top_diff_mat, &bottom_diff_mat](
-        int64 start, int64 limit) {
-
+                     int64 start, int64 limit) {
       const int32 depth = params.depth;
       const int32 in_planes = params.tensor_in_planes;
       const int32 in_rows = params.tensor_in_rows;
@@ -682,10 +681,9 @@ class MaxPooling3dGradGradOp : public OpKernel {
                     "Pooling is not yet supported on the batch dimension."));
     const int32 ksize_c = GetTensorDim(ksize_, data_format_, 'C');
     const int32 stride_c = GetTensorDim(stride_, data_format_, 'C');
-    OP_REQUIRES(
-        context, ksize_c == 1 && stride_c == 1,
-        errors::Unimplemented(
-            "MaxPooling3dGradGrad is not yet supported on the depth dimension."));
+    OP_REQUIRES(context, ksize_c == 1 && stride_c == 1,
+                errors::Unimplemented("MaxPooling3dGradGrad is not yet "
+                                      "supported on the depth dimension."));
   }
 
   void Compute(OpKernelContext* context) override {
@@ -703,7 +701,7 @@ class MaxPooling3dGradGradOp : public OpKernel {
         context, out_grad_backprop.dims() == 5,
         errors::InvalidArgument("out_grad_backprop must be 5-dimensional"));
 
-    Pool3dParameters params{context, ksize_, stride_,
+    Pool3dParameters params{context,  ksize_,       stride_,
                             padding_, data_format_, tensor_in.shape()};
 
     Tensor* output = nullptr;
@@ -736,12 +734,11 @@ class MaxPooling3dGradGradOp : public OpKernel {
   REGISTER_KERNEL_BUILDER(                                                 \
       Name("AvgPool3D").Device(DEVICE_##D).TypeConstraint<T>("T"),         \
       Pooling3DOp<D##Device, T, AVG>);                                     \
-  REGISTER_KERNEL_BUILDER(                                                 \
-      Name("AvgPool3DGrad")                                                \
-          .Device(DEVICE_##D)                                              \
-          .TypeConstraint<T>("T")                                          \
-          .HostMemory("orig_input_shape"),                                 \
-      AvgPooling3dGradOp<D##Device, T>);
+  REGISTER_KERNEL_BUILDER(Name("AvgPool3DGrad")                            \
+                              .Device(DEVICE_##D)                          \
+                              .TypeConstraint<T>("T")                      \
+                              .HostMemory("orig_input_shape"),             \
+                          AvgPooling3dGradOp<D##Device, T>);
 
 #define REGISTER_CPU_KERNELS(T) REGISTER_KERNELS(CPU, T)
 TF_CALL_float(REGISTER_CPU_KERNELS);
@@ -835,8 +832,7 @@ struct LaunchMaxPooling3dGradGradOp<GPUDevice, T> {
 };
 
 #define REGISTER_GPU_KERNELS(T) REGISTER_KERNELS(GPU, T)
-TF_CALL_float(REGISTER_GPU_KERNELS)
-TF_CALL_half(REGISTER_GPU_KERNELS)
+TF_CALL_float(REGISTER_GPU_KERNELS) TF_CALL_half(REGISTER_GPU_KERNELS)
 #undef REGISTER_GPU_KERNELS
 
 #endif  // GOOGLE_CUDA
diff --git a/tensorflow/core/kernels/pooling_ops_3d_gpu.cu.cc b/tensorflow/core/kernels/pooling_ops_3d_gpu.cu.cc
index 08af188b282..341a43c368e 100644
--- a/tensorflow/core/kernels/pooling_ops_3d_gpu.cu.cc
+++ b/tensorflow/core/kernels/pooling_ops_3d_gpu.cu.cc
@@ -17,8 +17,8 @@ limitations under the License.
 
 #define EIGEN_USE_GPU
 
-#include "tensorflow/core/kernels/pooling_ops_3d_gpu.h"
 #include "tensorflow/core/framework/register_types.h"
+#include "tensorflow/core/kernels/pooling_ops_3d_gpu.h"
 #include "tensorflow/core/util/cuda_kernel_helper.h"
 #include "tensorflow/core/util/tensor_format.h"
 
@@ -159,12 +159,11 @@ bool MaxPool3dGradBackward<T>::operator()(
         bottom_diff);
   }
   return d.ok();
-};
+}
 
 }  // namespace functor
 
-#define DEFINE_GPU_SPECS(T) \
-  template struct functor::MaxPool3dGradBackward<T>;
+#define DEFINE_GPU_SPECS(T) template struct functor::MaxPool3dGradBackward<T>;
 TF_CALL_GPU_NUMBER_TYPES(DEFINE_GPU_SPECS);
 #undef DEFINE_GPU_SPECS
 
diff --git a/tensorflow/core/kernels/pooling_ops_common.cc b/tensorflow/core/kernels/pooling_ops_common.cc
index 9e7314cad5e..37747a31999 100644
--- a/tensorflow/core/kernels/pooling_ops_common.cc
+++ b/tensorflow/core/kernels/pooling_ops_common.cc
@@ -373,7 +373,7 @@ void DnnPoolingGradOp<T>::Compute(
   }
 }
 
-#define DEFINE_DNN_OPS(T)       \
+#define DEFINE_DNN_OPS(T)         \
   template class DnnPoolingOp<T>; \
   template class DnnPoolingGradOp<T>;
 TF_CALL_GPU_NUMBER_TYPES(DEFINE_DNN_OPS)
diff --git a/tensorflow/core/kernels/xsmm_conv2d.cc b/tensorflow/core/kernels/xsmm_conv2d.cc
index d3a29c2e3ea..7936cbcd46f 100644
--- a/tensorflow/core/kernels/xsmm_conv2d.cc
+++ b/tensorflow/core/kernels/xsmm_conv2d.cc
@@ -35,9 +35,9 @@ void dummy_xsmm_conv2d_ensure_file_is_not_empty(void);
 #include "tensorflow/core/lib/core/blocking_counter.h"
 #include "tensorflow/core/lib/core/threadpool.h"
 
+#include "libxsmm_main.h"  // TODO(bsteiner): API to avoid incl. header from src/
 #include "include/libxsmm_cpuid.h"
 #include "include/libxsmm_malloc.h"
-#include "libxsmm_main.h" // TODO: API to avoid incl. header from src/
 
 namespace tensorflow {
 
@@ -72,7 +72,6 @@ bool CanUseXsmmConv2D(const libxsmm_dnn_conv_desc& desc,
   return true;
 }
 
-
 typedef Eigen::ThreadPoolDevice CPUDevice;
 
 namespace functor {
@@ -83,25 +82,34 @@ static void chk_libxsmm_err(libxsmm_dnn_err_t status, string msg) {
   }
 }
 
-LIBXSMM_INLINE void copy_RSCK_to_custom(const float* rsck, float *kcrs, int R, int S, int C, int K,int blocksifm, int blocksofm, int ifmblock,int ofmblock, int start, int end)
-{
-  LIBXSMM_VLA_DECL(4, const      float, input, rsck, S, C,K);
-  LIBXSMM_VLA_DECL(6, float, output, kcrs, blocksifm,R,S,ifmblock, ofmblock);
-  int r, s, k,c, v1,v2;
-  
-  for (k = start; k < end ; k++ ) { 
-    for(c = 0; c < blocksifm;c++){
-      for ( r = 0; r < R; r++ ) {
-        for ( s = 0; s < S; s++ ){
-          for ( v1 = c*ifmblock; v1 < std::min(C,(c+1)*ifmblock) ; v1++ ) {
-            for ( v2 = k*ofmblock; v2 < std::min(K, (k+1)*ofmblock); v2++ )
-              LIBXSMM_VLA_ACCESS(6,  output, k,c, r, s,v1- c*ifmblock,v2-k*ofmblock, blocksifm, R, S,ifmblock,ofmblock) = LIBXSMM_VLA_ACCESS(4, input, r, s, v1, v2,  S, C, K);
-            for ( v2 = K; v2 < (k+1)*ofmblock ; v2++ )
-              LIBXSMM_VLA_ACCESS(6,  output, k,c, r, s,v1- c*ifmblock,v2-k*ofmblock, blocksifm, R, S,ifmblock,ofmblock) = 0.0f; 
-            }
-          for ( v1 = C; v1 < (c+1)*ifmblock ; v1++ ) {
-            for ( v2 = k*ofmblock; v2 < (k+1)*ofmblock; v2++ )
-              LIBXSMM_VLA_ACCESS(6,  output, k,c, r, s,v1- c*ifmblock,v2-k*ofmblock, blocksifm, R, S,ifmblock,ofmblock) = 0.0f;
+LIBXSMM_INLINE void copy_RSCK_to_custom(const float* rsck, float* kcrs, int R,
+                                        int S, int C, int K, int blocksifm,
+                                        int blocksofm, int ifmblock,
+                                        int ofmblock, int start, int end) {
+  LIBXSMM_VLA_DECL(4, const float, input, rsck, S, C, K);
+  LIBXSMM_VLA_DECL(6, float, output, kcrs, blocksifm, R, S, ifmblock, ofmblock);
+  int r, s, k, c, v1, v2;
+
+  for (k = start; k < end; k++) {
+    for (c = 0; c < blocksifm; c++) {
+      for (r = 0; r < R; r++) {
+        for (s = 0; s < S; s++) {
+          for (v1 = c * ifmblock; v1 < std::min(C, (c + 1) * ifmblock); v1++) {
+            for (v2 = k * ofmblock; v2 < std::min(K, (k + 1) * ofmblock); v2++)
+              LIBXSMM_VLA_ACCESS(6, output, k, c, r, s, v1 - c * ifmblock,
+                                 v2 - k * ofmblock, blocksifm, R, S, ifmblock,
+                                 ofmblock) =
+                  LIBXSMM_VLA_ACCESS(4, input, r, s, v1, v2, S, C, K);
+            for (v2 = K; v2 < (k + 1) * ofmblock; v2++)
+              LIBXSMM_VLA_ACCESS(6, output, k, c, r, s, v1 - c * ifmblock,
+                                 v2 - k * ofmblock, blocksifm, R, S, ifmblock,
+                                 ofmblock) = 0.0f;
+          }
+          for (v1 = C; v1 < (c + 1) * ifmblock; v1++) {
+            for (v2 = k * ofmblock; v2 < (k + 1) * ofmblock; v2++)
+              LIBXSMM_VLA_ACCESS(6, output, k, c, r, s, v1 - c * ifmblock,
+                                 v2 - k * ofmblock, blocksifm, R, S, ifmblock,
+                                 ofmblock) = 0.0f;
           }
         }
       }
@@ -109,47 +117,28 @@ LIBXSMM_INLINE void copy_RSCK_to_custom(const float* rsck, float *kcrs, int R, i
   }
 }
 
- 
+class libxsmm_dnn_conv_desc_wrap {
+ public:
+  const libxsmm_dnn_conv_desc d;
 
-class libxsmm_dnn_conv_desc_wrap{
-  public:
-    const libxsmm_dnn_conv_desc d;
- 
-    libxsmm_dnn_conv_desc_wrap(const libxsmm_dnn_conv_desc &d_) : d(d_){
-    }
-    bool operator==(const libxsmm_dnn_conv_desc_wrap  &w) const{
-      return( d.N == w.d.N &&
-              d.C == w.d.C &&
-              d.H == w.d.H &&
-              d.W == w.d.W &&
-              d.K == w.d.K &&
-              d.R == w.d.R &&
-              d.S == w.d.S &&
-              d.u == w.d.u &&
-              d.v == w.d.v &&
-              d.pad_h == w.d.pad_h &&
-              d.pad_w == w.d.pad_w
-            );
-    }
+  libxsmm_dnn_conv_desc_wrap(const libxsmm_dnn_conv_desc& d_) : d(d_) {}
+  bool operator==(const libxsmm_dnn_conv_desc_wrap& w) const {
+    return (d.N == w.d.N && d.C == w.d.C && d.H == w.d.H && d.W == w.d.W &&
+            d.K == w.d.K && d.R == w.d.R && d.S == w.d.S && d.u == w.d.u &&
+            d.v == w.d.v && d.pad_h == w.d.pad_h && d.pad_w == w.d.pad_w);
+  }
 };
- 
- 
-struct HashFunction{
-  std::size_t operator()(const libxsmm_dnn_conv_desc_wrap & w) const{
-   
-    
 
+struct HashFunction {
+  std::size_t operator()(const libxsmm_dnn_conv_desc_wrap& w) const {
+    // unsigned char ptr[sizeof(&w.d)];
 
-    //unsigned char ptr[sizeof(&w.d)];
-
-    
-    //memcpy(ptr, (unsigned char *)&w.d, sizeof(&w.d))
-                                       
+    // memcpy(ptr, (unsigned char *)&w.d, sizeof(&w.d))
 
     //
     /*
     std::ostringstream N,C,H,W,K,R,S,u,v,padh,padw;
- 
+
     N << w.d.N; C << w.d.C;
     H << w.d.H; W << w.d.W;
     K << w.d.K; R << w.d.R;
@@ -167,47 +156,53 @@ struct HashFunction{
     //
     //
     */
-    return ( std::hash<unsigned long long>()((unsigned long long)&(w.d)));
+    return (std::hash<unsigned long long>()((unsigned long long)&(w.d)));
   }
 };
- 
-class handles{
-  public:
-    libxsmm_dnn_layer* find( const libxsmm_dnn_conv_desc_wrap &w) {
-      std::unordered_map<libxsmm_dnn_conv_desc_wrap , libxsmm_dnn_layer*, HashFunction>::iterator i = libxsmm_handles.find(w);
-      if (i == libxsmm_handles.end()){
-        libxsmm_dnn_err_t status;
-        libxsmm_dnn_layer* libxsmm_handle = libxsmm_dnn_create_conv_layer(w.d, &status);
-        chk_libxsmm_err(status, "Create handle");
-        libxsmm_handles.insert(std::make_pair(w, libxsmm_handle));
-        return libxsmm_handle;
-      }
-      else
-        return i->second;
+
+class handles {
+ public:
+  libxsmm_dnn_layer* find(const libxsmm_dnn_conv_desc_wrap& w) {
+    std::unordered_map<libxsmm_dnn_conv_desc_wrap, libxsmm_dnn_layer*,
+                       HashFunction>::iterator i = libxsmm_handles.find(w);
+    if (i == libxsmm_handles.end()) {
+      libxsmm_dnn_err_t status;
+      libxsmm_dnn_layer* libxsmm_handle =
+          libxsmm_dnn_create_conv_layer(w.d, &status);
+      chk_libxsmm_err(status, "Create handle");
+      libxsmm_handles.insert(std::make_pair(w, libxsmm_handle));
+      return libxsmm_handle;
+    } else {
+      return i->second;
     }
-   ~handles(){
-    std::unordered_map<libxsmm_dnn_conv_desc_wrap , libxsmm_dnn_layer*, HashFunction>::iterator i;
-    for (i= libxsmm_handles.begin(); i != libxsmm_handles.end(); i++)
+  }
+  ~handles() {
+    std::unordered_map<libxsmm_dnn_conv_desc_wrap, libxsmm_dnn_layer*,
+                       HashFunction>::iterator i;
+    for (i = libxsmm_handles.begin(); i != libxsmm_handles.end(); i++)
       chk_libxsmm_err(libxsmm_dnn_destroy_conv_layer(i->second),
-                    "Destroy handle");
-    }
-  private:
- 
-    std::unordered_map<libxsmm_dnn_conv_desc_wrap , libxsmm_dnn_layer*, HashFunction> libxsmm_handles;
- 
+                      "Destroy handle");
+  }
+
+ private:
+  std::unordered_map<libxsmm_dnn_conv_desc_wrap, libxsmm_dnn_layer*,
+                     HashFunction>
+      libxsmm_handles;
 };
 
 static handles libxsmm_handles;
 
-//#define LIBXSMM_DETAILED_TIMING
+// #define LIBXSMM_DETAILED_TIMING
 
 template <typename InputPtr, typename FilterPtr, typename OutputPtr>
 static bool CallLibxsmmConvGeneric(OpKernelContext* ctx,
                                    const libxsmm_dnn_conv_desc& desc,
-                                   libxsmm_dnn_compute_kind kind, InputPtr input,
-                                   FilterPtr filter, OutputPtr output) {
+                                   libxsmm_dnn_compute_kind kind,
+                                   InputPtr input, FilterPtr filter,
+                                   OutputPtr output) {
 #if defined(LIBXSMM_DETAILED_TIMING)
-  unsigned long long l_tick1, l_tick2, l_tick3, l_tick4, l_tick5, l_tick6, l_tick7, l_tick8, l_tick9, l_tick10;
+  unsigned long long l_tick1, l_tick2, l_tick3, l_tick4, l_tick5, l_tick6,
+      l_tick7, l_tick8, l_tick9, l_tick10;
   l_tick1 = libxsmm_timer_tick();
 #endif
   // setup scoped allocator, which adopts the allocator from the context
@@ -216,14 +211,14 @@ static bool CallLibxsmmConvGeneric(OpKernelContext* ctx,
   libxsmm_dnn_layer* libxsmm_handle;
   libxsmm_dnn_conv_desc_wrap w(desc);
   void* scratch;
- 
-  //if(kind == LIBXSMM_DNN_COMPUTE_KIND_FWD)
+
+  // if(kind == LIBXSMM_DNN_COMPUTE_KIND_FWD)
   libxsmm_handle = libxsmm_handles.find(w);
-  //else{
+  // else{
   //  libxsmm_handle = libxsmm_dnn_create_conv_layer(desc, &status);
   //  chk_libxsmm_err(status, "Create handle");
   //}
-  
+
   status = libxsmm_dnn_get_codegen_success(libxsmm_handle, kind);
   if (status == LIBXSMM_DNN_WARN_FALLBACK) {
     chk_libxsmm_err(libxsmm_dnn_destroy_conv_layer(libxsmm_handle),
@@ -241,12 +236,16 @@ static bool CallLibxsmmConvGeneric(OpKernelContext* ctx,
 #endif
 
   int ifmblock = (libxsmm_handle->ifmblock);
-  int ofmblock = (libxsmm_handle->ofmblock); 
+  int ofmblock = (libxsmm_handle->ofmblock);
 
-  int blocksifm = desc.C%ifmblock ==0 ? desc.C/ifmblock :desc.C/ifmblock + 1;           
-  int blocksofm = desc.K%ofmblock ==0 ? desc.K/ofmblock :desc.K/ofmblock + 1;
-  float *native_filter = (float*)libxsmm_aligned_scratch( blocksofm*blocksifm*desc.R*desc.S*ifmblock*ofmblock*sizeof(float), 2097152);
- 
+  int blocksifm =
+      desc.C % ifmblock == 0 ? desc.C / ifmblock : desc.C / ifmblock + 1;
+  int blocksofm =
+      desc.K % ofmblock == 0 ? desc.K / ofmblock : desc.K / ofmblock + 1;
+  float* native_filter =
+      (float*)libxsmm_aligned_scratch(blocksofm * blocksifm * desc.R * desc.S *
+                                          ifmblock * ofmblock * sizeof(float),
+                                      2097152);
 
   const DeviceBase::CpuWorkerThreads* worker_threads =
       ctx->device()->tensorflow_cpu_worker_threads();
@@ -254,90 +253,111 @@ static bool CallLibxsmmConvGeneric(OpKernelContext* ctx,
   int num_threads = worker_threads->num_threads;
 
 #if 1
-  if(kind ==  LIBXSMM_DNN_COMPUTE_KIND_FWD || kind ==  LIBXSMM_DNN_COMPUTE_KIND_BWD){
-    if(blocksofm > num_threads){
+  if (kind == LIBXSMM_DNN_COMPUTE_KIND_FWD ||
+      kind == LIBXSMM_DNN_COMPUTE_KIND_BWD) {
+    if (blocksofm > num_threads) {
       int work = blocksofm;
       BlockingCounter count(num_threads);
       for (int i = 0; i < num_threads; ++i) {
         worker_threads->workers->Schedule([=, &count]() {
-        int start = work/num_threads*i;
-        int end =  (start + work/num_threads) > work ? work: start + work/num_threads;
-        copy_RSCK_to_custom(filter, native_filter, desc.R, desc.S,desc.C, desc.K,blocksifm,blocksofm,ifmblock,ofmblock,start, end);
-        count.DecrementCount();
+          int start = work / num_threads * i;
+          int end = (start + work / num_threads) > work
+                        ? work
+                        : start + work / num_threads;
+          copy_RSCK_to_custom(filter, native_filter, desc.R, desc.S, desc.C,
+                              desc.K, blocksifm, blocksofm, ifmblock, ofmblock,
+                              start, end);
+          count.DecrementCount();
         });
       }
       count.Wait();
-    }
-    else{
- 
+    } else {
       int work = blocksofm;
       int num_threads = work;
- 
+
       BlockingCounter count(num_threads);
       for (int i = 0; i < num_threads; ++i) {
         worker_threads->workers->Schedule([=, &count]() {
-        int start = i;
-        int end =  i+1;
-        copy_RSCK_to_custom(filter, native_filter, desc.R, desc.S,desc.C, desc.K,blocksifm,blocksofm,ifmblock,ofmblock, start, end);
-        count.DecrementCount();
+          int start = i;
+          int end = i + 1;
+          copy_RSCK_to_custom(filter, native_filter, desc.R, desc.S, desc.C,
+                              desc.K, blocksifm, blocksofm, ifmblock, ofmblock,
+                              start, end);
+          count.DecrementCount();
         });
       }
       count.Wait();
     }
-  }
-  //Added: for weight update
-  else if (kind == LIBXSMM_DNN_COMPUTE_KIND_UPD){
-    libxsmm_filter = libxsmm_dnn_link_filter(libxsmm_handle, LIBXSMM_DNN_FILTER, filter, LIBXSMM_DNN_TENSOR_FORMAT_RSCK_PTR, &status);
-    chk_libxsmm_err(status, "Link filter");//weight update is in RSCK as filter should be returned in RSCK format
+  } else if (kind == LIBXSMM_DNN_COMPUTE_KIND_UPD) {
+    // Added: for weight update
+    libxsmm_filter =
+        libxsmm_dnn_link_filter(libxsmm_handle, LIBXSMM_DNN_FILTER, filter,
+                                LIBXSMM_DNN_TENSOR_FORMAT_RSCK_PTR, &status);
+    chk_libxsmm_err(status,
+                    "Link filter");  // weight update is in RSCK as
+                                     // filter should be returned in RSCK
+                                     // format
   }
 #else
-  memset( native_filter, 0, blocksofm*blocksifm*desc.R*desc.S*ifmblock*ofmblock*sizeof(float));
+  memset(native_filter, 0,
+         blocksofm * blocksifm * desc.R * desc.S * ifmblock * ofmblock *
+             sizeof(float));
 #endif
 
 #if defined(LIBXSMM_DETAILED_TIMING)
   l_tick3 = libxsmm_timer_tick();
 #endif
 
-  libxsmm_input = libxsmm_dnn_link_buffer(
-      libxsmm_handle, LIBXSMM_DNN_INPUT, input, LIBXSMM_DNN_TENSOR_FORMAT_NHWC_PTR, &status);
+  libxsmm_input =
+      libxsmm_dnn_link_buffer(libxsmm_handle, LIBXSMM_DNN_INPUT, input,
+                              LIBXSMM_DNN_TENSOR_FORMAT_NHWC_PTR, &status);
   chk_libxsmm_err(status, "Link input buffer");
-  libxsmm_output = libxsmm_dnn_link_buffer(
-      libxsmm_handle, LIBXSMM_DNN_OUTPUT, output, LIBXSMM_DNN_TENSOR_FORMAT_NHWC_PTR, &status);
+  libxsmm_output =
+      libxsmm_dnn_link_buffer(libxsmm_handle, LIBXSMM_DNN_OUTPUT, output,
+                              LIBXSMM_DNN_TENSOR_FORMAT_NHWC_PTR, &status);
   chk_libxsmm_err(status, "Link output buffer");
-  if(kind == LIBXSMM_DNN_COMPUTE_KIND_FWD || kind == LIBXSMM_DNN_COMPUTE_KIND_BWD){
-  libxsmm_filter = libxsmm_dnn_link_filter(
-      libxsmm_handle, LIBXSMM_DNN_FILTER, native_filter, LIBXSMM_DNN_TENSOR_FORMAT_LIBXSMM_PTR, &status);
-  chk_libxsmm_err(status, "Link filter");
+  if (kind == LIBXSMM_DNN_COMPUTE_KIND_FWD ||
+      kind == LIBXSMM_DNN_COMPUTE_KIND_BWD) {
+    libxsmm_filter = libxsmm_dnn_link_filter(
+        libxsmm_handle, LIBXSMM_DNN_FILTER, native_filter,
+        LIBXSMM_DNN_TENSOR_FORMAT_LIBXSMM_PTR, &status);
+    chk_libxsmm_err(status, "Link filter");
   }
   if (kind == LIBXSMM_DNN_COMPUTE_KIND_FWD) {
     chk_libxsmm_err(libxsmm_dnn_zero_buffer(libxsmm_output), "Zero output");
 
-    chk_libxsmm_err(libxsmm_dnn_bind_buffer(libxsmm_handle, libxsmm_input, LIBXSMM_DNN_REGULAR_INPUT),
+    chk_libxsmm_err(libxsmm_dnn_bind_buffer(libxsmm_handle, libxsmm_input,
+                                            LIBXSMM_DNN_REGULAR_INPUT),
                     "Bind input forward");
-    chk_libxsmm_err(
-        libxsmm_dnn_bind_buffer(libxsmm_handle, libxsmm_output, LIBXSMM_DNN_REGULAR_OUTPUT),
-        "Bind output forward");
-    chk_libxsmm_err(libxsmm_dnn_bind_filter(libxsmm_handle, libxsmm_filter, LIBXSMM_DNN_REGULAR_FILTER),
+    chk_libxsmm_err(libxsmm_dnn_bind_buffer(libxsmm_handle, libxsmm_output,
+                                            LIBXSMM_DNN_REGULAR_OUTPUT),
+                    "Bind output forward");
+    chk_libxsmm_err(libxsmm_dnn_bind_filter(libxsmm_handle, libxsmm_filter,
+                                            LIBXSMM_DNN_REGULAR_FILTER),
                     "Bind filter forward");
   } else if (kind == LIBXSMM_DNN_COMPUTE_KIND_BWD) {
     chk_libxsmm_err(libxsmm_dnn_zero_buffer(libxsmm_input), "Zero input");
 
-    chk_libxsmm_err(libxsmm_dnn_bind_buffer(libxsmm_handle, libxsmm_input, LIBXSMM_DNN_GRADIENT_INPUT),
+    chk_libxsmm_err(libxsmm_dnn_bind_buffer(libxsmm_handle, libxsmm_input,
+                                            LIBXSMM_DNN_GRADIENT_INPUT),
                     "Bind input backward");
-    chk_libxsmm_err(
-        libxsmm_dnn_bind_buffer(libxsmm_handle, libxsmm_output, LIBXSMM_DNN_GRADIENT_OUTPUT),
-        "Bind output backward");
-    chk_libxsmm_err(libxsmm_dnn_bind_filter(libxsmm_handle, libxsmm_filter, LIBXSMM_DNN_REGULAR_FILTER),
+    chk_libxsmm_err(libxsmm_dnn_bind_buffer(libxsmm_handle, libxsmm_output,
+                                            LIBXSMM_DNN_GRADIENT_OUTPUT),
+                    "Bind output backward");
+    chk_libxsmm_err(libxsmm_dnn_bind_filter(libxsmm_handle, libxsmm_filter,
+                                            LIBXSMM_DNN_REGULAR_FILTER),
                     "Bind filter backward");
   } else if (kind == LIBXSMM_DNN_COMPUTE_KIND_UPD) {
     chk_libxsmm_err(libxsmm_dnn_zero_filter(libxsmm_filter), "Zero filter");
 
-    chk_libxsmm_err(libxsmm_dnn_bind_buffer(libxsmm_handle, libxsmm_input, LIBXSMM_DNN_REGULAR_INPUT),
-                    "Bind input weight udpate");
-    chk_libxsmm_err(
-        libxsmm_dnn_bind_buffer(libxsmm_handle, libxsmm_output, LIBXSMM_DNN_GRADIENT_OUTPUT),
-        "Bind output weight update");
-    chk_libxsmm_err(libxsmm_dnn_bind_filter(libxsmm_handle, libxsmm_filter, LIBXSMM_DNN_GRADIENT_FILTER),
+    chk_libxsmm_err(libxsmm_dnn_bind_buffer(libxsmm_handle, libxsmm_input,
+                                            LIBXSMM_DNN_REGULAR_INPUT),
+                    "Bind input weight update");
+    chk_libxsmm_err(libxsmm_dnn_bind_buffer(libxsmm_handle, libxsmm_output,
+                                            LIBXSMM_DNN_GRADIENT_OUTPUT),
+                    "Bind output weight update");
+    chk_libxsmm_err(libxsmm_dnn_bind_filter(libxsmm_handle, libxsmm_filter,
+                                            LIBXSMM_DNN_GRADIENT_FILTER),
                     "Bind filter weight update");
   } else {
     /* shouldn't happen */
@@ -348,9 +368,14 @@ static bool CallLibxsmmConvGeneric(OpKernelContext* ctx,
 #endif
 
   /* bind scratch */
-  scratch = (void*)libxsmm_aligned_scratch( libxsmm_dnn_get_scratch_size( libxsmm_handle, LIBXSMM_DNN_COMPUTE_KIND_ALL, &status ), 2097152);
-  chk_libxsmm_err( status, "scratch allocation" );
-  chk_libxsmm_err( libxsmm_dnn_bind_scratch( libxsmm_handle, LIBXSMM_DNN_COMPUTE_KIND_ALL, scratch ), "binding scratch" );
+  scratch = (void*)libxsmm_aligned_scratch(
+      libxsmm_dnn_get_scratch_size(libxsmm_handle, LIBXSMM_DNN_COMPUTE_KIND_ALL,
+                                   &status),
+      2097152);
+  chk_libxsmm_err(status, "scratch allocation");
+  chk_libxsmm_err(libxsmm_dnn_bind_scratch(
+                      libxsmm_handle, LIBXSMM_DNN_COMPUTE_KIND_ALL, scratch),
+                  "binding scratch");
 
 #if defined(LIBXSMM_DETAILED_TIMING)
   l_tick5 = libxsmm_timer_tick();
@@ -366,7 +391,7 @@ static bool CallLibxsmmConvGeneric(OpKernelContext* ctx,
 
 #if 1
   BlockingCounter counter(num_threads);
-  
+
   for (int i = 0; i < num_threads; ++i) {
     worker_threads->workers->Schedule([=, &counter]() {
       chk_libxsmm_err(libxsmm_dnn_execute_st(libxsmm_handle, kind, 0, i),
@@ -376,9 +401,11 @@ static bool CallLibxsmmConvGeneric(OpKernelContext* ctx,
   }
   counter.Wait();
 #else
-  #pragma omp parallel
+#pragma omp parallel
   {
-    chk_libxsmm_err(libxsmm_dnn_execute_st(libxsmm_handle, kind, 0, omp_get_thread_num()), "Worker");
+    chk_libxsmm_err(
+        libxsmm_dnn_execute_st(libxsmm_handle, kind, 0, omp_get_thread_num()),
+        "Worker");
   }
 #endif
 
@@ -387,7 +414,7 @@ static bool CallLibxsmmConvGeneric(OpKernelContext* ctx,
 #endif
 
   if (kind == LIBXSMM_DNN_COMPUTE_KIND_UPD) {
-    libxsmm_dnn_reduce_wu_filters( libxsmm_handle, LIBXSMM_DNN_GRADIENT_FILTER );
+    libxsmm_dnn_reduce_wu_filters(libxsmm_handle, LIBXSMM_DNN_GRADIENT_FILTER);
   }
 
 #if defined(LIBXSMM_DETAILED_TIMING)
@@ -395,19 +422,39 @@ static bool CallLibxsmmConvGeneric(OpKernelContext* ctx,
 #endif
 
   /* clean up */
-  chk_libxsmm_err( libxsmm_dnn_release_scratch( libxsmm_handle, LIBXSMM_DNN_COMPUTE_KIND_ALL ), "release scratch" );
+  chk_libxsmm_err(
+      libxsmm_dnn_release_scratch(libxsmm_handle, LIBXSMM_DNN_COMPUTE_KIND_ALL),
+      "release scratch");
   if (kind == LIBXSMM_DNN_COMPUTE_KIND_FWD) {
-    chk_libxsmm_err( libxsmm_dnn_release_buffer( libxsmm_handle, LIBXSMM_DNN_REGULAR_INPUT ), "release input" );
-    chk_libxsmm_err( libxsmm_dnn_release_buffer( libxsmm_handle, LIBXSMM_DNN_REGULAR_OUTPUT ), "release output" );
-    chk_libxsmm_err( libxsmm_dnn_release_filter( libxsmm_handle, LIBXSMM_DNN_REGULAR_FILTER ), "release filter" );
+    chk_libxsmm_err(
+        libxsmm_dnn_release_buffer(libxsmm_handle, LIBXSMM_DNN_REGULAR_INPUT),
+        "release input");
+    chk_libxsmm_err(
+        libxsmm_dnn_release_buffer(libxsmm_handle, LIBXSMM_DNN_REGULAR_OUTPUT),
+        "release output");
+    chk_libxsmm_err(
+        libxsmm_dnn_release_filter(libxsmm_handle, LIBXSMM_DNN_REGULAR_FILTER),
+        "release filter");
   } else if (kind == LIBXSMM_DNN_COMPUTE_KIND_BWD) {
-    chk_libxsmm_err( libxsmm_dnn_release_buffer( libxsmm_handle, LIBXSMM_DNN_GRADIENT_INPUT ), "release input" );
-    chk_libxsmm_err( libxsmm_dnn_release_buffer( libxsmm_handle, LIBXSMM_DNN_GRADIENT_OUTPUT ), "release output" );
-    chk_libxsmm_err( libxsmm_dnn_release_filter( libxsmm_handle, LIBXSMM_DNN_REGULAR_FILTER ), "release filter" );
+    chk_libxsmm_err(
+        libxsmm_dnn_release_buffer(libxsmm_handle, LIBXSMM_DNN_GRADIENT_INPUT),
+        "release input");
+    chk_libxsmm_err(
+        libxsmm_dnn_release_buffer(libxsmm_handle, LIBXSMM_DNN_GRADIENT_OUTPUT),
+        "release output");
+    chk_libxsmm_err(
+        libxsmm_dnn_release_filter(libxsmm_handle, LIBXSMM_DNN_REGULAR_FILTER),
+        "release filter");
   } else if (kind == LIBXSMM_DNN_COMPUTE_KIND_UPD) {
-    chk_libxsmm_err( libxsmm_dnn_release_buffer( libxsmm_handle, LIBXSMM_DNN_REGULAR_INPUT ), "release input" );
-    chk_libxsmm_err( libxsmm_dnn_release_buffer( libxsmm_handle, LIBXSMM_DNN_GRADIENT_OUTPUT ), "release output" );
-    chk_libxsmm_err( libxsmm_dnn_release_filter( libxsmm_handle, LIBXSMM_DNN_GRADIENT_FILTER ), "release filter" );
+    chk_libxsmm_err(
+        libxsmm_dnn_release_buffer(libxsmm_handle, LIBXSMM_DNN_REGULAR_INPUT),
+        "release input");
+    chk_libxsmm_err(
+        libxsmm_dnn_release_buffer(libxsmm_handle, LIBXSMM_DNN_GRADIENT_OUTPUT),
+        "release output");
+    chk_libxsmm_err(
+        libxsmm_dnn_release_filter(libxsmm_handle, LIBXSMM_DNN_GRADIENT_FILTER),
+        "release filter");
   } else {
     /* shouldn't happen */
   }
@@ -418,9 +465,9 @@ static bool CallLibxsmmConvGeneric(OpKernelContext* ctx,
 #if defined(LIBXSMM_DETAILED_TIMING)
   l_tick9 = libxsmm_timer_tick();
 #endif
-  
-  //if(kind != LIBXSMM_DNN_COMPUTE_KIND_FWD)
-  //chk_libxsmm_err(libxsmm_dnn_destroy_conv_layer(libxsmm_handle),
+
+  // if(kind != LIBXSMM_DNN_COMPUTE_KIND_FWD)
+  // chk_libxsmm_err(libxsmm_dnn_destroy_conv_layer(libxsmm_handle),
   //               "Destroy handle");
 
   libxsmm_free(native_filter);
@@ -428,17 +475,20 @@ static bool CallLibxsmmConvGeneric(OpKernelContext* ctx,
 
 #if defined(LIBXSMM_DETAILED_TIMING)
   l_tick10 = libxsmm_timer_tick();
-  printf("time for convolution (%i, %i, %i, %i, %i): %f, %f, %f, %f, %f, %f, %f, %f, %f, %f\n", desc.N, desc.C, desc.K, desc.R, desc.S, 
-                                                                                      libxsmm_timer_duration(l_tick1, l_tick2),
-                                                                                      libxsmm_timer_duration(l_tick2, l_tick3),
-                                                                                      libxsmm_timer_duration(l_tick3, l_tick4),
-                                                                                      libxsmm_timer_duration(l_tick4, l_tick5),
-                                                                                      libxsmm_timer_duration(l_tick5, l_tick6),
-                                                                                      libxsmm_timer_duration(l_tick6, l_tick7),
-                                                                                      libxsmm_timer_duration(l_tick7, l_tick8),
-                                                                                      libxsmm_timer_duration(l_tick8, l_tick9),
-                                                                                      libxsmm_timer_duration(l_tick9, l_tick10),
-                                                                                      libxsmm_timer_duration(l_tick1, l_tick10)  );
+  printf(
+      "time for convolution (%i, %i, %i, %i, %i): %f, %f, %f, %f, %f, %f, %f, "
+      "%f, %f, %f\n",
+      desc.N, desc.C, desc.K, desc.R, desc.S,
+      libxsmm_timer_duration(l_tick1, l_tick2),
+      libxsmm_timer_duration(l_tick2, l_tick3),
+      libxsmm_timer_duration(l_tick3, l_tick4),
+      libxsmm_timer_duration(l_tick4, l_tick5),
+      libxsmm_timer_duration(l_tick5, l_tick6),
+      libxsmm_timer_duration(l_tick6, l_tick7),
+      libxsmm_timer_duration(l_tick7, l_tick8),
+      libxsmm_timer_duration(l_tick8, l_tick9),
+      libxsmm_timer_duration(l_tick9, l_tick10),
+      libxsmm_timer_duration(l_tick1, l_tick10));
 #endif
 
   return true;  // Succeeded
@@ -448,8 +498,8 @@ template <typename T>
 struct XsmmFwdConv2D<CPUDevice, T> {
   bool operator()(OpKernelContext* ctx, const libxsmm_dnn_conv_desc& desc,
                   const T* input, const T* filter, T* output) {
-    return CallLibxsmmConvGeneric(ctx, desc, LIBXSMM_DNN_COMPUTE_KIND_FWD, input,
-                                  filter, output);
+    return CallLibxsmmConvGeneric(ctx, desc, LIBXSMM_DNN_COMPUTE_KIND_FWD,
+                                  input, filter, output);
   }
 };
 
@@ -457,8 +507,8 @@ template <typename T>
 struct XsmmBkwInputConv2D<CPUDevice, T> {
   bool operator()(OpKernelContext* ctx, const libxsmm_dnn_conv_desc& desc,
                   T* input, const T* filter, const T* output) {
-    return CallLibxsmmConvGeneric(ctx, desc, LIBXSMM_DNN_COMPUTE_KIND_BWD, input,
-                                  filter, output);
+    return CallLibxsmmConvGeneric(ctx, desc, LIBXSMM_DNN_COMPUTE_KIND_BWD,
+                                  input, filter, output);
   }
 };
 
@@ -466,8 +516,8 @@ template <typename T>
 struct XsmmBkwFilterConv2D<CPUDevice, T> {
   bool operator()(OpKernelContext* ctx, const libxsmm_dnn_conv_desc& desc,
                   const T* input, T* filter, const T* output) {
-    return CallLibxsmmConvGeneric(ctx, desc, LIBXSMM_DNN_COMPUTE_KIND_UPD, input,
-                                  filter, output);
+    return CallLibxsmmConvGeneric(ctx, desc, LIBXSMM_DNN_COMPUTE_KIND_UPD,
+                                  input, filter, output);
   }
 };
 
diff --git a/tensorflow/core/lib/io/inputbuffer.cc b/tensorflow/core/lib/io/inputbuffer.cc
index 750737a62d0..7efe2dc5434 100644
--- a/tensorflow/core/lib/io/inputbuffer.cc
+++ b/tensorflow/core/lib/io/inputbuffer.cc
@@ -47,7 +47,7 @@ Status InputBuffer::ReadLine(string* result) {
   Status s;
   do {
     size_t buf_remain = limit_ - pos_;
-    char* newline = (char*)memchr(pos_, '\n', buf_remain);
+    char* newline = static_cast<char*>(memchr(pos_, '\n', buf_remain));
     if (newline != nullptr) {
       size_t result_len = newline - pos_;
       result->append(pos_, result_len);
diff --git a/tensorflow/core/lib/png/png_io.cc b/tensorflow/core/lib/png/png_io.cc
index bdc39e5d6f7..961a78f83b1 100644
--- a/tensorflow/core/lib/png/png_io.cc
+++ b/tensorflow/core/lib/png/png_io.cc
@@ -17,6 +17,7 @@ limitations under the License.
 
 #include <string.h>
 #include <sys/types.h>
+#include <zlib.h>
 #include <string>
 #include <utility>
 #include <vector>
@@ -152,7 +153,10 @@ bool DecodeHeader(StringPiece png_string, int* width, int* height,
   if (components != NULL) {
     switch (context.color_type) {
       case PNG_COLOR_TYPE_PALETTE:
-        *components = (context.info_ptr->valid & PNG_INFO_tRNS) ? 4 : 3;
+        *components =
+            (png_get_valid(context.png_ptr, context.info_ptr, PNG_INFO_tRNS))
+                ? 4
+                : 3;
         break;
       case PNG_COLOR_TYPE_GRAY:
         *components = 1;
@@ -176,8 +180,11 @@ bool DecodeHeader(StringPiece png_string, int* width, int* height,
   }
   if (metadata != NULL) {
     metadata->clear();
-    for (int i = 0; i < context.info_ptr->num_text; i++) {
-      const png_text& text = context.info_ptr->text[i];
+    png_textp text_ptr = NULL;
+    int num_text = 0;
+    png_get_text(context.png_ptr, context.info_ptr, &text_ptr, &num_text);
+    for (int i = 0; i < num_text; i++) {
+      const png_text& text = text_ptr[i];
       metadata->push_back(std::make_pair(text.key, text.text));
     }
   }
@@ -228,9 +235,10 @@ bool CommonInitDecode(StringPiece png_string, int desired_channels,
     return false;
   }
   if (context->channels == 0) {  // Autodetect number of channels
-    context->channels = context->info_ptr->channels;
+    context->channels = png_get_channels(context->png_ptr, context->info_ptr);
   }
-  const bool has_tRNS = (context->info_ptr->valid & PNG_INFO_tRNS) != 0;
+  const bool has_tRNS =
+      (png_get_valid(context->png_ptr, context->info_ptr, PNG_INFO_tRNS)) != 0;
   const bool has_alpha = (context->color_type & PNG_COLOR_MASK_ALPHA) != 0;
   if ((context->channels & 1) == 0) {  // We desire alpha
     if (has_alpha) {                   // There is alpha
@@ -268,7 +276,9 @@ bool CommonInitDecode(StringPiece png_string, int desired_channels,
   const bool want_gray = (context->channels < 3);
   const bool is_gray = !(context->color_type & PNG_COLOR_MASK_COLOR);
   if (is_gray) {  // upconvert gray to 8-bit if needed.
-    if (context->bit_depth < 8) png_set_gray_1_2_4_to_8(context->png_ptr);
+    if (context->bit_depth < 8) {
+      png_set_expand_gray_1_2_4_to_8(context->png_ptr);
+    }
   }
   if (want_gray) {  // output is grayscale
     if (!is_gray)
@@ -301,7 +311,9 @@ bool CommonFinishDecode(png_bytep data, int row_bytes, DecodeContext* context) {
     }
   }
 
-  context->info_ptr->valid |= PNG_INFO_IDAT;
+  // Marks iDAT as valid.
+  png_set_rows(context->png_ptr, context->info_ptr,
+               png_get_rows(context->png_ptr, context->info_ptr));
   png_read_end(context->png_ptr, context->info_ptr);
 
   // Clean up.
diff --git a/tensorflow/core/ops/compat/ops_history.v1.pbtxt b/tensorflow/core/ops/compat/ops_history.v1.pbtxt
index 2af4e8692b2..c538a278a30 100644
--- a/tensorflow/core/ops/compat/ops_history.v1.pbtxt
+++ b/tensorflow/core/ops/compat/ops_history.v1.pbtxt
@@ -1782,6 +1782,69 @@ op {
     }
   }
 }
+op {
+  name: "AvgPool"
+  input_arg {
+    name: "value"
+    type_attr: "T"
+  }
+  output_arg {
+    name: "output"
+    type_attr: "T"
+  }
+  attr {
+    name: "ksize"
+    type: "list(int)"
+    has_minimum: true
+    minimum: 4
+  }
+  attr {
+    name: "strides"
+    type: "list(int)"
+    has_minimum: true
+    minimum: 4
+  }
+  attr {
+    name: "padding"
+    type: "string"
+    allowed_values {
+      list {
+        s: "SAME"
+        s: "VALID"
+      }
+    }
+  }
+  attr {
+    name: "data_format"
+    type: "string"
+    default_value {
+      s: "NHWC"
+    }
+    allowed_values {
+      list {
+        s: "NHWC"
+        s: "NCHW"
+      }
+    }
+  }
+  attr {
+    name: "T"
+    type: "type"
+    allowed_values {
+      list {
+        type: DT_FLOAT
+        type: DT_DOUBLE
+        type: DT_INT32
+        type: DT_INT64
+        type: DT_UINT8
+        type: DT_INT16
+        type: DT_INT8
+        type: DT_UINT16
+        type: DT_HALF
+      }
+    }
+  }
+}
 op {
   name: "AvgPool3D"
   input_arg {
@@ -2097,6 +2160,73 @@ op {
     }
   }
 }
+op {
+  name: "AvgPoolGrad"
+  input_arg {
+    name: "orig_input_shape"
+    type: DT_INT32
+  }
+  input_arg {
+    name: "grad"
+    type_attr: "T"
+  }
+  output_arg {
+    name: "output"
+    type_attr: "T"
+  }
+  attr {
+    name: "ksize"
+    type: "list(int)"
+    has_minimum: true
+    minimum: 4
+  }
+  attr {
+    name: "strides"
+    type: "list(int)"
+    has_minimum: true
+    minimum: 4
+  }
+  attr {
+    name: "padding"
+    type: "string"
+    allowed_values {
+      list {
+        s: "SAME"
+        s: "VALID"
+      }
+    }
+  }
+  attr {
+    name: "data_format"
+    type: "string"
+    default_value {
+      s: "NHWC"
+    }
+    allowed_values {
+      list {
+        s: "NHWC"
+        s: "NCHW"
+      }
+    }
+  }
+  attr {
+    name: "T"
+    type: "type"
+    allowed_values {
+      list {
+        type: DT_FLOAT
+        type: DT_DOUBLE
+        type: DT_INT32
+        type: DT_INT64
+        type: DT_UINT8
+        type: DT_INT16
+        type: DT_INT8
+        type: DT_UINT16
+        type: DT_HALF
+      }
+    }
+  }
+}
 op {
   name: "Barrier"
   output_arg {
@@ -9755,6 +9885,72 @@ op {
     }
   }
 }
+op {
+  name: "MaxPool"
+  input_arg {
+    name: "input"
+    type_attr: "T"
+  }
+  output_arg {
+    name: "output"
+    type_attr: "T"
+  }
+  attr {
+    name: "T"
+    type: "type"
+    default_value {
+      type: DT_FLOAT
+    }
+    allowed_values {
+      list {
+        type: DT_FLOAT
+        type: DT_DOUBLE
+        type: DT_INT32
+        type: DT_INT64
+        type: DT_UINT8
+        type: DT_INT16
+        type: DT_INT8
+        type: DT_UINT16
+        type: DT_HALF
+      }
+    }
+  }
+  attr {
+    name: "ksize"
+    type: "list(int)"
+    has_minimum: true
+    minimum: 4
+  }
+  attr {
+    name: "strides"
+    type: "list(int)"
+    has_minimum: true
+    minimum: 4
+  }
+  attr {
+    name: "padding"
+    type: "string"
+    allowed_values {
+      list {
+        s: "SAME"
+        s: "VALID"
+      }
+    }
+  }
+  attr {
+    name: "data_format"
+    type: "string"
+    default_value {
+      s: "NHWC"
+    }
+    allowed_values {
+      list {
+        s: "NHWC"
+        s: "NCHW"
+      }
+    }
+  }
+}
 op {
   name: "MaxPool3D"
   input_arg {
@@ -10017,6 +10213,181 @@ op {
     }
   }
 }
+op {
+  name: "MaxPool3DGrad"
+  input_arg {
+    name: "orig_input"
+    type_attr: "TInput"
+  }
+  input_arg {
+    name: "orig_output"
+    type_attr: "TInput"
+  }
+  input_arg {
+    name: "grad"
+    type_attr: "T"
+  }
+  output_arg {
+    name: "output"
+    type_attr: "T"
+  }
+  attr {
+    name: "ksize"
+    type: "list(int)"
+    has_minimum: true
+    minimum: 5
+  }
+  attr {
+    name: "strides"
+    type: "list(int)"
+    has_minimum: true
+    minimum: 5
+  }
+  attr {
+    name: "padding"
+    type: "string"
+    allowed_values {
+      list {
+        s: "SAME"
+        s: "VALID"
+      }
+    }
+  }
+  attr {
+    name: "data_format"
+    type: "string"
+    default_value {
+      s: "NDHWC"
+    }
+    allowed_values {
+      list {
+        s: "NDHWC"
+        s: "NCDHW"
+      }
+    }
+  }
+  attr {
+    name: "T"
+    type: "type"
+    default_value {
+      type: DT_FLOAT
+    }
+    allowed_values {
+      list {
+        type: DT_FLOAT
+        type: DT_DOUBLE
+        type: DT_INT64
+        type: DT_INT32
+        type: DT_UINT8
+        type: DT_UINT16
+        type: DT_INT16
+        type: DT_INT8
+        type: DT_COMPLEX64
+        type: DT_COMPLEX128
+        type: DT_QINT8
+        type: DT_QUINT8
+        type: DT_QINT32
+        type: DT_HALF
+      }
+    }
+  }
+  attr {
+    name: "TInput"
+    type: "type"
+    default_value {
+      type: DT_FLOAT
+    }
+    allowed_values {
+      list {
+        type: DT_FLOAT
+        type: DT_DOUBLE
+        type: DT_INT64
+        type: DT_INT32
+        type: DT_UINT8
+        type: DT_UINT16
+        type: DT_INT16
+        type: DT_INT8
+        type: DT_COMPLEX64
+        type: DT_COMPLEX128
+        type: DT_QINT8
+        type: DT_QUINT8
+        type: DT_QINT32
+        type: DT_HALF
+      }
+    }
+  }
+}
+op {
+  name: "MaxPool3DGradGrad"
+  input_arg {
+    name: "orig_input"
+    type_attr: "T"
+  }
+  input_arg {
+    name: "orig_output"
+    type_attr: "T"
+  }
+  input_arg {
+    name: "grad"
+    type_attr: "T"
+  }
+  output_arg {
+    name: "output"
+    type_attr: "T"
+  }
+  attr {
+    name: "ksize"
+    type: "list(int)"
+    has_minimum: true
+    minimum: 5
+  }
+  attr {
+    name: "strides"
+    type: "list(int)"
+    has_minimum: true
+    minimum: 5
+  }
+  attr {
+    name: "padding"
+    type: "string"
+    allowed_values {
+      list {
+        s: "SAME"
+        s: "VALID"
+      }
+    }
+  }
+  attr {
+    name: "data_format"
+    type: "string"
+    default_value {
+      s: "NDHWC"
+    }
+    allowed_values {
+      list {
+        s: "NDHWC"
+        s: "NCDHW"
+      }
+    }
+  }
+  attr {
+    name: "T"
+    type: "type"
+    allowed_values {
+      list {
+        type: DT_FLOAT
+        type: DT_DOUBLE
+        type: DT_INT32
+        type: DT_INT64
+        type: DT_UINT8
+        type: DT_INT16
+        type: DT_INT8
+        type: DT_UINT16
+        type: DT_HALF
+      }
+    }
+  }
+}
 op {
   name: "MaxPoolGrad"
   input_arg {
@@ -10084,6 +10455,219 @@ op {
     }
   }
 }
+op {
+  name: "MaxPoolGrad"
+  input_arg {
+    name: "orig_input"
+    type_attr: "T"
+  }
+  input_arg {
+    name: "orig_output"
+    type_attr: "T"
+  }
+  input_arg {
+    name: "grad"
+    type_attr: "T"
+  }
+  output_arg {
+    name: "output"
+    type_attr: "T"
+  }
+  attr {
+    name: "ksize"
+    type: "list(int)"
+    has_minimum: true
+    minimum: 4
+  }
+  attr {
+    name: "strides"
+    type: "list(int)"
+    has_minimum: true
+    minimum: 4
+  }
+  attr {
+    name: "padding"
+    type: "string"
+    allowed_values {
+      list {
+        s: "SAME"
+        s: "VALID"
+      }
+    }
+  }
+  attr {
+    name: "data_format"
+    type: "string"
+    default_value {
+      s: "NHWC"
+    }
+    allowed_values {
+      list {
+        s: "NHWC"
+        s: "NCHW"
+      }
+    }
+  }
+  attr {
+    name: "T"
+    type: "type"
+    default_value {
+      type: DT_FLOAT
+    }
+    allowed_values {
+      list {
+        type: DT_FLOAT
+        type: DT_DOUBLE
+        type: DT_INT32
+        type: DT_INT64
+        type: DT_UINT8
+        type: DT_INT16
+        type: DT_INT8
+        type: DT_UINT16
+        type: DT_HALF
+      }
+    }
+  }
+}
+op {
+  name: "MaxPoolGradGrad"
+  input_arg {
+    name: "orig_input"
+    type_attr: "T"
+  }
+  input_arg {
+    name: "orig_output"
+    type_attr: "T"
+  }
+  input_arg {
+    name: "grad"
+    type_attr: "T"
+  }
+  output_arg {
+    name: "output"
+    type_attr: "T"
+  }
+  attr {
+    name: "ksize"
+    type: "list(int)"
+    has_minimum: true
+    minimum: 4
+  }
+  attr {
+    name: "strides"
+    type: "list(int)"
+    has_minimum: true
+    minimum: 4
+  }
+  attr {
+    name: "padding"
+    type: "string"
+    allowed_values {
+      list {
+        s: "SAME"
+        s: "VALID"
+      }
+    }
+  }
+  attr {
+    name: "data_format"
+    type: "string"
+    default_value {
+      s: "NHWC"
+    }
+    allowed_values {
+      list {
+        s: "NHWC"
+        s: "NCHW"
+      }
+    }
+  }
+  attr {
+    name: "T"
+    type: "type"
+    allowed_values {
+      list {
+        type: DT_FLOAT
+        type: DT_DOUBLE
+        type: DT_INT32
+        type: DT_INT64
+        type: DT_UINT8
+        type: DT_INT16
+        type: DT_INT8
+        type: DT_UINT16
+        type: DT_HALF
+      }
+    }
+  }
+}
+op {
+  name: "MaxPoolGradGradWithArgmax"
+  input_arg {
+    name: "input"
+    type_attr: "T"
+  }
+  input_arg {
+    name: "grad"
+    type_attr: "T"
+  }
+  input_arg {
+    name: "argmax"
+    type_attr: "Targmax"
+  }
+  output_arg {
+    name: "output"
+    type_attr: "T"
+  }
+  attr {
+    name: "ksize"
+    type: "list(int)"
+    has_minimum: true
+    minimum: 4
+  }
+  attr {
+    name: "strides"
+    type: "list(int)"
+    has_minimum: true
+    minimum: 4
+  }
+  attr {
+    name: "padding"
+    type: "string"
+    allowed_values {
+      list {
+        s: "SAME"
+        s: "VALID"
+      }
+    }
+  }
+  attr {
+    name: "Targmax"
+    type: "type"
+    allowed_values {
+      list {
+        type: DT_INT32
+        type: DT_INT64
+      }
+    }
+  }
+  attr {
+    name: "T"
+    type: "type"
+    allowed_values {
+      list {
+        type: DT_FLOAT
+        type: DT_DOUBLE
+        type: DT_INT32
+        type: DT_INT64
+        type: DT_UINT8
+        type: DT_INT16
+        type: DT_INT8
+        type: DT_UINT16
+        type: DT_HALF
+      }
+    }
+  }
+}
 op {
   name: "MaxPoolGradWithArgmax"
   input_arg {
@@ -10148,6 +10732,137 @@ op {
     }
   }
 }
+op {
+  name: "MaxPoolGradWithArgmax"
+  input_arg {
+    name: "input"
+    type_attr: "T"
+  }
+  input_arg {
+    name: "grad"
+    type_attr: "T"
+  }
+  input_arg {
+    name: "argmax"
+    type_attr: "Targmax"
+  }
+  output_arg {
+    name: "output"
+    type_attr: "T"
+  }
+  attr {
+    name: "ksize"
+    type: "list(int)"
+    has_minimum: true
+    minimum: 4
+  }
+  attr {
+    name: "strides"
+    type: "list(int)"
+    has_minimum: true
+    minimum: 4
+  }
+  attr {
+    name: "padding"
+    type: "string"
+    allowed_values {
+      list {
+        s: "SAME"
+        s: "VALID"
+      }
+    }
+  }
+  attr {
+    name: "Targmax"
+    type: "type"
+    allowed_values {
+      list {
+        type: DT_INT32
+        type: DT_INT64
+      }
+    }
+  }
+  attr {
+    name: "T"
+    type: "type"
+    allowed_values {
+      list {
+        type: DT_FLOAT
+        type: DT_DOUBLE
+        type: DT_INT32
+        type: DT_INT64
+        type: DT_UINT8
+        type: DT_INT16
+        type: DT_INT8
+        type: DT_UINT16
+        type: DT_HALF
+      }
+    }
+  }
+}
+op {
+  name: "MaxPoolWithArgmax"
+  input_arg {
+    name: "input"
+    type_attr: "T"
+  }
+  output_arg {
+    name: "output"
+    type_attr: "T"
+  }
+  output_arg {
+    name: "argmax"
+    type_attr: "Targmax"
+  }
+  attr {
+    name: "ksize"
+    type: "list(int)"
+    has_minimum: true
+    minimum: 4
+  }
+  attr {
+    name: "strides"
+    type: "list(int)"
+    has_minimum: true
+    minimum: 4
+  }
+  attr {
+    name: "Targmax"
+    type: "type"
+    default_value {
+      type: DT_INT64
+    }
+    allowed_values {
+      list {
+        type: DT_INT32
+        type: DT_INT64
+      }
+    }
+  }
+  attr {
+    name: "padding"
+    type: "string"
+    allowed_values {
+      list {
+        s: "SAME"
+        s: "VALID"
+      }
+    }
+  }
+  attr {
+    name: "T"
+    type: "type"
+    default_value {
+      type: DT_FLOAT
+    }
+    allowed_values {
+      list {
+        type: DT_FLOAT
+        type: DT_HALF
+      }
+    }
+  }
+}
 op {
   name: "MaxPoolWithArgmax"
   input_arg {
@@ -10200,12 +10915,16 @@ op {
   attr {
     name: "T"
     type: "type"
-    default_value {
-      type: DT_FLOAT
-    }
     allowed_values {
       list {
         type: DT_FLOAT
+        type: DT_DOUBLE
+        type: DT_INT32
+        type: DT_INT64
+        type: DT_UINT8
+        type: DT_INT16
+        type: DT_INT8
+        type: DT_UINT16
         type: DT_HALF
       }
     }
diff --git a/tensorflow/core/ops/nn_grad.cc b/tensorflow/core/ops/nn_grad.cc
index 8a86f90e5c3..05ad635f587 100644
--- a/tensorflow/core/ops/nn_grad.cc
+++ b/tensorflow/core/ops/nn_grad.cc
@@ -181,7 +181,6 @@ Status MaxPoolGrad(const AttrSlice& attrs, FunctionDef* g) {
 }
 REGISTER_OP_GRADIENT("MaxPool", MaxPoolGrad);
 
-
 Status MaxPoolGradGrad(const AttrSlice& attrs, FunctionDef* g) {
   // clang-format off
   *g = FDH::Define(
diff --git a/tensorflow/core/ops/ops.pbtxt b/tensorflow/core/ops/ops.pbtxt
index d1f9bbb391a..4a644b55e6f 100644
--- a/tensorflow/core/ops/ops.pbtxt
+++ b/tensorflow/core/ops/ops.pbtxt
@@ -2034,8 +2034,14 @@ op {
     allowed_values {
       list {
         type: DT_FLOAT
-        type: DT_HALF
         type: DT_DOUBLE
+        type: DT_INT32
+        type: DT_INT64
+        type: DT_UINT8
+        type: DT_INT16
+        type: DT_INT8
+        type: DT_UINT16
+        type: DT_HALF
       }
     }
   }
@@ -2259,8 +2265,14 @@ op {
     allowed_values {
       list {
         type: DT_FLOAT
-        type: DT_HALF
         type: DT_DOUBLE
+        type: DT_INT32
+        type: DT_INT64
+        type: DT_UINT8
+        type: DT_INT16
+        type: DT_INT8
+        type: DT_UINT16
+        type: DT_HALF
       }
     }
   }
@@ -8465,7 +8477,7 @@ op {
     }
   }
   summary: "Gather slices from `params` according to `indices`."
-  description: "`indices` must be an integer tensor of any dimension (usually 0-D or 1-D).\nProduces an output tensor with shape `indices.shape + params.shape[1:]` where:\n\n```python\n    # Scalar indices\n    output[:, ..., :] = params[indices, :, ... :]\n\n    # Vector indices\n    output[i, :, ..., :] = params[indices[i], :, ... :]\n\n    # Higher rank indices\n    output[i, ..., j, :, ... :] = params[indices[i, ..., j], :, ..., :]\n```\n\nIf `indices` is a permutation and `len(indices) == params.shape[0]` then\nthis operation will permute `params` accordingly.\n\n<div style=\"width:70%; margin:auto; margin-bottom:10px; margin-top:20px;\">\n<img style=\"width:100%\" src=\"../../images/Gather.png\" alt>\n</div>"
+  description: "`indices` must be an integer tensor of any dimension (usually 0-D or 1-D).\nProduces an output tensor with shape `indices.shape + params.shape[1:]` where:\n\n```python\n    # Scalar indices\n    output[:, ..., :] = params[indices, :, ... :]\n\n    # Vector indices\n    output[i, :, ..., :] = params[indices[i], :, ... :]\n\n    # Higher rank indices\n    output[i, ..., j, :, ... :] = params[indices[i, ..., j], :, ..., :]\n```\n\nIf `indices` is a permutation and `len(indices) == params.shape[0]` then\nthis operation will permute `params` accordingly.\n\n`validate_indices`: DEPRECATED. If this operation is assigned to CPU, values in\n`indices` are always validated to be within range. If assigned to GPU,\nout-of-bound indices result in unspecified behavior (currently the result is\n`0`, but this may become an error in the future).\n\n<div style=\"width:70%; margin:auto; margin-bottom:10px; margin-top:20px;\">\n<img style=\"width:100%\" src=\"../../images/Gather.png\" alt>\n</div>"
 }
 op {
   name: "GatherNd"
@@ -10574,6 +10586,13 @@ op {
     allowed_values {
       list {
         type: DT_FLOAT
+        type: DT_DOUBLE
+        type: DT_INT32
+        type: DT_INT64
+        type: DT_UINT8
+        type: DT_INT16
+        type: DT_INT8
+        type: DT_UINT16
         type: DT_HALF
       }
     }
@@ -10699,12 +10718,12 @@ op {
   input_arg {
     name: "orig_input"
     description: "The original input tensor."
-    type: DT_FLOAT
+    type_attr: "TInput"
   }
   input_arg {
     name: "orig_output"
     description: "The original output tensor."
-    type: DT_FLOAT
+    type_attr: "TInput"
   }
   input_arg {
     name: "grad"
@@ -10757,6 +10776,34 @@ op {
   attr {
     name: "T"
     type: "type"
+    default_value {
+      type: DT_FLOAT
+    }
+    allowed_values {
+      list {
+        type: DT_FLOAT
+        type: DT_DOUBLE
+        type: DT_INT64
+        type: DT_INT32
+        type: DT_UINT8
+        type: DT_UINT16
+        type: DT_INT16
+        type: DT_INT8
+        type: DT_COMPLEX64
+        type: DT_COMPLEX128
+        type: DT_QINT8
+        type: DT_QUINT8
+        type: DT_QINT32
+        type: DT_HALF
+      }
+    }
+  }
+  attr {
+    name: "TInput"
+    type: "type"
+    default_value {
+      type: DT_FLOAT
+    }
     allowed_values {
       list {
         type: DT_FLOAT
@@ -10778,6 +10825,86 @@ op {
   }
   summary: "Computes gradients of max pooling function."
 }
+op {
+  name: "MaxPool3DGradGrad"
+  input_arg {
+    name: "orig_input"
+    description: "The original input tensor."
+    type_attr: "T"
+  }
+  input_arg {
+    name: "orig_output"
+    description: "The original output tensor."
+    type_attr: "T"
+  }
+  input_arg {
+    name: "grad"
+    description: "Output backprop of shape `[batch, depth, rows, cols, channels]`."
+    type_attr: "T"
+  }
+  output_arg {
+    name: "output"
+    description: "Gradients of gradients w.r.t. the input to `max_pool`."
+    type_attr: "T"
+  }
+  attr {
+    name: "ksize"
+    type: "list(int)"
+    description: "1-D tensor of length 5. The size of the window for each dimension of\nthe input tensor. Must have `ksize[0] = ksize[4] = 1`."
+    has_minimum: true
+    minimum: 5
+  }
+  attr {
+    name: "strides"
+    type: "list(int)"
+    description: "1-D tensor of length 5. The stride of the sliding window for each\ndimension of `input`. Must have `strides[0] = strides[4] = 1`."
+    has_minimum: true
+    minimum: 5
+  }
+  attr {
+    name: "padding"
+    type: "string"
+    description: "The type of padding algorithm to use."
+    allowed_values {
+      list {
+        s: "SAME"
+        s: "VALID"
+      }
+    }
+  }
+  attr {
+    name: "data_format"
+    type: "string"
+    default_value {
+      s: "NDHWC"
+    }
+    description: "The data format of the input and output data. With the\ndefault format \"NDHWC\", the data is stored in the order of:\n    [batch, in_depth, in_height, in_width, in_channels].\nAlternatively, the format could be \"NCDHW\", the data storage order is:\n    [batch, in_channels, in_depth, in_height, in_width]."
+    allowed_values {
+      list {
+        s: "NDHWC"
+        s: "NCDHW"
+      }
+    }
+  }
+  attr {
+    name: "T"
+    type: "type"
+    allowed_values {
+      list {
+        type: DT_FLOAT
+        type: DT_DOUBLE
+        type: DT_INT32
+        type: DT_INT64
+        type: DT_UINT8
+        type: DT_INT16
+        type: DT_INT8
+        type: DT_UINT16
+        type: DT_HALF
+      }
+    }
+  }
+  summary: "Computes second-order gradients of the maxpooling function."
+}
 op {
   name: "MaxPoolGrad"
   input_arg {
@@ -10848,12 +10975,175 @@ op {
     allowed_values {
       list {
         type: DT_FLOAT
+        type: DT_DOUBLE
+        type: DT_INT32
+        type: DT_INT64
+        type: DT_UINT8
+        type: DT_INT16
+        type: DT_INT8
+        type: DT_UINT16
         type: DT_HALF
       }
     }
   }
   summary: "Computes gradients of the maxpooling function."
 }
+op {
+  name: "MaxPoolGradGrad"
+  input_arg {
+    name: "orig_input"
+    description: "The original input tensor."
+    type_attr: "T"
+  }
+  input_arg {
+    name: "orig_output"
+    description: "The original output tensor."
+    type_attr: "T"
+  }
+  input_arg {
+    name: "grad"
+    description: "4-D.  Gradients of gradients w.r.t. the input of `max_pool`."
+    type_attr: "T"
+  }
+  output_arg {
+    name: "output"
+    description: "Gradients of gradients w.r.t. the input to `max_pool`."
+    type_attr: "T"
+  }
+  attr {
+    name: "ksize"
+    type: "list(int)"
+    description: "The size of the window for each dimension of the input tensor."
+    has_minimum: true
+    minimum: 4
+  }
+  attr {
+    name: "strides"
+    type: "list(int)"
+    description: "The stride of the sliding window for each dimension of the\ninput tensor."
+    has_minimum: true
+    minimum: 4
+  }
+  attr {
+    name: "padding"
+    type: "string"
+    description: "The type of padding algorithm to use."
+    allowed_values {
+      list {
+        s: "SAME"
+        s: "VALID"
+      }
+    }
+  }
+  attr {
+    name: "data_format"
+    type: "string"
+    default_value {
+      s: "NHWC"
+    }
+    description: "Specify the data format of the input and output data. With the\ndefault format \"NHWC\", the data is stored in the order of:\n    [batch, in_height, in_width, in_channels].\nAlternatively, the format could be \"NCHW\", the data storage order of:\n    [batch, in_channels, in_height, in_width]."
+    allowed_values {
+      list {
+        s: "NHWC"
+        s: "NCHW"
+      }
+    }
+  }
+  attr {
+    name: "T"
+    type: "type"
+    allowed_values {
+      list {
+        type: DT_FLOAT
+        type: DT_DOUBLE
+        type: DT_INT32
+        type: DT_INT64
+        type: DT_UINT8
+        type: DT_INT16
+        type: DT_INT8
+        type: DT_UINT16
+        type: DT_HALF
+      }
+    }
+  }
+  summary: "Computes second-order gradients of the maxpooling function."
+}
+op {
+  name: "MaxPoolGradGradWithArgmax"
+  input_arg {
+    name: "input"
+    description: "The original input."
+    type_attr: "T"
+  }
+  input_arg {
+    name: "grad"
+    description: "4-D with shape `[batch, height, width, channels]`.  Gradients w.r.t. the\ninput of `max_pool`."
+    type_attr: "T"
+  }
+  input_arg {
+    name: "argmax"
+    description: "The indices of the maximum values chosen for each output of `max_pool`."
+    type_attr: "Targmax"
+  }
+  output_arg {
+    name: "output"
+    description: "Gradients of gradients w.r.t. the input of `max_pool`."
+    type_attr: "T"
+  }
+  attr {
+    name: "ksize"
+    type: "list(int)"
+    description: "The size of the window for each dimension of the input tensor."
+    has_minimum: true
+    minimum: 4
+  }
+  attr {
+    name: "strides"
+    type: "list(int)"
+    description: "The stride of the sliding window for each dimension of the\ninput tensor."
+    has_minimum: true
+    minimum: 4
+  }
+  attr {
+    name: "padding"
+    type: "string"
+    description: "The type of padding algorithm to use."
+    allowed_values {
+      list {
+        s: "SAME"
+        s: "VALID"
+      }
+    }
+  }
+  attr {
+    name: "Targmax"
+    type: "type"
+    allowed_values {
+      list {
+        type: DT_INT32
+        type: DT_INT64
+      }
+    }
+  }
+  attr {
+    name: "T"
+    type: "type"
+    allowed_values {
+      list {
+        type: DT_FLOAT
+        type: DT_DOUBLE
+        type: DT_INT32
+        type: DT_INT64
+        type: DT_UINT8
+        type: DT_INT16
+        type: DT_INT8
+        type: DT_UINT16
+        type: DT_HALF
+      }
+    }
+  }
+  summary: "Computes second-order gradients of the maxpooling function."
+}
 op {
   name: "MaxPoolGradWithArgmax"
   input_arg {
@@ -10914,12 +11204,16 @@ op {
   attr {
     name: "T"
     type: "type"
-    default_value {
-      type: DT_FLOAT
-    }
     allowed_values {
       list {
         type: DT_FLOAT
+        type: DT_DOUBLE
+        type: DT_INT32
+        type: DT_INT64
+        type: DT_UINT8
+        type: DT_INT16
+        type: DT_INT8
+        type: DT_UINT16
         type: DT_HALF
       }
     }
@@ -10984,12 +11278,16 @@ op {
   attr {
     name: "T"
     type: "type"
-    default_value {
-      type: DT_FLOAT
-    }
     allowed_values {
       list {
         type: DT_FLOAT
+        type: DT_DOUBLE
+        type: DT_INT32
+        type: DT_INT64
+        type: DT_UINT8
+        type: DT_INT16
+        type: DT_INT8
+        type: DT_UINT16
         type: DT_HALF
       }
     }
diff --git a/tensorflow/core/platform/cpu_info.cc b/tensorflow/core/platform/cpu_info.cc
index 451b7209bb7..9edf2de64ca 100644
--- a/tensorflow/core/platform/cpu_info.cc
+++ b/tensorflow/core/platform/cpu_info.cc
@@ -69,10 +69,6 @@ int GetXCR0EAX() {
 // Structure for basic CPUID info
 class CPUIDInfo {
 public:
-  string vendor_str;
-  int family;
-  int model_num;
-
   CPUIDInfo()
       : have_adx_(0),
         have_aes_(0),
@@ -121,9 +117,9 @@ public:
 
     // Get vendor string (issue CPUID with eax = 0)
     GETCPUID(eax, ebx, ecx, edx, 0, 0);
-    cpuid->vendor_str.append(reinterpret_cast<char *>(&ebx), 4);
-    cpuid->vendor_str.append(reinterpret_cast<char *>(&edx), 4);
-    cpuid->vendor_str.append(reinterpret_cast<char *>(&ecx), 4);
+    cpuid->vendor_str_.append(reinterpret_cast<char *>(&ebx), 4);
+    cpuid->vendor_str_.append(reinterpret_cast<char *>(&edx), 4);
+    cpuid->vendor_str_.append(reinterpret_cast<char *>(&ecx), 4);
 
     // To get general information and extended features we send eax = 1 and
     // ecx = 0 to cpuid.  The response is returned in eax, ebx, ecx and edx.
@@ -131,8 +127,8 @@ public:
     // Volume 2A: Instruction Set Reference, A-M CPUID).
     GETCPUID(eax, ebx, ecx, edx, 1, 0);
 
-    cpuid->model_num = static_cast<int>((eax >> 4) & 0xf);
-    cpuid->family = static_cast<int>((eax >> 8) & 0xf);
+    cpuid->model_num_ = static_cast<int>((eax >> 4) & 0xf);
+    cpuid->family_ = static_cast<int>((eax >> 8) & 0xf);
 
     cpuid->have_aes_ = (ecx >> 25) & 0x1;
     cpuid->have_cmov_ = (edx >> 15) & 0x1;
@@ -254,6 +250,10 @@ public:
     return false;
   }
 
+  string vendor_str() const { return vendor_str_; }
+  int family() const { return family_; }
+  int model_num() { return model_num_; }
+
  private:
   int highest_eax_;
   int have_adx_ : 1;
@@ -293,6 +293,9 @@ public:
   int have_sse4_2_ : 1;
   int have_ssse3_ : 1;
   int have_hypervisor_ : 1;
+  string vendor_str_;
+  int family_;
+  int model_num_;
 };
 
 std::once_flag cpuid_once_flag;
@@ -318,7 +321,7 @@ bool TestCPUFeature(CPUFeature feature) {
 std::string CPUVendorIDString() {
 #ifdef PLATFORM_IS_X86
   InitCPUIDInfo();
-  return cpuid->vendor_str;
+  return cpuid->vendor_str();
 #else
   return "";
 #endif
@@ -327,7 +330,7 @@ std::string CPUVendorIDString() {
 int CPUFamily() {
 #ifdef PLATFORM_IS_X86
   InitCPUIDInfo();
-  return cpuid->family;
+  return cpuid->family();
 #else
   return 0;
 #endif
@@ -336,7 +339,7 @@ int CPUFamily() {
 int CPUModelNum() {
 #ifdef PLATFORM_IS_X86
   InitCPUIDInfo();
-  return cpuid->model_num;
+  return cpuid->model_num();
 #else
   return 0;
 #endif
diff --git a/tensorflow/core/platform/windows/port.cc b/tensorflow/core/platform/windows/port.cc
index b29f978f661..85b53e07c43 100644
--- a/tensorflow/core/platform/windows/port.cc
+++ b/tensorflow/core/platform/windows/port.cc
@@ -58,21 +58,20 @@ int NumSchedulableCPUs() {
 
 void* AlignedMalloc(size_t size, int minimum_alignment) {
 #ifdef TENSORFLOW_USE_JEMALLOC
-    void* ptr = NULL;
-    // posix_memalign requires that the requested alignment be at least
-    // sizeof(void*). In this case, fall back on malloc which should return
-    // memory aligned to at least the size of a pointer.
-    const int required_alignment = sizeof(void*);
-    if (minimum_alignment < required_alignment) return Malloc(size);
-    int err = jemalloc_posix_memalign(&ptr, minimum_alignment, size);
-    if (err != 0) {
-        return NULL;
-    }
-    else {
-        return ptr;
-    }
+  void* ptr = NULL;
+  // posix_memalign requires that the requested alignment be at least
+  // sizeof(void*). In this case, fall back on malloc which should return
+  // memory aligned to at least the size of a pointer.
+  const int required_alignment = sizeof(void*);
+  if (minimum_alignment < required_alignment) return Malloc(size);
+  int err = jemalloc_posix_memalign(&ptr, minimum_alignment, size);
+  if (err != 0) {
+    return NULL;
+  } else {
+    return ptr;
+  }
 #else
-    return _aligned_malloc(size, minimum_alignment);
+  return _aligned_malloc(size, minimum_alignment);
 #endif
 }
 
diff --git a/tensorflow/core/protobuf/rewriter_config.proto b/tensorflow/core/protobuf/rewriter_config.proto
index aef69461d88..6e9eff62254 100644
--- a/tensorflow/core/protobuf/rewriter_config.proto
+++ b/tensorflow/core/protobuf/rewriter_config.proto
@@ -9,4 +9,8 @@ option java_package = "org.tensorflow.framework";
 message RewriterConfig {
   bool optimize_tensor_layout = 1;
   bool disable_model_pruning = 2;
+  bool constant_folding = 3;
+  // If non-empty, will use this as an alternative way to specify a list of
+  // optimizations to turn on and the order of the optimizations.
+  repeated string optimizers = 100;
 }
diff --git a/tensorflow/core/util/equal_graph_def.cc b/tensorflow/core/util/equal_graph_def.cc
index 7e7a3f52236..2db026da56c 100644
--- a/tensorflow/core/util/equal_graph_def.cc
+++ b/tensorflow/core/util/equal_graph_def.cc
@@ -28,13 +28,18 @@ bool EqualGraphDef(const GraphDef& actual, const GraphDef& expected,
                    string* diff, const EqualGraphDefOptions& options) {
   // Intentionally do not check that versions match so that this routine can
   // be used for less brittle golden file tests.
+  return EqualRepeatedNodeDef(actual.node(), expected.node(), diff, options);
+}
 
+bool EqualRepeatedNodeDef(const protobuf::RepeatedPtrField<NodeDef>& actual,
+                          const protobuf::RepeatedPtrField<NodeDef>& expected,
+                          string* diff, const EqualGraphDefOptions& options) {
   std::unordered_map<string, const NodeDef*> actual_index;
-  for (const NodeDef& node : actual.node()) {
+  for (const NodeDef& node : actual) {
     actual_index[node.name()] = &node;
   }
 
-  for (const NodeDef& expected_node : expected.node()) {
+  for (const NodeDef& expected_node : expected) {
     auto actual_iter = actual_index.find(expected_node.name());
     if (actual_iter == actual_index.end()) {
       if (diff != nullptr) {
@@ -53,10 +58,9 @@ bool EqualGraphDef(const GraphDef& actual, const GraphDef& expected,
 
   if (!actual_index.empty()) {
     if (diff != nullptr) {
-      *diff = strings::StrCat("Found unexpected node '",
-                              SummarizeNodeDef(*actual_index.begin()->second),
-                              "' not in expected graph:\n",
-                              SummarizeGraphDef(expected));
+      *diff =
+          strings::StrCat("Found unexpected node '",
+                          SummarizeNodeDef(*actual_index.begin()->second), "'");
     }
     return false;
   }
diff --git a/tensorflow/core/util/equal_graph_def.h b/tensorflow/core/util/equal_graph_def.h
index 82f8bd0713b..1ce6181c2e7 100644
--- a/tensorflow/core/util/equal_graph_def.h
+++ b/tensorflow/core/util/equal_graph_def.h
@@ -18,6 +18,7 @@ limitations under the License.
 
 #include "tensorflow/core/framework/graph.pb.h"
 #include "tensorflow/core/framework/graph_def_util.h"
+#include "tensorflow/core/platform/protobuf.h"
 #include "tensorflow/core/platform/types.h"
 
 namespace tensorflow {
@@ -44,6 +45,14 @@ bool EqualGraphDef(const GraphDef& actual, const GraphDef& expected,
 bool EqualNodeDef(const NodeDef& actual, const NodeDef& expected, string* diff,
                   const EqualGraphDefOptions& options = {});
 
+// Determines if actual and expected are equal, ignoring ordering. If they're
+// different and diff != nullptr, *diff is set to an explanation of the
+// difference.
+bool EqualRepeatedNodeDef(const protobuf::RepeatedPtrField<NodeDef>& actual,
+                          const protobuf::RepeatedPtrField<NodeDef>& expected,
+                          string* diff,
+                          const EqualGraphDefOptions& options = {});
+
 #define TF_EXPECT_GRAPH_EQ(expected, actual)                  \
   do {                                                        \
     string diff;                                              \
diff --git a/tensorflow/core/util/equal_graph_def_test.cc b/tensorflow/core/util/equal_graph_def_test.cc
index 9ce951e6eff..af870c5c607 100644
--- a/tensorflow/core/util/equal_graph_def_test.cc
+++ b/tensorflow/core/util/equal_graph_def_test.cc
@@ -47,8 +47,7 @@ class EqualGraphDefTest : public ::testing::Test {
  protected:
   EqualGraphDefTest()
       : e_(GraphDefBuilder::kFailImmediately),
-        a_(GraphDefBuilder::kFailImmediately) {
-  }
+        a_(GraphDefBuilder::kFailImmediately) {}
 
   bool Match() {
     GraphDef expected;
@@ -89,11 +88,7 @@ TEST_F(EqualGraphDefTest, ExtraNode) {
   Input(a_.opts().WithName("A"));
   Input(a_.opts().WithName("B"));
   EXPECT_FALSE(Match());
-  EXPECT_EQ(strings::StrCat(
-                "Found unexpected node 'B = Input[]()' not in expected graph:\n"
-                "versions = producer: ",
-                TF_GRAPH_DEF_VERSION, ";\n", "A = Input[]();\n"),
-            diff_);
+  EXPECT_EQ("Found unexpected node 'B = Input[]()'", diff_);
 }
 
 TEST_F(EqualGraphDefTest, NodeOrder) {
@@ -169,21 +164,23 @@ TEST_F(EqualGraphDefTest, ControlInputOrder) {
   Node* b = Input(e_.opts().WithName("B"));
   Node* c = Input(e_.opts().WithName("C"));
   Node* d = Input(e_.opts().WithName("D"));
-  Combine(a, a, e_.opts()
-                    .WithName("E")
-                    .WithControlInput(b)
-                    .WithControlInput(c)
-                    .WithControlInput(d));
+  Combine(a, a,
+          e_.opts()
+              .WithName("E")
+              .WithControlInput(b)
+              .WithControlInput(c)
+              .WithControlInput(d));
 
   a = Input(a_.opts().WithName("A"));
   b = Input(a_.opts().WithName("B"));
   c = Input(a_.opts().WithName("C"));
   d = Input(a_.opts().WithName("D"));
-  Combine(a, a, a_.opts()
-                    .WithName("E")
-                    .WithControlInput(c)
-                    .WithControlInput(d)
-                    .WithControlInput(b));
+  Combine(a, a,
+          a_.opts()
+              .WithName("E")
+              .WithControlInput(c)
+              .WithControlInput(d)
+              .WithControlInput(b));
   EXPECT_TRUE(Match()) << diff_;
 }
 
diff --git a/tensorflow/core/util/mkl_util.h b/tensorflow/core/util/mkl_util.h
index 4556fae2a30..ebbe195bbc9 100644
--- a/tensorflow/core/util/mkl_util.h
+++ b/tensorflow/core/util/mkl_util.h
@@ -17,8 +17,8 @@ limitations under the License.
 #define TENSORFLOW_CORE_UTIL_MKL_UTIL_H_
 #ifdef INTEL_MKL
 
-#include <vector>
 #include <string>
+#include <vector>
 
 #include "third_party/mkl/include/mkl_dnn.h"
 #include "third_party/mkl/include/mkl_dnn_types.h"
@@ -63,9 +63,7 @@ class MklShape {
 
   void SetMklTensor(const bool isMklTensor) { isMklTensor_ = isMklTensor; }
 
-  void SetDimensions(const size_t dimension) {
-    dimension_ = dimension;
-  }
+  void SetDimensions(const size_t dimension) { dimension_ = dimension; }
 
   void SetMklLayout(const void* primitive, size_t resourceType) {
     CHECK_EQ(
@@ -408,8 +406,8 @@ static inline bool IsMklLayer(const std::string& op_name, DataType T) {
   // the type is float. Actually, we should query kernel registration and
   // find out if op is supported for type T. But there is no API to query
   // kernel registration using name and type.
-  bool result = (kernel.find(kMklLayerLabelPattern) != string::npos) &&
-                (T == DT_FLOAT);
+  bool result =
+      (kernel.find(kMklLayerLabelPattern) != string::npos) && (T == DT_FLOAT);
   if (result == true) {
     VLOG(1) << "mkl_layer_registry::" << op_name << " is " << kMklLayerLabel;
   }
diff --git a/tensorflow/go/example_inception_inference_test.go b/tensorflow/go/example_inception_inference_test.go
index 87056b85a27..682bd245cc7 100644
--- a/tensorflow/go/example_inception_inference_test.go
+++ b/tensorflow/go/example_inception_inference_test.go
@@ -28,8 +28,8 @@ import (
 	"os"
 	"path/filepath"
 
-	tf "github.com/tensorflow/tensorflow/tensorflow/go"
 	"github.com/tensorflow/tensorflow/tensorflow/go/op"
+	tf "github.com/tensorflow/tensorflow/tensorflow/go"
 )
 
 func Example() {
diff --git a/tensorflow/go/genop/internal/genop.go b/tensorflow/go/genop/internal/genop.go
index 7c9c5a4d6ea..dec08dee1ca 100644
--- a/tensorflow/go/genop/internal/genop.go
+++ b/tensorflow/go/genop/internal/genop.go
@@ -158,12 +158,12 @@ func makeOutputList(op *tf.Operation, start int, output string) ([]tf.Output, in
 `))
 
 	tmplOp = template.Must(template.New("op").Funcs(template.FuncMap{
-		"MakeComment": makeComment,
-		"GoType":      goType,
-		"CamelCase":   camelCase,
-		"Identifier":  identifier,
-		"IsListArg":   isListArg,
-		"IsListAttr":  isListAttr,
+		"MakeComment":       makeComment,
+		"GoType":            goType,
+		"CamelCase":         camelCase,
+		"Identifier":        identifier,
+		"IsListArg":         isListArg,
+		"IsListAttr":        isListAttr,
 		"StripLeadingColon": stripLeadingColon,
 	}).Parse(`
 {{if .OptionalAttrs -}}
diff --git a/tensorflow/go/op/wrappers.go b/tensorflow/go/op/wrappers.go
index bbf687e3229..877961d0aaf 100644
--- a/tensorflow/go/op/wrappers.go
+++ b/tensorflow/go/op/wrappers.go
@@ -4841,6 +4841,59 @@ func DynamicStitch(scope *Scope, indices []tf.Output, data []tf.Output) (merged
 	return op.Output(0)
 }
 
+// CropAndResizeGradBoxesAttr is an optional argument to CropAndResizeGradBoxes.
+type CropAndResizeGradBoxesAttr func(optionalAttr)
+
+// CropAndResizeGradBoxesMethod sets the optional method attribute to value.
+//
+// value: A string specifying the interpolation method. Only 'bilinear' is
+// supported for now.
+// If not specified, defaults to "bilinear"
+func CropAndResizeGradBoxesMethod(value string) CropAndResizeGradBoxesAttr {
+	return func(m optionalAttr) {
+		m["method"] = value
+	}
+}
+
+// Computes the gradient of the crop_and_resize op wrt the input boxes tensor.
+//
+// Arguments:
+//	grads: A 4-D tensor of shape `[num_boxes, crop_height, crop_width, depth]`.
+//	image: A 4-D tensor of shape `[batch, image_height, image_width, depth]`.
+// Both `image_height` and `image_width` need to be positive.
+//	boxes: A 2-D tensor of shape `[num_boxes, 4]`. The `i`-th row of the tensor
+// specifies the coordinates of a box in the `box_ind[i]` image and is specified
+// in normalized coordinates `[y1, x1, y2, x2]`. A normalized coordinate value of
+// `y` is mapped to the image coordinate at `y * (image_height - 1)`, so as the
+// `[0, 1]` interval of normalized image height is mapped to
+// `[0, image_height - 1] in image height coordinates. We do allow y1 > y2, in
+// which case the sampled crop is an up-down flipped version of the original
+// image. The width dimension is treated similarly. Normalized coordinates
+// outside the `[0, 1]` range are allowed, in which case we use
+// `extrapolation_value` to extrapolate the input image values.
+//	box_ind: A 1-D tensor of shape `[num_boxes]` with int32 values in `[0, batch)`.
+// The value of `box_ind[i]` specifies the image that the `i`-th box refers to.
+//
+// Returns A 2-D tensor of shape `[num_boxes, 4]`.
+func CropAndResizeGradBoxes(scope *Scope, grads tf.Output, image tf.Output, boxes tf.Output, box_ind tf.Output, optional ...CropAndResizeGradBoxesAttr) (output tf.Output) {
+	if scope.Err() != nil {
+		return
+	}
+	attrs := map[string]interface{}{}
+	for _, a := range optional {
+		a(attrs)
+	}
+	opspec := tf.OpSpec{
+		Type: "CropAndResizeGradBoxes",
+		Input: []tf.Input{
+			grads, image, boxes, box_ind,
+		},
+		Attrs: attrs,
+	}
+	op := scope.AddOperation(opspec)
+	return op.Output(0)
+}
+
 // Computes softmax cross entropy cost and gradients to backpropagate.
 //
 // Unlike `SoftmaxCrossEntropyWithLogits`, this operation does not accept
@@ -5205,10 +5258,10 @@ func FusedBatchNormGrad(scope *Scope, y_backprop tf.Output, x tf.Output, scale t
 	return op.Output(0), op.Output(1), op.Output(2), op.Output(3), op.Output(4)
 }
 
-// AvgPool3DAttr is an optional argument to AvgPool3D.
-type AvgPool3DAttr func(optionalAttr)
+// MaxPool3DGradGradAttr is an optional argument to MaxPool3DGradGrad.
+type MaxPool3DGradGradAttr func(optionalAttr)
 
-// AvgPool3DDataFormat sets the optional data_format attribute to value.
+// MaxPool3DGradGradDataFormat sets the optional data_format attribute to value.
 //
 // value: The data format of the input and output data. With the
 // default format "NDHWC", the data is stored in the order of:
@@ -5216,24 +5269,26 @@ type AvgPool3DAttr func(optionalAttr)
 // Alternatively, the format could be "NCDHW", the data storage order is:
 //     [batch, in_channels, in_depth, in_height, in_width].
 // If not specified, defaults to "NDHWC"
-func AvgPool3DDataFormat(value string) AvgPool3DAttr {
+func MaxPool3DGradGradDataFormat(value string) MaxPool3DGradGradAttr {
 	return func(m optionalAttr) {
 		m["data_format"] = value
 	}
 }
 
-// Performs 3D average pooling on the input.
+// Computes second-order gradients of the maxpooling function.
 //
 // Arguments:
-//	input: Shape `[batch, depth, rows, cols, channels]` tensor to pool over.
+//	orig_input: The original input tensor.
+//	orig_output: The original output tensor.
+//	grad: Output backprop of shape `[batch, depth, rows, cols, channels]`.
 //	ksize: 1-D tensor of length 5. The size of the window for each dimension of
 // the input tensor. Must have `ksize[0] = ksize[4] = 1`.
 //	strides: 1-D tensor of length 5. The stride of the sliding window for each
 // dimension of `input`. Must have `strides[0] = strides[4] = 1`.
 //	padding: The type of padding algorithm to use.
 //
-// Returns The average pooled output tensor.
-func AvgPool3D(scope *Scope, input tf.Output, ksize []int64, strides []int64, padding string, optional ...AvgPool3DAttr) (output tf.Output) {
+// Returns Gradients of gradients w.r.t. the input to `max_pool`.
+func MaxPool3DGradGrad(scope *Scope, orig_input tf.Output, orig_output tf.Output, grad tf.Output, ksize []int64, strides []int64, padding string, optional ...MaxPool3DGradGradAttr) (output tf.Output) {
 	if scope.Err() != nil {
 		return
 	}
@@ -5242,9 +5297,9 @@ func AvgPool3D(scope *Scope, input tf.Output, ksize []int64, strides []int64, pa
 		a(attrs)
 	}
 	opspec := tf.OpSpec{
-		Type: "AvgPool3D",
+		Type: "MaxPool3DGradGrad",
 		Input: []tf.Input{
-			input,
+			orig_input, orig_output, grad,
 		},
 		Attrs: attrs,
 	}
@@ -5252,35 +5307,6 @@ func AvgPool3D(scope *Scope, input tf.Output, ksize []int64, strides []int64, pa
 	return op.Output(0)
 }
 
-// Produces the max pool of the input tensor for quantized types.
-//
-// Arguments:
-//	input: The 4D (batch x rows x cols x depth) Tensor to MaxReduce over.
-//	min_input: The float value that the lowest quantized input value represents.
-//	max_input: The float value that the highest quantized input value represents.
-//	ksize: The size of the window for each dimension of the input tensor.
-// The length must be 4 to match the number of dimensions of the input.
-//	strides: The stride of the sliding window for each dimension of the input
-// tensor. The length must be 4 to match the number of dimensions of the input.
-//	padding: The type of padding algorithm to use.
-//
-// Returns The float value that the lowest quantized output value represents.The float value that the highest quantized output value represents.
-func QuantizedMaxPool(scope *Scope, input tf.Output, min_input tf.Output, max_input tf.Output, ksize []int64, strides []int64, padding string) (output tf.Output, min_output tf.Output, max_output tf.Output) {
-	if scope.Err() != nil {
-		return
-	}
-	attrs := map[string]interface{}{"ksize": ksize, "strides": strides, "padding": padding}
-	opspec := tf.OpSpec{
-		Type: "QuantizedMaxPool",
-		Input: []tf.Input{
-			input, min_input, max_input,
-		},
-		Attrs: attrs,
-	}
-	op := scope.AddOperation(opspec)
-	return op.Output(0), op.Output(1), op.Output(2)
-}
-
 // FakeQuantWithMinMaxArgsGradientAttr is an optional argument to FakeQuantWithMinMaxArgsGradient.
 type FakeQuantWithMinMaxArgsGradientAttr func(optionalAttr)
 
@@ -5403,6 +5429,82 @@ func Equal(scope *Scope, x tf.Output, y tf.Output) (z tf.Output) {
 	return op.Output(0)
 }
 
+// AvgPool3DAttr is an optional argument to AvgPool3D.
+type AvgPool3DAttr func(optionalAttr)
+
+// AvgPool3DDataFormat sets the optional data_format attribute to value.
+//
+// value: The data format of the input and output data. With the
+// default format "NDHWC", the data is stored in the order of:
+//     [batch, in_depth, in_height, in_width, in_channels].
+// Alternatively, the format could be "NCDHW", the data storage order is:
+//     [batch, in_channels, in_depth, in_height, in_width].
+// If not specified, defaults to "NDHWC"
+func AvgPool3DDataFormat(value string) AvgPool3DAttr {
+	return func(m optionalAttr) {
+		m["data_format"] = value
+	}
+}
+
+// Performs 3D average pooling on the input.
+//
+// Arguments:
+//	input: Shape `[batch, depth, rows, cols, channels]` tensor to pool over.
+//	ksize: 1-D tensor of length 5. The size of the window for each dimension of
+// the input tensor. Must have `ksize[0] = ksize[4] = 1`.
+//	strides: 1-D tensor of length 5. The stride of the sliding window for each
+// dimension of `input`. Must have `strides[0] = strides[4] = 1`.
+//	padding: The type of padding algorithm to use.
+//
+// Returns The average pooled output tensor.
+func AvgPool3D(scope *Scope, input tf.Output, ksize []int64, strides []int64, padding string, optional ...AvgPool3DAttr) (output tf.Output) {
+	if scope.Err() != nil {
+		return
+	}
+	attrs := map[string]interface{}{"ksize": ksize, "strides": strides, "padding": padding}
+	for _, a := range optional {
+		a(attrs)
+	}
+	opspec := tf.OpSpec{
+		Type: "AvgPool3D",
+		Input: []tf.Input{
+			input,
+		},
+		Attrs: attrs,
+	}
+	op := scope.AddOperation(opspec)
+	return op.Output(0)
+}
+
+// Produces the max pool of the input tensor for quantized types.
+//
+// Arguments:
+//	input: The 4D (batch x rows x cols x depth) Tensor to MaxReduce over.
+//	min_input: The float value that the lowest quantized input value represents.
+//	max_input: The float value that the highest quantized input value represents.
+//	ksize: The size of the window for each dimension of the input tensor.
+// The length must be 4 to match the number of dimensions of the input.
+//	strides: The stride of the sliding window for each dimension of the input
+// tensor. The length must be 4 to match the number of dimensions of the input.
+//	padding: The type of padding algorithm to use.
+//
+// Returns The float value that the lowest quantized output value represents.The float value that the highest quantized output value represents.
+func QuantizedMaxPool(scope *Scope, input tf.Output, min_input tf.Output, max_input tf.Output, ksize []int64, strides []int64, padding string) (output tf.Output, min_output tf.Output, max_output tf.Output) {
+	if scope.Err() != nil {
+		return
+	}
+	attrs := map[string]interface{}{"ksize": ksize, "strides": strides, "padding": padding}
+	opspec := tf.OpSpec{
+		Type: "QuantizedMaxPool",
+		Input: []tf.Input{
+			input, min_input, max_input,
+		},
+		Attrs: attrs,
+	}
+	op := scope.AddOperation(opspec)
+	return op.Output(0), op.Output(1), op.Output(2)
+}
+
 // Conv3DBackpropInputV2Attr is an optional argument to Conv3DBackpropInputV2.
 type Conv3DBackpropInputV2Attr func(optionalAttr)
 
@@ -5959,64 +6061,6 @@ func BiasAddV1(scope *Scope, value tf.Output, bias tf.Output) (output tf.Output)
 	return op.Output(0)
 }
 
-// FractionalAvgPoolGradAttr is an optional argument to FractionalAvgPoolGrad.
-type FractionalAvgPoolGradAttr func(optionalAttr)
-
-// FractionalAvgPoolGradOverlapping sets the optional overlapping attribute to value.
-//
-// value: When set to True, it means when pooling, the values at the boundary
-// of adjacent pooling cells are used by both cells. For example:
-//
-// `index  0  1  2  3  4`
-//
-// `value  20 5  16 3  7`
-//
-// If the pooling sequence is [0, 2, 4], then 16, at index 2 will be used twice.
-// The result would be [41/3, 26/3] for fractional avg pooling.
-// If not specified, defaults to false
-func FractionalAvgPoolGradOverlapping(value bool) FractionalAvgPoolGradAttr {
-	return func(m optionalAttr) {
-		m["overlapping"] = value
-	}
-}
-
-// Computes gradient of the FractionalAvgPool function.
-//
-// Unlike FractionalMaxPoolGrad, we don't need to find arg_max for
-// FractionalAvgPoolGrad, we just need to evenly back-propagate each element of
-// out_backprop to those indices that form the same pooling cell. Therefore, we
-// just need to know the shape of original input tensor, instead of the whole
-// tensor.
-//
-// Arguments:
-//	orig_input_tensor_shape: Original input tensor shape for `fractional_avg_pool`
-//	out_backprop: 4-D with shape `[batch, height, width, channels]`.  Gradients
-// w.r.t. the output of `fractional_avg_pool`.
-//	row_pooling_sequence: row pooling sequence, form pooling region with
-// col_pooling_sequence.
-//	col_pooling_sequence: column pooling sequence, form pooling region with
-// row_pooling sequence.
-//
-// Returns 4-D.  Gradients w.r.t. the input of `fractional_avg_pool`.
-func FractionalAvgPoolGrad(scope *Scope, orig_input_tensor_shape tf.Output, out_backprop tf.Output, row_pooling_sequence tf.Output, col_pooling_sequence tf.Output, optional ...FractionalAvgPoolGradAttr) (output tf.Output) {
-	if scope.Err() != nil {
-		return
-	}
-	attrs := map[string]interface{}{}
-	for _, a := range optional {
-		a(attrs)
-	}
-	opspec := tf.OpSpec{
-		Type: "FractionalAvgPoolGrad",
-		Input: []tf.Input{
-			orig_input_tensor_shape, out_backprop, row_pooling_sequence, col_pooling_sequence,
-		},
-		Attrs: attrs,
-	}
-	op := scope.AddOperation(opspec)
-	return op.Output(0)
-}
-
 // Conv2DBackpropInputAttr is an optional argument to Conv2DBackpropInput.
 type Conv2DBackpropInputAttr func(optionalAttr)
 
@@ -6219,104 +6263,6 @@ func Sigmoid(scope *Scope, x tf.Output) (y tf.Output) {
 	return op.Output(0)
 }
 
-// FractionalMaxPoolGradAttr is an optional argument to FractionalMaxPoolGrad.
-type FractionalMaxPoolGradAttr func(optionalAttr)
-
-// FractionalMaxPoolGradOverlapping sets the optional overlapping attribute to value.
-//
-// value: When set to True, it means when pooling, the values at the boundary
-// of adjacent pooling cells are used by both cells. For example:
-//
-// `index  0  1  2  3  4`
-//
-// `value  20 5  16 3  7`
-//
-// If the pooling sequence is [0, 2, 4], then 16, at index 2 will be used twice.
-// The result would be [20, 16] for fractional max pooling.
-// If not specified, defaults to false
-func FractionalMaxPoolGradOverlapping(value bool) FractionalMaxPoolGradAttr {
-	return func(m optionalAttr) {
-		m["overlapping"] = value
-	}
-}
-
-// Computes gradient of the FractionalMaxPool function.
-//
-// Arguments:
-//	orig_input: Original input for `fractional_max_pool`
-//	orig_output: Original output for `fractional_max_pool`
-//	out_backprop: 4-D with shape `[batch, height, width, channels]`.  Gradients
-// w.r.t. the output of `fractional_max_pool`.
-//	row_pooling_sequence: row pooling sequence, form pooling region with
-// col_pooling_sequence.
-//	col_pooling_sequence: column pooling sequence, form pooling region with
-// row_pooling sequence.
-//
-// Returns 4-D.  Gradients w.r.t. the input of `fractional_max_pool`.
-func FractionalMaxPoolGrad(scope *Scope, orig_input tf.Output, orig_output tf.Output, out_backprop tf.Output, row_pooling_sequence tf.Output, col_pooling_sequence tf.Output, optional ...FractionalMaxPoolGradAttr) (output tf.Output) {
-	if scope.Err() != nil {
-		return
-	}
-	attrs := map[string]interface{}{}
-	for _, a := range optional {
-		a(attrs)
-	}
-	opspec := tf.OpSpec{
-		Type: "FractionalMaxPoolGrad",
-		Input: []tf.Input{
-			orig_input, orig_output, out_backprop, row_pooling_sequence, col_pooling_sequence,
-		},
-		Attrs: attrs,
-	}
-	op := scope.AddOperation(opspec)
-	return op.Output(0)
-}
-
-// ResourceApplyAdagradDAAttr is an optional argument to ResourceApplyAdagradDA.
-type ResourceApplyAdagradDAAttr func(optionalAttr)
-
-// ResourceApplyAdagradDAUseLocking sets the optional use_locking attribute to value.
-//
-// value: If True, updating of the var and accum tensors will be protected by
-// a lock; otherwise the behavior is undefined, but may exhibit less contention.
-// If not specified, defaults to false
-func ResourceApplyAdagradDAUseLocking(value bool) ResourceApplyAdagradDAAttr {
-	return func(m optionalAttr) {
-		m["use_locking"] = value
-	}
-}
-
-// Update '*var' according to the proximal adagrad scheme.
-//
-// Arguments:
-//	var_: Should be from a Variable().
-//	gradient_accumulator: Should be from a Variable().
-//	gradient_squared_accumulator: Should be from a Variable().
-//	grad: The gradient.
-//	lr: Scaling factor. Must be a scalar.
-//	l1: L1 regularization. Must be a scalar.
-//	l2: L2 regularization. Must be a scalar.
-//	global_step: Training step number. Must be a scalar.
-//
-// Returns the created operation.
-func ResourceApplyAdagradDA(scope *Scope, var_ tf.Output, gradient_accumulator tf.Output, gradient_squared_accumulator tf.Output, grad tf.Output, lr tf.Output, l1 tf.Output, l2 tf.Output, global_step tf.Output, optional ...ResourceApplyAdagradDAAttr) (o *tf.Operation) {
-	if scope.Err() != nil {
-		return
-	}
-	attrs := map[string]interface{}{}
-	for _, a := range optional {
-		a(attrs)
-	}
-	opspec := tf.OpSpec{
-		Type: "ResourceApplyAdagradDA",
-		Input: []tf.Input{
-			var_, gradient_accumulator, gradient_squared_accumulator, grad, lr, l1, l2, global_step,
-		},
-		Attrs: attrs,
-	}
-	return scope.AddOperation(opspec)
-}
-
 // ComputeAccidentalHitsAttr is an optional argument to ComputeAccidentalHits.
 type ComputeAccidentalHitsAttr func(optionalAttr)
 
@@ -9304,6 +9250,64 @@ func FractionalMaxPool(scope *Scope, value tf.Output, pooling_ratio []float32, o
 	return op.Output(0), op.Output(1), op.Output(2)
 }
 
+// FractionalAvgPoolGradAttr is an optional argument to FractionalAvgPoolGrad.
+type FractionalAvgPoolGradAttr func(optionalAttr)
+
+// FractionalAvgPoolGradOverlapping sets the optional overlapping attribute to value.
+//
+// value: When set to True, it means when pooling, the values at the boundary
+// of adjacent pooling cells are used by both cells. For example:
+//
+// `index  0  1  2  3  4`
+//
+// `value  20 5  16 3  7`
+//
+// If the pooling sequence is [0, 2, 4], then 16, at index 2 will be used twice.
+// The result would be [41/3, 26/3] for fractional avg pooling.
+// If not specified, defaults to false
+func FractionalAvgPoolGradOverlapping(value bool) FractionalAvgPoolGradAttr {
+	return func(m optionalAttr) {
+		m["overlapping"] = value
+	}
+}
+
+// Computes gradient of the FractionalAvgPool function.
+//
+// Unlike FractionalMaxPoolGrad, we don't need to find arg_max for
+// FractionalAvgPoolGrad, we just need to evenly back-propagate each element of
+// out_backprop to those indices that form the same pooling cell. Therefore, we
+// just need to know the shape of original input tensor, instead of the whole
+// tensor.
+//
+// Arguments:
+//	orig_input_tensor_shape: Original input tensor shape for `fractional_avg_pool`
+//	out_backprop: 4-D with shape `[batch, height, width, channels]`.  Gradients
+// w.r.t. the output of `fractional_avg_pool`.
+//	row_pooling_sequence: row pooling sequence, form pooling region with
+// col_pooling_sequence.
+//	col_pooling_sequence: column pooling sequence, form pooling region with
+// row_pooling sequence.
+//
+// Returns 4-D.  Gradients w.r.t. the input of `fractional_avg_pool`.
+func FractionalAvgPoolGrad(scope *Scope, orig_input_tensor_shape tf.Output, out_backprop tf.Output, row_pooling_sequence tf.Output, col_pooling_sequence tf.Output, optional ...FractionalAvgPoolGradAttr) (output tf.Output) {
+	if scope.Err() != nil {
+		return
+	}
+	attrs := map[string]interface{}{}
+	for _, a := range optional {
+		a(attrs)
+	}
+	opspec := tf.OpSpec{
+		Type: "FractionalAvgPoolGrad",
+		Input: []tf.Input{
+			orig_input_tensor_shape, out_backprop, row_pooling_sequence, col_pooling_sequence,
+		},
+		Attrs: attrs,
+	}
+	op := scope.AddOperation(opspec)
+	return op.Output(0)
+}
+
 // Reorders a SparseTensor into the canonical, row-major ordering.
 //
 // Note that by convention, all sparse ops preserve the canonical ordering along
@@ -9769,6 +9773,35 @@ func RequantizationRange(scope *Scope, input tf.Output, input_min tf.Output, inp
 	return op.Output(0), op.Output(1)
 }
 
+// Computes second-order gradients of the maxpooling function.
+//
+// Arguments:
+//	input: The original input.
+//	grad: 4-D with shape `[batch, height, width, channels]`.  Gradients w.r.t. the
+// input of `max_pool`.
+//	argmax: The indices of the maximum values chosen for each output of `max_pool`.
+//	ksize: The size of the window for each dimension of the input tensor.
+//	strides: The stride of the sliding window for each dimension of the
+// input tensor.
+//	padding: The type of padding algorithm to use.
+//
+// Returns Gradients of gradients w.r.t. the input of `max_pool`.
+func MaxPoolGradGradWithArgmax(scope *Scope, input tf.Output, grad tf.Output, argmax tf.Output, ksize []int64, strides []int64, padding string) (output tf.Output) {
+	if scope.Err() != nil {
+		return
+	}
+	attrs := map[string]interface{}{"ksize": ksize, "strides": strides, "padding": padding}
+	opspec := tf.OpSpec{
+		Type: "MaxPoolGradGradWithArgmax",
+		Input: []tf.Input{
+			input, grad, argmax,
+		},
+		Attrs: attrs,
+	}
+	op := scope.AddOperation(opspec)
+	return op.Output(0)
+}
+
 // DepthwiseConv2dNativeBackpropInputAttr is an optional argument to DepthwiseConv2dNativeBackpropInput.
 type DepthwiseConv2dNativeBackpropInputAttr func(optionalAttr)
 
@@ -10539,6 +10572,104 @@ func ResourceSparseApplyAdagradDA(scope *Scope, var_ tf.Output, gradient_accumul
 	return scope.AddOperation(opspec)
 }
 
+// ResourceApplyAdagradDAAttr is an optional argument to ResourceApplyAdagradDA.
+type ResourceApplyAdagradDAAttr func(optionalAttr)
+
+// ResourceApplyAdagradDAUseLocking sets the optional use_locking attribute to value.
+//
+// value: If True, updating of the var and accum tensors will be protected by
+// a lock; otherwise the behavior is undefined, but may exhibit less contention.
+// If not specified, defaults to false
+func ResourceApplyAdagradDAUseLocking(value bool) ResourceApplyAdagradDAAttr {
+	return func(m optionalAttr) {
+		m["use_locking"] = value
+	}
+}
+
+// Update '*var' according to the proximal adagrad scheme.
+//
+// Arguments:
+//	var_: Should be from a Variable().
+//	gradient_accumulator: Should be from a Variable().
+//	gradient_squared_accumulator: Should be from a Variable().
+//	grad: The gradient.
+//	lr: Scaling factor. Must be a scalar.
+//	l1: L1 regularization. Must be a scalar.
+//	l2: L2 regularization. Must be a scalar.
+//	global_step: Training step number. Must be a scalar.
+//
+// Returns the created operation.
+func ResourceApplyAdagradDA(scope *Scope, var_ tf.Output, gradient_accumulator tf.Output, gradient_squared_accumulator tf.Output, grad tf.Output, lr tf.Output, l1 tf.Output, l2 tf.Output, global_step tf.Output, optional ...ResourceApplyAdagradDAAttr) (o *tf.Operation) {
+	if scope.Err() != nil {
+		return
+	}
+	attrs := map[string]interface{}{}
+	for _, a := range optional {
+		a(attrs)
+	}
+	opspec := tf.OpSpec{
+		Type: "ResourceApplyAdagradDA",
+		Input: []tf.Input{
+			var_, gradient_accumulator, gradient_squared_accumulator, grad, lr, l1, l2, global_step,
+		},
+		Attrs: attrs,
+	}
+	return scope.AddOperation(opspec)
+}
+
+// FractionalMaxPoolGradAttr is an optional argument to FractionalMaxPoolGrad.
+type FractionalMaxPoolGradAttr func(optionalAttr)
+
+// FractionalMaxPoolGradOverlapping sets the optional overlapping attribute to value.
+//
+// value: When set to True, it means when pooling, the values at the boundary
+// of adjacent pooling cells are used by both cells. For example:
+//
+// `index  0  1  2  3  4`
+//
+// `value  20 5  16 3  7`
+//
+// If the pooling sequence is [0, 2, 4], then 16, at index 2 will be used twice.
+// The result would be [20, 16] for fractional max pooling.
+// If not specified, defaults to false
+func FractionalMaxPoolGradOverlapping(value bool) FractionalMaxPoolGradAttr {
+	return func(m optionalAttr) {
+		m["overlapping"] = value
+	}
+}
+
+// Computes gradient of the FractionalMaxPool function.
+//
+// Arguments:
+//	orig_input: Original input for `fractional_max_pool`
+//	orig_output: Original output for `fractional_max_pool`
+//	out_backprop: 4-D with shape `[batch, height, width, channels]`.  Gradients
+// w.r.t. the output of `fractional_max_pool`.
+//	row_pooling_sequence: row pooling sequence, form pooling region with
+// col_pooling_sequence.
+//	col_pooling_sequence: column pooling sequence, form pooling region with
+// row_pooling sequence.
+//
+// Returns 4-D.  Gradients w.r.t. the input of `fractional_max_pool`.
+func FractionalMaxPoolGrad(scope *Scope, orig_input tf.Output, orig_output tf.Output, out_backprop tf.Output, row_pooling_sequence tf.Output, col_pooling_sequence tf.Output, optional ...FractionalMaxPoolGradAttr) (output tf.Output) {
+	if scope.Err() != nil {
+		return
+	}
+	attrs := map[string]interface{}{}
+	for _, a := range optional {
+		a(attrs)
+	}
+	opspec := tf.OpSpec{
+		Type: "FractionalMaxPoolGrad",
+		Input: []tf.Input{
+			orig_input, orig_output, out_backprop, row_pooling_sequence, col_pooling_sequence,
+		},
+		Attrs: attrs,
+	}
+	op := scope.AddOperation(opspec)
+	return op.Output(0)
+}
+
 // AvgPool3DGradAttr is an optional argument to AvgPool3DGrad.
 type AvgPool3DGradAttr func(optionalAttr)
 
@@ -12160,6 +12291,118 @@ func SparseDenseCwiseMul(scope *Scope, sp_indices tf.Output, sp_values tf.Output
 	return op.Output(0)
 }
 
+// NonMaxSuppressionAttr is an optional argument to NonMaxSuppression.
+type NonMaxSuppressionAttr func(optionalAttr)
+
+// NonMaxSuppressionIouThreshold sets the optional iou_threshold attribute to value.
+//
+// value: A float representing the threshold for deciding whether boxes
+// overlap too much with respect to IOU.
+// If not specified, defaults to 0.5
+func NonMaxSuppressionIouThreshold(value float32) NonMaxSuppressionAttr {
+	return func(m optionalAttr) {
+		m["iou_threshold"] = value
+	}
+}
+
+// Greedily selects a subset of bounding boxes in descending order of score,
+//
+// pruning away boxes that have high intersection-over-union (IOU) overlap
+// with previously selected boxes.  Bounding boxes are supplied as
+// [y1, x1, y2, x2], where (y1, x1) and (y2, x2) are the coordinates of any
+// diagonal pair of box corners and the coordinates can be provided as normalized
+// (i.e., lying in the interval [0, 1]) or absolute.  Note that this algorithm
+// is agnostic to where the origin is in the coordinate system.  Note that this
+// algorithm is invariant to orthogonal transformations and translations
+// of the coordinate system; thus translating or reflections of the coordinate
+// system result in the same boxes being selected by the algorithm.
+//
+// The output of this operation is a set of integers indexing into the input
+// collection of bounding boxes representing the selected boxes.  The bounding
+// box coordinates corresponding to the selected indices can then be obtained
+// using the `tf.gather operation`.  For example:
+//
+//   selected_indices = tf.image.non_max_suppression(
+//       boxes, scores, max_output_size, iou_threshold)
+//   selected_boxes = tf.gather(boxes, selected_indices)
+//
+// Arguments:
+//	boxes: A 2-D float tensor of shape `[num_boxes, 4]`.
+//	scores: A 1-D float tensor of shape `[num_boxes]` representing a single
+// score corresponding to each box (each row of boxes).
+//	max_output_size: A scalar integer tensor representing the maximum number of
+// boxes to be selected by non max suppression.
+//
+// Returns A 1-D integer tensor of shape `[M]` representing the selected
+// indices from the boxes tensor, where `M <= max_output_size`.
+func NonMaxSuppression(scope *Scope, boxes tf.Output, scores tf.Output, max_output_size tf.Output, optional ...NonMaxSuppressionAttr) (selected_indices tf.Output) {
+	if scope.Err() != nil {
+		return
+	}
+	attrs := map[string]interface{}{}
+	for _, a := range optional {
+		a(attrs)
+	}
+	opspec := tf.OpSpec{
+		Type: "NonMaxSuppression",
+		Input: []tf.Input{
+			boxes, scores, max_output_size,
+		},
+		Attrs: attrs,
+	}
+	op := scope.AddOperation(opspec)
+	return op.Output(0)
+}
+
+// ResourceApplyAdadeltaAttr is an optional argument to ResourceApplyAdadelta.
+type ResourceApplyAdadeltaAttr func(optionalAttr)
+
+// ResourceApplyAdadeltaUseLocking sets the optional use_locking attribute to value.
+//
+// value: If True, updating of the var, accum and update_accum tensors will be protected by
+// a lock; otherwise the behavior is undefined, but may exhibit less contention.
+// If not specified, defaults to false
+func ResourceApplyAdadeltaUseLocking(value bool) ResourceApplyAdadeltaAttr {
+	return func(m optionalAttr) {
+		m["use_locking"] = value
+	}
+}
+
+// Update '*var' according to the adadelta scheme.
+//
+// accum = rho() * accum + (1 - rho()) * grad.square();
+// update = (update_accum + epsilon).sqrt() * (accum + epsilon()).rsqrt() * grad;
+// update_accum = rho() * update_accum + (1 - rho()) * update.square();
+// var -= update;
+//
+// Arguments:
+//	var_: Should be from a Variable().
+//	accum: Should be from a Variable().
+//	accum_update: Should be from a Variable().
+//	lr: Scaling factor. Must be a scalar.
+//	rho: Decay factor. Must be a scalar.
+//	epsilon: Constant factor. Must be a scalar.
+//	grad: The gradient.
+//
+// Returns the created operation.
+func ResourceApplyAdadelta(scope *Scope, var_ tf.Output, accum tf.Output, accum_update tf.Output, lr tf.Output, rho tf.Output, epsilon tf.Output, grad tf.Output, optional ...ResourceApplyAdadeltaAttr) (o *tf.Operation) {
+	if scope.Err() != nil {
+		return
+	}
+	attrs := map[string]interface{}{}
+	for _, a := range optional {
+		a(attrs)
+	}
+	opspec := tf.OpSpec{
+		Type: "ResourceApplyAdadelta",
+		Input: []tf.Input{
+			var_, accum, accum_update, lr, rho, epsilon, grad,
+		},
+		Attrs: attrs,
+	}
+	return scope.AddOperation(opspec)
+}
+
 // Shuffle dimensions of x according to a permutation.
 //
 // The output `y` has the same rank as `x`. The shapes of `x` and `y` satisfy:
@@ -12861,6 +13104,11 @@ func GatherValidateIndices(value bool) GatherAttr {
 // If `indices` is a permutation and `len(indices) == params.shape[0]` then
 // this operation will permute `params` accordingly.
 //
+// `validate_indices`: DEPRECATED. If this operation is assigned to CPU, values in
+// `indices` are always validated to be within range. If assigned to GPU,
+// out-of-bound indices result in unspecified behavior (currently the result is
+// `0`, but this may become an error in the future).
+//
 // <div style="width:70%; margin:auto; margin-bottom:10px; margin-top:20px;">
 // <img style="width:100%" src="../../images/Gather.png" alt>
 // </div>
@@ -13435,118 +13683,6 @@ func RandomPoisson(scope *Scope, shape tf.Output, rate tf.Output, optional ...Ra
 	return op.Output(0)
 }
 
-// ResourceApplyAdadeltaAttr is an optional argument to ResourceApplyAdadelta.
-type ResourceApplyAdadeltaAttr func(optionalAttr)
-
-// ResourceApplyAdadeltaUseLocking sets the optional use_locking attribute to value.
-//
-// value: If True, updating of the var, accum and update_accum tensors will be protected by
-// a lock; otherwise the behavior is undefined, but may exhibit less contention.
-// If not specified, defaults to false
-func ResourceApplyAdadeltaUseLocking(value bool) ResourceApplyAdadeltaAttr {
-	return func(m optionalAttr) {
-		m["use_locking"] = value
-	}
-}
-
-// Update '*var' according to the adadelta scheme.
-//
-// accum = rho() * accum + (1 - rho()) * grad.square();
-// update = (update_accum + epsilon).sqrt() * (accum + epsilon()).rsqrt() * grad;
-// update_accum = rho() * update_accum + (1 - rho()) * update.square();
-// var -= update;
-//
-// Arguments:
-//	var_: Should be from a Variable().
-//	accum: Should be from a Variable().
-//	accum_update: Should be from a Variable().
-//	lr: Scaling factor. Must be a scalar.
-//	rho: Decay factor. Must be a scalar.
-//	epsilon: Constant factor. Must be a scalar.
-//	grad: The gradient.
-//
-// Returns the created operation.
-func ResourceApplyAdadelta(scope *Scope, var_ tf.Output, accum tf.Output, accum_update tf.Output, lr tf.Output, rho tf.Output, epsilon tf.Output, grad tf.Output, optional ...ResourceApplyAdadeltaAttr) (o *tf.Operation) {
-	if scope.Err() != nil {
-		return
-	}
-	attrs := map[string]interface{}{}
-	for _, a := range optional {
-		a(attrs)
-	}
-	opspec := tf.OpSpec{
-		Type: "ResourceApplyAdadelta",
-		Input: []tf.Input{
-			var_, accum, accum_update, lr, rho, epsilon, grad,
-		},
-		Attrs: attrs,
-	}
-	return scope.AddOperation(opspec)
-}
-
-// NonMaxSuppressionAttr is an optional argument to NonMaxSuppression.
-type NonMaxSuppressionAttr func(optionalAttr)
-
-// NonMaxSuppressionIouThreshold sets the optional iou_threshold attribute to value.
-//
-// value: A float representing the threshold for deciding whether boxes
-// overlap too much with respect to IOU.
-// If not specified, defaults to 0.5
-func NonMaxSuppressionIouThreshold(value float32) NonMaxSuppressionAttr {
-	return func(m optionalAttr) {
-		m["iou_threshold"] = value
-	}
-}
-
-// Greedily selects a subset of bounding boxes in descending order of score,
-//
-// pruning away boxes that have high intersection-over-union (IOU) overlap
-// with previously selected boxes.  Bounding boxes are supplied as
-// [y1, x1, y2, x2], where (y1, x1) and (y2, x2) are the coordinates of any
-// diagonal pair of box corners and the coordinates can be provided as normalized
-// (i.e., lying in the interval [0, 1]) or absolute.  Note that this algorithm
-// is agnostic to where the origin is in the coordinate system.  Note that this
-// algorithm is invariant to orthogonal transformations and translations
-// of the coordinate system; thus translating or reflections of the coordinate
-// system result in the same boxes being selected by the algorithm.
-//
-// The output of this operation is a set of integers indexing into the input
-// collection of bounding boxes representing the selected boxes.  The bounding
-// box coordinates corresponding to the selected indices can then be obtained
-// using the `tf.gather operation`.  For example:
-//
-//   selected_indices = tf.image.non_max_suppression(
-//       boxes, scores, max_output_size, iou_threshold)
-//   selected_boxes = tf.gather(boxes, selected_indices)
-//
-// Arguments:
-//	boxes: A 2-D float tensor of shape `[num_boxes, 4]`.
-//	scores: A 1-D float tensor of shape `[num_boxes]` representing a single
-// score corresponding to each box (each row of boxes).
-//	max_output_size: A scalar integer tensor representing the maximum number of
-// boxes to be selected by non max suppression.
-//
-// Returns A 1-D integer tensor of shape `[M]` representing the selected
-// indices from the boxes tensor, where `M <= max_output_size`.
-func NonMaxSuppression(scope *Scope, boxes tf.Output, scores tf.Output, max_output_size tf.Output, optional ...NonMaxSuppressionAttr) (selected_indices tf.Output) {
-	if scope.Err() != nil {
-		return
-	}
-	attrs := map[string]interface{}{}
-	for _, a := range optional {
-		a(attrs)
-	}
-	opspec := tf.OpSpec{
-		Type: "NonMaxSuppression",
-		Input: []tf.Input{
-			boxes, scores, max_output_size,
-		},
-		Attrs: attrs,
-	}
-	op := scope.AddOperation(opspec)
-	return op.Output(0)
-}
-
 // Applies softmax to a batched N-D `SparseTensor`.
 //
 // The inputs represent an N-D SparseTensor  with logical shape `[..., B, C]`
@@ -15015,170 +15151,6 @@ func NotEqual(scope *Scope, x tf.Output, y tf.Output) (z tf.Output) {
 	return op.Output(0)
 }
 
-// FractionalAvgPoolAttr is an optional argument to FractionalAvgPool.
-type FractionalAvgPoolAttr func(optionalAttr)
-
-// FractionalAvgPoolPseudoRandom sets the optional pseudo_random attribute to value.
-//
-// value: When set to True, generates the pooling sequence in a
-// pseudorandom fashion, otherwise, in a random fashion. Check paper [Benjamin
-// Graham, Fractional Max-Pooling](http://arxiv.org/abs/1412.6071) for
-// difference between pseudorandom and random.
-// If not specified, defaults to false
-func FractionalAvgPoolPseudoRandom(value bool) FractionalAvgPoolAttr {
-	return func(m optionalAttr) {
-		m["pseudo_random"] = value
-	}
-}
-
-// FractionalAvgPoolOverlapping sets the optional overlapping attribute to value.
-//
-// value: When set to True, it means when pooling, the values at the boundary
-// of adjacent pooling cells are used by both cells. For example:
-//
-// `index  0  1  2  3  4`
-//
-// `value  20 5  16 3  7`
-//
-// If the pooling sequence is [0, 2, 4], then 16, at index 2 will be used twice.
-// The result would be [41/3, 26/3] for fractional avg pooling.
-// If not specified, defaults to false
-func FractionalAvgPoolOverlapping(value bool) FractionalAvgPoolAttr {
-	return func(m optionalAttr) {
-		m["overlapping"] = value
-	}
-}
-
-// FractionalAvgPoolDeterministic sets the optional deterministic attribute to value.
-//
-// value: When set to True, a fixed pooling region will be used when
-// iterating over a FractionalAvgPool node in the computation graph. Mainly used
-// in unit test to make FractionalAvgPool deterministic.
-// If not specified, defaults to false
-func FractionalAvgPoolDeterministic(value bool) FractionalAvgPoolAttr {
-	return func(m optionalAttr) {
-		m["deterministic"] = value
-	}
-}
-
-// FractionalAvgPoolSeed sets the optional seed attribute to value.
-//
-// value: If either seed or seed2 are set to be non-zero, the random number
-// generator is seeded by the given seed.  Otherwise, it is seeded by a
-// random seed.
-// If not specified, defaults to 0
-func FractionalAvgPoolSeed(value int64) FractionalAvgPoolAttr {
-	return func(m optionalAttr) {
-		m["seed"] = value
-	}
-}
-
-// FractionalAvgPoolSeed2 sets the optional seed2 attribute to value.
-//
-// value: An second seed to avoid seed collision.
-// If not specified, defaults to 0
-func FractionalAvgPoolSeed2(value int64) FractionalAvgPoolAttr {
-	return func(m optionalAttr) {
-		m["seed2"] = value
-	}
-}
-
-// Performs fractional average pooling on the input.
-//
-// Fractional average pooling is similar to Fractional max pooling in the pooling
-// region generation step. The only difference is that after pooling regions are
-// generated, a mean operation is performed instead of a max operation in each
-// pooling region.
-//
-// Arguments:
-//	value: 4-D with shape `[batch, height, width, channels]`.
-//	pooling_ratio: Pooling ratio for each dimension of `value`, currently only
-// supports row and col dimension and should be >= 1.0. For example, a valid
-// pooling ratio looks like [1.0, 1.44, 1.73, 1.0]. The first and last elements
-// must be 1.0 because we don't allow pooling on batch and channels
-// dimensions. 1.44 and 1.73 are pooling ratio on height and width dimensions
-// respectively.
-//
-// Returns output tensor after fractional avg pooling.row pooling sequence, needed to calculate gradient.column pooling sequence, needed to calculate gradient.
-func FractionalAvgPool(scope *Scope, value tf.Output, pooling_ratio []float32, optional ...FractionalAvgPoolAttr) (output tf.Output, row_pooling_sequence tf.Output, col_pooling_sequence tf.Output) {
-	if scope.Err() != nil {
-		return
-	}
-	attrs := map[string]interface{}{"pooling_ratio": pooling_ratio}
-	for _, a := range optional {
-		a(attrs)
-	}
-	opspec := tf.OpSpec{
-		Type: "FractionalAvgPool",
-		Input: []tf.Input{
-			value,
-		},
-		Attrs: attrs,
-	}
-	op := scope.AddOperation(opspec)
-	return op.Output(0), op.Output(1), op.Output(2)
-}
-
-// RandomCropAttr is an optional argument to RandomCrop.
-type RandomCropAttr func(optionalAttr)
-
-// RandomCropSeed sets the optional seed attribute to value.
-//
-// value: If either seed or seed2 are set to be non-zero, the random number
-// generator is seeded by the given seed.  Otherwise, it is seeded by a
-// random seed.
-// If not specified, defaults to 0
-func RandomCropSeed(value int64) RandomCropAttr {
-	return func(m optionalAttr) {
-		m["seed"] = value
-	}
-}
-
-// RandomCropSeed2 sets the optional seed2 attribute to value.
-//
-// value: An second seed to avoid seed collision.
-// If not specified, defaults to 0
-func RandomCropSeed2(value int64) RandomCropAttr {
-	return func(m optionalAttr) {
-		m["seed2"] = value
-	}
-}
-
-// Randomly crop `image`.
-//
-// DEPRECATED at GraphDef version 8: Random crop is now pure Python
-//
-// `size` is a 1-D int64 tensor with 2 elements representing the crop height and
-// width.  The values must be non negative.
-//
-// This Op picks a random location in `image` and crops a `height` by `width`
-// rectangle from that location.  The random location is picked so the cropped
-// area will fit inside the original image.
-//
-// Arguments:
-//	image: 3-D of shape `[height, width, channels]`.
-//	size: 1-D of length 2 containing: `crop_height`, `crop_width`..
-//
-// Returns 3-D of shape `[crop_height, crop_width, channels].`
-func RandomCrop(scope *Scope, image tf.Output, size tf.Output, optional ...RandomCropAttr) (output tf.Output) {
-	if scope.Err() != nil {
-		return
-	}
-	attrs := map[string]interface{}{}
-	for _, a := range optional {
-		a(attrs)
-	}
-	opspec := tf.OpSpec{
-		Type: "RandomCrop",
-		Input: []tf.Input{
-			image, size,
-		},
-		Attrs: attrs,
-	}
-	op := scope.AddOperation(opspec)
-	return op.Output(0)
-}
-
 // Returns immutable tensor from memory region.
 //
 // The current implementation memmaps the tensor from a file.
@@ -15493,6 +15465,170 @@ func IdentityReaderV2(scope *Scope, optional ...IdentityReaderV2Attr) (reader_ha
 	return op.Output(0)
 }
 
+// RandomCropAttr is an optional argument to RandomCrop.
+type RandomCropAttr func(optionalAttr)
+
+// RandomCropSeed sets the optional seed attribute to value.
+//
+// value: If either seed or seed2 are set to be non-zero, the random number
+// generator is seeded by the given seed.  Otherwise, it is seeded by a
+// random seed.
+// If not specified, defaults to 0
+func RandomCropSeed(value int64) RandomCropAttr {
+	return func(m optionalAttr) {
+		m["seed"] = value
+	}
+}
+
+// RandomCropSeed2 sets the optional seed2 attribute to value.
+//
+// value: An second seed to avoid seed collision.
+// If not specified, defaults to 0
+func RandomCropSeed2(value int64) RandomCropAttr {
+	return func(m optionalAttr) {
+		m["seed2"] = value
+	}
+}
+
+// Randomly crop `image`.
+//
+// DEPRECATED at GraphDef version 8: Random crop is now pure Python
+//
+// `size` is a 1-D int64 tensor with 2 elements representing the crop height and
+// width.  The values must be non negative.
+//
+// This Op picks a random location in `image` and crops a `height` by `width`
+// rectangle from that location.  The random location is picked so the cropped
+// area will fit inside the original image.
+//
+// Arguments:
+//	image: 3-D of shape `[height, width, channels]`.
+//	size: 1-D of length 2 containing: `crop_height`, `crop_width`..
+//
+// Returns 3-D of shape `[crop_height, crop_width, channels].`
+func RandomCrop(scope *Scope, image tf.Output, size tf.Output, optional ...RandomCropAttr) (output tf.Output) {
+	if scope.Err() != nil {
+		return
+	}
+	attrs := map[string]interface{}{}
+	for _, a := range optional {
+		a(attrs)
+	}
+	opspec := tf.OpSpec{
+		Type: "RandomCrop",
+		Input: []tf.Input{
+			image, size,
+		},
+		Attrs: attrs,
+	}
+	op := scope.AddOperation(opspec)
+	return op.Output(0)
+}
+
+// FractionalAvgPoolAttr is an optional argument to FractionalAvgPool.
+type FractionalAvgPoolAttr func(optionalAttr)
+
+// FractionalAvgPoolPseudoRandom sets the optional pseudo_random attribute to value.
+//
+// value: When set to True, generates the pooling sequence in a
+// pseudorandom fashion, otherwise, in a random fashion. Check paper [Benjamin
+// Graham, Fractional Max-Pooling](http://arxiv.org/abs/1412.6071) for
+// difference between pseudorandom and random.
+// If not specified, defaults to false
+func FractionalAvgPoolPseudoRandom(value bool) FractionalAvgPoolAttr {
+	return func(m optionalAttr) {
+		m["pseudo_random"] = value
+	}
+}
+
+// FractionalAvgPoolOverlapping sets the optional overlapping attribute to value.
+//
+// value: When set to True, it means when pooling, the values at the boundary
+// of adjacent pooling cells are used by both cells. For example:
+//
+// `index  0  1  2  3  4`
+//
+// `value  20 5  16 3  7`
+//
+// If the pooling sequence is [0, 2, 4], then 16, at index 2 will be used twice.
+// The result would be [41/3, 26/3] for fractional avg pooling.
+// If not specified, defaults to false
+func FractionalAvgPoolOverlapping(value bool) FractionalAvgPoolAttr {
+	return func(m optionalAttr) {
+		m["overlapping"] = value
+	}
+}
+
+// FractionalAvgPoolDeterministic sets the optional deterministic attribute to value.
+//
+// value: When set to True, a fixed pooling region will be used when
+// iterating over a FractionalAvgPool node in the computation graph. Mainly used
+// in unit test to make FractionalAvgPool deterministic.
+// If not specified, defaults to false
+func FractionalAvgPoolDeterministic(value bool) FractionalAvgPoolAttr {
+	return func(m optionalAttr) {
+		m["deterministic"] = value
+	}
+}
+
+// FractionalAvgPoolSeed sets the optional seed attribute to value.
+//
+// value: If either seed or seed2 are set to be non-zero, the random number
+// generator is seeded by the given seed.  Otherwise, it is seeded by a
+// random seed.
+// If not specified, defaults to 0
+func FractionalAvgPoolSeed(value int64) FractionalAvgPoolAttr {
+	return func(m optionalAttr) {
+		m["seed"] = value
+	}
+}
+
+// FractionalAvgPoolSeed2 sets the optional seed2 attribute to value.
+//
+// value: An second seed to avoid seed collision.
+// If not specified, defaults to 0
+func FractionalAvgPoolSeed2(value int64) FractionalAvgPoolAttr {
+	return func(m optionalAttr) {
+		m["seed2"] = value
+	}
+}
+
+// Performs fractional average pooling on the input.
+//
+// Fractional average pooling is similar to Fractional max pooling in the pooling
+// region generation step. The only difference is that after pooling regions are
+// generated, a mean operation is performed instead of a max operation in each
+// pooling region.
+//
+// Arguments:
+//	value: 4-D with shape `[batch, height, width, channels]`.
+//	pooling_ratio: Pooling ratio for each dimension of `value`, currently only
+// supports row and col dimension and should be >= 1.0. For example, a valid
+// pooling ratio looks like [1.0, 1.44, 1.73, 1.0]. The first and last elements
+// must be 1.0 because we don't allow pooling on batch and channels
+// dimensions. 1.44 and 1.73 are pooling ratio on height and width dimensions
+// respectively.
+//
+// Returns output tensor after fractional avg pooling.row pooling sequence, needed to calculate gradient.column pooling sequence, needed to calculate gradient.
+func FractionalAvgPool(scope *Scope, value tf.Output, pooling_ratio []float32, optional ...FractionalAvgPoolAttr) (output tf.Output, row_pooling_sequence tf.Output, col_pooling_sequence tf.Output) {
+	if scope.Err() != nil {
+		return
+	}
+	attrs := map[string]interface{}{"pooling_ratio": pooling_ratio}
+	for _, a := range optional {
+		a(attrs)
+	}
+	opspec := tf.OpSpec{
+		Type: "FractionalAvgPool",
+		Input: []tf.Input{
+			value,
+		},
+		Attrs: attrs,
+	}
+	op := scope.AddOperation(opspec)
+	return op.Output(0), op.Output(1), op.Output(2)
+}
+
 // Produces the average pool of the input tensor for quantized types.
 //
 // Arguments:
@@ -18157,59 +18293,6 @@ func Bincount(scope *Scope, arr tf.Output, size tf.Output, weights tf.Output) (b
 	return op.Output(0)
 }
 
-// CropAndResizeGradBoxesAttr is an optional argument to CropAndResizeGradBoxes.
-type CropAndResizeGradBoxesAttr func(optionalAttr)
-
-// CropAndResizeGradBoxesMethod sets the optional method attribute to value.
-//
-// value: A string specifying the interpolation method. Only 'bilinear' is
-// supported for now.
-// If not specified, defaults to "bilinear"
-func CropAndResizeGradBoxesMethod(value string) CropAndResizeGradBoxesAttr {
-	return func(m optionalAttr) {
-		m["method"] = value
-	}
-}
-
-// Computes the gradient of the crop_and_resize op wrt the input boxes tensor.
-//
-// Arguments:
-//	grads: A 4-D tensor of shape `[num_boxes, crop_height, crop_width, depth]`.
-//	image: A 4-D tensor of shape `[batch, image_height, image_width, depth]`.
-// Both `image_height` and `image_width` need to be positive.
-//	boxes: A 2-D tensor of shape `[num_boxes, 4]`. The `i`-th row of the tensor
-// specifies the coordinates of a box in the `box_ind[i]` image and is specified
-// in normalized coordinates `[y1, x1, y2, x2]`. A normalized coordinate value of
-// `y` is mapped to the image coordinate at `y * (image_height - 1)`, so as the
-// `[0, 1]` interval of normalized image height is mapped to
-// `[0, image_height - 1] in image height coordinates. We do allow y1 > y2, in
-// which case the sampled crop is an up-down flipped version of the original
-// image. The width dimension is treated similarly. Normalized coordinates
-// outside the `[0, 1]` range are allowed, in which case we use
-// `extrapolation_value` to extrapolate the input image values.
-//	box_ind: A 1-D tensor of shape `[num_boxes]` with int32 values in `[0, batch)`.
-// The value of `box_ind[i]` specifies the image that the `i`-th box refers to.
-//
-// Returns A 2-D tensor of shape `[num_boxes, 4]`.
-func CropAndResizeGradBoxes(scope *Scope, grads tf.Output, image tf.Output, boxes tf.Output, box_ind tf.Output, optional ...CropAndResizeGradBoxesAttr) (output tf.Output) {
-	if scope.Err() != nil {
-		return
-	}
-	attrs := map[string]interface{}{}
-	for _, a := range optional {
-		a(attrs)
-	}
-	opspec := tf.OpSpec{
-		Type: "CropAndResizeGradBoxes",
-		Input: []tf.Input{
-			grads, image, boxes, box_ind,
-		},
-		Attrs: attrs,
-	}
-	op := scope.AddOperation(opspec)
-	return op.Output(0)
-}
-
 // Reshapes a quantized tensor as per the Reshape op.
 //
 // ```
@@ -20075,6 +20158,54 @@ func TensorArrayGatherV3(scope *Scope, handle tf.Output, indices tf.Output, flow
 	return op.Output(0)
 }
 
+// MaxPoolGradGradAttr is an optional argument to MaxPoolGradGrad.
+type MaxPoolGradGradAttr func(optionalAttr)
+
+// MaxPoolGradGradDataFormat sets the optional data_format attribute to value.
+//
+// value: Specify the data format of the input and output data. With the
+// default format "NHWC", the data is stored in the order of:
+//     [batch, in_height, in_width, in_channels].
+// Alternatively, the format could be "NCHW", the data storage order of:
+//     [batch, in_channels, in_height, in_width].
+// If not specified, defaults to "NHWC"
+func MaxPoolGradGradDataFormat(value string) MaxPoolGradGradAttr {
+	return func(m optionalAttr) {
+		m["data_format"] = value
+	}
+}
+
+// Computes second-order gradients of the maxpooling function.
+//
+// Arguments:
+//	orig_input: The original input tensor.
+//	orig_output: The original output tensor.
+//	grad: 4-D.  Gradients of gradients w.r.t. the input of `max_pool`.
+//	ksize: The size of the window for each dimension of the input tensor.
+//	strides: The stride of the sliding window for each dimension of the
+// input tensor.
+//	padding: The type of padding algorithm to use.
+//
+// Returns Gradients of gradients w.r.t. the input to `max_pool`.
+func MaxPoolGradGrad(scope *Scope, orig_input tf.Output, orig_output tf.Output, grad tf.Output, ksize []int64, strides []int64, padding string, optional ...MaxPoolGradGradAttr) (output tf.Output) {
+	if scope.Err() != nil {
+		return
+	}
+	attrs := map[string]interface{}{"ksize": ksize, "strides": strides, "padding": padding}
+	for _, a := range optional {
+		a(attrs)
+	}
+	opspec := tf.OpSpec{
+		Type: "MaxPoolGradGrad",
+		Input: []tf.Input{
+			orig_input, orig_output, grad,
+		},
+		Attrs: attrs,
+	}
+	op := scope.AddOperation(opspec)
+	return op.Output(0)
+}
+
 // 3D real-valued fast Fourier transform.
 //
 // Computes the 3-dimensional discrete Fourier transform of a real-valued signal
diff --git a/tensorflow/python/estimator/estimator.py b/tensorflow/python/estimator/estimator.py
index c20a24b2ee3..36918af5529 100644
--- a/tensorflow/python/estimator/estimator.py
+++ b/tensorflow/python/estimator/estimator.py
@@ -266,7 +266,11 @@ class Estimator(object):
         checkpoint_path=checkpoint_path,
         name=name)
 
-  def predict(self, input_fn, predict_keys=None, hooks=None, checkpoint_path=None):
+  def predict(self,
+              input_fn,
+              predict_keys=None,
+              hooks=None,
+              checkpoint_path=None):
     """Returns predictions for given features.
 
     Args:
diff --git a/tensorflow/python/estimator/estimator_test.py b/tensorflow/python/estimator/estimator_test.py
index 889ba3cf38e..a1659156a62 100644
--- a/tensorflow/python/estimator/estimator_test.py
+++ b/tensorflow/python/estimator/estimator_test.py
@@ -627,7 +627,10 @@ class EstimatorPredictTest(test.TestCase):
   def test_no_trained_model_invalid_checkpoint_path(self):
     est = estimator.Estimator(model_fn=model_fn_global_step_incrementer)
     with self.assertRaises(ValueError):
-      next(est.predict(dummy_input_fn, checkpoint_path=saver.latest_checkpoint("fakedir")))
+      next(
+          est.predict(
+              dummy_input_fn,
+              checkpoint_path=saver.latest_checkpoint('fakedir')))
 
   def test_tensor_predictions(self):
 
@@ -848,9 +851,12 @@ class EstimatorPredictTest(test.TestCase):
     est1 = estimator.Estimator(model_fn=_model_fn)
     est1.train(dummy_input_fn, steps=1)
     est2 = estimator.Estimator(model_fn=_model_fn, model_dir=est1.model_dir)
-    self.assertEqual([32.], next(est2.predict(
-      dummy_input_fn,
-      checkpoint_path=saver.latest_checkpoint(est1.model_dir))))
+    self.assertEqual(
+        [32.],
+        next(
+            est2.predict(
+                dummy_input_fn,
+                checkpoint_path=saver.latest_checkpoint(est1.model_dir))))
 
   def test_scaffold_is_used(self):
 
diff --git a/tensorflow/python/estimator/inputs/queues/feeding_functions.py b/tensorflow/python/estimator/inputs/queues/feeding_functions.py
index dab8ffea757..a6f5157680f 100644
--- a/tensorflow/python/estimator/inputs/queues/feeding_functions.py
+++ b/tensorflow/python/estimator/inputs/queues/feeding_functions.py
@@ -20,9 +20,9 @@ from __future__ import print_function
 
 import collections
 import random
+import types as tp
 import numpy as np
 import six
-import types as tp
 
 from tensorflow.python.estimator.inputs.queues import feeding_queue_runner as fqr
 from tensorflow.python.framework import dtypes
@@ -245,8 +245,8 @@ class _GeneratorFeedFn(object):
 
   def __call__(self):
     if self._num_epochs and self._epoch >= self._num_epochs:
-      raise errors.OutOfRangeError(
-          None, None, "Already emitted %s epochs." % self._epoch)
+      raise errors.OutOfRangeError(None, None,
+                                   "Already emitted %s epochs." % self._epoch)
     list_dict = {}
     list_dict_size = 0
     while list_dict_size < self._batch_size:
@@ -258,8 +258,9 @@ class _GeneratorFeedFn(object):
         data_row = next(self._iterator)
       for index, key in enumerate(self._keys):
         if key not in data_row.keys():
-          raise KeyError('key mismatch between dicts emitted by GenFun'
-              'Expected {} keys; got {}'.format( self._keys, data_row.keys()))
+          raise KeyError("key mismatch between dicts emitted by GenFun"
+                         "Expected {} keys; got {}".format(
+                             self._keys, data_row.keys()))
         list_dict.setdefault(self._col_placeholders[index],
                              list()).append(data_row[key])
         list_dict_size += 1
diff --git a/tensorflow/python/framework/tensor_util_test.py b/tensorflow/python/framework/tensor_util_test.py
index 727438a56df..5eb5230404d 100644
--- a/tensorflow/python/framework/tensor_util_test.py
+++ b/tensorflow/python/framework/tensor_util_test.py
@@ -18,8 +18,8 @@ from __future__ import absolute_import
 from __future__ import division
 from __future__ import print_function
 
-import numpy as np
 import sys
+import numpy as np
 
 from tensorflow.python.framework import constant_op
 from tensorflow.python.framework import dtypes
@@ -48,13 +48,13 @@ class TensorUtilTest(test.TestCase):
 
   def testFloatN(self):
     t = tensor_util.make_tensor_proto([10.0, 20.0, 30.0])
-    if sys.byteorder == "big":  
-      self.assertProtoEquals("""  
+    if sys.byteorder == "big":
+      self.assertProtoEquals("""
         dtype: DT_FLOAT  
         tensor_shape { dim { size: 3 } }  
         tensor_content: "A \000\000A\240\000\000A\360\000\000"  
-        """, t)  
-    else:  
+        """, t)
+    else:
       self.assertProtoEquals("""
         dtype: DT_FLOAT
         tensor_shape { dim { size: 3 } }
@@ -66,12 +66,12 @@ class TensorUtilTest(test.TestCase):
 
   def testFloatTyped(self):
     t = tensor_util.make_tensor_proto([10.0, 20.0, 30.0], dtype=dtypes.float32)
-    if sys.byteorder == "big":  
-      self.assertProtoEquals("""  
+    if sys.byteorder == "big":
+      self.assertProtoEquals("""
         dtype: DT_FLOAT  
         tensor_shape { dim { size: 3 } }  
         tensor_content: "A \000\000A\240\000\000A\360\000\000"  
-        """, t)  
+        """, t)
     else:
       self.assertProtoEquals("""
         dtype: DT_FLOAT
@@ -84,13 +84,13 @@ class TensorUtilTest(test.TestCase):
 
   def testFloatTypeCoerce(self):
     t = tensor_util.make_tensor_proto([10, 20, 30], dtype=dtypes.float32)
-    if sys.byteorder == "big":  
-      self.assertProtoEquals("""  
+    if sys.byteorder == "big":
+      self.assertProtoEquals("""
         dtype: DT_FLOAT  
         tensor_shape { dim { size: 3 } }  
         tensor_content: "A \000\000A\240\000\000A\360\000\000"  
-        """, t)  
-    else:  
+        """, t)
+    else:
       self.assertProtoEquals("""
         dtype: DT_FLOAT
         tensor_shape { dim { size: 3 } }
@@ -103,13 +103,13 @@ class TensorUtilTest(test.TestCase):
   def testFloatTypeCoerceNdarray(self):
     arr = np.asarray([10, 20, 30], dtype="int")
     t = tensor_util.make_tensor_proto(arr, dtype=dtypes.float32)
-    if sys.byteorder == "big":  
-      self.assertProtoEquals("""  
+    if sys.byteorder == "big":
+      self.assertProtoEquals("""
         dtype: DT_FLOAT  
         tensor_shape { dim { size: 3 } }  
         tensor_content: "A \000\000A\240\000\000A\360\000\000"  
-        """, t)  
-    else: 
+        """, t)
+    else:
       self.assertProtoEquals("""
         dtype: DT_FLOAT
         tensor_shape { dim { size: 3 } }
@@ -121,13 +121,13 @@ class TensorUtilTest(test.TestCase):
 
   def testFloatSizes(self):
     t = tensor_util.make_tensor_proto([10.0, 20.0, 30.0], shape=[1, 3])
-    if sys.byteorder == "big":  
-      self.assertProtoEquals("""  
+    if sys.byteorder == "big":
+      self.assertProtoEquals("""
         dtype: DT_FLOAT  
         tensor_shape { dim { size: 1 } dim { size: 3 } }  
         tensor_content: "A \000\000A\240\000\000A\360\000\000"  
-        """, t)  
-    else:  
+        """, t)
+    else:
       self.assertProtoEquals("""
         dtype: DT_FLOAT
         tensor_shape { dim { size: 1 } dim { size: 3 } }
@@ -139,13 +139,13 @@ class TensorUtilTest(test.TestCase):
 
   def testFloatSizes2(self):
     t = tensor_util.make_tensor_proto([10.0, 20.0, 30.0], shape=[3, 1])
-    if sys.byteorder == "big":  
-      self.assertProtoEquals("""  
+    if sys.byteorder == "big":
+      self.assertProtoEquals("""
         dtype: DT_FLOAT  
         tensor_shape { dim { size: 3 } dim { size: 1 } }  
         tensor_content: "A \000\000A\240\000\000A\360\000\000"  
-        """, t)  
-    else:  
+        """, t)
+    else:
       self.assertProtoEquals("""
         dtype: DT_FLOAT
         tensor_shape { dim { size: 3 } dim { size: 1 } }
@@ -167,13 +167,13 @@ class TensorUtilTest(test.TestCase):
   def testFloatNpArrayFloat64(self):
     t = tensor_util.make_tensor_proto(
         np.array([[10.0, 20.0, 30.0]], dtype=np.float64))
-    if sys.byteorder == "big":  
-      self.assertProtoEquals("""  
+    if sys.byteorder == "big":
+      self.assertProtoEquals("""
         dtype: DT_DOUBLE  
         tensor_shape { dim { size: 1 } dim { size: 3 } }  
         tensor_content: "@$\000\000\000\000\000\000@4\000\000\000\000\000\000@>\000\000\000\000\000\000"  
-        """, t)  
-    else:  
+        """, t)
+    else:
       self.assertProtoEquals("""
         dtype: DT_DOUBLE
         tensor_shape { dim { size: 1 } dim { size: 3 } }
@@ -258,13 +258,13 @@ class TensorUtilTest(test.TestCase):
 
   def testIntNDefaultType(self):
     t = tensor_util.make_tensor_proto([10, 20, 30, 40], shape=[2, 2])
-    if sys.byteorder == "big":  
-      self.assertProtoEquals("""  
+    if sys.byteorder == "big":
+      self.assertProtoEquals("""
         dtype: DT_INT32  
         tensor_shape { dim { size: 2 } dim { size: 2 } }  
         tensor_content: "\000\000\000\\n\000\000\000\024\000\000\000\036\000\000\000("  
-        """, t)  
-    else:  
+        """, t)
+    else:
       self.assertProtoEquals("""
         dtype: DT_INT32
         tensor_shape { dim { size: 2 } dim { size: 2 } }
@@ -328,13 +328,13 @@ class TensorUtilTest(test.TestCase):
   def testLongN(self):
     t = tensor_util.make_tensor_proto(
         [10, 20, 30], shape=[1, 3], dtype=dtypes.int64)
-    if sys.byteorder == "big":  
-      self.assertProtoEquals("""  
+    if sys.byteorder == "big":
+      self.assertProtoEquals("""
         dtype: DT_INT64  
         tensor_shape { dim { size: 1 } dim { size: 3 } }  
         tensor_content: "\000\000\000\000\000\000\000\\n\000\000\000\000\000\000\000\024\000\000\000\000\000\000\000\036"  
-        """, t)  
-    else: 
+        """, t)
+    else:
       self.assertProtoEquals("""
         dtype: DT_INT64
         tensor_shape { dim { size: 1 } dim { size: 3 } }
@@ -346,13 +346,13 @@ class TensorUtilTest(test.TestCase):
 
   def testLongNpArray(self):
     t = tensor_util.make_tensor_proto(np.array([10, 20, 30]))
-    if sys.byteorder == "big":  
-      self.assertProtoEquals("""  
+    if sys.byteorder == "big":
+      self.assertProtoEquals("""
         dtype: DT_INT64  
         tensor_shape { dim { size: 3 } }  
         tensor_content: "\000\000\000\000\000\000\000\\n\000\000\000\000\000\000\000\024\000\000\000\000\000\000\000\036"  
-        """, t)  
-    else:  
+        """, t)
+    else:
       self.assertProtoEquals("""
         dtype: DT_INT64
         tensor_shape { dim { size: 3 } }
@@ -367,13 +367,13 @@ class TensorUtilTest(test.TestCase):
     data = [(21,), (22,), (23,)]
 
     t = tensor_util.make_tensor_proto(data, dtype=dtypes.qint32)
-    if sys.byteorder == "big":  
-      self.assertProtoEquals("""  
+    if sys.byteorder == "big":
+      self.assertProtoEquals("""
         dtype: DT_QINT32  
         tensor_shape { dim { size: 3 } }  
         tensor_content: "\000\000\000\025\000\000\000\026\000\000\000\027"  
-        """, t)  
-    else:  
+        """, t)
+    else:
       self.assertProtoEquals("""
         dtype: DT_QINT32
         tensor_shape { dim { size: 3 } }
@@ -404,13 +404,13 @@ class TensorUtilTest(test.TestCase):
     self.assertAllEqual(np.array(data, dtype=a.dtype), a)
 
     t = tensor_util.make_tensor_proto(data, dtype=dtypes.quint16)
-    if sys.byteorder == "big":  
-      self.assertProtoEquals("""  
+    if sys.byteorder == "big":
+      self.assertProtoEquals("""
         dtype: DT_QUINT16  
         tensor_shape { dim { size: 3 } }  
         tensor_content: "\000\025\000\026\000\027"  
-        """, t)  
-    else:  
+        """, t)
+    else:
       self.assertProtoEquals("""
         dtype: DT_QUINT16
         tensor_shape { dim { size: 3 } }
@@ -421,13 +421,13 @@ class TensorUtilTest(test.TestCase):
     self.assertAllEqual(np.array(data, dtype=a.dtype), a)
 
     t = tensor_util.make_tensor_proto(data, dtype=dtypes.qint16)
-    if sys.byteorder == "big":  
-      self.assertProtoEquals("""  
+    if sys.byteorder == "big":
+      self.assertProtoEquals("""
         dtype: DT_QINT16  
         tensor_shape { dim { size: 3 } }  
         tensor_content: "\000\025\000\026\000\027"  
-        """, t)  
-    else: 
+        """, t)
+    else:
       self.assertProtoEquals("""
         dtype: DT_QINT16
         tensor_shape { dim { size: 3 } }
@@ -669,7 +669,9 @@ class TensorUtilTest(test.TestCase):
     self.assertFalse(tensor_util.ShapeEquals(t, [4]))
 
   def testMockArray(self):
+
     class MockArray(object):
+
       def __init__(self, array):
         self.array = array
 
diff --git a/tensorflow/python/kernel_tests/pooling_ops_3d_test.py b/tensorflow/python/kernel_tests/pooling_ops_3d_test.py
index 5e9b7766a77..fa1553a3f6b 100644
--- a/tensorflow/python/kernel_tests/pooling_ops_3d_test.py
+++ b/tensorflow/python/kernel_tests/pooling_ops_3d_test.py
@@ -261,7 +261,7 @@ class PoolingTest(test.TestCase):
           padding=padding,
           data_format=data_format,
           name=func_name)
-      t_g = gradients_impl.gradients(t ** 2, input_tensor)[0]
+      t_g = gradients_impl.gradients(t**2, input_tensor)[0]
 
       err_g = gradient_checker.compute_gradient_error(
           input_tensor,
diff --git a/tensorflow/python/kernel_tests/pooling_ops_test.py b/tensorflow/python/kernel_tests/pooling_ops_test.py
index 85b01be2663..1b6c8bef986 100644
--- a/tensorflow/python/kernel_tests/pooling_ops_test.py
+++ b/tensorflow/python/kernel_tests/pooling_ops_test.py
@@ -24,10 +24,10 @@ from tensorflow.python.framework import constant_op
 from tensorflow.python.framework import dtypes
 from tensorflow.python.framework import errors_impl
 from tensorflow.python.framework import test_util
-from tensorflow.python.ops import gradients_impl
 from tensorflow.python.ops import array_ops
 from tensorflow.python.ops import gen_nn_ops
 from tensorflow.python.ops import gradient_checker
+from tensorflow.python.ops import gradients_impl
 from tensorflow.python.ops import nn_ops
 import tensorflow.python.ops.nn_grad  # pylint: disable=unused-import
 from tensorflow.python.platform import test
@@ -97,7 +97,7 @@ class PoolingTest(test.TestCase):
     # Initializes the input tensor with array containing incrementing
     # numbers from 1.
     x = [f * 1.0 for f in range(1, total_size + 1)]
-    with self.test_session(use_gpu=use_gpu) as sess:
+    with self.test_session(use_gpu=use_gpu):
       t = constant_op.constant(x, shape=input_sizes, dtype=data_type)
       if data_format == "NCHW":
         t = test_util.NHWCToNCHW(t)
@@ -497,7 +497,7 @@ class PoolingTest(test.TestCase):
                                          strides,
                                          error_msg,
                                          use_gpu=False):
-    with self.test_session(use_gpu=use_gpu) as sess:
+    with self.test_session(use_gpu=use_gpu):
       t = constant_op.constant(1.0, shape=in_size)
       with self.assertRaisesRegexp(errors_impl.UnimplementedError, error_msg):
         t = nn_ops.max_pool(
@@ -562,7 +562,8 @@ class PoolingTest(test.TestCase):
         self.assertShapeEqual(cpu_val, out_op)
       # The CPU version accumulates its gradient on fp16, so it's less
       # accurate than the GPU version that does the accumulation on fp32
-      self.assertAllCloseAccordingToType(cpu_val, gpu_val, half_rtol=0.01, half_atol=0.01)
+      self.assertAllCloseAccordingToType(
+          cpu_val, gpu_val, half_rtol=0.01, half_atol=0.01)
 
   def _CompareMaxPoolingGradBk(self, input_shape, output_shape, ksize, strides,
                                padding):
@@ -570,14 +571,13 @@ class PoolingTest(test.TestCase):
       # Generate numbers in a narrow range, so that there are many duplicates
       # in the input.
       tensor_input = np.random.random_integers(0, 3, input_shape).astype(dtype)
-      tensor_output = np.random.rand(*output_shape).astype(dtype)
       with self.test_session(use_gpu=True):
         t = constant_op.constant(tensor_input, shape=input_shape)
         _, argmax_op = nn_ops.max_pool_with_argmax(t, ksize, strides, padding)
         argmax = argmax_op.eval()
         grad_in = constant_op.constant(tensor_input, shape=input_shape)
-        out_op = gen_nn_ops._max_pool_grad_grad_with_argmax(t, grad_in, argmax,
-                                                            ksize, strides, padding)
+        out_op = gen_nn_ops._max_pool_grad_grad_with_argmax(
+            t, grad_in, argmax, ksize, strides, padding)
         gpu_val = out_op.eval()
         self.assertShapeEqual(gpu_val, out_op)
       with self.test_session(use_gpu=False):
@@ -585,13 +585,14 @@ class PoolingTest(test.TestCase):
         out_op = nn_ops.max_pool(t, ksize, strides, padding)
         orig_out = out_op.eval()
         grad_in = constant_op.constant(tensor_input, shape=input_shape)
-        out_op = gen_nn_ops._max_pool_grad_grad(t, orig_out, grad_in, ksize, strides,
-                                                padding)
+        out_op = gen_nn_ops._max_pool_grad_grad(t, orig_out, grad_in, ksize,
+                                                strides, padding)
         cpu_val = out_op.eval()
         self.assertShapeEqual(cpu_val, out_op)
       # The CPU version accumulates its gradient on fp16, so it's less
       # accurate than the GPU version that does the accumulation on fp32
-      self.assertAllCloseAccordingToType(cpu_val, gpu_val, half_rtol=0.01, half_atol=0.01)
+      self.assertAllCloseAccordingToType(
+          cpu_val, gpu_val, half_rtol=0.01, half_atol=0.01)
 
   def testMaxPoolingWithArgmax(self):
     # MaxPoolWithArgMax is implemented only on CUDA.
@@ -619,7 +620,7 @@ class PoolingTest(test.TestCase):
     orig_input = [1.0, 1.0, 1.0, 1.0, 0.0, 1.0, 1.0, 1.0, 1.0]
     tensor_input = [11.0, 12.0, 13.0, 14.0]
     tensor_argmax = list(np.array([0, 1, 3, 5], dtype=np.int64))
-    with self.test_session(use_gpu=True) as sess:
+    with self.test_session(use_gpu=True):
       orig_in = constant_op.constant(orig_input, shape=[1, 3, 3, 1])
       t = constant_op.constant(tensor_input, shape=[1, 2, 2, 1])
       argmax = constant_op.constant(
@@ -642,7 +643,7 @@ class PoolingTest(test.TestCase):
     orig_input = [1.0, 1.0, 1.0, 1.0, 0.0, 1.0, 1.0, 1.0, 1.0]
     tensor_input = [11.0, 12.0, 13.0, 14.0, 15.0, 16.0, 17.0, 18.0, 19.0]
     tensor_argmax = list(np.array([0, 1, 3, 5], dtype=np.int64))
-    with self.test_session(use_gpu=True) as sess:
+    with self.test_session(use_gpu=True):
       orig_in = constant_op.constant(orig_input, shape=[1, 3, 3, 1])
       t = constant_op.constant(tensor_input, shape=[1, 3, 3, 1])
       argmax = constant_op.constant(
@@ -655,8 +656,7 @@ class PoolingTest(test.TestCase):
           strides=[1, 1, 1, 1],
           padding="VALID")
       out = out_op.eval().flatten()
-      self.assertAllClose(out,
-                          [11.0, 12.0, 14.0, 16.0])
+      self.assertAllClose(out, [11.0, 12.0, 14.0, 16.0])
 
   def _ConstructAndTestGradient(self,
                                 pool_func,
@@ -791,16 +791,16 @@ class PoolingTest(test.TestCase):
         strides = [1, row_stride, col_stride, 1]
         t = input_tensor
       t = pool_func(
-        t,
-        ksize=ksize,
-        strides=strides,
-        padding=padding,
-        data_format=data_format,
-        name=func_name)
+          t,
+          ksize=ksize,
+          strides=strides,
+          padding=padding,
+          data_format=data_format,
+          name=func_name)
       if data_format == "NCHW":
         t = test_util.NHWCToNCHW(t)
 
-      t_g = gradients_impl.gradients(t ** 2, input_tensor)[0]
+      t_g = gradients_impl.gradients(t**2, input_tensor)[0]
       err = gradient_checker.compute_gradient_error(
           input_tensor,
           input_sizes,
@@ -952,7 +952,7 @@ class PoolingTest(test.TestCase):
                              expected_input_backprop, input_sizes, output_sizes,
                              window_rows, window_cols, row_stride, col_stride,
                              padding, use_gpu):
-    with self.test_session(use_gpu=use_gpu) as sess:
+    with self.test_session(use_gpu=use_gpu):
       input_tensor = constant_op.constant(input_data, shape=input_sizes)
       output_tensor = nn_ops.max_pool(input_tensor,
                                       [1, window_rows, window_cols, 1],
@@ -1312,8 +1312,10 @@ class PoolingTest(test.TestCase):
       A Tensor.
     """
     return gen_nn_ops._max_pool_grad_grad(orig_input, orig_output, grad,
-                                          [1, window_rows, window_cols, 1],
-                                          [1, row_stride, col_stride, 1], padding)
+                                          [1, window_rows, window_cols,
+                                           1], [1, row_stride, col_stride,
+                                                1], padding)
+
   def testAvgPoolGrad(self):
     for (data_format, use_gpu) in GetTestConfigs():
       self._testAvgPoolGradValidPadding1_1(data_format, use_gpu)
@@ -1501,7 +1503,9 @@ def GetMaxPoolGradTest(input_size, filter_size, output_size, strides, padding):
 
   return Test
 
-def GetMaxPoolGradGradTest(input_size, filter_size, output_size, strides, padding):
+
+def GetMaxPoolGradGradTest(input_size, filter_size, output_size, strides,
+                           padding):
 
   def Test(self):
     # MaxPoolWithArgMax is implemented only on CUDA.
@@ -1522,6 +1526,6 @@ if __name__ == "__main__":
             GetMaxPoolGradTest(input_size_, filter_size_, output_size_, stride_,
                                padding_))
     setattr(PoolingTest, "testMaxPoolGradGrad_" + name_,
-            GetMaxPoolGradGradTest(input_size_, filter_size_, output_size_, stride_,
-                                   padding_))
+            GetMaxPoolGradGradTest(input_size_, filter_size_, output_size_,
+                                   stride_, padding_))
   test.main()
diff --git a/tensorflow/python/kernel_tests/py_func_test.py b/tensorflow/python/kernel_tests/py_func_test.py
index f1bb3bdc228..c7fc7dd5826 100644
--- a/tensorflow/python/kernel_tests/py_func_test.py
+++ b/tensorflow/python/kernel_tests/py_func_test.py
@@ -283,6 +283,28 @@ class PyOpTest(test.TestCase):
     with self.test_session() as sess:
       self.assertEqual(sess.run(f), [])
 
+  def _testExceptionHandling(self, py_exp, tf_exp):
+
+    def raise_exception():
+      raise py_exp("blah")  # pylint: disable=not-callable
+
+    f = script_ops.py_func(raise_exception, [], [])
+    with self.test_session() as sess:
+      with self.assertRaisesRegexp(tf_exp, "blah"):
+        sess.run(f)
+
+  def testExceptionHandling(self):
+    self._testExceptionHandling(ValueError, errors.InvalidArgumentError)
+    self._testExceptionHandling(TypeError, errors.InvalidArgumentError)
+    self._testExceptionHandling(StopIteration, errors.OutOfRangeError)
+    self._testExceptionHandling(MemoryError, errors.ResourceExhaustedError)
+    self._testExceptionHandling(NotImplementedError, errors.UnimplementedError)
+
+    class WeirdError(Exception):
+      pass
+
+    self._testExceptionHandling(WeirdError, errors.UnknownError)
+
 
 if __name__ == "__main__":
   test.main()
diff --git a/tensorflow/python/kernel_tests/resource_variable_ops_test.py b/tensorflow/python/kernel_tests/resource_variable_ops_test.py
index 0b81dcb8afe..2fba15801cb 100644
--- a/tensorflow/python/kernel_tests/resource_variable_ops_test.py
+++ b/tensorflow/python/kernel_tests/resource_variable_ops_test.py
@@ -195,6 +195,39 @@ class ResourceVariableOpsTest(test_util.TensorFlowTestCase):
     self.assertIsInstance(w.dtype, dtypes.DType)
     self.assertEqual(v.dtype, w.dtype)
 
+  def testCachingDevice(self):
+    with ops.device("/job:server/task:1"):
+      v = resource_variable_ops.ResourceVariable(
+          2.0, caching_device="/job:localhost")
+      self.assertEqual("/job:localhost", v.value().device)
+      with self.assertRaisesRegexp(ValueError, "No attr named '_class'"):
+        _ = v.value().op.get_attr("_class")
+
+    with ops.colocate_with(v.op):
+      w = resource_variable_ops.ResourceVariable(
+          2.0, caching_device="/job:localhost")
+      self.assertEqual("/job:localhost", w.value().device)
+      with self.assertRaisesRegexp(ValueError, "No attr named '_class'"):
+        _ = w.value().op.get_attr("_class")
+
+  def testSharedName(self):
+    with self.test_session():
+      v = resource_variable_ops.ResourceVariable(300.0, name="var1")
+      v.initializer.run()
+
+      w = resource_variable_ops.var_handle_op(dtype=v.dtype.base_dtype,
+                                              shape=v.get_shape(),
+                                              shared_name="var1")
+      w_read = resource_variable_ops.read_variable_op(w, v.dtype.base_dtype)
+      self.assertEqual(300.0, w_read.eval())
+
+      x = resource_variable_ops.var_handle_op(dtype=v.dtype.base_dtype,
+                                              shape=v.get_shape(),
+                                              shared_name="var1/")
+      x_read = resource_variable_ops.read_variable_op(x, v.dtype.base_dtype)
+      with self.assertRaisesOpError("Resource .*/var1//.* does not exist"):
+        _ = x_read.eval()
+
 
 if __name__ == "__main__":
   test.main()
diff --git a/tensorflow/python/lib/core/py_func.cc b/tensorflow/python/lib/core/py_func.cc
index f6062aa03d9..fe0557b535b 100644
--- a/tensorflow/python/lib/core/py_func.cc
+++ b/tensorflow/python/lib/core/py_func.cc
@@ -22,6 +22,7 @@ limitations under the License.
 #include "tensorflow/core/framework/op_kernel.h"
 #include "tensorflow/core/lib/core/errors.h"
 #include "tensorflow/core/lib/core/threadpool.h"
+#include "tensorflow/core/lib/strings/strcat.h"
 #include "tensorflow/core/platform/macros.h"
 #include "tensorflow/core/platform/mutex.h"
 #include "tensorflow/core/platform/types.h"
@@ -172,6 +173,48 @@ bool IsSingleNone(PyObject* obj) {
   return item == Py_None;
 }
 
+// py.__class__.__name__
+const char* ClassName(PyObject* py) {
+/* PyPy doesn't have a separate C API for old-style classes. */
+#if PY_MAJOR_VERSION < 3 && !defined(PYPY_VERSION)
+  if (PyClass_Check(py))
+    return PyString_AS_STRING(
+        CHECK_NOTNULL(reinterpret_cast<PyClassObject*>(py)->cl_name));
+  if (PyInstance_Check(py))
+    return PyString_AS_STRING(CHECK_NOTNULL(
+        reinterpret_cast<PyInstanceObject*>(py)->in_class->cl_name));
+#endif
+  if (Py_TYPE(py) == &PyType_Type) {
+    return reinterpret_cast<PyTypeObject*>(py)->tp_name;
+  }
+  return Py_TYPE(py)->tp_name;
+}
+
+string PyExcFetch() {
+  CHECK(PyErr_Occurred()) << "Must only call PyExcFetch after an exception.";
+  PyObject* ptype;
+  PyObject* pvalue;
+  PyObject* ptraceback;
+  PyErr_Fetch(&ptype, &pvalue, &ptraceback);
+  PyErr_NormalizeException(&ptype, &pvalue, &ptraceback);
+  string err = ClassName(ptype);
+  if (pvalue) {
+    PyObject* str = PyObject_Str(pvalue);
+    if (str) {
+#if PY_MAJOR_VERSION < 3
+      strings::StrAppend(&err, ": ", PyString_AS_STRING(str));
+#else
+      strings::StrAppend(&err, ": ", PyUnicode_AsUTF8(str));
+#endif
+      Py_DECREF(str);
+    }
+    Py_DECREF(pvalue);
+  }
+  Py_DECREF(ptype);
+  Py_XDECREF(ptraceback);
+  return err;
+}
+
 // Calls the registered py function through the trampoline.
 Status DoCallPyFunc(PyCall* call) {
   PyObject* trampoline = GetPyTrampoline();
@@ -189,11 +232,24 @@ Status DoCallPyFunc(PyCall* call) {
   Py_DECREF(args);
   if (result == nullptr) {
     if (PyErr_Occurred()) {
-      // TODO(zhifengc): Consider pretty-print error using LOG(STDERR).
-      PyErr_Print();
+      if (PyErr_ExceptionMatches(PyExc_ValueError) ||
+          PyErr_ExceptionMatches(PyExc_TypeError)) {
+        return errors::InvalidArgument(PyExcFetch());
+      } else if (PyErr_ExceptionMatches(PyExc_StopIteration)) {
+        return errors::OutOfRange(PyExcFetch());
+      } else if (PyErr_ExceptionMatches(PyExc_MemoryError)) {
+        return errors::ResourceExhausted(PyExcFetch());
+      } else if (PyErr_ExceptionMatches(PyExc_NotImplementedError)) {
+        return errors::Unimplemented(PyExcFetch());
+      } else {
+        // TODO(ebrevdo): Check if exception is an OpError and use the
+        // OpError.error_code property to map it back in the Status.
+        return errors::Unknown(PyExcFetch());
+      }
+    } else {
+      return errors::Internal("Failed to run py callback ", call->token,
+                              ": see error log.");
     }
-    return errors::Internal("Failed to run py callback ", call->token,
-                            ": see error log.");
   }
 
   // Process the return values and converts them to tf Tensors.
diff --git a/tensorflow/python/ops/nn_grad.py b/tensorflow/python/ops/nn_grad.py
index 93f6f97ee4a..9b765390b3e 100644
--- a/tensorflow/python/ops/nn_grad.py
+++ b/tensorflow/python/ops/nn_grad.py
@@ -21,10 +21,10 @@ from __future__ import print_function
 from tensorflow.python.framework import dtypes
 from tensorflow.python.framework import ops
 from tensorflow.python.ops import array_ops
+from tensorflow.python.ops import gen_nn_ops
 from tensorflow.python.ops import math_ops
 from tensorflow.python.ops import nn_ops
 from tensorflow.python.ops import sparse_ops
-from tensorflow.python.ops import gen_nn_ops
 
 
 @ops.RegisterGradient("Conv2DBackpropInput")
@@ -132,12 +132,12 @@ def _AvgPool3DGrad(op, grad):
 
 @ops.RegisterGradient("AvgPool3DGrad")
 def _AvgPool3DGradGrad(op, grad):
-  return (array_ops.stop_gradient(op.inputs[0]),
-          gen_nn_ops.avg_pool3d(grad,
-                                op.get_attr("ksize"),
-                                op.get_attr("strides"),
-                                op.get_attr("padding"),
-                                data_format=op.get_attr("data_format")))
+  return (array_ops.stop_gradient(op.inputs[0]), gen_nn_ops.avg_pool3d(
+      grad,
+      op.get_attr("ksize"),
+      op.get_attr("strides"),
+      op.get_attr("padding"),
+      data_format=op.get_attr("data_format")))
 
 
 @ops.RegisterGradient("MaxPool3D")
@@ -154,32 +154,34 @@ def _MaxPool3DGrad(op, grad):
 
 @ops.RegisterGradient("MaxPool3DGrad")
 def _MaxPool3DGradGrad(op, grad):
-  return (array_ops.zeros(shape=array_ops.shape(op.inputs[0]),
-                          dtype=op.inputs[0].dtype),
-          array_ops.zeros(shape=array_ops.shape(op.inputs[1]),
-                          dtype=op.inputs[1].dtype),
-          gen_nn_ops._max_pool3d_grad_grad(op.inputs[0],
-                                           op.inputs[1],
-                                           grad,
-                                           op.get_attr("ksize"),
-                                           op.get_attr("strides"),
-                                           padding=op.get_attr("padding"),
-                                           data_format=op.get_attr("data_format")))
+  return (array_ops.zeros(
+      shape=array_ops.shape(op.inputs[0]),
+      dtype=op.inputs[0].dtype), array_ops.zeros(
+          shape=array_ops.shape(op.inputs[1]), dtype=op.inputs[1].dtype),
+          gen_nn_ops._max_pool3d_grad_grad(
+              op.inputs[0],
+              op.inputs[1],
+              grad,
+              op.get_attr("ksize"),
+              op.get_attr("strides"),
+              padding=op.get_attr("padding"),
+              data_format=op.get_attr("data_format")))
 
 
 @ops.RegisterGradient("MaxPool3DGradGrad")
 def _MaxPool3DGradGradGrad(op, grad):
-  return (array_ops.zeros(shape=array_ops.shape(op.inputs[0]),
-                          dtype=op.inputs[0].dtype),
-          array_ops.zeros(shape=array_ops.shape(op.inputs[1]),
-                          dtype=op.inputs[1].dtype),
-          gen_nn_ops._max_pool3d_grad(op.inputs[0],
-                                      op.inputs[1],
-                                      grad,
-                                      op.get_attr("ksize"),
-                                      op.get_attr("strides"),
-                                      padding=op.get_attr("padding"),
-                                      data_format=op.get_attr("data_format")))
+  return (array_ops.zeros(
+      shape=array_ops.shape(op.inputs[0]),
+      dtype=op.inputs[0].dtype), array_ops.zeros(
+          shape=array_ops.shape(op.inputs[1]), dtype=op.inputs[1].dtype),
+          gen_nn_ops._max_pool3d_grad(
+              op.inputs[0],
+              op.inputs[1],
+              grad,
+              op.get_attr("ksize"),
+              op.get_attr("strides"),
+              padding=op.get_attr("padding"),
+              data_format=op.get_attr("data_format")))
 
 
 @ops.RegisterGradient("Softmax")
@@ -328,7 +330,7 @@ def _EluGradGrad(op, grad):
   return (gen_nn_ops._elu_grad(grad, op.outputs[0]),
           array_ops.where(
               x < 0., gen_nn_ops._elu_grad(grad, op.outputs[0] + 1),
-              array_ops.zeros(shape = array_ops.shape(x), dtype = x.dtype)))
+              array_ops.zeros(shape=array_ops.shape(x), dtype=x.dtype)))
 
 
 @ops.RegisterGradient("Relu6")
@@ -385,12 +387,13 @@ def _SoftmaxCrossEntropyWithLogitsGrad(op, grad_loss, grad_grad):
   softmax_grad = op.outputs[1]
   grad = _BroadcastMul(grad_loss, softmax_grad)
 
-  if grad_grad.op.type not in ('ZerosLike', 'Zeros'):
+  if grad_grad.op.type not in ("ZerosLike", "Zeros"):
     logits = op.inputs[0]
     softmax = nn_ops.softmax(logits)
 
-    grad += ((grad_grad - array_ops.squeeze(math_ops.matmul(grad_grad[:, None, :],
-                                                              softmax[:, :, None]), axis=1)) * softmax)
+    grad += ((grad_grad - array_ops.squeeze(
+        math_ops.matmul(grad_grad[:, None, :],
+                        softmax[:, :, None]), axis=1)) * softmax)
 
   return grad, None
 
@@ -482,12 +485,12 @@ def _AvgPoolGrad(op, grad):
 
 @ops.RegisterGradient("AvgPoolGrad")
 def _AvgPoolGradGrad(op, grad):
-  return (array_ops.stop_gradient(op.inputs[0]),
-          gen_nn_ops._avg_pool(grad,
-                               op.get_attr("ksize"),
-                               op.get_attr("strides"),
-                               op.get_attr("padding"),
-                               data_format=op.get_attr("data_format")))
+  return (array_ops.stop_gradient(op.inputs[0]), gen_nn_ops._avg_pool(
+      grad,
+      op.get_attr("ksize"),
+      op.get_attr("strides"),
+      op.get_attr("padding"),
+      data_format=op.get_attr("data_format")))
 
 
 @ops.RegisterGradient("MaxPool")
@@ -503,32 +506,34 @@ def _MaxPoolGrad(op, grad):
 
 @ops.RegisterGradient("MaxPoolGrad")
 def _MaxPoolGradGrad(op, grad):
-  return (array_ops.zeros(shape=array_ops.shape(op.inputs[0]),
-                          dtype=op.inputs[0].dtype),
-          array_ops.zeros(shape=array_ops.shape(op.inputs[1]),
-                          dtype=op.inputs[1].dtype),
-          gen_nn_ops._max_pool_grad_grad(op.inputs[0],
-                                         op.inputs[1],
-                                         grad,
-                                         op.get_attr("ksize"),
-                                         op.get_attr("strides"),
-                                         padding=op.get_attr("padding"),
-                                         data_format=op.get_attr("data_format")))
+  return (array_ops.zeros(
+      shape=array_ops.shape(op.inputs[0]),
+      dtype=op.inputs[0].dtype), array_ops.zeros(
+          shape=array_ops.shape(op.inputs[1]), dtype=op.inputs[1].dtype),
+          gen_nn_ops._max_pool_grad_grad(
+              op.inputs[0],
+              op.inputs[1],
+              grad,
+              op.get_attr("ksize"),
+              op.get_attr("strides"),
+              padding=op.get_attr("padding"),
+              data_format=op.get_attr("data_format")))
 
 
 @ops.RegisterGradient("MaxPoolGradGrad")
 def _MaxPoolGradGradGrad(op, grad):
-  return (array_ops.zeros(shape=array_ops.shape(op.inputs[0]),
-                          dtype=op.inputs[0].dtype),
-          array_ops.zeros(shape=array_ops.shape(op.inputs[1]),
-                          dtype=op.inputs[1].dtype),
-          gen_nn_ops._max_pool_grad(op.inputs[0],
-                                    op.inputs[1],
-                                    grad,
-                                    op.get_attr("ksize"),
-                                    op.get_attr("strides"),
-                                    padding=op.get_attr("padding"),
-                                    data_format=op.get_attr("data_format")))
+  return (array_ops.zeros(
+      shape=array_ops.shape(op.inputs[0]),
+      dtype=op.inputs[0].dtype), array_ops.zeros(
+          shape=array_ops.shape(op.inputs[1]), dtype=op.inputs[1].dtype),
+          gen_nn_ops._max_pool_grad(
+              op.inputs[0],
+              op.inputs[1],
+              grad,
+              op.get_attr("ksize"),
+              op.get_attr("strides"),
+              padding=op.get_attr("padding"),
+              data_format=op.get_attr("data_format")))
 
 
 @ops.RegisterGradient("FractionalMaxPool")
diff --git a/tensorflow/python/ops/nn_ops.py b/tensorflow/python/ops/nn_ops.py
index b2ccec0a9d2..99d29a37193 100644
--- a/tensorflow/python/ops/nn_ops.py
+++ b/tensorflow/python/ops/nn_ops.py
@@ -39,6 +39,8 @@ from tensorflow.python.ops.gen_nn_ops import *
 # Aliases for some automatically-generated names.
 local_response_normalization = gen_nn_ops.lrn
 
+# pylint: disable=protected-access
+
 
 def _non_atrous_convolution(input, filter, padding, data_format=None,  # pylint: disable=redefined-builtin
                             strides=None, name=None):
diff --git a/tensorflow/python/ops/resource_variable_ops.py b/tensorflow/python/ops/resource_variable_ops.py
index e84fa21868f..86e0cae27ac 100644
--- a/tensorflow/python/ops/resource_variable_ops.py
+++ b/tensorflow/python/ops/resource_variable_ops.py
@@ -159,16 +159,15 @@ class ResourceVariable(object):
     with ops.control_dependencies(None):
       with ops.name_scope(name, "Variable", [] if init_from_fn else
                           [initial_value]) as name:
+        # pylint: disable=protected-access
+        true_name = ops._name_from_scope_name(name)
         if init_from_fn:
           # Use attr_scope and device(None) to simulate the behavior of
           # colocate_with when the variable we want to colocate with doesn't
           # yet exist.
-          # pylint: disable=protected-access
-          true_name = ops._name_from_scope_name(name)
           attr = attr_value_pb2.AttrValue(
               list=attr_value_pb2.AttrValue.ListValue(
                   s=[compat.as_bytes("loc:@%s" % true_name)]))
-          # pylint: disable=protected-access
           with ops.get_default_graph()._attr_scope({"_class": attr}):
             with ops.name_scope("Initializer"), ops.device(None):
               self._initial_value = ops.convert_to_tensor(
@@ -176,7 +175,8 @@ class ResourceVariable(object):
             self._handle = gen_resource_variable_ops.var_handle_op(
                 shape=self._initial_value.get_shape(),
                 dtype=self._initial_value.dtype.base_dtype,
-                shared_name=name, name=name)
+                shared_name=true_name, name=name)
+        # pylint: enable=protected-access
 
         # Or get the initial value from a Tensor or Python object.
         else:
@@ -185,7 +185,7 @@ class ResourceVariable(object):
           self._handle = gen_resource_variable_ops.var_handle_op(
               shape=self._initial_value.get_shape(),
               dtype=self._initial_value.dtype.base_dtype,
-              shared_name=name, name=name)
+              shared_name=true_name, name=name)
 
         self._dtype = self._initial_value.dtype.base_dtype
 
@@ -201,8 +201,16 @@ class ResourceVariable(object):
               self._handle, dtype=self._dtype)
           self._graph_element = value
           if caching_device is not None:
-            with ops.device(caching_device):
-              self._cached_value = array_ops.identity(value)
+            # Variables may be created in a tf.device() or ops.colocate_with()
+            # context. At the same time, users would expect caching device to be
+            # independent of this context, and/or would not expect the current
+            # device context to be merged with the caching device spec.
+            # Therefore we reset the colocation stack before creating the cached
+            # value. Note that resetting the colocation stack will also reset
+            # the device stack.
+            with ops.colocate_with(None, ignore_existing=True):
+              with ops.device(caching_device):
+                self._cached_value = array_ops.identity(value)
           else:
             self._cached_value = None
           ops.add_to_collections(collections, self)
@@ -422,6 +430,8 @@ def _dense_var_to_tensor(var, dtype=None, name=None, as_ref=False):
   if dtype is not None and dtype != var.value().dtype:
     print("trying to switch the dtype to ", dtype, " from ", var.value().dtype)
     return NotImplemented
+  if as_ref:
+    return var.read_value().op.inputs[0]
   return var.value()
 # pylint: enable=unused-argument,protected-access
 
diff --git a/tensorflow/python/platform/control_imports.py b/tensorflow/python/platform/control_imports.py
new file mode 100644
index 00000000000..b8e8e78ef3b
--- /dev/null
+++ b/tensorflow/python/platform/control_imports.py
@@ -0,0 +1,27 @@
+# Copyright 2015 The TensorFlow Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ==============================================================================
+"""Switch between Google or open source dependencies."""
+# Switch between Google and OSS dependencies
+USE_OSS = True
+
+# Per-dependency switches determining whether each dependency is ready
+# to be replaced by its OSS equivalence.
+# TODO(danmane,mrry,opensource): Flip these switches, then remove them
+OSS_APP = True
+OSS_FLAGS = True
+OSS_GFILE = True
+OSS_GOOGLETEST = True
+OSS_LOGGING = True
+OSS_PARAMETERIZED = True
diff --git a/tensorflow/python/training/saver_test.py b/tensorflow/python/training/saver_test.py
index 3172a1d5ba9..1a980f93113 100644
--- a/tensorflow/python/training/saver_test.py
+++ b/tensorflow/python/training/saver_test.py
@@ -156,6 +156,18 @@ class SaverTest(test.TestCase):
   def testResourceBasic(self):
     self.basicSaveRestore(resource_variable_ops.ResourceVariable)
 
+  def testResourceSaveRestoreCachingDevice(self):
+    save_path = os.path.join(self.get_temp_dir(), "resource_cache")
+    v = resource_variable_ops.ResourceVariable([1], caching_device="/cpu:0")
+    with self.test_session() as sess:
+      variables.global_variables_initializer().run()
+      save = saver_module.Saver()
+      save.save(sess, save_path)
+    with self.test_session() as sess:
+      save2 = saver_module.Saver()
+      save2.restore(sess, save_path)
+      self.assertEquals(v.eval(), [1])
+
   def testSaveCopyRestoreWithSaveRelativePaths(self):
     """Save, copy checkpoint dir and restore from copied dir.
 
diff --git a/tensorflow/stream_executor/cuda/cuda_dnn.cc b/tensorflow/stream_executor/cuda/cuda_dnn.cc
index eed0d43a3c1..b6d841f3653 100644
--- a/tensorflow/stream_executor/cuda/cuda_dnn.cc
+++ b/tensorflow/stream_executor/cuda/cuda_dnn.cc
@@ -1202,7 +1202,8 @@ class CudnnRnnSequenceTensorDescriptor
     // Only the first one needs to be destroyed. All others are the same.
     cudnnStatus_t status =
         wrap::cudnnDestroyTensorDescriptor(parent_, handles_[0]);
-    CUDNN_RETURN_IF_FAIL(status, "Failed to destroy sequence tensor descriptor");
+    CUDNN_RETURN_IF_FAIL(status,
+                         "Failed to destroy sequence tensor descriptor");
   }
 
   const cudnnTensorDescriptor_t* handles() const {
@@ -3089,7 +3090,7 @@ bool CudnnSupport::DoPoolForward(
     DeviceMemory<double>* output_data) {
   mutex_lock lock{dnn_handle_mutex_};
   auto status = wrap::cudnnSetStream(parent_, ToHandle(dnn_handle_),
-                                        AsCUDAStreamValue(stream));
+                                     AsCUDAStreamValue(stream));
   if (status != CUDNN_STATUS_SUCCESS) {
     LOG(ERROR) << "failed to set stream for cudnn handle: " << ToString(status);
     return false;
@@ -3195,7 +3196,7 @@ bool CudnnSupport::DoPoolBackward(
     DeviceMemory<double>* output_diff_data) {
   mutex_lock lock{dnn_handle_mutex_};
   auto status = wrap::cudnnSetStream(parent_, ToHandle(dnn_handle_),
-                                        AsCUDAStreamValue(stream));
+                                     AsCUDAStreamValue(stream));
   if (status != CUDNN_STATUS_SUCCESS) {
     LOG(ERROR) << "failed to set stream for cudnn handle: " << ToString(status);
     return false;
diff --git a/tensorflow/stream_executor/dnn.h b/tensorflow/stream_executor/dnn.h
index 2cd4eda4119..c5805064f3c 100644
--- a/tensorflow/stream_executor/dnn.h
+++ b/tensorflow/stream_executor/dnn.h
@@ -1273,13 +1273,6 @@ class DnnSupport {
   // the input. The output width and height can be different.
   //
   // See PoolingDescriptor for how to configure the pooling operation.
-  virtual bool DoPoolForward(Stream* stream,
-                             const dnn::PoolingDescriptor& pooling_dimensions,
-                             const dnn::BatchDescriptor& input_dimensions,
-                             const DeviceMemory<double>& input_data,
-                             const dnn::BatchDescriptor& output_dimensions,
-                             DeviceMemory<double>* output_data) = 0;
-
   virtual bool DoPoolForward(Stream* stream,
                              const dnn::PoolingDescriptor& pooling_dimensions,
                              const dnn::BatchDescriptor& input_dimensions,
@@ -1287,12 +1280,25 @@ class DnnSupport {
                              const dnn::BatchDescriptor& output_dimensions,
                              DeviceMemory<float>* output_data) = 0;
 
+  virtual bool DoPoolForward(Stream* stream,
+                             const dnn::PoolingDescriptor& pooling_dimensions,
+                             const dnn::BatchDescriptor& input_dimensions,
+                             const DeviceMemory<double>& input_data,
+                             const dnn::BatchDescriptor& output_dimensions,
+                             DeviceMemory<double>* output_data) {
+    LOG(FATAL) << "DoPoolForward not implemented for double.";
+    return false;
+  }
+
   virtual bool DoPoolForward(Stream* stream,
                              const dnn::PoolingDescriptor& pooling_dimensions,
                              const dnn::BatchDescriptor& input_dimensions,
                              const DeviceMemory<Eigen::half>& input_data,
                              const dnn::BatchDescriptor& output_dimensions,
-                             DeviceMemory<Eigen::half>* output_data) = 0;
+                             DeviceMemory<Eigen::half>* output_data) {
+    LOG(FATAL) << "DoPoolForward not implemented for float16.";
+    return false;
+  }
 
   // Performs differentiation of the pooling operation.
   virtual bool DoPoolBackward(Stream* stream,
@@ -1302,7 +1308,10 @@ class DnnSupport {
                               const dnn::BatchDescriptor& output_dimensions,
                               const DeviceMemory<double>& output_data,
                               const DeviceMemory<double>& input_diff_data,
-                              DeviceMemory<double>* output_diff_data) = 0;
+                              DeviceMemory<double>* output_diff_data) {
+    LOG(FATAL) << "DoPoolBackward not implemented.";
+    return false;
+  }
 
   virtual bool DoPoolBackward(Stream* stream,
                               const dnn::PoolingDescriptor& pooling_dimensions,
@@ -1311,7 +1320,10 @@ class DnnSupport {
                               const dnn::BatchDescriptor& output_dimensions,
                               const DeviceMemory<float>& output_data,
                               const DeviceMemory<float>& input_diff_data,
-                              DeviceMemory<float>* output_diff_data) = 0;
+                              DeviceMemory<float>* output_diff_data) {
+    LOG(FATAL) << "DoPoolBackward not implemented.";
+    return false;
+  }
 
   virtual bool DoPoolBackward(Stream* stream,
                               const dnn::PoolingDescriptor& pooling_dimensions,
@@ -1320,7 +1332,10 @@ class DnnSupport {
                               const dnn::BatchDescriptor& output_dimensions,
                               const DeviceMemory<Eigen::half>& output_data,
                               const DeviceMemory<Eigen::half>& input_diff_data,
-                              DeviceMemory<Eigen::half>* output_diff_data) = 0;
+                              DeviceMemory<Eigen::half>* output_diff_data) {
+    LOG(FATAL) << "DoPoolBackward not implemented.";
+    return false;
+  }
 
   // Applies local response normalization to the values from
   // input_data and writes the result to output_data. See comments on
@@ -1900,4 +1915,3 @@ class DnnSupport {
 }  // namespace perftools
 
 #endif  // TENSORFLOW_STREAM_EXECUTOR_DNN_H_
-
diff --git a/tensorflow/tensorboard/TAG b/tensorflow/tensorboard/TAG
index 82cced27d7b..0691f67b202 100644
--- a/tensorflow/tensorboard/TAG
+++ b/tensorflow/tensorboard/TAG
@@ -1 +1 @@
-51
+52
diff --git a/tensorflow/tensorboard/backend/BUILD b/tensorflow/tensorboard/backend/BUILD
index b99e6c56559..4e1db853744 100644
--- a/tensorflow/tensorboard/backend/BUILD
+++ b/tensorflow/tensorboard/backend/BUILD
@@ -90,6 +90,7 @@ py_test(
         "//tensorflow/python:training",
         "//tensorflow/tensorboard",
         "//tensorflow/tensorboard/backend/event_processing:event_multiplexer",
+        "//tensorflow/tensorboard/plugins:base_plugin",
         "@org_pocoo_werkzeug//:werkzeug",
     ],
 )
diff --git a/tensorflow/tensorboard/backend/application.py b/tensorflow/tensorboard/backend/application.py
index e812880bbda..005d1830390 100644
--- a/tensorflow/tensorboard/backend/application.py
+++ b/tensorflow/tensorboard/backend/application.py
@@ -102,11 +102,11 @@ def standard_tensorboard_wsgi(logdir, purge_orphaned_data, reload_interval):
       size_guidance=DEFAULT_SIZE_GUIDANCE,
       purge_orphaned_data=purge_orphaned_data)
 
-  plugins = {
-      debugger_plugin.PLUGIN_PREFIX_ROUTE: debugger_plugin.DebuggerPlugin(),
-      projector_plugin.PLUGIN_PREFIX_ROUTE: projector_plugin.ProjectorPlugin(),
-      text_plugin.PLUGIN_PREFIX_ROUTE: text_plugin.TextPlugin(),
-  }
+  plugins = [
+      debugger_plugin.DebuggerPlugin(),
+      projector_plugin.ProjectorPlugin(),
+      text_plugin.TextPlugin(),
+  ]
 
   return TensorBoardWSGIApp(logdir, plugins, multiplexer, reload_interval)
 
@@ -128,12 +128,16 @@ class TensorBoardWSGIApp(object):
       logdir: the logdir spec that describes where data will be loaded.
         may be a directory, or comma,separated list of directories, or colons
         can be used to provide named directories
-      plugins: Map from plugin name to plugin application
+      plugins: List of plugins that extend tensorboard.plugins.BasePlugin
       multiplexer: The EventMultiplexer with TensorBoard data to serve
       reload_interval: How often (in seconds) to reload the Multiplexer
 
     Returns:
       A WSGI application that implements the TensorBoard backend.
+
+    Raises:
+      ValueError: If some plugin has no plugin_name
+      ValueError: If two plugins have the same plugin_name
     """
     self._logdir = logdir
     self._plugins = plugins
@@ -177,15 +181,22 @@ class TensorBoardWSGIApp(object):
     # Serve the routes from the registered plugins using their name as the route
     # prefix. For example if plugin z has two routes /a and /b, they will be
     # served as /data/plugin/z/a and /data/plugin/z/b.
-    for name in self._plugins:
+    plugin_names_encountered = set()
+    for plugin in self._plugins:
+      if plugin.plugin_name is None:
+        raise ValueError('Plugin %s has no plugin_name' % plugin)
+      if plugin.plugin_name in plugin_names_encountered:
+        raise ValueError('Duplicate plugins for name %s' % plugin.plugin_name)
+      plugin_names_encountered.add(plugin.plugin_name)
+
       try:
-        plugin = self._plugins[name]
         plugin_apps = plugin.get_plugin_apps(self._multiplexer, self._logdir)
       except Exception as e:  # pylint: disable=broad-except
-        logging.warning('Plugin %s failed. Exception: %s', name, str(e))
+        logging.warning('Plugin %s failed. Exception: %s', plugin.plugin_name,
+                        str(e))
         continue
       for route, app in plugin_apps.items():
-        path = DATA_PREFIX + PLUGIN_PREFIX + '/' + name + route
+        path = DATA_PREFIX + PLUGIN_PREFIX + '/' + plugin.plugin_name + route
         self.data_applications[path] = app
 
   # We use underscore_names for consistency with inherited methods.
diff --git a/tensorflow/tensorboard/backend/application_test.py b/tensorflow/tensorboard/backend/application_test.py
index 234c541803e..454ba63e752 100644
--- a/tensorflow/tensorboard/backend/application_test.py
+++ b/tensorflow/tensorboard/backend/application_test.py
@@ -48,6 +48,7 @@ from tensorflow.python.summary.writer import writer as writer_lib
 from tensorflow.tensorboard import tensorboard
 from tensorflow.tensorboard.backend import application
 from tensorflow.tensorboard.backend.event_processing import event_multiplexer
+from tensorflow.tensorboard.plugins import base_plugin
 
 
 class TensorboardServerTest(test.TestCase):
@@ -61,7 +62,7 @@ class TensorboardServerTest(test.TestCase):
     multiplexer = event_multiplexer.EventMultiplexer(
         size_guidance=application.DEFAULT_SIZE_GUIDANCE,
         purge_orphaned_data=True)
-    plugins = {}
+    plugins = []
     app = application.TensorBoardWSGIApp(
         self.temp_dir, plugins, multiplexer, reload_interval=0)
     try:
@@ -480,5 +481,34 @@ class TensorboardSimpleServerConstructionTest(test.TestCase):
     self.assertTrue(one_passed)  # We expect either IPv4 or IPv6 to be supported
 
 
+class TensorBoardApplcationConstructionTest(test.TestCase):
+
+  def testExceptions(self):
+
+    class UnnamedPlugin(base_plugin.TBPlugin):
+
+      def get_plugin_apps(self):
+        pass
+
+    class MockPlugin(UnnamedPlugin):
+      plugin_name = 'mock'
+
+    class OtherMockPlugin(UnnamedPlugin):
+      plugin_name = 'mock'
+
+    logdir = '/fake/foo'
+    multiplexer = event_multiplexer.EventMultiplexer()
+
+    # Fails if there is an unnamed plugin
+    with self.assertRaises(ValueError):
+      plugins = [UnnamedPlugin()]
+      application.TensorBoardWSGIApp(logdir, plugins, multiplexer, 0)
+
+    # Fails if there are two plugins with same name
+    with self.assertRaises(ValueError):
+      plugins = [MockPlugin(), OtherMockPlugin()]
+      application.TensorBoardWSGIApp(logdir, plugins, multiplexer, 0)
+
+
 if __name__ == '__main__':
   test.main()
diff --git a/tensorflow/tensorboard/components/tf_dashboard_common/tf-multi-checkbox.html b/tensorflow/tensorboard/components/tf_dashboard_common/tf-multi-checkbox.html
index bc15312fc3a..ac407844f0b 100644
--- a/tensorflow/tensorboard/components/tf_dashboard_common/tf-multi-checkbox.html
+++ b/tensorflow/tensorboard/components/tf_dashboard_common/tf-multi-checkbox.html
@@ -194,10 +194,11 @@ handle these situations gracefully.
         type: Object,
         observer: "synchronizeColors",
       }, // map from run name to css class
-      numRunsEnabledByDefault: {
-        // When TB first loads, first k runs are enabled, rest are disabled.
+      maxRunsToEnableByDefault: {
+        // When TB first loads, if it has k or fewer runs, they are all enabled
+        // by default. If there are more, then they are all disabled.
         type: Number,
-        value: 10,
+        value: 40,
       },
       _debouncedRegexChange: {
         type: Function,
@@ -230,8 +231,24 @@ handle these situations gracefully.
     },
     observers: [
       "_setIsolatorIcon(runSelectionState, names)",
-      "_storeRunToIsCheckedMapping(runSelectionState)",
+      "_storeRunToIsCheckedMappingWithDefault(runSelectionState, namesMatchingRegex)",
     ],
+    _storeRunToIsCheckedMappingWithDefault() {
+      var runSelectionStateIsDefault = Object.keys(this.runSelectionState).length == 0;
+      if (runSelectionStateIsDefault || this.namesMatchingRegex == null) {
+        return;
+      }
+      var _this = this;
+      var allToggledOn = this.namesMatchingRegex
+              .every(function(n) {return _this.runSelectionState[n]});
+      var allToggledOff = this.namesMatchingRegex
+              .every(function(n) {return !_this.runSelectionState[n]});
+      var defaultOff = this.namesMatchingRegex.length > this.maxRunsToEnableByDefault;
+      if (defaultOff && allToggledOff || !defaultOff && allToggledOn) {
+        this.runSelectionState = {};
+      }
+      this._storeRunToIsCheckedMapping(this.runSelectionState);
+    },
     _storeRunToIsCheckedMapping: TF.URIStorage.getObjectObserver('runSelectionState', {}),
     _makeRegex: function(regex) {
       try {
@@ -261,9 +278,10 @@ handle these situations gracefully.
     },
     computeOutSelected: function(__, ___) {
       var runSelectionState = this.runSelectionState;
-      var num = this.numRunsEnabledByDefault;
+      var num = this.maxRunsToEnableByDefault;
+      var allEnabled = this.namesMatchingRegex.length <= num;
       return this.namesMatchingRegex.filter(function(n, i) {
-        return runSelectionState[n] == null ? i<num : runSelectionState[n];
+        return runSelectionState[n] == null ? allEnabled : runSelectionState[n];
       });
     },
     synchronizeColors: function(e) {
@@ -313,18 +331,24 @@ handle these situations gracefully.
     _regexInputObserver: TF.URIStorage.getStringObserver("regexInput", ""),
     toggleAll: function() {
       var _this = this;
-      var allToggledOn = this.namesMatchingRegex
-                    .every(function(n) {return _this.runSelectionState[n]});
+      var anyToggledOn = this.namesMatchingRegex
+                    .some(function(n) {return _this.runSelectionState[n]});
+
 
       var runSelectionStateIsDefault = Object.keys(this.runSelectionState).length == 0;
 
-      var numRuns = this.namesMatchingRegex.length;
+      var defaultOff = this.namesMatchingRegex.length > this.maxRunsToEnableByDefault;
+      // We have runs toggled either if some were explicitly toggled on, or if
+      // we are in the default state, and there are few enough that we default
+      // to toggling on.
+      anyToggledOn = anyToggledOn || runSelectionStateIsDefault && !defaultOff;
 
-      var shouldDisable = allToggledOn || runSelectionStateIsDefault;
+      // If any are toggled on, we turn everything off. Or, if none are toggled
+      // on, we turn everything on.
 
       var newRunsDisabled = {};
       this.names.forEach(function(n) {
-        newRunsDisabled[n] = !shouldDisable;
+        newRunsDisabled[n] = !anyToggledOn;
       })
       this.runSelectionState = newRunsDisabled;
     },
diff --git a/tensorflow/tensorboard/components/vz_line_chart/vz-line-chart.ts b/tensorflow/tensorboard/components/vz_line_chart/vz-line-chart.ts
index 59da03d455f..4bc6a8a837a 100644
--- a/tensorflow/tensorboard/components/vz_line_chart/vz-line-chart.ts
+++ b/tensorflow/tensorboard/components/vz_line_chart/vz-line-chart.ts
@@ -453,8 +453,8 @@ module VZ {
     }
 
     private resmoothDataset(dataset: Plottable.Dataset) {
-      var data = dataset.data();
-      var smoothingWeight = this.smoothingWeight;
+      let data = dataset.data();
+      const smoothingWeight = this.smoothingWeight;
       let last = data.length > 0 ? data[0].scalar : NaN;
       data.forEach((d) => {
         if (!_.isFinite(last)) {
@@ -462,8 +462,8 @@ module VZ {
         } else {
           // 1st-order IIR low-pass filter to attenuate the higher-
           // frequency components of the time-series.
-          d.smoothed = last * smoothingWeight +
-                       (1 - smoothingWeight) * d.scalar;
+          d.smoothed =
+              last * smoothingWeight + (1 - smoothingWeight) * d.scalar;
         }
         last = d.smoothed;
       });
diff --git a/tensorflow/tensorboard/plugins/base_plugin.py b/tensorflow/tensorboard/plugins/base_plugin.py
index 86cfeb6cc24..8b1560cf8a1 100644
--- a/tensorflow/tensorboard/plugins/base_plugin.py
+++ b/tensorflow/tensorboard/plugins/base_plugin.py
@@ -30,6 +30,12 @@ class TBPlugin(object):
   """TensorBoard plugin interface. Every plugin must extend from this class."""
   __metaclass__ = ABCMeta
 
+  # The plugin_name will also be a prefix in the http handlers generated by
+  # the plugin, e.g. `data/plugins/$PLUGIN_NAME/$HANDLER`
+  # The plugin name must be unique for each registered plugin, or
+  # a ValueError will be thrown when the application is constructed
+  plugin_name = None
+
   @abstractmethod
   def get_plugin_apps(self, multiplexer, logdir):
     """Returns a set of WSGI applications that the plugin implements.
diff --git a/tensorflow/tensorboard/plugins/debugger/debugger_plugin.py b/tensorflow/tensorboard/plugins/debugger/debugger_plugin.py
index 43902efe24e..cfa8f681872 100644
--- a/tensorflow/tensorboard/plugins/debugger/debugger_plugin.py
+++ b/tensorflow/tensorboard/plugins/debugger/debugger_plugin.py
@@ -34,7 +34,7 @@ from tensorflow.tensorboard.backend.event_processing import event_file_loader
 from tensorflow.tensorboard.plugins import base_plugin
 
 # The prefix of routes provided by this plugin.
-PLUGIN_PREFIX_ROUTE = 'debugger'
+_PLUGIN_PREFIX_ROUTE = 'debugger'
 
 # HTTP routes.
 _HEALTH_PILLS_ROUTE = '/health_pills'
@@ -63,6 +63,8 @@ class DebuggerPlugin(base_plugin.TBPlugin):
   values.
   """
 
+  plugin_name = _PLUGIN_PREFIX_ROUTE
+
   def get_plugin_apps(self, multiplexer, logdir):
     """Obtains a mapping between routes and handlers. Stores the logdir.
 
diff --git a/tensorflow/tensorboard/plugins/debugger/debugger_plugin_test.py b/tensorflow/tensorboard/plugins/debugger/debugger_plugin_test.py
index 9e71e2713d1..2c9135fd277 100644
--- a/tensorflow/tensorboard/plugins/debugger/debugger_plugin_test.py
+++ b/tensorflow/tensorboard/plugins/debugger/debugger_plugin_test.py
@@ -96,7 +96,7 @@ class DebuggerPluginTest(test.TestCase):
     })
     self.plugin = debugger_plugin.DebuggerPlugin()
     wsgi_app = application.TensorBoardWSGIApp(
-        self.log_dir, {'debugger': self.plugin},
+        self.log_dir, [self.plugin],
         self.multiplexer,
         reload_interval=0)
     self.server = werkzeug_test.Client(wsgi_app, wrappers.BaseResponse)
diff --git a/tensorflow/tensorboard/plugins/projector/projector_plugin.py b/tensorflow/tensorboard/plugins/projector/projector_plugin.py
index 439e9198987..32ebb78e42d 100644
--- a/tensorflow/tensorboard/plugins/projector/projector_plugin.py
+++ b/tensorflow/tensorboard/plugins/projector/projector_plugin.py
@@ -42,7 +42,7 @@ from tensorflow.tensorboard.plugins.base_plugin import TBPlugin
 from tensorflow.tensorboard.plugins.projector import projector_config_pb2
 
 # The prefix of routes provided by this plugin.
-PLUGIN_PREFIX_ROUTE = 'projector'
+_PLUGIN_PREFIX_ROUTE = 'projector'
 
 PROJECTOR_FILENAME = 'projector_config.pbtxt'
 
@@ -305,6 +305,8 @@ def _parse_positive_int_param(request, param_name):
 class ProjectorPlugin(TBPlugin):
   """Embedding projector."""
 
+  plugin_name = _PLUGIN_PREFIX_ROUTE
+
   def __init__(self):
     self._handlers = None
     self.readers = {}
diff --git a/tensorflow/tensorboard/plugins/projector/projector_plugin_test.py b/tensorflow/tensorboard/plugins/projector/projector_plugin_test.py
index 750cb1472a0..5679eff4a35 100644
--- a/tensorflow/tensorboard/plugins/projector/projector_plugin_test.py
+++ b/tensorflow/tensorboard/plugins/projector/projector_plugin_test.py
@@ -104,10 +104,8 @@ class ProjectorAppTest(test.TestCase):
         size_guidance=application.DEFAULT_SIZE_GUIDANCE,
         purge_orphaned_data=True)
     plugin = projector_plugin.ProjectorPlugin()
-    plugin.get_plugin_apps(multiplexer, self.log_dir)
-    plugins = {'projector': plugin}
     wsgi_app = application.TensorBoardWSGIApp(
-        self.log_dir, plugins, multiplexer, reload_interval=0)
+        self.log_dir, [plugin], multiplexer, reload_interval=0)
     self.server = werkzeug_test.Client(wsgi_app, wrappers.BaseResponse)
 
   def _Get(self, path):
diff --git a/tensorflow/tensorboard/plugins/text/text_plugin.py b/tensorflow/tensorboard/plugins/text/text_plugin.py
index 2c0620f5649..b337ce2ad03 100644
--- a/tensorflow/tensorboard/plugins/text/text_plugin.py
+++ b/tensorflow/tensorboard/plugins/text/text_plugin.py
@@ -39,7 +39,7 @@ from tensorflow.tensorboard.backend import http_util
 from tensorflow.tensorboard.plugins import base_plugin
 
 # The prefix of routes provided by this plugin.
-PLUGIN_PREFIX_ROUTE = 'text'
+_PLUGIN_PREFIX_ROUTE = 'text'
 
 # HTTP routes
 RUNS_ROUTE = '/runs'
@@ -251,6 +251,8 @@ def process_string_tensor_event(event):
 class TextPlugin(base_plugin.TBPlugin):
   """Text Plugin for TensorBoard."""
 
+  plugin_name = _PLUGIN_PREFIX_ROUTE
+
   def index_impl(self):
     run_to_series = {}
     name = text_summary.TextSummaryPluginAsset.plugin_name
diff --git a/tensorflow/tensorflow.bzl b/tensorflow/tensorflow.bzl
index c8254f0062b..156f7b13bd0 100644
--- a/tensorflow/tensorflow.bzl
+++ b/tensorflow/tensorflow.bzl
@@ -1,42 +1,45 @@
 # -*- Python -*-
 
+
 # Given a source file, generate a test name.
 # i.e. "common_runtime/direct_session_test.cc" becomes
 #      "common_runtime_direct_session_test"
 def src_to_test_name(src):
   return src.replace("/", "_").split(".")[0]
 
+
 # Return the options to use for a C++ library or binary build.
 # Uses the ":optmode" config_setting to pick the options.
 load(
     "//tensorflow/core:platform/default/build_config_root.bzl",
     "tf_cuda_tests_tags",
     "tf_sycl_tests_tags",
-    "tf_additional_xla_deps_py",
-)
-load(
-    "@local_config_cuda//cuda:build_defs.bzl",
-    "if_cuda",
-    "cuda_default_copts"
-)
+    "tf_additional_xla_deps_py",)
+load("@local_config_cuda//cuda:build_defs.bzl", "if_cuda", "cuda_default_copts")
 
 load(
     "//third_party/mkl:build_defs.bzl",
-    "if_mkl",
-)
+    "if_mkl",)
+
 
 # List of proto files for android builds
 def tf_android_core_proto_sources(core_proto_sources_relative):
-  return [str(Label("//tensorflow/core:" + p))
-          for p in core_proto_sources_relative]
+  return [
+      "//tensorflow/core:" + p for p in core_proto_sources_relative
+  ]
+
 
 # Returns the list of pb.h and proto.h headers that are generated for
 # tf_android_core_proto_sources().
 def tf_android_core_proto_headers(core_proto_sources_relative):
-  return ([str(Label("//tensorflow/core/" + p.replace(".proto", ".pb.h")))
-          for p in core_proto_sources_relative] +
-         [str(Label("//tensorflow/core/" + p.replace(".proto", ".proto.h")))
-          for p in core_proto_sources_relative])
+  return ([
+      "//tensorflow/core/" + p.replace(".proto", ".pb.h")
+      for p in core_proto_sources_relative
+  ] + [
+      "//tensorflow/core/" + p.replace(".proto", ".proto.h")
+      for p in core_proto_sources_relative
+  ])
+
 
 def if_android_x86(a):
   return select({
@@ -52,30 +55,35 @@ def if_android_arm(a):
       "//conditions:default": [],
   })
 
+
 def if_android_arm64(a):
   return select({
       str(Label("//tensorflow:android_arm64")): a,
       "//conditions:default": [],
   })
 
+
 def if_not_android(a):
   return select({
       str(Label("//tensorflow:android")): [],
       "//conditions:default": a,
   })
 
+
 def if_android(a):
   return select({
       str(Label("//tensorflow:android")): a,
       "//conditions:default": [],
   })
 
+
 def if_ios(a):
   return select({
       str(Label("//tensorflow:ios")): a,
       "//conditions:default": [],
   })
 
+
 def if_mobile(a):
   return select({
       str(Label("//tensorflow:android")): a,
@@ -83,6 +91,7 @@ def if_mobile(a):
       "//conditions:default": [],
   })
 
+
 def if_not_mobile(a):
   return select({
       str(Label("//tensorflow:android")): [],
@@ -90,12 +99,14 @@ def if_not_mobile(a):
       "//conditions:default": a,
   })
 
+
 def if_not_windows(a):
   return select({
       str(Label("//tensorflow:windows")): [],
       "//conditions:default": a,
   })
 
+
 def if_x86(a):
   return select({
       str(Label("//tensorflow:linux_x86_64")): a,
@@ -103,33 +114,34 @@ def if_x86(a):
       "//conditions:default": [],
   })
 
+
 # LINT.IfChange
 def tf_copts():
-  return (["-DEIGEN_AVOID_STL_ARRAY",
-           "-Iexternal/gemmlowp",
-           "-Wno-sign-compare",
-           "-fno-exceptions",] +
-          if_cuda(["-DGOOGLE_CUDA=1"]) +
-          if_mkl(["-DINTEL_MKL=1"]) +
-          if_android_arm(["-mfpu=neon"]) +
-          if_x86(["-msse3"]) +
-          select({
-              str(Label("//tensorflow:android")): [
-                  "-std=c++11",
-                  "-DTF_LEAN_BINARY",
-                  "-O2",
-              ],
-              str(Label("//tensorflow:darwin")): [],
-              str(Label("//tensorflow:windows")): [
-                "/DLANG_CXX11",
-                "/D__VERSION__=\\\"MSVC\\\"",
-                "/DPLATFORM_WINDOWS",
-                "/DTF_COMPILE_LIBRARY",
-                "/DEIGEN_HAS_C99_MATH",
-                "/DTENSORFLOW_USE_EIGEN_THREADPOOL",
-              ],
-              str(Label("//tensorflow:ios")): ["-std=c++11"],
-              "//conditions:default": ["-pthread"]}))
+  return ([
+      "-DEIGEN_AVOID_STL_ARRAY",
+      "-Iexternal/gemmlowp",
+      "-Wno-sign-compare",
+      "-fno-exceptions",
+  ] + if_cuda(["-DGOOGLE_CUDA=1"]) + if_mkl(["-DINTEL_MKL=1"]) + if_android_arm(
+      ["-mfpu=neon"]) + if_x86(["-msse3"]) + select({
+          "//tensorflow:android": [
+              "-std=c++11",
+              "-DTF_LEAN_BINARY",
+              "-O2",
+          ],
+          "//tensorflow:darwin": [],
+          "//tensorflow:windows": [
+              "/DLANG_CXX11",
+              "/D__VERSION__=\\\"MSVC\\\"",
+              "/DPLATFORM_WINDOWS",
+              "/DTF_COMPILE_LIBRARY",
+              "/DEIGEN_HAS_C99_MATH",
+              "/DTENSORFLOW_USE_EIGEN_THREADPOOL",
+          ],
+          "//tensorflow:ios": ["-std=c++11"],
+          "//conditions:default": ["-pthread"]
+      }))
+
 
 def tf_opts_nortti_if_android():
   return if_android([
@@ -137,8 +149,11 @@ def tf_opts_nortti_if_android():
       "-DGOOGLE_PROTOBUF_NO_RTTI",
       "-DGOOGLE_PROTOBUF_NO_STATIC_INITIALIZER",
   ]) + if_android_x86(["-msse4.1"])
+
+
 # LINT.ThenChange(//tensorflow/contrib/android/cmake/CMakeLists.txt)
 
+
 # Given a list of "op_lib_names" (a list of files in the ops directory
 # without their .cc extensions), generate a library for that file.
 def tf_gen_op_libs(op_lib_names, deps=None):
@@ -147,16 +162,20 @@ def tf_gen_op_libs(op_lib_names, deps=None):
   if not deps:
     deps = []
   for n in op_lib_names:
-    native.cc_library(name=n + "_op_lib",
-                      copts=tf_copts(),
-                      srcs=["ops/" + n + ".cc"],
-                      deps=deps + [str(Label("//tensorflow/core:framework"))],
-                      visibility=["//visibility:public"],
-                      alwayslink=1,
-                      linkstatic=1,)
+    native.cc_library(
+        name=n + "_op_lib",
+        copts=tf_copts(),
+        srcs=["ops/" + n + ".cc"],
+        deps=deps + ["//tensorflow/core:framework"],
+        visibility=["//visibility:public"],
+        alwayslink=1,
+        linkstatic=1,)
 
-def tf_gen_op_wrapper_cc(name, out_ops_file, pkg="",
-                         op_gen=str(Label("//tensorflow/cc:cc_op_gen_main")),
+
+def tf_gen_op_wrapper_cc(name,
+                         out_ops_file,
+                         pkg="",
+                         op_gen="//tensorflow/cc:cc_op_gen_main",
                          deps=None,
                          override_file=None,
                          include_internal_ops=0):
@@ -165,12 +184,11 @@ def tf_gen_op_wrapper_cc(name, out_ops_file, pkg="",
   if deps == None:
     deps = [pkg + ":" + name + "_op_lib"]
   native.cc_binary(
-      name = tool,
-      copts = tf_copts(),
-      linkopts = ["-lm"],
-      linkstatic = 1,   # Faster to link this one-time-use binary dynamically
-      deps = [op_gen] + deps
-  )
+      name=tool,
+      copts=tf_copts(),
+      linkopts=["-lm"],
+      linkstatic=1,  # Faster to link this one-time-use binary dynamically
+      deps=[op_gen] + deps)
 
   if override_file == None:
     srcs = []
@@ -180,14 +198,17 @@ def tf_gen_op_wrapper_cc(name, out_ops_file, pkg="",
     override_arg = "$(location " + override_file + ")"
   native.genrule(
       name=name + "_genrule",
-      outs=[out_ops_file + ".h", out_ops_file + ".cc",
-            out_ops_file + "_internal.h", out_ops_file + "_internal.cc"],
+      outs=[
+          out_ops_file + ".h", out_ops_file + ".cc",
+          out_ops_file + "_internal.h", out_ops_file + "_internal.cc"
+      ],
       srcs=srcs,
       tools=[":" + tool],
       cmd=("$(location :" + tool + ") $(location :" + out_ops_file + ".h) " +
            "$(location :" + out_ops_file + ".cc) " + override_arg + " " +
            str(include_internal_ops)))
 
+
 # Given a list of "op_lib_names" (a list of files in the ops directory
 # without their .cc extensions), generate individual C++ .cc and .h
 # files for each of the ops files mentioned, and then generate a
@@ -235,59 +256,72 @@ def tf_gen_op_wrappers_cc(name,
   internalhdrs = []
   for n in op_lib_names:
     tf_gen_op_wrapper_cc(
-        n, "ops/" + n, pkg=pkg, op_gen=op_gen, override_file=override_file,
+        n,
+        "ops/" + n,
+        pkg=pkg,
+        op_gen=op_gen,
+        override_file=override_file,
         include_internal_ops=include_internal_ops)
     subsrcs += ["ops/" + n + ".cc"]
     subhdrs += ["ops/" + n + ".h"]
     internalsrcs += ["ops/" + n + "_internal.cc"]
     internalhdrs += ["ops/" + n + "_internal.h"]
 
-  native.cc_library(name=name,
-                    srcs=subsrcs,
-                    hdrs=subhdrs,
-                    deps=deps + if_not_android([
-                        str(Label("//tensorflow/core:core_cpu")),
-                        str(Label("//tensorflow/core:framework")),
-                        str(Label("//tensorflow/core:lib")),
-                        str(Label("//tensorflow/core:protos_all_cc")),
-                    ]) + if_android([
-                        str(Label("//tensorflow/core:android_tensorflow_lib")),
-                    ]),
-                    copts=tf_copts(),
-                    alwayslink=1,
-                    visibility=visibility)
-  native.cc_library(name=name + "_internal",
-                    srcs=internalsrcs,
-                    hdrs=internalhdrs,
-                    deps=deps + if_not_android([
-                        str(Label("//tensorflow/core:core_cpu")),
-                        str(Label("//tensorflow/core:framework")),
-                        str(Label("//tensorflow/core:lib")),
-                        str(Label("//tensorflow/core:protos_all_cc")),
-                    ]) + if_android([
-                        str(Label("//tensorflow/core:android_tensorflow_lib")),
-                    ]),
-                    copts=tf_copts(),
-                    alwayslink=1,
-                    visibility=[str(Label("//tensorflow:internal"))])
+  native.cc_library(
+      name=name,
+      srcs=subsrcs,
+      hdrs=subhdrs,
+      deps=deps + if_not_android([
+          "//tensorflow/core:core_cpu",
+          "//tensorflow/core:framework",
+          "//tensorflow/core:lib",
+          "//tensorflow/core:protos_all_cc",
+      ]) + if_android([
+          "//tensorflow/core:android_tensorflow_lib",
+      ]),
+      copts=tf_copts(),
+      alwayslink=1,
+      visibility=visibility)
+  native.cc_library(
+      name=name + "_internal",
+      srcs=internalsrcs,
+      hdrs=internalhdrs,
+      deps=deps + if_not_android([
+          "//tensorflow/core:core_cpu",
+          "//tensorflow/core:framework",
+          "//tensorflow/core:lib",
+          "//tensorflow/core:protos_all_cc",
+      ]) + if_android([
+          "//tensorflow/core:android_tensorflow_lib",
+      ]),
+      copts=tf_copts(),
+      alwayslink=1,
+      visibility=["//tensorflow:internal"])
+
 
 # Invoke this rule in .../tensorflow/python to build the wrapper library.
-def tf_gen_op_wrapper_py(name, out=None, hidden=None, visibility=None, deps=[],
-                         require_shape_functions=False, hidden_file=None,
+def tf_gen_op_wrapper_py(name,
+                         out=None,
+                         hidden=None,
+                         visibility=None,
+                         deps=[],
+                         require_shape_functions=False,
+                         hidden_file=None,
                          generated_target_name=None):
   # Construct a cc_binary containing the specified ops.
   tool_name = "gen_" + name + "_py_wrappers_cc"
   if not deps:
     deps = [str(Label("//tensorflow/core:" + name + "_op_lib"))]
   native.cc_binary(
-      name = tool_name,
-      linkopts = ["-lm"],
-      copts = tf_copts(),
-      linkstatic = 1,   # Faster to link this one-time-use binary dynamically
-      deps = ([str(Label("//tensorflow/core:framework")),
-               str(Label("//tensorflow/python:python_op_gen_main"))] + deps),
-      visibility = [str(Label("//tensorflow:internal"))],
-  )
+      name=tool_name,
+      linkopts=["-lm"],
+      copts=tf_copts(),
+      linkstatic=1,  # Faster to link this one-time-use binary dynamically
+      deps=([
+          "//tensorflow/core:framework",
+          "//tensorflow/python:python_op_gen_main"
+      ] + deps),
+      visibility=["//tensorflow:internal"],)
 
   # Invoke the previous cc_binary to generate a python file.
   if not out:
@@ -299,8 +333,8 @@ def tf_gen_op_wrapper_py(name, out=None, hidden=None, visibility=None, deps=[],
         name=name + "_pygenrule",
         outs=[out],
         tools=[tool_name],
-        cmd=("$(location " + tool_name + ") " + ",".join(hidden)
-             + " " + ("1" if require_shape_functions else "0") + " > $@"))
+        cmd=("$(location " + tool_name + ") " + ",".join(hidden) + " " +
+             ("1" if require_shape_functions else "0") + " > $@"))
   elif hidden_file:
     # `hidden_file` is file containing a list of op names to be hidden in the
     # generated module.
@@ -309,77 +343,120 @@ def tf_gen_op_wrapper_py(name, out=None, hidden=None, visibility=None, deps=[],
         outs=[out],
         srcs=[hidden_file],
         tools=[tool_name],
-        cmd=("$(location " + tool_name + ") @$(location "
-             + hidden_file + ") " + ("1" if require_shape_functions else "0")
-             + " > $@"))
+        cmd=("$(location " + tool_name + ") @$(location " + hidden_file + ") " +
+             ("1" if require_shape_functions else "0") + " > $@"))
   else:
     # No ops should be hidden in the generated module.
     native.genrule(
         name=name + "_pygenrule",
         outs=[out],
         tools=[tool_name],
-        cmd=("$(location " + tool_name + ") "
-             + ("1" if require_shape_functions else "0") + " > $@"))
+        cmd=("$(location " + tool_name + ") " +
+             ("1" if require_shape_functions else "0") + " > $@"))
 
   # Make a py_library out of the generated python file.
   if not generated_target_name:
     generated_target_name = name
-  native.py_library(name=generated_target_name,
-                    srcs=[out],
-                    srcs_version="PY2AND3",
-                    visibility=visibility,
-                    deps=[
-                        str(Label("//tensorflow/python:framework_for_generated_wrappers_v2")),
-                    ],)
+  native.py_library(
+      name=generated_target_name,
+      srcs=[out],
+      srcs_version="PY2AND3",
+      visibility=visibility,
+      deps=[
+          "//tensorflow/python:framework_for_generated_wrappers_v2",
+      ],)
+
 
 # Define a bazel macro that creates cc_test for tensorflow.
 # TODO(opensource): we need to enable this to work around the hidden symbol
 # __cudaRegisterFatBinary error. Need more investigations.
-def tf_cc_test(name, srcs, deps, linkstatic=0, tags=[], data=[], size="medium",
-               suffix="", args=None, linkopts=[]):
-  native.cc_test(name="%s%s" % (name, suffix),
-                 srcs=srcs,
-                 size=size,
-                 args=args,
-                 copts=tf_copts(),
-                 data=data,
-                 deps=deps,
-                 linkopts=["-lpthread", "-lm"] + linkopts,
-                 linkstatic=linkstatic,
-                 tags=tags)
+def tf_cc_test(name,
+               srcs,
+               deps,
+               linkstatic=0,
+               tags=[],
+               data=[],
+               size="medium",
+               suffix="",
+               args=None,
+               linkopts=[]):
+  native.cc_test(
+      name="%s%s" % (name, suffix),
+      srcs=srcs,
+      size=size,
+      args=args,
+      copts=tf_copts(),
+      data=data,
+      deps=deps,
+      linkopts=["-lpthread", "-lm"] + linkopts,
+      linkstatic=linkstatic,
+      tags=tags)
+
 
 # Part of the testing workflow requires a distinguishable name for the build
 # rules that involve a GPU, even if otherwise identical to the base rule.
-def tf_cc_test_gpu(name, srcs, deps, linkstatic=0, tags=[], data=[],
-                   size="medium", suffix="", args=None):
-  tf_cc_test(name, srcs, deps, linkstatic=linkstatic, tags=tags, data=data,
-             size=size, suffix=suffix, args=args)
+def tf_cc_test_gpu(name,
+                   srcs,
+                   deps,
+                   linkstatic=0,
+                   tags=[],
+                   data=[],
+                   size="medium",
+                   suffix="",
+                   args=None):
+  tf_cc_test(
+      name,
+      srcs,
+      deps,
+      linkstatic=linkstatic,
+      tags=tags,
+      data=data,
+      size=size,
+      suffix=suffix,
+      args=args)
+
+
+def tf_cuda_cc_test(name,
+                    srcs=[],
+                    deps=[],
+                    tags=[],
+                    data=[],
+                    size="medium",
+                    linkstatic=0,
+                    args=[],
+                    linkopts=[]):
+  tf_cc_test(
+      name=name,
+      srcs=srcs,
+      deps=deps,
+      tags=tags + ["manual"],
+      data=data,
+      size=size,
+      linkstatic=linkstatic,
+      linkopts=linkopts,
+      args=args)
+  tf_cc_test(
+      name=name,
+      srcs=srcs,
+      suffix="_gpu",
+      deps=deps + if_cuda(["//tensorflow/core:gpu_runtime"]),
+      linkstatic=if_cuda(1, 0),
+      tags=tags + tf_cuda_tests_tags(),
+      data=data,
+      size=size,
+      linkopts=linkopts,
+      args=args)
 
-def tf_cuda_cc_test(name, srcs=[], deps=[], tags=[], data=[], size="medium",
-                    linkstatic=0, args=[], linkopts=[]):
-  tf_cc_test(name=name,
-             srcs=srcs,
-             deps=deps,
-             tags=tags + ["manual"],
-             data=data,
-             size=size,
-             linkstatic=linkstatic,
-             linkopts=linkopts,
-             args=args)
-  tf_cc_test(name=name,
-             srcs=srcs,
-             suffix="_gpu",
-             deps=deps + if_cuda([str(Label("//tensorflow/core:gpu_runtime"))]),
-             linkstatic=if_cuda(1, 0),
-             tags=tags + tf_cuda_tests_tags(),
-             data=data,
-             size=size,
-             linkopts=linkopts,
-             args=args)
 
 # Create a cc_test for each of the tensorflow tests listed in "tests"
-def tf_cc_tests(srcs, deps, name='', linkstatic=0, tags=[], size="medium",
-                args=None, linkopts=[]):
+def tf_cc_tests(srcs,
+                deps,
+                name="",
+                linkstatic=0,
+                tags=[],
+                size="medium",
+                args=None,
+                linkopts=[]):
   for src in srcs:
     tf_cc_test(
         name=src_to_test_name(src),
@@ -391,17 +468,35 @@ def tf_cc_tests(srcs, deps, name='', linkstatic=0, tags=[], size="medium",
         args=args,
         linkopts=linkopts)
 
-def tf_cc_test_mkl(srcs, deps, name='', linkstatic=0, tags=[], size="medium",
-                    args=None):
+
+def tf_cc_test_mkl(srcs,
+                   deps,
+                   name="",
+                   linkstatic=0,
+                   tags=[],
+                   size="medium",
+                   args=None):
   if_mkl(tf_cc_tests(srcs, deps, linkstatic, tags=tags, size=size, args=args))
 
-def tf_cc_tests_gpu(srcs, deps, name='', linkstatic=0, tags=[], size="medium",
+
+def tf_cc_tests_gpu(srcs,
+                    deps,
+                    name="",
+                    linkstatic=0,
+                    tags=[],
+                    size="medium",
                     args=None):
   tf_cc_tests(srcs, deps, linkstatic, tags=tags, size=size, args=args)
 
 
-def tf_cuda_cc_tests(srcs, deps, name='', tags=[], size="medium", linkstatic=0,
-                     args=None, linkopts=[]):
+def tf_cuda_cc_tests(srcs,
+                     deps,
+                     name="",
+                     tags=[],
+                     size="medium",
+                     linkstatic=0,
+                     args=None,
+                     linkopts=[]):
   for src in srcs:
     tf_cuda_cc_test(
         name=src_to_test_name(src),
@@ -413,48 +508,52 @@ def tf_cuda_cc_tests(srcs, deps, name='', tags=[], size="medium", linkstatic=0,
         args=args,
         linkopts=linkopts)
 
+
 def _cuda_copts():
-    """Gets the appropriate set of copts for (maybe) CUDA compilation.
+  """Gets the appropriate set of copts for (maybe) CUDA compilation.
 
     If we're doing CUDA compilation, returns copts for our particular CUDA
     compiler.  If we're not doing CUDA compilation, returns an empty list.
 
     """
-    return cuda_default_copts() + select({
-        "//conditions:default": [],
-        "@local_config_cuda//cuda:using_nvcc": (
-            [
-                "-nvcc_options=relaxed-constexpr",
-                "-nvcc_options=ftz=true",
-            ]
-        ),
-        "@local_config_cuda//cuda:using_clang": (
-            [
-                "-fcuda-flush-denormals-to-zero",
-            ]
-        ),
-    })
+  return cuda_default_copts() + select({
+      "//conditions:default": [],
+      "@local_config_cuda//cuda:using_nvcc": ([
+          "-nvcc_options=relaxed-constexpr",
+          "-nvcc_options=ftz=true",
+      ]),
+      "@local_config_cuda//cuda:using_clang": ([
+          "-fcuda-flush-denormals-to-zero",
+      ]),
+  })
+
 
 # Build defs for TensorFlow kernels
 
+
 # When this target is built using --config=cuda, a cc_library is built
 # that passes -DGOOGLE_CUDA=1 and '-x cuda', linking in additional
 # libraries needed by GPU kernels.
-def tf_gpu_kernel_library(srcs, copts=[], cuda_copts=[], deps=[], hdrs=[],
+def tf_gpu_kernel_library(srcs,
+                          copts=[],
+                          cuda_copts=[],
+                          deps=[],
+                          hdrs=[],
                           **kwargs):
   copts = copts + _cuda_copts() + if_cuda(cuda_copts) + tf_copts()
 
   native.cc_library(
-      srcs = srcs,
-      hdrs = hdrs,
-      copts = copts,
-      deps = deps + if_cuda([
-          str(Label("//tensorflow/core:cuda")),
-          str(Label("//tensorflow/core:gpu_lib")),
+      srcs=srcs,
+      hdrs=hdrs,
+      copts=copts,
+      deps=deps + if_cuda([
+          "//tensorflow/core:cuda",
+          "//tensorflow/core:gpu_lib",
       ]),
       alwayslink=1,
       **kwargs)
 
+
 def tf_cuda_library(deps=None, cuda_deps=None, copts=None, **kwargs):
   """Generate a cc_library with a conditional set of CUDA dependencies.
 
@@ -479,15 +578,23 @@ def tf_cuda_library(deps=None, cuda_deps=None, copts=None, **kwargs):
     copts = []
 
   native.cc_library(
-      deps = deps + if_cuda(cuda_deps + [
-          str(Label("//tensorflow/core:cuda")),
+      deps=deps + if_cuda(cuda_deps + [
+          "//tensorflow/core:cuda",
           "@local_config_cuda//cuda:cuda_headers"
       ]),
-      copts = copts + if_cuda(["-DGOOGLE_CUDA=1"]) + if_mkl(["-DINTEL_MKL=1"]),
+      copts=copts + if_cuda(["-DGOOGLE_CUDA=1"]) + if_mkl(["-DINTEL_MKL=1"]),
       **kwargs)
 
-def tf_kernel_library(name, prefix=None, srcs=None, gpu_srcs=None, hdrs=None,
-                      deps=None, alwayslink=1, copts=tf_copts(), **kwargs):
+
+def tf_kernel_library(name,
+                      prefix=None,
+                      srcs=None,
+                      gpu_srcs=None,
+                      hdrs=None,
+                      deps=None,
+                      alwayslink=1,
+                      copts=tf_copts(),
+                      **kwargs):
   """A rule to build a TensorFlow OpKernel.
 
   May either specify srcs/hdrs or prefix.  Similar to tf_cuda_library,
@@ -517,43 +624,58 @@ def tf_kernel_library(name, prefix=None, srcs=None, gpu_srcs=None, hdrs=None,
     deps = []
 
   if prefix:
-    if native.glob([prefix + "*.cu.cc"], exclude = ["*test*"]):
+    if native.glob([prefix + "*.cu.cc"], exclude=["*test*"]):
       if not gpu_srcs:
         gpu_srcs = []
-      gpu_srcs = gpu_srcs + native.glob([prefix + "*.cu.cc", prefix + "*.h"],
-                                        exclude = [prefix + "*test*"])
-    srcs = srcs + native.glob([prefix + "*.cc"],
-                              exclude = [prefix + "*test*", prefix + "*.cu.cc"])
-    hdrs = hdrs + native.glob([prefix + "*.h"], exclude = [prefix + "*test*",
-                                                           prefix + "*.cu.h"])
+      gpu_srcs = gpu_srcs + native.glob(
+          [prefix + "*.cu.cc", prefix + "*.h"], exclude=[prefix + "*test*"])
+    srcs = srcs + native.glob(
+        [prefix + "*.cc"], exclude=[prefix + "*test*", prefix + "*.cu.cc"])
+    hdrs = hdrs + native.glob(
+        [prefix + "*.h"], exclude=[prefix + "*test*", prefix + "*.cu.h"])
 
   cuda_deps = [str(Label("//tensorflow/core:gpu_lib"))]
   if gpu_srcs:
     for gpu_src in gpu_srcs:
       if gpu_src.endswith(".cc") and not gpu_src.endswith(".cu.cc"):
-        fail("{} not allowed in gpu_srcs. .cc sources must end with .cu.cc".format(gpu_src))
+        fail("{} not allowed in gpu_srcs. .cc sources must end with .cu.cc".
+             format(gpu_src))
     tf_gpu_kernel_library(
-        name = name + "_gpu",
-        srcs = gpu_srcs,
-        deps = deps,
-        **kwargs)
+        name=name + "_gpu", srcs=gpu_srcs, deps=deps, **kwargs)
     cuda_deps.extend([":" + name + "_gpu"])
   tf_cuda_library(
-      name = name,
-      srcs = srcs,
-      hdrs = hdrs,
-      copts = copts,
-      cuda_deps = cuda_deps,
-      linkstatic = 1,   # Needed since alwayslink is broken in bazel b/27630669
-      alwayslink = alwayslink,
-      deps = deps,
+      name=name,
+      srcs=srcs,
+      hdrs=hdrs,
+      copts=copts,
+      cuda_deps=cuda_deps,
+      linkstatic=1,  # Needed since alwayslink is broken in bazel b/27630669
+      alwayslink=alwayslink,
+      deps=deps,
       **kwargs)
 
-def tf_mkl_kernel_library(name, prefix=None, srcs=None, gpu_srcs=None, hdrs=None,
-                      deps=None, alwayslink=1, copts=tf_copts(), **kwargs):
-  if_mkl(tf_kernel_library(name, prefix=prefix, srcs=srcs, gpu_srcs=gpu_srcs, 
-                                 hdrs=hdrs, deps=deps, alwayslink=alwayslink, 
-                                 copts=copts, **kwargs))
+
+def tf_mkl_kernel_library(name,
+                          prefix=None,
+                          srcs=None,
+                          gpu_srcs=None,
+                          hdrs=None,
+                          deps=None,
+                          alwayslink=1,
+                          copts=tf_copts(),
+                          **kwargs):
+  if_mkl(
+      tf_kernel_library(
+          name,
+          prefix=prefix,
+          srcs=srcs,
+          gpu_srcs=gpu_srcs,
+          hdrs=hdrs,
+          deps=deps,
+          alwayslink=alwayslink,
+          copts=copts,
+          **kwargs))
+
 
 # Bazel rules for building swig files.
 def _py_wrap_cc_impl(ctx):
@@ -570,59 +692,61 @@ def _py_wrap_cc_impl(ctx):
   inputs += ctx.files.toolchain_deps
   swig_include_dirs = set(_get_repository_roots(ctx, inputs))
   swig_include_dirs += sorted([f.dirname for f in ctx.files._swiglib])
-  args = ["-c++",
-          "-python",
-          "-module", module_name,
-          "-o", ctx.outputs.cc_out.path,
-          "-outdir", ctx.outputs.py_out.dirname]
+  args = [
+      "-c++", "-python", "-module", module_name, "-o", ctx.outputs.cc_out.path,
+      "-outdir", ctx.outputs.py_out.dirname
+  ]
   args += ["-l" + f.path for f in ctx.files.swig_includes]
   args += ["-I" + i for i in swig_include_dirs]
   args += [src.path]
-  outputs = [ctx.outputs.cc_out,
-             ctx.outputs.py_out]
-  ctx.action(executable=ctx.executable._swig,
-             arguments=args,
-             inputs=list(inputs),
-             outputs=outputs,
-             mnemonic="PythonSwig",
-             progress_message="SWIGing " + src.path)
+  outputs = [ctx.outputs.cc_out, ctx.outputs.py_out]
+  ctx.action(
+      executable=ctx.executable._swig,
+      arguments=args,
+      inputs=list(inputs),
+      outputs=outputs,
+      mnemonic="PythonSwig",
+      progress_message="SWIGing " + src.path)
   return struct(files=set(outputs))
 
+
 _py_wrap_cc = rule(
-    attrs = {
-        "srcs": attr.label_list(
-            mandatory = True,
-            allow_files = True,
-        ),
-        "swig_includes": attr.label_list(
-            cfg = "data",
-            allow_files = True,
-        ),
-        "deps": attr.label_list(
-            allow_files = True,
-            providers = ["cc"],
-        ),
-        "toolchain_deps": attr.label_list(
-            allow_files = True,
-        ),
-        "module_name": attr.string(mandatory = True),
-        "py_module_name": attr.string(mandatory = True),
-        "_swig": attr.label(
-            default = Label("@swig//:swig"),
-            executable = True,
-            cfg = "host",
-        ),
-        "_swiglib": attr.label(
-            default = Label("@swig//:templates"),
-            allow_files = True,
-        ),
+    attrs={
+        "srcs":
+            attr.label_list(
+                mandatory=True,
+                allow_files=True,),
+        "swig_includes":
+            attr.label_list(
+                cfg="data",
+                allow_files=True,),
+        "deps":
+            attr.label_list(
+                allow_files=True,
+                providers=["cc"],),
+        "toolchain_deps":
+            attr.label_list(
+                allow_files=True,),
+        "module_name":
+            attr.string(mandatory=True),
+        "py_module_name":
+            attr.string(mandatory=True),
+        "_swig":
+            attr.label(
+                default=Label("@swig//:swig"),
+                executable=True,
+                cfg="host",),
+        "_swiglib":
+            attr.label(
+                default=Label("@swig//:templates"),
+                allow_files=True,),
     },
-    outputs = {
+    outputs={
         "cc_out": "%{module_name}.cc",
         "py_out": "%{py_module_name}.py",
     },
-    implementation = _py_wrap_cc_impl,
-)
+    implementation=_py_wrap_cc_impl,)
+
 
 def _get_repository_roots(ctx, files):
   """Returns abnormal root directories under which files reside.
@@ -653,6 +777,7 @@ def _get_repository_roots(ctx, files):
       result[root] -= 1
   return [k for v, k in sorted([(v, k) for k, v in result.items()])]
 
+
 # Bazel rule for collecting the header files that a target depends on.
 def _transitive_hdrs_impl(ctx):
   outputs = set()
@@ -660,30 +785,27 @@ def _transitive_hdrs_impl(ctx):
     outputs += dep.cc.transitive_headers
   return struct(files=outputs)
 
+
 _transitive_hdrs = rule(
-    attrs = {
+    attrs={
         "deps": attr.label_list(
-            allow_files = True,
-            providers = ["cc"],
-        ),
+            allow_files=True,
+            providers=["cc"],),
     },
-    implementation = _transitive_hdrs_impl,
-)
+    implementation=_transitive_hdrs_impl,)
+
 
 def transitive_hdrs(name, deps=[], **kwargs):
-  _transitive_hdrs(name=name + "_gather",
-                   deps=deps)
-  native.filegroup(name=name,
-                   srcs=[":" + name + "_gather"])
+  _transitive_hdrs(name=name + "_gather", deps=deps)
+  native.filegroup(name=name, srcs=[":" + name + "_gather"])
+
 
 # Create a header only library that includes all the headers exported by
 # the libraries in deps.
 def cc_header_only_library(name, deps=[], **kwargs):
-  _transitive_hdrs(name=name + "_gather",
-                   deps=deps)
-  native.cc_library(name=name,
-                    hdrs=[":" + name + "_gather"],
-                    **kwargs)
+  _transitive_hdrs(name=name + "_gather", deps=deps)
+  native.cc_library(name=name, hdrs=[":" + name + "_gather"], **kwargs)
+
 
 def tf_custom_op_library_additional_deps():
   return [
@@ -692,6 +814,7 @@ def tf_custom_op_library_additional_deps():
       str(Label("//tensorflow/core:framework_headers_lib")),
   ]
 
+
 # Traverse the dependency graph along the "deps" attribute of the
 # target and return a struct with one field called 'tf_collected_deps'.
 # tf_collected_deps will be the union of the deps of the current target
@@ -705,14 +828,16 @@ def _collect_deps_aspect_impl(target, ctx):
         alldeps = alldeps | dep.tf_collected_deps
   return struct(tf_collected_deps=alldeps)
 
+
 collect_deps_aspect = aspect(
-    implementation=_collect_deps_aspect_impl,
-    attr_aspects=["deps"])
+    implementation=_collect_deps_aspect_impl, attr_aspects=["deps"])
+
 
 def _dep_label(dep):
   label = dep.label
   return label.package + ":" + label.name
 
+
 # This rule checks that the transitive dependencies of targets listed
 # in the 'deps' attribute don't depend on the targets listed in
 # the 'disallowed_deps' attribute.
@@ -724,23 +849,23 @@ def _check_deps_impl(ctx):
     for dep in input_dep.tf_collected_deps:
       for disallowed_dep in disallowed_deps:
         if dep == disallowed_dep.label:
-          fail(_dep_label(input_dep) + " cannot depend on " +
-               _dep_label(disallowed_dep))
+          fail(
+              _dep_label(input_dep) + " cannot depend on " + _dep_label(
+                  disallowed_dep))
   return struct()
 
+
 check_deps = rule(
     _check_deps_impl,
-    attrs = {
-        "deps": attr.label_list(
-            aspects=[collect_deps_aspect],
-            mandatory = True,
-            allow_files = True
-        ),
-        "disallowed_deps": attr.label_list(
-            mandatory = True,
-            allow_files = True
-        )},
-)
+    attrs={
+        "deps":
+            attr.label_list(
+                aspects=[collect_deps_aspect], mandatory=True,
+                allow_files=True),
+        "disallowed_deps":
+            attr.label_list(mandatory=True, allow_files=True)
+    },)
+
 
 # Helper to build a dynamic library (.so) from the sources containing
 # implementations of custom ops and kernels.
@@ -753,33 +878,42 @@ def tf_custom_op_library(name, srcs=[], gpu_srcs=[], deps=[]):
   if gpu_srcs:
     basename = name.split(".")[0]
     native.cc_library(
-        name = basename + "_gpu",
-        srcs = gpu_srcs,
-        copts = _cuda_copts(),
-        deps = deps + if_cuda(cuda_deps))
+        name=basename + "_gpu",
+        srcs=gpu_srcs,
+        copts=_cuda_copts(),
+        deps=deps + if_cuda(cuda_deps))
     cuda_deps.extend([":" + basename + "_gpu"])
 
-  check_deps(name=name+"_check_deps",
-             deps=deps + if_cuda(cuda_deps),
-             disallowed_deps=[str(Label("//tensorflow/core:framework")),
-                              str(Label("//tensorflow/core:lib"))])
+  check_deps(
+      name=name + "_check_deps",
+      deps=deps + if_cuda(cuda_deps),
+      disallowed_deps=[
+          "//tensorflow/core:framework",
+          "//tensorflow/core:lib"
+      ])
 
-  native.cc_binary(name=name,
-                   srcs=srcs,
-                   deps=deps + if_cuda(cuda_deps),
-                   data=[name + "_check_deps"],
-                   copts=tf_copts(),
-                   linkshared=1,
-                   linkopts = select({
-                       "//conditions:default": [
-                           "-lm",
-                       ],
-                       str(Label("//tensorflow:darwin")): [],
-                   }),
-  )
+  native.cc_binary(
+      name=name,
+      srcs=srcs,
+      deps=deps + if_cuda(cuda_deps),
+      data=[name + "_check_deps"],
+      copts=tf_copts(),
+      linkshared=1,
+      linkopts=select({
+          "//conditions:default": [
+              "-lm",
+          ],
+          "//tensorflow:darwin": [],
+      }),)
 
-def tf_custom_op_py_library(name, srcs=[], dso=[], kernels=[],
-                            srcs_version="PY2AND3", visibility=None, deps=[]):
+
+def tf_custom_op_py_library(name,
+                            srcs=[],
+                            dso=[],
+                            kernels=[],
+                            srcs_version="PY2AND3",
+                            visibility=None,
+                            deps=[]):
   kernels = kernels  # unused argument
   native.py_library(
       name=name,
@@ -787,86 +921,103 @@ def tf_custom_op_py_library(name, srcs=[], dso=[], kernels=[],
       srcs=srcs,
       srcs_version=srcs_version,
       visibility=visibility,
-      deps=deps,
-  )
+      deps=deps,)
+
 
 def tf_extension_linkopts():
   return []  # No extension link opts
 
+
 def tf_extension_copts():
   return []  # No extension c opts
 
-def tf_py_wrap_cc(name, srcs, swig_includes=[], deps=[], copts=[], **kwargs):
+
+def tf_py_wrap_cc(name,
+                             srcs,
+                             swig_includes=[],
+                             deps=[],
+                             copts=[],
+                             **kwargs):
   module_name = name.split("/")[-1]
   # Convert a rule name such as foo/bar/baz to foo/bar/_baz.so
   # and use that as the name for the rule producing the .so file.
   cc_library_name = "/".join(name.split("/")[:-1] + ["_" + module_name + ".so"])
-  cc_library_pyd_name = "/".join(name.split("/")[:-1] + ["_" + module_name + ".pyd"])
+  cc_library_pyd_name = "/".join(
+      name.split("/")[:-1] + ["_" + module_name + ".pyd"])
   extra_deps = []
-  _py_wrap_cc(name=name + "_py_wrap",
-              srcs=srcs,
-              swig_includes=swig_includes,
-              deps=deps + extra_deps,
-              toolchain_deps=["//tools/defaults:crosstool"],
-              module_name=module_name,
-              py_module_name=name)
+  _py_wrap_cc(
+      name=name + "_py_wrap",
+      srcs=srcs,
+      swig_includes=swig_includes,
+      deps=deps + extra_deps,
+      toolchain_deps=["//tools/defaults:crosstool"],
+      module_name=module_name,
+      py_module_name=name)
   extra_linkopts = select({
       "@local_config_cuda//cuda:darwin": [
           "-Wl,-exported_symbols_list",
           str(Label("//tensorflow:tf_exported_symbols.lds"))
       ],
-      str(Label("//tensorflow:windows")): [
-      ],
+      str(Label("//tensorflow:windows")): [],
       "//conditions:default": [
           "-Wl,--version-script",
-          str(Label("//tensorflow:tf_version_script.lds"))
-      ]})
+          "//tensorflow:tf_version_script.lds"
+      ]
+  })
   extra_deps += select({
       "@local_config_cuda//cuda:darwin": [
-        str(Label("//tensorflow:tf_exported_symbols.lds"))
-      ],
-      str(Label("//tensorflow:windows")): [
+          "//tensorflow:tf_exported_symbols.lds"
       ],
+      "//tensorflow:windows": [],
       "//conditions:default": [
-        str(Label("//tensorflow:tf_version_script.lds"))
+          "//tensorflow:tf_version_script.lds"
       ]
   })
 
   native.cc_binary(
       name=cc_library_name,
       srcs=[module_name + ".cc"],
-      copts=(copts + ["-Wno-self-assign",
-                      "-Wno-sign-compare",
-                      "-Wno-write-strings"]
-             + tf_extension_copts()),
+      copts=(copts + [
+          "-Wno-self-assign", "-Wno-sign-compare", "-Wno-write-strings"
+      ] + tf_extension_copts()),
       linkopts=tf_extension_linkopts() + extra_linkopts,
       linkstatic=1,
       linkshared=1,
       deps=deps + extra_deps)
   native.genrule(
-      name = "gen_" + cc_library_pyd_name,
-      srcs = [":" + cc_library_name],
-      outs = [cc_library_pyd_name],
-      cmd = "cp $< $@",
-  )
-  native.py_library(name=name,
-                    srcs=[":" + name + ".py"],
-                    srcs_version="PY2AND3",
-                    data=select({
-                      str(Label("//tensorflow:windows")): [":" + cc_library_pyd_name],
-                      "//conditions:default": [":" + cc_library_name],
-                    }))
+      name="gen_" + cc_library_pyd_name,
+      srcs=[":" + cc_library_name],
+      outs=[cc_library_pyd_name],
+      cmd="cp $< $@",)
+  native.py_library(
+      name=name,
+      srcs=[":" + name + ".py"],
+      srcs_version="PY2AND3",
+      data=select({
+          "//tensorflow:windows": [":" + cc_library_pyd_name],
+          "//conditions:default": [":" + cc_library_name],
+      }))
+
 
 def py_test(deps=[], **kwargs):
   native.py_test(
       deps=select({
-          "//conditions:default" : deps,
-          str(Label("//tensorflow:no_tensorflow_py_deps")) : []
+          "//conditions:default": deps,
+          "//tensorflow:no_tensorflow_py_deps": []
       }),
       **kwargs)
 
-def tf_py_test(name, srcs, size="medium", data=[], main=None, args=[],
-               tags=[], shard_count=1, additional_deps=[], flaky=0,
+
+def tf_py_test(name,
+               srcs,
+               size="medium",
+               data=[],
+               main=None,
+               args=[],
+               tags=[],
+               shard_count=1,
+               additional_deps=[],
+               flaky=0,
                xla_enabled=False):
   if xla_enabled:
     additional_deps += tf_additional_xla_deps_py()
@@ -881,46 +1032,67 @@ def tf_py_test(name, srcs, size="medium", data=[], main=None, args=[],
       shard_count=shard_count,
       data=data,
       deps=select({
-          "//conditions:default" : [
-            str(Label("//tensorflow/python:extra_py_tests_deps")),
-            str(Label("//tensorflow/python:gradient_checker")),
+          "//conditions:default": [
+              "//tensorflow/python:extra_py_tests_deps",
+              "//tensorflow/python:gradient_checker",
           ] + additional_deps,
-          str(Label("//tensorflow:no_tensorflow_py_deps")) : []
+          "//tensorflow:no_tensorflow_py_deps": []
       }),
       flaky=flaky,
       srcs_version="PY2AND3")
 
-def cuda_py_test(name, srcs, size="medium", data=[], main=None, args=[],
-                 shard_count=1, additional_deps=[], tags=[], flaky=0,
+
+def cuda_py_test(name,
+                 srcs,
+                 size="medium",
+                 data=[],
+                 main=None,
+                 args=[],
+                 shard_count=1,
+                 additional_deps=[],
+                 tags=[],
+                 flaky=0,
                  xla_enabled=False):
   test_tags = tags + tf_cuda_tests_tags()
-  tf_py_test(name=name,
-             size=size,
-             srcs=srcs,
-             data=data,
-             main=main,
-             args=args,
-             tags=test_tags,
-             shard_count=shard_count,
-             additional_deps=additional_deps,
-             flaky=flaky,
-             xla_enabled=xla_enabled)
+  tf_py_test(
+      name=name,
+      size=size,
+      srcs=srcs,
+      data=data,
+      main=main,
+      args=args,
+      tags=test_tags,
+      shard_count=shard_count,
+      additional_deps=additional_deps,
+      flaky=flaky,
+      xla_enabled=xla_enabled)
 
-def sycl_py_test(name, srcs, size="medium", data=[], main=None, args=[],
-                 shard_count=1, additional_deps=[], tags=[], flaky=0,
+
+def sycl_py_test(name,
+                 srcs,
+                 size="medium",
+                 data=[],
+                 main=None,
+                 args=[],
+                 shard_count=1,
+                 additional_deps=[],
+                 tags=[],
+                 flaky=0,
                  xla_enabled=False):
- test_tags = tags + tf_sycl_tests_tags()
- tf_py_test(name=name,
-            size=size,
-            srcs=srcs,
-            data=data,
-            main=main,
-            args=args,
-            tags=test_tags,
-            shard_count=shard_count,
-            additional_deps=additional_deps,
-            flaky=flaky,
-            xla_enabled=xla_enabled)
+  test_tags = tags + tf_sycl_tests_tags()
+  tf_py_test(
+      name=name,
+      size=size,
+      srcs=srcs,
+      data=data,
+      main=main,
+      args=args,
+      tags=test_tags,
+      shard_count=shard_count,
+      additional_deps=additional_deps,
+      flaky=flaky,
+      xla_enabled=xla_enabled)
+
 
 def py_tests(name,
              srcs,
@@ -935,22 +1107,39 @@ def py_tests(name,
     test_name = src.split("/")[-1].split(".")[0]
     if prefix:
       test_name = "%s_%s" % (prefix, test_name)
-    tf_py_test(name=test_name,
-               size=size,
-               srcs=[src],
-               main=src,
-               tags=tags,
-               shard_count=shard_count,
-               data=data,
-               additional_deps=additional_deps,
-               xla_enabled=xla_enabled)
+    tf_py_test(
+        name=test_name,
+        size=size,
+        srcs=[src],
+        main=src,
+        tags=tags,
+        shard_count=shard_count,
+        data=data,
+        additional_deps=additional_deps,
+        xla_enabled=xla_enabled)
 
-def cuda_py_tests(name, srcs, size="medium", additional_deps=[], data=[],
-                  shard_count=1, tags=[], prefix="", xla_enabled=False):
+
+def cuda_py_tests(name,
+                  srcs,
+                  size="medium",
+                  additional_deps=[],
+                  data=[],
+                  shard_count=1,
+                  tags=[],
+                  prefix="",
+                  xla_enabled=False):
   test_tags = tags + tf_cuda_tests_tags()
-  py_tests(name=name, size=size, srcs=srcs, additional_deps=additional_deps,
-           data=data, tags=test_tags, shard_count=shard_count,prefix=prefix,
-           xla_enabled=xla_enabled)
+  py_tests(
+      name=name,
+      size=size,
+      srcs=srcs,
+      additional_deps=additional_deps,
+      data=data,
+      tags=test_tags,
+      shard_count=shard_count,
+      prefix=prefix,
+      xla_enabled=xla_enabled)
+
 
 # Creates a genrule named <name> for running tools/proto_text's generator to
 # make the proto_text functions, for the protos passed in <srcs>.
@@ -958,40 +1147,46 @@ def cuda_py_tests(name, srcs, size="medium", additional_deps=[], data=[],
 # Return a struct with fields (hdrs, srcs) containing the names of the
 # generated files.
 def tf_generate_proto_text_sources(name, srcs_relative_dir, srcs):
-  out_hdrs = ([p.replace(".proto", ".pb_text.h") for p in srcs] +
-              [p.replace(".proto", ".pb_text-impl.h") for p in srcs])
+  out_hdrs = (
+      [p.replace(".proto", ".pb_text.h")
+       for p in srcs] + [p.replace(".proto", ".pb_text-impl.h") for p in srcs])
   out_srcs = [p.replace(".proto", ".pb_text.cc") for p in srcs]
   native.genrule(
-        name = name,
-        srcs = srcs + [str(Label("//tensorflow/tools/proto_text:placeholder.txt"))],
-        outs = out_hdrs + out_srcs,
-        cmd = "$(location //tensorflow/tools/proto_text:gen_proto_text_functions) " +
-              "$(@D) " + srcs_relative_dir + " $(SRCS)",
-        tools = [str(Label("//tensorflow/tools/proto_text:gen_proto_text_functions"))],
-    )
+      name=name,
+      srcs=srcs + ["//tensorflow/tools/proto_text:placeholder.txt"],
+      outs=out_hdrs + out_srcs,
+      cmd=
+      "$(location //tensorflow/tools/proto_text:gen_proto_text_functions) "
+      + "$(@D) " + srcs_relative_dir + " $(SRCS)",
+      tools=[
+          "//tensorflow/tools/proto_text:gen_proto_text_functions"
+      ],)
   return struct(hdrs=out_hdrs, srcs=out_srcs)
 
+
 def tf_genrule_cmd_append_to_srcs(to_append):
-    return ("cat $(SRCS) > $(@) && " +
-            "echo >> $(@) && " +
-            "echo " + to_append + " >> $(@)")
+  return ("cat $(SRCS) > $(@) && " + "echo >> $(@) && " + "echo " + to_append +
+          " >> $(@)")
 
 
 def tf_version_info_genrule():
   native.genrule(
-      name = "version_info_gen",
-      srcs = [
-          str(Label("//tensorflow/tools/git:gen/spec.json")),
-          str(Label("//tensorflow/tools/git:gen/head")),
-          str(Label("//tensorflow/tools/git:gen/branch_ref")),
+      name="version_info_gen",
+      srcs=[
+          "//tensorflow/tools/git:gen/spec.json",
+          "//tensorflow/tools/git:gen/head",
+          "//tensorflow/tools/git:gen/branch_ref",
       ],
-      outs = ["util/version_info.cc"],
-      cmd = "$(location //tensorflow/tools/git:gen_git_source.py) --generate $(SRCS) \"$@\"",
-      local = 1,
-      tools = [str(Label("//tensorflow/tools/git:gen_git_source.py"))],
-  )
+      outs=["util/version_info.cc"],
+      cmd=
+      "$(location //tensorflow/tools/git:gen_git_source.py) --generate $(SRCS) \"$@\"",
+      local=1,
+      tools=["//tensorflow/tools/git:gen_git_source.py"],)
 
-def cc_library_with_android_deps(deps, android_deps=[],
-                                common_deps=[], **kwargs):
+
+def cc_library_with_android_deps(deps,
+                                 android_deps=[],
+                                 common_deps=[],
+                                 **kwargs):
   deps = if_not_android(deps) + if_android(android_deps) + common_deps
   native.cc_library(deps=deps, **kwargs)
diff --git a/tensorflow/tools/ci_build/linux/cpu/run_py3_contrib.sh b/tensorflow/tools/ci_build/linux/cpu/run_py3_contrib.sh
index 45747ba2d94..a03cab0cca5 100755
--- a/tensorflow/tools/ci_build/linux/cpu/run_py3_contrib.sh
+++ b/tensorflow/tools/ci_build/linux/cpu/run_py3_contrib.sh
@@ -33,6 +33,6 @@ yes "" | ./configure
 
 # Run bazel test command. Double test timeouts to avoid flakes.
 bazel test --test_tag_filters=-gpu,-benchmark-test -k \
-    --jobs=${N_JOBS} --test_timeout 300,450,1200,3600 --build_tests_only \
+    --jobs=${N_JOBS} --test_timeout 300,450,1200,3600 \
     --test_output=errors -- \
     //tensorflow/contrib/...
diff --git a/tensorflow/tools/pip_package/setup.py b/tensorflow/tools/pip_package/setup.py
index 594dde3d7bf..f591e50ac9d 100644
--- a/tensorflow/tools/pip_package/setup.py
+++ b/tensorflow/tools/pip_package/setup.py
@@ -167,15 +167,12 @@ headers = (list(find_files('*.h', 'tensorflow/core')) +
            list(find_files('*', 'third_party/eigen3')) +
            list(find_files('*', 'external/eigen_archive')))
 
-tf_long_description = (
-    'Note: TensorFlow manylinux1 wheels do not conform to the '
-    'specification in PEP531.')
 
 setup(
     name=project_name,
     version=_VERSION.replace('-', ''),
     description='TensorFlow helps the tensors flow',
-    long_description=tf_long_description,
+    long_description='',
     url='http://tensorflow.org/',
     author='Google Inc.',
     author_email='opensource@google.com',
diff --git a/tensorflow/tools/test/check_futures_test.py b/tensorflow/tools/test/check_futures_test.py
index 32d65adb1f2..36a61c0ecc2 100644
--- a/tensorflow/tools/test/check_futures_test.py
+++ b/tensorflow/tools/test/check_futures_test.py
@@ -40,6 +40,7 @@ FUTURES_PATTERN_2 = re.compile(
 REQUIRED_FUTURES = frozenset(['absolute_import', 'division', 'print_function'])
 
 WHITELIST = [
+    'python/platform/control_imports.py',
     'tools/docker/jupyter_notebook_config.py',
 ]
 
diff --git a/tensorflow/workspace.bzl b/tensorflow/workspace.bzl
index 4a39723cdc9..0a79bdf6e4b 100644
--- a/tensorflow/workspace.bzl
+++ b/tensorflow/workspace.bzl
@@ -1,9 +1,11 @@
 # TensorFlow external dependencies that can be loaded in WORKSPACE files.
 
-load("@io_bazel_rules_closure//closure/private:java_import_external.bzl", "java_import_external")
+load("@io_bazel_rules_closure//closure/private:java_import_external.bzl",
+     "java_import_external")
 load("@io_bazel_rules_closure//closure:defs.bzl", "filegroup_external")
 load("@io_bazel_rules_closure//closure:defs.bzl", "webfiles_external")
 load("//third_party/gpus:cuda_configure.bzl", "cuda_configure")
+
 load("//third_party/sycl:sycl_configure.bzl", "sycl_configure")
 
 
@@ -14,20 +16,23 @@ def _parse_bazel_version(bazel_version):
 
   # Split into (release, date) parts and only return the release
   # as a tuple of integers.
-  parts = version.split('-', 1)
+  parts = version.split("-", 1)
 
   # Turn "release" into a tuple of strings
   version_tuple = ()
-  for number in parts[0].split('.'):
+  for number in parts[0].split("."):
     version_tuple += (str(number),)
   return version_tuple
 
+
 # Check that a specific bazel version is being used.
 def check_version(bazel_version):
   if "bazel_version" not in dir(native):
-    fail("\nCurrent Bazel version is lower than 0.2.1, expected at least %s\n" % bazel_version)
+    fail("\nCurrent Bazel version is lower than 0.2.1, expected at least %s\n" %
+         bazel_version)
   elif not native.bazel_version:
-    print("\nCurrent Bazel is not a release version, cannot check for compatibility.")
+    print("\nCurrent Bazel is not a release version, cannot check for " +
+          "compatibility.")
     print("Make sure that you are running at least Bazel %s.\n" % bazel_version)
   else:
     current_bazel_version = _parse_bazel_version(native.bazel_version)
@@ -37,286 +42,279 @@ def check_version(bazel_version):
           native.bazel_version, bazel_version))
   pass
 
+
 def _repos_are_siblings():
   return Label("@foo//bar").workspace_root.startswith("../")
 
+
 # Temporary workaround to support including TensorFlow as a submodule until this
 # use-case is supported in the next Bazel release.
 def _temp_workaround_http_archive_impl(repo_ctx):
-   repo_ctx.template("BUILD", repo_ctx.attr.build_file,
-                     {
-                         "%prefix%" : ".." if _repos_are_siblings() else "external",
-                         "%ws%": repo_ctx.attr.repository
-                     }, False)
-   repo_ctx.download_and_extract(repo_ctx.attr.urls, "", repo_ctx.attr.sha256,
-                                 "", repo_ctx.attr.strip_prefix)
-   if repo_ctx.attr.patch_file != None:
-     _apply_patch(repo_ctx, repo_ctx.attr.patch_file)
+  repo_ctx.template("BUILD", repo_ctx.attr.build_file, {
+      "%prefix%": ".." if _repos_are_siblings() else "external",
+      "%ws%": repo_ctx.attr.repository
+  }, False)
+  repo_ctx.download_and_extract(repo_ctx.attr.urls, "", repo_ctx.attr.sha256,
+                                "", repo_ctx.attr.strip_prefix)
+  if repo_ctx.attr.patch_file != None:
+    _apply_patch(repo_ctx, repo_ctx.attr.patch_file)
+
 
 temp_workaround_http_archive = repository_rule(
-   implementation=_temp_workaround_http_archive_impl,
-   attrs = {
-      "build_file": attr.label(),
-      "repository": attr.string(),
-      "patch_file": attr.label(default = None),
-      "urls": attr.string_list(default = []),
-      "sha256": attr.string(default = ""),
-      "strip_prefix": attr.string(default = ""),
-   })
+    implementation=_temp_workaround_http_archive_impl,
+    attrs={
+        "build_file": attr.label(),
+        "repository": attr.string(),
+        "patch_file": attr.label(default=None),
+        "urls": attr.string_list(default=[]),
+        "sha256": attr.string(default=""),
+        "strip_prefix": attr.string(default=""),
+    })
+
 
 # Executes specified command with arguments and calls 'fail' if it exited with non-zero code
 def _execute_and_check_ret_code(repo_ctx, cmd_and_args):
   result = repo_ctx.execute(cmd_and_args)
   if result.return_code != 0:
-    fail(("Non-zero return code({1}) when executing '{0}':\n" +
-          "Stdout: {2}\n" +
-          "Stderr: {3}").format(" ".join(cmd_and_args),
-                                result.return_code, result.stdout, result.stderr))
+    fail(("Non-zero return code({1}) when executing '{0}':\n" + "Stdout: {2}\n"
+          + "Stderr: {3}").format(" ".join(cmd_and_args), result.return_code,
+                                  result.stdout, result.stderr))
+
 
 # Apply a patch_file to the repository root directory
 # Runs 'patch -p1'
 def _apply_patch(repo_ctx, patch_file):
-  _execute_and_check_ret_code(repo_ctx, ["patch", "-p1",
-                                         "-d", repo_ctx.path("."),
-                                         "-i", repo_ctx.path(patch_file)])
+  _execute_and_check_ret_code(repo_ctx, [
+      "patch", "-p1", "-d", repo_ctx.path("."), "-i", repo_ctx.path(patch_file)
+  ])
+
 
 # Download the repository and apply a patch to its root
 def _patched_http_archive_impl(repo_ctx):
-  repo_ctx.download_and_extract(repo_ctx.attr.urls,
-                                sha256 = repo_ctx.attr.sha256,
-                                stripPrefix = repo_ctx.attr.strip_prefix)
+  repo_ctx.download_and_extract(
+      repo_ctx.attr.urls,
+      sha256=repo_ctx.attr.sha256,
+      stripPrefix=repo_ctx.attr.strip_prefix)
   _apply_patch(repo_ctx, repo_ctx.attr.patch_file)
 
+
 patched_http_archive = repository_rule(
-    implementation = _patched_http_archive_impl,
-    attrs = {
-      "patch_file": attr.label(),
-      "build_file": attr.label(),
-      "repository": attr.string(),
-      "urls": attr.string_list(default = []),
-      "sha256": attr.string(default = ""),
-      "strip_prefix": attr.string(default = ""),
+    implementation=_patched_http_archive_impl,
+    attrs={
+        "patch_file": attr.label(),
+        "build_file": attr.label(),
+        "repository": attr.string(),
+        "urls": attr.string_list(default=[]),
+        "sha256": attr.string(default=""),
+        "strip_prefix": attr.string(default=""),
     })
 
+
 # If TensorFlow is linked as a submodule.
 # path_prefix and tf_repo_name are no longer used.
-def tf_workspace(path_prefix = "", tf_repo_name = ""):
+def tf_workspace(path_prefix="", tf_repo_name=""):
   # We must check the bazel version before trying to parse any other BUILD
   # files, in case the parsing of those build files depends on the bazel
   # version we require here.
   check_version("0.4.5")
-  cuda_configure(name = "local_config_cuda")
-  sycl_configure(name = "local_config_sycl")
+  cuda_configure(name="local_config_cuda")
+  sycl_configure(name="local_config_sycl")
   if path_prefix:
-    print("path_prefix was specified to tf_workspace but is no longer used and will be removed in the future.")
+    print(
+        "path_prefix was specified to tf_workspace but is no longer used and " +
+        "will be removed in the future."
+    )
   if tf_repo_name:
-    print("tf_repo_name was specified to tf_workspace but is no longer used and will be removed in the future.")
+    print(
+        "tf_repo_name was specified to tf_workspace but is no longer used " +
+        "and will be removed in the future."
+    )
 
   native.new_http_archive(
-      name = "eigen_archive",
-      urls = [
+      name="eigen_archive",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/bitbucket.org/eigen/eigen/get/deff8b280204.tar.gz",
           "https://bitbucket.org/eigen/eigen/get/deff8b280204.tar.gz",
       ],
-      sha256 = "a39834683eb5bdb9a7434f0ab3621d2cbc3b07e8002db6de101e45ec536723eb",
-      strip_prefix = "eigen-eigen-deff8b280204",
-      build_file = str(Label("//third_party:eigen.BUILD")),
-  )
+      sha256="a39834683eb5bdb9a7434f0ab3621d2cbc3b07e8002db6de101e45ec536723eb",
+      strip_prefix="eigen-eigen-deff8b280204",
+      build_file=str(Label("//third_party:eigen.BUILD")),)
 
   native.new_http_archive(
-      name = "libxsmm_archive",
-      urls = [
+      name="libxsmm_archive",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/hfp/libxsmm/archive/1.8.tar.gz",
           "https://github.com/hfp/libxsmm/archive/1.8.tar.gz",
       ],
-      sha256 = "0330201afb5525d0950ec861fec9dd75eb40a03845ebe03d2c635cf8bfc14fea",
-      strip_prefix = "libxsmm-1.8",
-      build_file = str(Label("//third_party:libxsmm.BUILD")),
-  )
+      sha256="0330201afb5525d0950ec861fec9dd75eb40a03845ebe03d2c635cf8bfc14fea",
+      strip_prefix="libxsmm-1.8",
+      build_file=str(Label("//third_party:libxsmm.BUILD")),)
 
   native.bind(
-      name = "xsmm_avx",
-      actual = "@libxsmm_archive//third_party:xsmm_avx",
-  )
+      name="xsmm_avx",
+      actual="@libxsmm_archive//third_party:xsmm_avx",)
 
   native.new_http_archive(
-      name = "ortools_archive",
-      urls = [
+      name="ortools_archive",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/google/or-tools/archive/253f7955c6a1fd805408fba2e42ac6d45b312d15.tar.gz",
           "https://github.com/google/or-tools/archive/253f7955c6a1fd805408fba2e42ac6d45b312d15.tar.gz",
       ],
-      sha256 = "932075525642b04ac6f1b50589f1df5cd72ec2f448b721fd32234cf183f0e755",
-      strip_prefix = "or-tools-253f7955c6a1fd805408fba2e42ac6d45b312d15/src",
-      build_file = str(Label("//third_party:ortools.BUILD")),
-  )
+      sha256="932075525642b04ac6f1b50589f1df5cd72ec2f448b721fd32234cf183f0e755",
+      strip_prefix="or-tools-253f7955c6a1fd805408fba2e42ac6d45b312d15/src",
+      build_file=str(Label("//third_party:ortools.BUILD")),)
 
   native.http_archive(
-      name = "com_googlesource_code_re2",
-      urls = [
+      name="com_googlesource_code_re2",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/google/re2/archive/b94b7cd42e9f02673cd748c1ac1d16db4052514c.tar.gz",
           "https://github.com/google/re2/archive/b94b7cd42e9f02673cd748c1ac1d16db4052514c.tar.gz",
       ],
-      sha256 = "bd63550101e056427c9e7ff12a408c1c8b74e9803f393ca916b2926fc2c4906f",
-      strip_prefix = "re2-b94b7cd42e9f02673cd748c1ac1d16db4052514c",
-  )
+      sha256="bd63550101e056427c9e7ff12a408c1c8b74e9803f393ca916b2926fc2c4906f",
+      strip_prefix="re2-b94b7cd42e9f02673cd748c1ac1d16db4052514c",)
 
   native.http_archive(
-      name = "gemmlowp",
-      urls = [
+      name="gemmlowp",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/google/gemmlowp/archive/a6f29d8ac48d63293f845f2253eccbf86bc28321.tar.gz",
           "https://github.com/google/gemmlowp/archive/a6f29d8ac48d63293f845f2253eccbf86bc28321.tar.gz",
       ],
-      sha256 = "75d40ea8e68b0d1644f052fffe8f14a410b2a73d40ccb859a95c0578d194ec26",
-      strip_prefix = "gemmlowp-a6f29d8ac48d63293f845f2253eccbf86bc28321",
-  )
+      sha256="75d40ea8e68b0d1644f052fffe8f14a410b2a73d40ccb859a95c0578d194ec26",
+      strip_prefix="gemmlowp-a6f29d8ac48d63293f845f2253eccbf86bc28321",)
 
   native.new_http_archive(
-      name = "farmhash_archive",
-      urls = [
+      name="farmhash_archive",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/google/farmhash/archive/92e897b282426729f4724d91a637596c7e2fe28f.zip",
           "https://github.com/google/farmhash/archive/92e897b282426729f4724d91a637596c7e2fe28f.zip",
       ],
-      sha256 = "4c626d1f306bda2c6804ab955892f803f5245f4dcaecb4979dc08b091256da54",
-      strip_prefix = "farmhash-92e897b282426729f4724d91a637596c7e2fe28f",
-      build_file = str(Label("//third_party:farmhash.BUILD")),
-  )
+      sha256="4c626d1f306bda2c6804ab955892f803f5245f4dcaecb4979dc08b091256da54",
+      strip_prefix="farmhash-92e897b282426729f4724d91a637596c7e2fe28f",
+      build_file=str(Label("//third_party:farmhash.BUILD")),)
 
   native.bind(
-      name = "farmhash",
-      actual = "@farmhash//:farmhash",
-  )
+      name="farmhash",
+      actual="@farmhash//:farmhash",)
 
   native.new_http_archive(
-      name = "highwayhash",
-      urls = [
+      name="highwayhash",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/google/highwayhash/archive/dfcb97ca4fe9277bf9dc1802dd979b071896453b.tar.gz",
           "https://github.com/google/highwayhash/archive/dfcb97ca4fe9277bf9dc1802dd979b071896453b.tar.gz",
       ],
-      sha256 = "0f30a15b1566d93f146c8d149878a06e91d9bb7ec2cfd76906df62a82be4aac9",
-      strip_prefix = "highwayhash-dfcb97ca4fe9277bf9dc1802dd979b071896453b",
-      build_file = str(Label("//third_party:highwayhash.BUILD")),
-  )
+      sha256="0f30a15b1566d93f146c8d149878a06e91d9bb7ec2cfd76906df62a82be4aac9",
+      strip_prefix="highwayhash-dfcb97ca4fe9277bf9dc1802dd979b071896453b",
+      build_file=str(Label("//third_party:highwayhash.BUILD")),)
 
   native.new_http_archive(
-      name = "nasm",
-      urls = [
+      name="nasm",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/www.nasm.us/pub/nasm/releasebuilds/2.12.02/nasm-2.12.02.tar.bz2",
           "http://pkgs.fedoraproject.org/repo/pkgs/nasm/nasm-2.12.02.tar.bz2/d15843c3fb7db39af80571ee27ec6fad/nasm-2.12.02.tar.bz2",
       ],
-      sha256 = "00b0891c678c065446ca59bcee64719d0096d54d6886e6e472aeee2e170ae324",
-      strip_prefix = "nasm-2.12.02",
-      build_file = str(Label("//third_party:nasm.BUILD")),
-  )
+      sha256="00b0891c678c065446ca59bcee64719d0096d54d6886e6e472aeee2e170ae324",
+      strip_prefix="nasm-2.12.02",
+      build_file=str(Label("//third_party:nasm.BUILD")),)
 
   temp_workaround_http_archive(
-      name = "jpeg",
-      urls = [
+      name="jpeg",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/libjpeg-turbo/libjpeg-turbo/archive/1.5.1.tar.gz",
           "https://github.com/libjpeg-turbo/libjpeg-turbo/archive/1.5.1.tar.gz",
       ],
-      sha256 = "c15a9607892113946379ccea3ca8b85018301b200754f209453ab21674268e77",
-      strip_prefix = "libjpeg-turbo-1.5.1",
-      build_file = str(Label("//third_party/jpeg:jpeg.BUILD")),
-      repository = tf_repo_name,
-  )
+      sha256="c15a9607892113946379ccea3ca8b85018301b200754f209453ab21674268e77",
+      strip_prefix="libjpeg-turbo-1.5.1",
+      build_file=str(Label("//third_party/jpeg:jpeg.BUILD")),
+      repository=tf_repo_name,)
 
   native.new_http_archive(
-      name = "png_archive",
-      urls = [
+      name="png_archive",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/glennrp/libpng/archive/v1.2.53.zip",
           "https://github.com/glennrp/libpng/archive/v1.2.53.zip",
       ],
-      sha256 = "c35bcc6387495ee6e757507a68ba036d38ad05b415c2553b3debe2a57647a692",
-      strip_prefix = "libpng-1.2.53",
-      build_file = str(Label("//third_party:png.BUILD")),
-  )
+      sha256="c35bcc6387495ee6e757507a68ba036d38ad05b415c2553b3debe2a57647a692",
+      strip_prefix="libpng-1.2.53",
+      build_file=str(Label("//third_party:png.BUILD")),)
 
   native.new_http_archive(
-      name = "gif_archive",
-      urls = [
+      name="gif_archive",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/ufpr.dl.sourceforge.net/project/giflib/giflib-5.1.4.tar.gz",
           "http://ufpr.dl.sourceforge.net/project/giflib/giflib-5.1.4.tar.gz",
           "http://pilotfiber.dl.sourceforge.net/project/giflib/giflib-5.1.4.tar.gz",
       ],
-      sha256 = "34a7377ba834397db019e8eb122e551a49c98f49df75ec3fcc92b9a794a4f6d1",
-      strip_prefix = "giflib-5.1.4",
-      build_file = str(Label("//third_party:gif.BUILD")),
-  )
+      sha256="34a7377ba834397db019e8eb122e551a49c98f49df75ec3fcc92b9a794a4f6d1",
+      strip_prefix="giflib-5.1.4",
+      build_file=str(Label("//third_party:gif.BUILD")),)
 
   native.new_http_archive(
-      name = "six_archive",
-      urls = [
+      name="six_archive",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/pypi.python.org/packages/source/s/six/six-1.10.0.tar.gz",
           "http://pypi.python.org/packages/source/s/six/six-1.10.0.tar.gz",
       ],
-      sha256 = "105f8d68616f8248e24bf0e9372ef04d3cc10104f1980f54d57b2ce73a5ad56a",
-      strip_prefix = "six-1.10.0",
-      build_file = str(Label("//third_party:six.BUILD")),
-  )
+      sha256="105f8d68616f8248e24bf0e9372ef04d3cc10104f1980f54d57b2ce73a5ad56a",
+      strip_prefix="six-1.10.0",
+      build_file=str(Label("//third_party:six.BUILD")),)
 
   native.new_http_archive(
-      name = "org_pythonhosted_markdown",
-      urls = [
+      name="org_pythonhosted_markdown",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/pypi.python.org/packages/1d/25/3f6d2cb31ec42ca5bd3bfbea99b63892b735d76e26f20dd2dcc34ffe4f0d/Markdown-2.6.8.tar.gz",
           "https://pypi.python.org/packages/1d/25/3f6d2cb31ec42ca5bd3bfbea99b63892b735d76e26f20dd2dcc34ffe4f0d/Markdown-2.6.8.tar.gz",
       ],
-      strip_prefix = "Markdown-2.6.8",
-      sha256 = "0ac8a81e658167da95d063a9279c9c1b2699f37c7c4153256a458b3a43860e33",
-      build_file = str(Label("//third_party:markdown.BUILD")),
-  )
+      strip_prefix="Markdown-2.6.8",
+      sha256="0ac8a81e658167da95d063a9279c9c1b2699f37c7c4153256a458b3a43860e33",
+      build_file=str(Label("//third_party:markdown.BUILD")),)
 
   native.new_http_archive(
-      name = "org_html5lib",
-      urls = [
+      name="org_html5lib",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/html5lib/html5lib-python/archive/1.0b8.tar.gz",
           "https://github.com/html5lib/html5lib-python/archive/1.0b8.tar.gz",
       ],
-      sha256 = "adb36c879264e8880b92589c4c4fe0814cd9d157b73328b14d728f48a6bab0a4",
-      strip_prefix = "html5lib-python-1.0b8",
-      build_file = str(Label("//third_party:html5lib.BUILD")),
-  )
+      sha256="adb36c879264e8880b92589c4c4fe0814cd9d157b73328b14d728f48a6bab0a4",
+      strip_prefix="html5lib-python-1.0b8",
+      build_file=str(Label("//third_party:html5lib.BUILD")),)
 
   native.new_http_archive(
-      name = "org_mozilla_bleach",
-      urls = [
+      name="org_mozilla_bleach",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/mozilla/bleach/archive/v1.5.tar.gz",
           "https://github.com/mozilla/bleach/archive/v1.5.tar.gz",
       ],
-      strip_prefix = "bleach-1.5",
-      sha256 = "0d68713d02ba4148c417ab1637dd819333d96929a34401d0233947bec0881ad8",
-      build_file = str(Label("//third_party:bleach.BUILD")),
-  )
+      strip_prefix="bleach-1.5",
+      sha256="0d68713d02ba4148c417ab1637dd819333d96929a34401d0233947bec0881ad8",
+      build_file=str(Label("//third_party:bleach.BUILD")),)
 
   native.new_http_archive(
-      name = "org_pocoo_werkzeug",
-      urls = [
+      name="org_pocoo_werkzeug",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/pypi.python.org/packages/b7/7f/44d3cfe5a12ba002b253f6985a4477edfa66da53787a2a838a40f6415263/Werkzeug-0.11.10.tar.gz",
           "https://pypi.python.org/packages/b7/7f/44d3cfe5a12ba002b253f6985a4477edfa66da53787a2a838a40f6415263/Werkzeug-0.11.10.tar.gz",
       ],
-      strip_prefix = "Werkzeug-0.11.10",
-      sha256 = "cc64dafbacc716cdd42503cf6c44cb5a35576443d82f29f6829e5c49264aeeee",
-      build_file = str(Label("//third_party:werkzeug.BUILD")),
-  )
+      strip_prefix="Werkzeug-0.11.10",
+      sha256="cc64dafbacc716cdd42503cf6c44cb5a35576443d82f29f6829e5c49264aeeee",
+      build_file=str(Label("//third_party:werkzeug.BUILD")),)
 
   native.bind(
-      name = "six",
-      actual = "@six_archive//:six",
-  )
+      name="six",
+      actual="@six_archive//:six",)
 
   patched_http_archive(
-      name = "protobuf",
-      urls = [
+      name="protobuf",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/google/protobuf/archive/2b7430d96aeff2bb624c8d52182ff5e4b9f7f18a.tar.gz",
           "https://github.com/google/protobuf/archive/2b7430d96aeff2bb624c8d52182ff5e4b9f7f18a.tar.gz",
       ],
-      sha256 = "e5d3d4e227a0f7afb8745df049bbd4d55474b158ca5aaa2a0e31099af24be1d0",
-      strip_prefix = "protobuf-2b7430d96aeff2bb624c8d52182ff5e4b9f7f18a",
+      sha256="e5d3d4e227a0f7afb8745df049bbd4d55474b158ca5aaa2a0e31099af24be1d0",
+      strip_prefix="protobuf-2b7430d96aeff2bb624c8d52182ff5e4b9f7f18a",
       # TODO: remove patching when tensorflow stops linking same protos into
       #       multiple shared libraries loaded in runtime by python.
       #       This patch fixes a runtime crash when tensorflow is compiled
       #       with clang -O2 on Linux (see https://github.com/tensorflow/tensorflow/issues/8394)
-      patch_file = str(Label("//third_party/protobuf:add_noinlines.patch")),
-  )
+      patch_file=str(Label("//third_party/protobuf:add_noinlines.patch")),)
 
   # We need to import the protobuf library under the names com_google_protobuf
   # and com_google_protobuf_cc to enable proto_library support in bazel.
@@ -342,25 +340,22 @@ def tf_workspace(path_prefix = "", tf_repo_name = ""):
   )
 
   native.new_http_archive(
-      name = "gmock_archive",
-      urls = [
+      name="gmock_archive",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/google/googletest/archive/release-1.8.0.zip",
           "https://github.com/google/googletest/archive/release-1.8.0.zip",
       ],
-      sha256 = "f3ed3b58511efd272eb074a3a6d6fb79d7c2e6a0e374323d1e6bcbcc1ef141bf",
-      strip_prefix = "googletest-release-1.8.0",
-      build_file = str(Label("//third_party:gmock.BUILD")),
-  )
+      sha256="f3ed3b58511efd272eb074a3a6d6fb79d7c2e6a0e374323d1e6bcbcc1ef141bf",
+      strip_prefix="googletest-release-1.8.0",
+      build_file=str(Label("//third_party:gmock.BUILD")),)
 
   native.bind(
-      name = "gtest",
-      actual = "@gmock_archive//:gtest",
-  )
+      name="gtest",
+      actual="@gmock_archive//:gtest",)
 
   native.bind(
-      name = "gtest_main",
-      actual = "@gmock_archive//:gtest_main",
-  )
+      name="gtest_main",
+      actual="@gmock_archive//:gtest_main",)
 
   native.git_repository(
     name   = "com_github_gflags_gflags",
@@ -369,231 +364,210 @@ def tf_workspace(path_prefix = "", tf_repo_name = ""):
   )
 
   native.bind(
-      name = "python_headers",
-      actual = str(Label("//util/python:python_headers")),
-  )
+      name="python_headers",
+      actual=str(Label("//util/python:python_headers")),)
 
   native.new_http_archive(
-      name = "pcre",
-      sha256 = "ccdf7e788769838f8285b3ee672ed573358202305ee361cfec7a4a4fb005bbc7",
-      urls = [
+      name="pcre",
+      sha256="ccdf7e788769838f8285b3ee672ed573358202305ee361cfec7a4a4fb005bbc7",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/ftp.exim.org/pub/pcre/pcre-8.39.tar.gz",
           "http://ftp.exim.org/pub/pcre/pcre-8.39.tar.gz",
       ],
-      strip_prefix = "pcre-8.39",
-      build_file = str(Label("//third_party:pcre.BUILD")),
-  )
+      strip_prefix="pcre-8.39",
+      build_file=str(Label("//third_party:pcre.BUILD")),)
 
   native.new_http_archive(
-      name = "swig",
-      sha256 = "58a475dbbd4a4d7075e5fe86d4e54c9edde39847cdb96a3053d87cb64a23a453",
-      urls = [
+      name="swig",
+      sha256="58a475dbbd4a4d7075e5fe86d4e54c9edde39847cdb96a3053d87cb64a23a453",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/ufpr.dl.sourceforge.net/project/swig/swig/swig-3.0.8/swig-3.0.8.tar.gz",
           "http://ufpr.dl.sourceforge.net/project/swig/swig/swig-3.0.8/swig-3.0.8.tar.gz",
           "http://pilotfiber.dl.sourceforge.net/project/swig/swig/swig-3.0.8/swig-3.0.8.tar.gz",
       ],
-      strip_prefix = "swig-3.0.8",
-      build_file = str(Label("//third_party:swig.BUILD")),
-  )
+      strip_prefix="swig-3.0.8",
+      build_file=str(Label("//third_party:swig.BUILD")),)
 
   temp_workaround_http_archive(
-      name = "curl",
-      sha256 = "ff3e80c1ca6a068428726cd7dd19037a47cc538ce58ef61c59587191039b2ca6",
-      urls = [
+      name="curl",
+      sha256="ff3e80c1ca6a068428726cd7dd19037a47cc538ce58ef61c59587191039b2ca6",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/curl.haxx.se/download/curl-7.49.1.tar.gz",
           "https://curl.haxx.se/download/curl-7.49.1.tar.gz",
       ],
-      strip_prefix = "curl-7.49.1",
-      build_file = str(Label("//third_party:curl.BUILD")),
-      repository = tf_repo_name
-  )
+      strip_prefix="curl-7.49.1",
+      build_file=str(Label("//third_party:curl.BUILD")),
+      repository=tf_repo_name)
 
   # grpc expects //external:protobuf_clib and //external:protobuf_compiler
   # to point to the protobuf's compiler library.
   native.bind(
-      name = "protobuf_clib",
-      actual = "@protobuf//:protoc_lib",
-  )
+      name="protobuf_clib",
+      actual="@protobuf//:protoc_lib",)
 
   native.bind(
-      name = "protobuf_compiler",
-      actual = "@protobuf//:protoc_lib",
-  )
+      name="protobuf_compiler",
+      actual="@protobuf//:protoc_lib",)
 
   native.new_http_archive(
-      name = "grpc",
-      urls = [
+      name="grpc",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/grpc/grpc/archive/d7ff4ff40071d2b486a052183e3e9f9382afb745.tar.gz",
           "https://github.com/grpc/grpc/archive/d7ff4ff40071d2b486a052183e3e9f9382afb745.tar.gz",
       ],
-      sha256 = "a15f352436ab92c521b1ac11e729e155ace38d0856380cf25048c5d1d9ba8e31",
-      strip_prefix = "grpc-d7ff4ff40071d2b486a052183e3e9f9382afb745",
-      build_file = str(Label("//third_party:grpc.BUILD")),
-  )
+      sha256="a15f352436ab92c521b1ac11e729e155ace38d0856380cf25048c5d1d9ba8e31",
+      strip_prefix="grpc-d7ff4ff40071d2b486a052183e3e9f9382afb745",
+      build_file=str(Label("//third_party:grpc.BUILD")),)
 
   # protobuf expects //external:grpc_cpp_plugin to point to grpc's
   # C++ plugin code generator.
   native.bind(
-      name = "grpc_cpp_plugin",
-      actual = "@grpc//:grpc_cpp_plugin",
-  )
+      name="grpc_cpp_plugin",
+      actual="@grpc//:grpc_cpp_plugin",)
 
   native.bind(
-      name = "grpc_lib",
-      actual = "@grpc//:grpc++_unsecure",
-  )
+      name="grpc_lib",
+      actual="@grpc//:grpc++_unsecure",)
 
   native.new_http_archive(
-      name = "linenoise",
-      sha256 = "7f51f45887a3d31b4ce4fa5965210a5e64637ceac12720cfce7954d6a2e812f7",
-      urls = [
+      name="linenoise",
+      sha256="7f51f45887a3d31b4ce4fa5965210a5e64637ceac12720cfce7954d6a2e812f7",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/antirez/linenoise/archive/c894b9e59f02203dbe4e2be657572cf88c4230c3.tar.gz",
           "https://github.com/antirez/linenoise/archive/c894b9e59f02203dbe4e2be657572cf88c4230c3.tar.gz",
       ],
-      strip_prefix = "linenoise-c894b9e59f02203dbe4e2be657572cf88c4230c3",
-      build_file = str(Label("//third_party:linenoise.BUILD")),
-  )
+      strip_prefix="linenoise-c894b9e59f02203dbe4e2be657572cf88c4230c3",
+      build_file=str(Label("//third_party:linenoise.BUILD")),)
 
   # TODO(phawkins): currently, this rule uses an unofficial LLVM mirror.
   # Switch to an official source of snapshots if/when possible.
   temp_workaround_http_archive(
-      name = "llvm",
-      urls = [
+      name="llvm",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/llvm-mirror/llvm/archive/5d2b26453d4bca5a13b69b0130e4369d1fcd393d.tar.gz",
           "https://github.com/llvm-mirror/llvm/archive/5d2b26453d4bca5a13b69b0130e4369d1fcd393d.tar.gz",
       ],
-      sha256 = "3cecf39bf4b3854629d610bb321bb57e0e46bda9110bd51c3bae5a4171c82bab",
-      strip_prefix = "llvm-5d2b26453d4bca5a13b69b0130e4369d1fcd393d",
-      build_file = str(Label("//third_party/llvm:llvm.BUILD")),
-      repository = tf_repo_name,
-  )
+      sha256="3cecf39bf4b3854629d610bb321bb57e0e46bda9110bd51c3bae5a4171c82bab",
+      strip_prefix="llvm-5d2b26453d4bca5a13b69b0130e4369d1fcd393d",
+      build_file=str(Label("//third_party/llvm:llvm.BUILD")),
+      repository=tf_repo_name,)
 
   native.new_http_archive(
-      name = "jsoncpp_git",
-      urls = [
+      name="jsoncpp_git",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/open-source-parsers/jsoncpp/archive/11086dd6a7eba04289944367ca82cea71299ed70.tar.gz",
           "https://github.com/open-source-parsers/jsoncpp/archive/11086dd6a7eba04289944367ca82cea71299ed70.tar.gz",
       ],
-      sha256 = "07d34db40593d257324ec5fb9debc4dc33f29f8fb44e33a2eeb35503e61d0fe2",
-      strip_prefix = "jsoncpp-11086dd6a7eba04289944367ca82cea71299ed70",
-      build_file = str(Label("//third_party:jsoncpp.BUILD")),
-  )
+      sha256="07d34db40593d257324ec5fb9debc4dc33f29f8fb44e33a2eeb35503e61d0fe2",
+      strip_prefix="jsoncpp-11086dd6a7eba04289944367ca82cea71299ed70",
+      build_file=str(Label("//third_party:jsoncpp.BUILD")),)
 
   native.bind(
-      name = "jsoncpp",
-      actual = "@jsoncpp_git//:jsoncpp",
-  )
+      name="jsoncpp",
+      actual="@jsoncpp_git//:jsoncpp",)
 
   native.http_archive(
-      name = "boringssl",
-      urls = [
+      name="boringssl",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/google/boringssl/archive/bbcaa15b0647816b9a1a9b9e0d209cd6712f0105.tar.gz",
           "https://github.com/google/boringssl/archive/bbcaa15b0647816b9a1a9b9e0d209cd6712f0105.tar.gz",  # 2016-07-11
       ],
-      sha256 = "025264d6e9a7ad371f2f66d17a28b6627de0c9592dc2eb54afd062f68f1f9aa3",
-      strip_prefix = "boringssl-bbcaa15b0647816b9a1a9b9e0d209cd6712f0105",
-  )
+      sha256="025264d6e9a7ad371f2f66d17a28b6627de0c9592dc2eb54afd062f68f1f9aa3",
+      strip_prefix="boringssl-bbcaa15b0647816b9a1a9b9e0d209cd6712f0105",)
 
   native.new_http_archive(
-      name = "nanopb_git",
-      urls = [
+      name="nanopb_git",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/nanopb/nanopb/archive/1251fa1065afc0d62f635e0f63fec8276e14e13c.tar.gz",
           "https://github.com/nanopb/nanopb/archive/1251fa1065afc0d62f635e0f63fec8276e14e13c.tar.gz",
       ],
-      sha256 = "ab1455c8edff855f4f55b68480991559e51c11e7dab060bbab7cffb12dd3af33",
-      strip_prefix = "nanopb-1251fa1065afc0d62f635e0f63fec8276e14e13c",
-      build_file = str(Label("//third_party:nanopb.BUILD")),
-  )
+      sha256="ab1455c8edff855f4f55b68480991559e51c11e7dab060bbab7cffb12dd3af33",
+      strip_prefix="nanopb-1251fa1065afc0d62f635e0f63fec8276e14e13c",
+      build_file=str(Label("//third_party:nanopb.BUILD")),)
 
   native.bind(
-      name = "nanopb",
-      actual = "@nanopb_git//:nanopb",
-  )
+      name="nanopb",
+      actual="@nanopb_git//:nanopb",)
 
   native.new_http_archive(
-      name = "zlib_archive",
-      urls = [
+      name="zlib_archive",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/zlib.net/zlib-1.2.8.tar.gz",
           "http://zlib.net/fossils/zlib-1.2.8.tar.gz",
       ],
-      sha256 = "36658cb768a54c1d4dec43c3116c27ed893e88b02ecfcb44f2166f9c0b7f2a0d",
-      strip_prefix = "zlib-1.2.8",
-      build_file = str(Label("//third_party:zlib.BUILD")),
-  )
+      sha256="36658cb768a54c1d4dec43c3116c27ed893e88b02ecfcb44f2166f9c0b7f2a0d",
+      strip_prefix="zlib-1.2.8",
+      build_file=str(Label("//third_party:zlib.BUILD")),)
 
   native.bind(
-      name = "zlib",
-      actual = "@zlib_archive//:zlib",
-  )
+      name="zlib",
+      actual="@zlib_archive//:zlib",)
 
   temp_workaround_http_archive(
-      name = "snappy",
-      urls = [
+      name="snappy",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/google/snappy/archive/1.1.4.zip",
           "https://github.com/google/snappy/archive/1.1.4.zip",
       ],
-      sha256 = "6c74d2b663170d68184da353cdd71b5b7d57bc8888ef1e99b4929b5d680dba54",
-      strip_prefix = "snappy-1.1.4",
-      build_file = str(Label("//third_party:snappy.BUILD")),
-      repository = tf_repo_name,
-  )
+      sha256="6c74d2b663170d68184da353cdd71b5b7d57bc8888ef1e99b4929b5d680dba54",
+      strip_prefix="snappy-1.1.4",
+      build_file=str(Label("//third_party:snappy.BUILD")),
+      repository=tf_repo_name,)
 
   temp_workaround_http_archive(
-      name = "nccl_archive",
-      urls = [
+      name="nccl_archive",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/nvidia/nccl/archive/024d1e267845f2ed06f3e2e42476d50f04a00ee6.tar.gz",
           "https://github.com/nvidia/nccl/archive/024d1e267845f2ed06f3e2e42476d50f04a00ee6.tar.gz",
       ],
-      sha256 = "6787f0eed88d52ee8e32956fa4947d92c139da469f1d8e311c307f27d641118e",
-      strip_prefix = "nccl-024d1e267845f2ed06f3e2e42476d50f04a00ee6",
-      build_file = str(Label("//third_party/nccl:nccl.BUILD")),
+      sha256="6787f0eed88d52ee8e32956fa4947d92c139da469f1d8e311c307f27d641118e",
+      strip_prefix="nccl-024d1e267845f2ed06f3e2e42476d50f04a00ee6",
+      build_file=str(Label("//third_party/nccl:nccl.BUILD")),
       # TODO: Remove patching after the fix is merged into nccl(see https://github.com/NVIDIA/nccl/pull/78)
-      patch_file = str(Label("//third_party/nccl:fix_clang_compilation.patch")),
-      repository = tf_repo_name,
-  )
+      patch_file=str(Label("//third_party/nccl:fix_clang_compilation.patch")),
+      repository=tf_repo_name,)
 
   java_import_external(
-      name = "junit",
-      jar_sha256 = "59721f0805e223d84b90677887d9ff567dc534d7c502ca903c0c2b17f05c116a",
-      jar_urls = [
+      name="junit",
+      jar_sha256=
+      "59721f0805e223d84b90677887d9ff567dc534d7c502ca903c0c2b17f05c116a",
+      jar_urls=[
           "http://bazel-mirror.storage.googleapis.com/repo1.maven.org/maven2/junit/junit/4.12/junit-4.12.jar",
           "http://repo1.maven.org/maven2/junit/junit/4.12/junit-4.12.jar",
           "http://maven.ibiblio.org/maven2/junit/junit/4.12/junit-4.12.jar",
       ],
-      licenses = ["reciprocal"],  # Common Public License Version 1.0
-      testonly_ = True,
-      deps = ["@org_hamcrest_core"],
-  )
+      licenses=["reciprocal"],  # Common Public License Version 1.0
+      testonly_=True,
+      deps=["@org_hamcrest_core"],)
 
   java_import_external(
-      name = "org_hamcrest_core",
-      jar_sha256 = "66fdef91e9739348df7a096aa384a5685f4e875584cce89386a7a47251c4d8e9",
-      jar_urls = [
+      name="org_hamcrest_core",
+      jar_sha256=
+      "66fdef91e9739348df7a096aa384a5685f4e875584cce89386a7a47251c4d8e9",
+      jar_urls=[
           "http://bazel-mirror.storage.googleapis.com/repo1.maven.org/maven2/org/hamcrest/hamcrest-core/1.3/hamcrest-core-1.3.jar",
           "http://repo1.maven.org/maven2/org/hamcrest/hamcrest-core/1.3/hamcrest-core-1.3.jar",
           "http://maven.ibiblio.org/maven2/org/hamcrest/hamcrest-core/1.3/hamcrest-core-1.3.jar",
       ],
-      licenses = ["notice"],  # New BSD License
-      testonly_ = True,
-  )
+      licenses=["notice"],  # New BSD License
+      testonly_=True,)
 
   temp_workaround_http_archive(
-      name = "jemalloc",
-      urls = [
+      name="jemalloc",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/jemalloc/jemalloc/archive/4.4.0.tar.gz",
           "https://github.com/jemalloc/jemalloc/archive/4.4.0.tar.gz",
       ],
-      sha256 = "3c8f25c02e806c3ce0ab5fb7da1817f89fc9732709024e2a81b6b82f7cc792a8",
-      strip_prefix = "jemalloc-4.4.0",
-      build_file = str(Label("//third_party:jemalloc.BUILD")),
-      repository = tf_repo_name,
-  )
+      sha256="3c8f25c02e806c3ce0ab5fb7da1817f89fc9732709024e2a81b6b82f7cc792a8",
+      strip_prefix="jemalloc-4.4.0",
+      build_file=str(Label("//third_party:jemalloc.BUILD")),
+      repository=tf_repo_name,)
 
   ##############################################################################
   # TensorBoard Build Tools
 
   filegroup_external(
-      name = "org_nodejs",
+      name="org_nodejs",
       # MIT with portions licensed:
       # - MIT
       # - Old MIT
@@ -603,14 +577,14 @@ def tf_workspace(path_prefix = "", tf_repo_name = ""):
       # - Unicode
       # - zlib
       # - Artistic 2.0
-      licenses = ["notice"],
-      sha256_urls_extract_macos = {
+      licenses=["notice"],
+      sha256_urls_extract_macos={
           "47109a00cac344d80296c195451bb5eee7c21727fcef1594384ddfe1f852957a": [
               "http://bazel-mirror.storage.googleapis.com/nodejs.org/dist/v4.3.2/node-v4.3.2-darwin-x64.tar.xz",
               "http://nodejs.org/dist/v4.3.2/node-v4.3.2-darwin-x64.tar.xz",
           ],
       },
-      sha256_urls_windows = {
+      sha256_urls_windows={
           "606c44c42d17866c017c50c0afadad411d9492ac4281d2431b937f881911614e": [
               "http://bazel-mirror.storage.googleapis.com/nodejs.org/dist/v4.3.2/win-x64/node.exe",
               "http://nodejs.org/dist/v4.3.2/win-x64/node.exe",
@@ -620,26 +594,25 @@ def tf_workspace(path_prefix = "", tf_repo_name = ""):
               "http://nodejs.org/dist/v4.3.2/win-x64/node.lib",
           ],
       },
-      sha256_urls_extract = {
+      sha256_urls_extract={
           "4350d0431b49697517c6cca5d66adf5f74eb9101c52f52ae959fa94225822d44": [
               "http://bazel-mirror.storage.googleapis.com/nodejs.org/dist/v4.3.2/node-v4.3.2-linux-x64.tar.xz",
               "http://nodejs.org/dist/v4.3.2/node-v4.3.2-linux-x64.tar.xz",
           ],
       },
-      strip_prefix = {
+      strip_prefix={
           "node-v4.3.2-darwin-x64.tar.xz": "node-v4.3.2-darwin-x64",
           "node-v4.3.2-linux-x64.tar.xz": "node-v4.3.2-linux-x64",
       },
-      executable = [
+      executable=[
           "node",
           "node.exe",
-      ],
-  )
+      ],)
 
   filegroup_external(
-      name = "com_microsoft_typescript",
-      licenses = ["notice"],  # Apache 2.0
-      sha256_urls = {
+      name="com_microsoft_typescript",
+      licenses=["notice"],  # Apache 2.0
+      sha256_urls={
           "e3d9e320a2cae99be4aaa37953961a48323cdf16ba9aa2557a44d69571cd9b8d": [
               "http://bazel-mirror.storage.googleapis.com/raw.githubusercontent.com/Microsoft/TypeScript/v2.1.6/lib/tsc.js",
               "https://raw.githubusercontent.com/Microsoft/TypeScript/v2.1.6/lib/tsc.js",
@@ -649,7 +622,7 @@ def tf_workspace(path_prefix = "", tf_repo_name = ""):
               "https://raw.githubusercontent.com/Microsoft/TypeScript/v2.1.6/lib/lib.es6.d.ts",
           ],
       },
-      extra_build_file_content = "\n".join([
+      extra_build_file_content="\n".join([
           "sh_binary(",
           "    name = \"tsc\",",
           "    srcs = [\"tsc.sh\"],",
@@ -672,40 +645,37 @@ def tf_workspace(path_prefix = "", tf_repo_name = ""):
           "          \"EOF\",",
           "    executable = True,",
           ")",
-      ]),
-  )
+      ]),)
 
   ##############################################################################
   # TensorBoard JavaScript Production Dependencies
 
   filegroup_external(
-      name = "com_lodash",
-      licenses = ["notice"],  # MIT
-      sha256_urls = {
+      name="com_lodash",
+      licenses=["notice"],  # MIT
+      sha256_urls={
           "7c7b391810bc08cf815683431857c51b5ee190062ae4f557e1e4689d6dd910ea": [
               "http://bazel-mirror.storage.googleapis.com/raw.githubusercontent.com/lodash/lodash/3.8.0/lodash.js",
               "https://raw.githubusercontent.com/lodash/lodash/3.8.0/lodash.js",
           ],
-      },
-  )
+      },)
 
   filegroup_external(
-      name = "com_numericjs",
+      name="com_numericjs",
       # no @license header
-      licenses = ["notice"],  # MIT
-      sha256_urls = {
+      licenses=["notice"],  # MIT
+      sha256_urls={
           "dfaca3b8485bee735788cc6eebca82ea25719adc1fb8911c7799c6bd5a95df3b": [
               "http://bazel-mirror.storage.googleapis.com/raw.githubusercontent.com/sloisel/numeric/v1.2.6/src/numeric.js",
               "https://raw.githubusercontent.com/sloisel/numeric/v1.2.6/src/numeric.js",
           ],
-      },
-  )
+      },)
 
   filegroup_external(
-      name = "com_palantir_plottable",
+      name="com_palantir_plottable",
       # no @license header
-      licenses = ["notice"],  # MIT
-      sha256_urls = {
+      licenses=["notice"],  # MIT
+      sha256_urls={
           "77510d7538dbd3b59f1c8a06f68131b38562e3be546364747618d5112723e818": [
               "http://bazel-mirror.storage.googleapis.com/raw.githubusercontent.com/palantir/plottable/v1.16.1/plottable.css",
               "https://raw.githubusercontent.com/palantir/plottable/v1.16.1/plottable.css",
@@ -718,61 +688,56 @@ def tf_workspace(path_prefix = "", tf_repo_name = ""):
               "http://bazel-mirror.storage.googleapis.com/raw.githubusercontent.com/palantir/plottable/v1.16.1/plottable.js",
               "https://raw.githubusercontent.com/palantir/plottable/v1.16.1/plottable.js",
           ],
-      },
-  )
+      },)
 
   filegroup_external(
-      name = "io_github_cpettitt_dagre",
+      name="io_github_cpettitt_dagre",
       # no @license header
-      licenses = ["notice"],  # MIT
-      sha256_urls = {
+      licenses=["notice"],  # MIT
+      sha256_urls={
           "7323829ddd77924a69e2b1235ded3eac30acd990da0f037e0fbd3c8e9035b50d": [
               "http://bazel-mirror.storage.googleapis.com/raw.githubusercontent.com/cpettitt/dagre/v0.7.4/dist/dagre.core.js",
               "https://raw.githubusercontent.com/cpettitt/dagre/v0.7.4/dist/dagre.core.js",
           ],
-      },
-  )
+      },)
 
   filegroup_external(
-      name = "io_github_cpettitt_graphlib",
+      name="io_github_cpettitt_graphlib",
       # no @license header
-      licenses = ["notice"],  # MIT
-      sha256_urls = {
+      licenses=["notice"],  # MIT
+      sha256_urls={
           "772045d412b1513b549be991c2e1846c38019429d43974efcae943fbe83489bf": [
               "http://bazel-mirror.storage.googleapis.com/raw.githubusercontent.com/cpettitt/graphlib/v1.0.7/dist/graphlib.core.js",
               "https://raw.githubusercontent.com/cpettitt/graphlib/v1.0.7/dist/graphlib.core.js",
           ],
-      },
-  )
+      },)
 
   filegroup_external(
-      name = "io_github_waylonflinn_weblas",
+      name="io_github_waylonflinn_weblas",
       # no @license header
-      licenses = ["notice"],  # MIT
-      sha256_urls = {
+      licenses=["notice"],  # MIT
+      sha256_urls={
           "f138fce57f673ca8a633f4aee5ae5b6fcb6ad0de59069a42a74e996fd04d8fcc": [
               "http://bazel-mirror.storage.googleapis.com/raw.githubusercontent.com/waylonflinn/weblas/v0.9.0/dist/weblas.js",
               "https://raw.githubusercontent.com/waylonflinn/weblas/v0.9.0/dist/weblas.js",
           ],
-      },
-  )
+      },)
 
   filegroup_external(
-      name = "org_d3js",
+      name="org_d3js",
       # no @license header
-      licenses = ["notice"],  # BSD-3-Clause
-      sha256_urls = {
+      licenses=["notice"],  # BSD-3-Clause
+      sha256_urls={
           "bc1e38838f5c5c8e040132d41efee6bfddbef728210bd566479dc1694af1d3f5": [
               "http://bazel-mirror.storage.googleapis.com/raw.githubusercontent.com/d3/d3/v3.5.15/d3.js",
               "https://raw.githubusercontent.com/d3/d3/v3.5.15/d3.js",
           ],
-      },
-  )
+      },)
 
   filegroup_external(
-      name = "org_definitelytyped",
-      licenses = ["notice"],  # MIT
-      sha256_urls = {
+      name="org_definitelytyped",
+      licenses=["notice"],  # MIT
+      sha256_urls={
           "b7da645f6e5555feb7aeede73775da0023ce2257df9c8e76c9159266035a9c0d": [
               "http://bazel-mirror.storage.googleapis.com/raw.githubusercontent.com/DefinitelyTyped/DefinitelyTyped/ebc69904eb78f94030d5d517b42db20867f679c0/chai/chai.d.ts",
               "https://raw.githubusercontent.com/DefinitelyTyped/DefinitelyTyped/ebc69904eb78f94030d5d517b42db20867f679c0/chai/chai.d.ts",
@@ -789,14 +754,13 @@ def tf_workspace(path_prefix = "", tf_repo_name = ""):
               "http://bazel-mirror.storage.googleapis.com/raw.githubusercontent.com/DefinitelyTyped/DefinitelyTyped/ebc69904eb78f94030d5d517b42db20867f679c0/mocha/mocha.d.ts",
               "https://raw.githubusercontent.com/DefinitelyTyped/DefinitelyTyped/ebc69904eb78f94030d5d517b42db20867f679c0/mocha/mocha.d.ts",
           ],
-      },
-  )
+      },)
 
   filegroup_external(
-      name = "org_threejs",
+      name="org_threejs",
       # no @license header
-      licenses = ["notice"],  # MIT
-      sha256_urls = {
+      licenses=["notice"],  # MIT
+      sha256_urls={
           "7aff264bd84c90bed3c72a4dc31db8c19151853c6df6980f52b01d3e9872c82d": [
               "http://bazel-mirror.storage.googleapis.com/raw.githubusercontent.com/mrdoob/three.js/ad419d40bdaab80abbb34b8f359b4ee840033a02/build/three.js",
               "https://raw.githubusercontent.com/mrdoob/three.js/ad419d40bdaab80abbb34b8f359b4ee840033a02/build/three.js",
@@ -805,190 +769,179 @@ def tf_workspace(path_prefix = "", tf_repo_name = ""):
               "http://bazel-mirror.storage.googleapis.com/raw.githubusercontent.com/mrdoob/three.js/ad419d40bdaab80abbb34b8f359b4ee840033a02/examples/js/controls/OrbitControls.js",
               "https://raw.githubusercontent.com/mrdoob/three.js/ad419d40bdaab80abbb34b8f359b4ee840033a02/examples/js/controls/OrbitControls.js",
           ],
-      },
-  )
+      },)
 
   ##############################################################################
   # TensorBoard JavaScript Testing Dependencies
 
   filegroup_external(
-      name = "com_chaijs",
+      name="com_chaijs",
       # no @license header
-      licenses = ["notice"],  # MIT
-      sha256_urls = {
+      licenses=["notice"],  # MIT
+      sha256_urls={
           "b926b325ad9843bf0b7a6d580ef78bb560e47c484b98680098d4fd9b31b77cd9": [
               "http://bazel-mirror.storage.googleapis.com/raw.githubusercontent.com/chaijs/chai/2.3.0/chai.js",
               "https://raw.githubusercontent.com/chaijs/chai/2.3.0/chai.js",
           ],
-      },
-  )
+      },)
 
   filegroup_external(
-      name = "org_mochajs",
+      name="org_mochajs",
       # no @license header
-      licenses = ["notice"],  # MIT
-      sha256_urls = {
+      licenses=["notice"],  # MIT
+      sha256_urls={
           "e36d865a17ffdf5868e55e736526ae30f3d4bc667c85a2a28cd5c850a82361e2": [
               "http://bazel-mirror.storage.googleapis.com/raw.githubusercontent.com/mochajs/mocha/2.3.4/mocha.js",
               "https://raw.githubusercontent.com/mochajs/mocha/2.3.4/mocha.js",
           ],
-      },
-  )
+      },)
 
   ##############################################################################
   # TensorBoard Polymer Dependencies
 
   webfiles_external(
-      name = "org_polymer_font_roboto",
-      licenses = ["notice"],  # BSD-3-Clause
-      sha256 = "fae51429b56a4a4c15f1f0c23b733c7095940cc9c04c275fa7adb3bf055b23b3",
-      urls = [
+      name="org_polymer_font_roboto",
+      licenses=["notice"],  # BSD-3-Clause
+      sha256="fae51429b56a4a4c15f1f0c23b733c7095940cc9c04c275fa7adb3bf055b23b3",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/PolymerElements/font-roboto/archive/v1.0.1.tar.gz",
           "https://github.com/PolymerElements/font-roboto/archive/v1.0.1.tar.gz",
       ],
-      strip_prefix = "font-roboto-1.0.1",
-      path = "/font-roboto",
-      srcs = ["roboto.html"],
-  )
+      strip_prefix="font-roboto-1.0.1",
+      path="/font-roboto",
+      srcs=["roboto.html"],)
 
   webfiles_external(
-      name = "org_polymer_iron_a11y_announcer",
-      licenses = ["notice"],  # BSD-3-Clause
-      sha256 = "6bce143db7a374a68535ec8b861a5f30e81f2f1e4ee36a55bda2a891f6fd2818",
-      urls = [
+      name="org_polymer_iron_a11y_announcer",
+      licenses=["notice"],  # BSD-3-Clause
+      sha256="6bce143db7a374a68535ec8b861a5f30e81f2f1e4ee36a55bda2a891f6fd2818",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/PolymerElements/iron-a11y-announcer/archive/v1.0.5.tar.gz",
           "https://github.com/PolymerElements/iron-a11y-announcer/archive/v1.0.5.tar.gz",
       ],
-      strip_prefix = "iron-a11y-announcer-1.0.5",
-      path = "/iron-a11y-announcer",
-      srcs = ["iron-a11y-announcer.html"],
-      deps = ["@org_polymer"],
-  )
+      strip_prefix="iron-a11y-announcer-1.0.5",
+      path="/iron-a11y-announcer",
+      srcs=["iron-a11y-announcer.html"],
+      deps=["@org_polymer"],)
 
   webfiles_external(
-      name = "org_polymer_iron_a11y_keys_behavior",
-      licenses = ["notice"],  # BSD-3-Clause
-      sha256 = "6823efc47a83208fd51d39c5a1d3eb0c0bebc705df1ce01310509da22a13ebd2",
-      urls = [
+      name="org_polymer_iron_a11y_keys_behavior",
+      licenses=["notice"],  # BSD-3-Clause
+      sha256="6823efc47a83208fd51d39c5a1d3eb0c0bebc705df1ce01310509da22a13ebd2",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/PolymerElements/iron-a11y-keys-behavior/archive/v1.1.8.tar.gz",
           "https://github.com/PolymerElements/iron-a11y-keys-behavior/archive/v1.1.8.tar.gz",
       ],
-      strip_prefix = "iron-a11y-keys-behavior-1.1.8",
-      path = "/iron-a11y-keys-behavior",
-      srcs = ["iron-a11y-keys-behavior.html"],
-      deps = ["@org_polymer"],
-  )
+      strip_prefix="iron-a11y-keys-behavior-1.1.8",
+      path="/iron-a11y-keys-behavior",
+      srcs=["iron-a11y-keys-behavior.html"],
+      deps=["@org_polymer"],)
 
   webfiles_external(
-      name = "org_polymer_iron_ajax",
-      licenses = ["notice"],  # BSD-3-Clause
-      sha256 = "9162d8af4611e911ac3ebbfc08bb7038ac04f6e79a9287b1476fe36ad6770bc5",
-      urls = [
+      name="org_polymer_iron_ajax",
+      licenses=["notice"],  # BSD-3-Clause
+      sha256="9162d8af4611e911ac3ebbfc08bb7038ac04f6e79a9287b1476fe36ad6770bc5",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/PolymerElements/iron-ajax/archive/v1.2.0.tar.gz",
           "https://github.com/PolymerElements/iron-ajax/archive/v1.2.0.tar.gz",
       ],
-      strip_prefix = "iron-ajax-1.2.0",
-      path = "/iron-ajax",
-      srcs = [
+      strip_prefix="iron-ajax-1.2.0",
+      path="/iron-ajax",
+      srcs=[
           "iron-ajax.html",
           "iron-request.html",
       ],
-      deps = [
+      deps=[
           "@org_polymer",
           "@org_polymer_promise_polyfill",
-      ],
-  )
+      ],)
 
   webfiles_external(
-      name = "org_polymer_iron_autogrow_textarea",
-      licenses = ["notice"],  # BSD-3-Clause
-      sha256 = "50bbb901d2c8f87462e3552e3d671a552faa12c37c485e548d7a234ebffbc427",
-      urls = [
+      name="org_polymer_iron_autogrow_textarea",
+      licenses=["notice"],  # BSD-3-Clause
+      sha256="50bbb901d2c8f87462e3552e3d671a552faa12c37c485e548d7a234ebffbc427",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/PolymerElements/iron-autogrow-textarea/archive/v1.0.12.tar.gz",
           "https://github.com/PolymerElements/iron-autogrow-textarea/archive/v1.0.12.tar.gz",
       ],
-      strip_prefix = "iron-autogrow-textarea-1.0.12",
-      path = "/iron-autogrow-textarea",
-      srcs = ["iron-autogrow-textarea.html"],
-      deps = [
+      strip_prefix="iron-autogrow-textarea-1.0.12",
+      path="/iron-autogrow-textarea",
+      srcs=["iron-autogrow-textarea.html"],
+      deps=[
           "@org_polymer",
           "@org_polymer_iron_behaviors",
           "@org_polymer_iron_flex_layout",
           "@org_polymer_iron_form_element_behavior",
           "@org_polymer_iron_validatable_behavior",
-      ],
-  )
+      ],)
 
   webfiles_external(
-      name = "org_polymer_iron_behaviors",
-      licenses = ["notice"],  # BSD-3-Clause
-      sha256 = "a1e8d4b7a13f3d36beba9c2a6b186ed33a53e6af2e79f98c1fcc7e85e7b53f89",
-      urls = [
+      name="org_polymer_iron_behaviors",
+      licenses=["notice"],  # BSD-3-Clause
+      sha256="a1e8d4b7a13f3d36beba9c2a6b186ed33a53e6af2e79f98c1fcc7e85e7b53f89",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/PolymerElements/iron-behaviors/archive/v1.0.17.tar.gz",
           "https://github.com/PolymerElements/iron-behaviors/archive/v1.0.17.tar.gz",
       ],
-      strip_prefix = "iron-behaviors-1.0.17",
-      path = "/iron-behaviors",
-      srcs = [
+      strip_prefix="iron-behaviors-1.0.17",
+      path="/iron-behaviors",
+      srcs=[
           "iron-button-state.html",
           "iron-control-state.html",
       ],
-      deps = [
+      deps=[
           "@org_polymer",
           "@org_polymer_iron_a11y_keys_behavior",
-      ],
-  )
+      ],)
 
   webfiles_external(
-      name = "org_polymer_iron_checked_element_behavior",
-      licenses = ["notice"],  # BSD-3-Clause
-      sha256 = "539a0e1c4df0bc702d3bd342388e4e56c77ec4c2066cce69e41426a69f92e8bd",
-      urls = [
+      name="org_polymer_iron_checked_element_behavior",
+      licenses=["notice"],  # BSD-3-Clause
+      sha256="539a0e1c4df0bc702d3bd342388e4e56c77ec4c2066cce69e41426a69f92e8bd",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/PolymerElements/iron-checked-element-behavior/archive/v1.0.4.tar.gz",
           "https://github.com/PolymerElements/iron-checked-element-behavior/archive/v1.0.4.tar.gz",
       ],
-      strip_prefix = "iron-checked-element-behavior-1.0.4",
-      path = "/iron-checked-element-behavior",
-      srcs = ["iron-checked-element-behavior.html"],
-      deps = [
+      strip_prefix="iron-checked-element-behavior-1.0.4",
+      path="/iron-checked-element-behavior",
+      srcs=["iron-checked-element-behavior.html"],
+      deps=[
           "@org_polymer",
           "@org_polymer_iron_form_element_behavior",
           "@org_polymer_iron_validatable_behavior",
-      ],
-  )
+      ],)
 
   webfiles_external(
-      name = "org_polymer_iron_collapse",
-      licenses = ["notice"],  # BSD-3-Clause
-      sha256 = "275808994a609a2f9923e2dd2db1957945ab141ba840eadc33f19e1f406d600e",
-      urls = [
+      name="org_polymer_iron_collapse",
+      licenses=["notice"],  # BSD-3-Clause
+      sha256="275808994a609a2f9923e2dd2db1957945ab141ba840eadc33f19e1f406d600e",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/PolymerElements/iron-collapse/archive/v1.0.8.tar.gz",
           "https://github.com/PolymerElements/iron-collapse/archive/v1.0.8.tar.gz",
       ],
-      strip_prefix = "iron-collapse-1.0.8",
-      path = "/iron-collapse",
-      srcs = ["iron-collapse.html"],
-      deps = [
+      strip_prefix="iron-collapse-1.0.8",
+      path="/iron-collapse",
+      srcs=["iron-collapse.html"],
+      deps=[
           "@org_polymer",
           "@org_polymer_iron_resizable_behavior",
-      ],
-  )
+      ],)
 
   webfiles_external(
-      name = "org_polymer_iron_demo_helpers",
-      licenses = ["notice"],  # BSD-3-Clause
-      sha256 = "aa7458492a6ac3d1f6344640a4c2ab07bce64e7ad0422b83b5d665707598cce6",
-      urls = [
+      name="org_polymer_iron_demo_helpers",
+      licenses=["notice"],  # BSD-3-Clause
+      sha256="aa7458492a6ac3d1f6344640a4c2ab07bce64e7ad0422b83b5d665707598cce6",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/PolymerElements/iron-demo-helpers/archive/v1.1.0.tar.gz",
           "https://github.com/PolymerElements/iron-demo-helpers/archive/v1.1.0.tar.gz",
       ],
-      strip_prefix = "iron-demo-helpers-1.1.0",
-      path = "/iron-demo-helpers",
-      srcs = [
+      strip_prefix="iron-demo-helpers-1.1.0",
+      path="/iron-demo-helpers",
+      srcs=[
           "demo-pages-shared-styles.html",
           "demo-snippet.html",
       ],
-      deps = [
+      deps=[
           "@org_polymer",
           "@org_polymer_iron_flex_layout",
           "@org_polymer_iron_icons",
@@ -996,109 +949,103 @@ def tf_workspace(path_prefix = "", tf_repo_name = ""):
           "@org_polymer_paper_icon_button",
           "@org_polymer_paper_styles",
           "@org_polymer_prism_element",
-      ],
-  )
+      ],)
 
   webfiles_external(
-      name = "org_polymer_iron_dropdown",
-      licenses = ["notice"],  # BSD-3-Clause
-      sha256 = "f7e4a31d096d10d8af1920397695cb17f3eb1cbe5e5ff91a861dabfcc085f376",
-      urls = [
+      name="org_polymer_iron_dropdown",
+      licenses=["notice"],  # BSD-3-Clause
+      sha256="f7e4a31d096d10d8af1920397695cb17f3eb1cbe5e5ff91a861dabfcc085f376",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/PolymerElements/iron-dropdown/archive/v1.4.0.tar.gz",
           "https://github.com/PolymerElements/iron-dropdown/archive/v1.4.0.tar.gz",
       ],
-      strip_prefix = "iron-dropdown-1.4.0",
-      path = "/iron-dropdown",
-      srcs = [
+      strip_prefix="iron-dropdown-1.4.0",
+      path="/iron-dropdown",
+      srcs=[
           "iron-dropdown.html",
           "iron-dropdown-scroll-manager.html",
       ],
-      deps = [
+      deps=[
           "@org_polymer",
           "@org_polymer_iron_a11y_keys_behavior",
           "@org_polymer_iron_behaviors",
           "@org_polymer_iron_overlay_behavior",
           "@org_polymer_iron_resizable_behavior",
           "@org_polymer_neon_animation",
-      ],
-  )
+      ],)
 
   webfiles_external(
-      name = "org_polymer_iron_fit_behavior",
-      licenses = ["notice"],  # BSD-3-Clause
-      sha256 = "10132a2ea309a37c4c07b8fead71f64abc588ee6107931e34680f5f36dd8291e",
-      urls = [
+      name="org_polymer_iron_fit_behavior",
+      licenses=["notice"],  # BSD-3-Clause
+      sha256="10132a2ea309a37c4c07b8fead71f64abc588ee6107931e34680f5f36dd8291e",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/PolymerElements/iron-fit-behavior/archive/v1.2.5.tar.gz",
           "https://github.com/PolymerElements/iron-fit-behavior/archive/v1.2.5.tar.gz",
       ],
-      strip_prefix = "iron-fit-behavior-1.2.5",
-      path = "/iron-fit-behavior",
-      srcs = ["iron-fit-behavior.html"],
-      deps = ["@org_polymer"],
-  )
+      strip_prefix="iron-fit-behavior-1.2.5",
+      path="/iron-fit-behavior",
+      srcs=["iron-fit-behavior.html"],
+      deps=["@org_polymer"],)
 
   webfiles_external(
-      name = "org_polymer_iron_flex_layout",
-      licenses = ["notice"],  # BSD-3-Clause
-      sha256 = "79287f6ca1c2d4e003f68b88fe19d03a1b6a0011e2b4cae579fe4d1474163a2e",
-      urls = [
+      name="org_polymer_iron_flex_layout",
+      licenses=["notice"],  # BSD-3-Clause
+      sha256="79287f6ca1c2d4e003f68b88fe19d03a1b6a0011e2b4cae579fe4d1474163a2e",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/PolymerElements/iron-flex-layout/archive/v1.3.0.tar.gz",
           "https://github.com/PolymerElements/iron-flex-layout/archive/v1.3.0.tar.gz",
       ],
-      strip_prefix = "iron-flex-layout-1.3.0",
-      path = "/iron-flex-layout",
-      srcs = [
+      strip_prefix="iron-flex-layout-1.3.0",
+      path="/iron-flex-layout",
+      srcs=[
           "classes/iron-flex-layout.html",
           "classes/iron-shadow-flex-layout.html",
           "iron-flex-layout.html",
           "iron-flex-layout-classes.html",
       ],
-      deps = ["@org_polymer"],
-  )
+      deps=["@org_polymer"],)
 
   webfiles_external(
-      name = "org_polymer_iron_form_element_behavior",
-      licenses = ["notice"],  # BSD-3-Clause
-      sha256 = "1dd9371c638e5bc2ecba8a64074aa680dfb8712198e9612f9ed24d387efc8f26",
-      urls = [
+      name="org_polymer_iron_form_element_behavior",
+      licenses=["notice"],  # BSD-3-Clause
+      sha256="1dd9371c638e5bc2ecba8a64074aa680dfb8712198e9612f9ed24d387efc8f26",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/PolymerElements/iron-form-element-behavior/archive/v1.0.6.tar.gz",
           "https://github.com/PolymerElements/iron-form-element-behavior/archive/v1.0.6.tar.gz",
       ],
-      strip_prefix = "iron-form-element-behavior-1.0.6",
-      path = "/iron-form-element-behavior",
-      srcs = ["iron-form-element-behavior.html"],
-      deps = ["@org_polymer"],
-  )
+      strip_prefix="iron-form-element-behavior-1.0.6",
+      path="/iron-form-element-behavior",
+      srcs=["iron-form-element-behavior.html"],
+      deps=["@org_polymer"],)
 
   webfiles_external(
-      name = "org_polymer_iron_icon",
-      licenses = ["notice"],  # BSD-3-Clause
-      sha256 = "9ed58a69159a02c07a6050d242e6d4e585a29f3245b8c8c390cfd52ddb786dc4",
-      urls = [
+      name="org_polymer_iron_icon",
+      licenses=["notice"],  # BSD-3-Clause
+      sha256="9ed58a69159a02c07a6050d242e6d4e585a29f3245b8c8c390cfd52ddb786dc4",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/PolymerElements/iron-icon/archive/v1.0.11.tar.gz",
           "https://github.com/PolymerElements/iron-icon/archive/v1.0.11.tar.gz",
       ],
-      strip_prefix = "iron-icon-1.0.11",
-      path = "/iron-icon",
-      srcs = ["iron-icon.html"],
-      deps = [
+      strip_prefix="iron-icon-1.0.11",
+      path="/iron-icon",
+      srcs=["iron-icon.html"],
+      deps=[
           "@org_polymer",
           "@org_polymer_iron_flex_layout",
           "@org_polymer_iron_meta",
-      ],
-  )
+      ],)
 
   webfiles_external(
-      name = "org_polymer_iron_icons",
-      licenses = ["notice"],  # BSD-3-Clause
-      sha256 = "3b18542c147c7923dc3a36b1a51984a73255d610f297d43c9aaccc52859bd0d0",
-      urls = [
+      name="org_polymer_iron_icons",
+      licenses=["notice"],  # BSD-3-Clause
+      sha256="3b18542c147c7923dc3a36b1a51984a73255d610f297d43c9aaccc52859bd0d0",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/PolymerElements/iron-icons/archive/v1.1.3.tar.gz",
           "https://github.com/PolymerElements/iron-icons/archive/v1.1.3.tar.gz",
       ],
-      strip_prefix = "iron-icons-1.1.3",
-      path = "/iron-icons",
-      srcs = [
+      strip_prefix="iron-icons-1.1.3",
+      path="/iron-icons",
+      srcs=[
           "av-icons.html",
           "communication-icons.html",
           "device-icons.html",
@@ -1111,247 +1058,233 @@ def tf_workspace(path_prefix = "", tf_repo_name = ""):
           "places-icons.html",
           "social-icons.html",
       ],
-      deps = [
+      deps=[
           "@org_polymer_iron_icon",
           "@org_polymer_iron_iconset_svg",
-      ],
-  )
+      ],)
 
   webfiles_external(
-      name = "org_polymer_iron_iconset_svg",
-      licenses = ["notice"],  # BSD-3-Clause
-      sha256 = "7e3925b7e63a7d22524c4b43ce16ab80d06a576649644783643c11a003284368",
-      urls = [
+      name="org_polymer_iron_iconset_svg",
+      licenses=["notice"],  # BSD-3-Clause
+      sha256="7e3925b7e63a7d22524c4b43ce16ab80d06a576649644783643c11a003284368",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/PolymerElements/iron-iconset-svg/archive/v1.1.0.tar.gz",
           "https://github.com/PolymerElements/iron-iconset-svg/archive/v1.1.0.tar.gz",
       ],
-      strip_prefix = "iron-iconset-svg-1.1.0",
-      path = "/iron-iconset-svg",
-      srcs = ["iron-iconset-svg.html"],
-      deps = [
+      strip_prefix="iron-iconset-svg-1.1.0",
+      path="/iron-iconset-svg",
+      srcs=["iron-iconset-svg.html"],
+      deps=[
           "@org_polymer",
           "@org_polymer_iron_meta",
-      ],
-  )
+      ],)
 
   webfiles_external(
-      name = "org_polymer_iron_input",
-      licenses = ["notice"],  # BSD-3-Clause
-      sha256 = "c505101ead08ab25526b1f49baecc8c28b4221b92a65e7334c783bdc81553c36",
-      urls = [
+      name="org_polymer_iron_input",
+      licenses=["notice"],  # BSD-3-Clause
+      sha256="c505101ead08ab25526b1f49baecc8c28b4221b92a65e7334c783bdc81553c36",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/PolymerElements/iron-input/archive/1.0.10.tar.gz",
           "https://github.com/PolymerElements/iron-input/archive/1.0.10.tar.gz",
       ],
-      strip_prefix = "iron-input-1.0.10",
-      path = "/iron-input",
-      srcs = ["iron-input.html"],
-      deps = [
+      strip_prefix="iron-input-1.0.10",
+      path="/iron-input",
+      srcs=["iron-input.html"],
+      deps=[
           "@org_polymer",
           "@org_polymer_iron_a11y_announcer",
           "@org_polymer_iron_validatable_behavior",
-      ],
-  )
+      ],)
 
   webfiles_external(
-      name = "org_polymer_iron_list",
-      licenses = ["notice"],  # BSD-3-Clause
-      sha256 = "72a6530b9f0ad5557f5d287845792a0ada74d8b159198e27f940e226313dc116",
-      urls = [
+      name="org_polymer_iron_list",
+      licenses=["notice"],  # BSD-3-Clause
+      sha256="72a6530b9f0ad5557f5d287845792a0ada74d8b159198e27f940e226313dc116",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/PolymerElements/iron-list/archive/v1.3.9.tar.gz",
           "https://github.com/PolymerElements/iron-list/archive/v1.3.9.tar.gz",
       ],
-      strip_prefix = "iron-list-1.3.9",
-      path = "/iron-list",
-      srcs = ["iron-list.html"],
-      deps = [
+      strip_prefix="iron-list-1.3.9",
+      path="/iron-list",
+      srcs=["iron-list.html"],
+      deps=[
           "@org_polymer",
           "@org_polymer_iron_a11y_keys_behavior",
           "@org_polymer_iron_resizable_behavior",
           "@org_polymer_iron_scroll_target_behavior",
-      ],
-  )
+      ],)
 
   webfiles_external(
-      name = "org_polymer_iron_menu_behavior",
-      licenses = ["notice"],  # BSD-3-Clause
-      sha256 = "ad27889343bc9a709258b073f69abc028bb1ffd3fdb975cd2d3939f7f5d7bb6c",
-      urls = [
+      name="org_polymer_iron_menu_behavior",
+      licenses=["notice"],  # BSD-3-Clause
+      sha256="ad27889343bc9a709258b073f69abc028bb1ffd3fdb975cd2d3939f7f5d7bb6c",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/PolymerElements/iron-menu-behavior/archive/v1.1.10.tar.gz",
           "https://github.com/PolymerElements/iron-menu-behavior/archive/v1.1.10.tar.gz",
       ],
-      strip_prefix = "iron-menu-behavior-1.1.10",
-      path = "/iron-menu-behavior",
-      srcs = [
+      strip_prefix="iron-menu-behavior-1.1.10",
+      path="/iron-menu-behavior",
+      srcs=[
           "iron-menu-behavior.html",
           "iron-menubar-behavior.html",
       ],
-      deps = [
+      deps=[
           "@org_polymer",
           "@org_polymer_iron_a11y_keys_behavior",
           "@org_polymer_iron_selector",
-      ],
-  )
+      ],)
 
   webfiles_external(
-      name = "org_polymer_iron_meta",
-      licenses = ["notice"],  # BSD-3-Clause
-      sha256 = "fb05e6031bae6b4effe5f15d44b3f548d5807f9e3b3aa2442ba17cf4b8b84361",
-      urls = [
+      name="org_polymer_iron_meta",
+      licenses=["notice"],  # BSD-3-Clause
+      sha256="fb05e6031bae6b4effe5f15d44b3f548d5807f9e3b3aa2442ba17cf4b8b84361",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/PolymerElements/iron-meta/archive/v1.1.1.tar.gz",
           "https://github.com/PolymerElements/iron-meta/archive/v1.1.1.tar.gz",
       ],
-      strip_prefix = "iron-meta-1.1.1",
-      path = "/iron-meta",
-      srcs = ["iron-meta.html"],
-      deps = ["@org_polymer"],
-  )
+      strip_prefix="iron-meta-1.1.1",
+      path="/iron-meta",
+      srcs=["iron-meta.html"],
+      deps=["@org_polymer"],)
 
   webfiles_external(
-      name = "org_polymer_iron_overlay_behavior",
-      licenses = ["notice"],  # BSD-3-Clause
-      sha256 = "3df5b54ff2e0510c87a2aff8c9d730d3fe83d3d11277cc1a49fa29b549acb46c",
-      urls = [
+      name="org_polymer_iron_overlay_behavior",
+      licenses=["notice"],  # BSD-3-Clause
+      sha256="3df5b54ff2e0510c87a2aff8c9d730d3fe83d3d11277cc1a49fa29b549acb46c",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/PolymerElements/iron-overlay-behavior/archive/v1.10.1.tar.gz",
           "https://github.com/PolymerElements/iron-overlay-behavior/archive/v1.10.1.tar.gz",
       ],
-      strip_prefix = "iron-overlay-behavior-1.10.1",
-      path = "/iron-overlay-behavior",
-      srcs = [
+      strip_prefix="iron-overlay-behavior-1.10.1",
+      path="/iron-overlay-behavior",
+      srcs=[
           "iron-focusables-helper.html",
           "iron-overlay-backdrop.html",
           "iron-overlay-behavior.html",
           "iron-overlay-manager.html",
       ],
-      deps = [
+      deps=[
           "@org_polymer",
           "@org_polymer_iron_a11y_keys_behavior",
           "@org_polymer_iron_fit_behavior",
           "@org_polymer_iron_resizable_behavior",
-      ],
-  )
+      ],)
 
   webfiles_external(
-      name = "org_polymer_iron_range_behavior",
-      licenses = ["notice"],  # BSD-3-Clause
-      sha256 = "b2f2b6d52284542330bd30b586e217926eb0adec5e13934a3cef557717c22dc2",
-      urls = [
+      name="org_polymer_iron_range_behavior",
+      licenses=["notice"],  # BSD-3-Clause
+      sha256="b2f2b6d52284542330bd30b586e217926eb0adec5e13934a3cef557717c22dc2",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/PolymerElements/iron-range-behavior/archive/v1.0.4.tar.gz",
           "https://github.com/PolymerElements/iron-range-behavior/archive/v1.0.4.tar.gz",
       ],
-      strip_prefix = "iron-range-behavior-1.0.4",
-      path = "/iron-range-behavior",
-      srcs = ["iron-range-behavior.html"],
-      deps = ["@org_polymer"],
-  )
+      strip_prefix="iron-range-behavior-1.0.4",
+      path="/iron-range-behavior",
+      srcs=["iron-range-behavior.html"],
+      deps=["@org_polymer"],)
 
   webfiles_external(
-      name = "org_polymer_iron_resizable_behavior",
-      licenses = ["notice"],  # BSD-3-Clause
-      sha256 = "a87a78ee9223c2f6afae7fc94a3ff91cbce6f7e2a7ed3f2979af7945c9281616",
-      urls = [
+      name="org_polymer_iron_resizable_behavior",
+      licenses=["notice"],  # BSD-3-Clause
+      sha256="a87a78ee9223c2f6afae7fc94a3ff91cbce6f7e2a7ed3f2979af7945c9281616",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/PolymerElements/iron-resizable-behavior/archive/v1.0.3.tar.gz",
           "https://github.com/PolymerElements/iron-resizable-behavior/archive/v1.0.3.tar.gz",
       ],
-      strip_prefix = "iron-resizable-behavior-1.0.3",
-      path = "/iron-resizable-behavior",
-      srcs = ["iron-resizable-behavior.html"],
-      deps = ["@org_polymer"],
-  )
+      strip_prefix="iron-resizable-behavior-1.0.3",
+      path="/iron-resizable-behavior",
+      srcs=["iron-resizable-behavior.html"],
+      deps=["@org_polymer"],)
 
   webfiles_external(
-      name = "org_polymer_iron_scroll_target_behavior",
-      licenses = ["notice"],  # BSD-3-Clause
-      sha256 = "d0de0c804b1ec91d814754144afd9da1cdb082690de88bd5e47fd5f41990746f",
-      urls = [
+      name="org_polymer_iron_scroll_target_behavior",
+      licenses=["notice"],  # BSD-3-Clause
+      sha256="d0de0c804b1ec91d814754144afd9da1cdb082690de88bd5e47fd5f41990746f",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/PolymerElements/iron-scroll-target-behavior/archive/v1.0.3.tar.gz",
           "https://github.com/PolymerElements/iron-scroll-target-behavior/archive/v1.0.3.tar.gz",
       ],
-      strip_prefix = "iron-scroll-target-behavior-1.0.3",
-      path = "/iron-scroll-target-behavior",
-      srcs = ["iron-scroll-target-behavior.html"],
-      deps = ["@org_polymer"],
-  )
+      strip_prefix="iron-scroll-target-behavior-1.0.3",
+      path="/iron-scroll-target-behavior",
+      srcs=["iron-scroll-target-behavior.html"],
+      deps=["@org_polymer"],)
 
   webfiles_external(
-      name = "org_polymer_iron_selector",
-      licenses = ["notice"],  # BSD-3-Clause
-      sha256 = "ba28a47443bad3b744611c9d7a79fb21dbdf2e35edc5ef8f812e2dcd72b16747",
-      urls = [
+      name="org_polymer_iron_selector",
+      licenses=["notice"],  # BSD-3-Clause
+      sha256="ba28a47443bad3b744611c9d7a79fb21dbdf2e35edc5ef8f812e2dcd72b16747",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/PolymerElements/iron-selector/archive/v1.5.2.tar.gz",
           "https://github.com/PolymerElements/iron-selector/archive/v1.5.2.tar.gz",
       ],
-      strip_prefix = "iron-selector-1.5.2",
-      path = "/iron-selector",
-      srcs = [
+      strip_prefix="iron-selector-1.5.2",
+      path="/iron-selector",
+      srcs=[
           "iron-multi-selectable.html",
           "iron-selectable.html",
           "iron-selection.html",
           "iron-selector.html",
       ],
-      deps = ["@org_polymer"],
-  )
+      deps=["@org_polymer"],)
 
   webfiles_external(
-      name = "org_polymer_iron_validatable_behavior",
-      licenses = ["notice"],  # BSD-3-Clause
-      sha256 = "aef4901e68043824f36104799269573dd345ffaac494186e466fdc79c06fdb63",
-      urls = [
+      name="org_polymer_iron_validatable_behavior",
+      licenses=["notice"],  # BSD-3-Clause
+      sha256="aef4901e68043824f36104799269573dd345ffaac494186e466fdc79c06fdb63",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/PolymerElements/iron-validatable-behavior/archive/v1.1.1.tar.gz",
           "https://github.com/PolymerElements/iron-validatable-behavior/archive/v1.1.1.tar.gz",
       ],
-      strip_prefix = "iron-validatable-behavior-1.1.1",
-      path = "/iron-validatable-behavior",
-      srcs = ["iron-validatable-behavior.html"],
-      deps = [
+      strip_prefix="iron-validatable-behavior-1.1.1",
+      path="/iron-validatable-behavior",
+      srcs=["iron-validatable-behavior.html"],
+      deps=[
           "@org_polymer",
           "@org_polymer_iron_meta",
-      ],
-  )
+      ],)
 
   webfiles_external(
-      name = "org_polymer_marked",
-      licenses = ["notice"],  # MIT
-      sha256 = "93d30bd593736ca440938d77808b7ef5972da0f3fcfe4ae63ae7b4ce117da2cb",
-      urls = [
+      name="org_polymer_marked",
+      licenses=["notice"],  # MIT
+      sha256="93d30bd593736ca440938d77808b7ef5972da0f3fcfe4ae63ae7b4ce117da2cb",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/chjj/marked/archive/v0.3.2.zip",
           "https://github.com/chjj/marked/archive/v0.3.2.zip",
       ],
-      strip_prefix = "marked-0.3.2",
-      path = "/marked",
-      srcs = ["lib/marked.js"],
-  )
+      strip_prefix="marked-0.3.2",
+      path="/marked",
+      srcs=["lib/marked.js"],)
 
   webfiles_external(
-      name = "org_polymer_marked_element",
-      licenses = ["notice"],  # BSD-3-Clause
-      sha256 = "7547616df95f8b903757e6afbabfcdba5322c2bcec3f17c726b8bba5adf4bc5f",
-      urls = [
+      name="org_polymer_marked_element",
+      licenses=["notice"],  # BSD-3-Clause
+      sha256="7547616df95f8b903757e6afbabfcdba5322c2bcec3f17c726b8bba5adf4bc5f",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/PolymerElements/marked-element/archive/v1.1.3.tar.gz",
           "https://github.com/PolymerElements/marked-element/archive/v1.1.3.tar.gz",
       ],
-      strip_prefix = "marked-element-1.1.3",
-      path = "/marked-element",
-      srcs = [
+      strip_prefix="marked-element-1.1.3",
+      path="/marked-element",
+      srcs=[
           "marked-element.html",
           "marked-import.html",
       ],
-      deps = [
+      deps=[
           "@org_polymer",
           "@org_polymer_marked",
-      ],
-  )
+      ],)
 
   webfiles_external(
-      name = "org_polymer_neon_animation",
-      licenses = ["notice"],  # BSD-3-Clause
-      sha256 = "8800c314a76b2da190a2b203259c1091f6d38e0057ed37c2a3d0b734980fa9a5",
-      urls = [
+      name="org_polymer_neon_animation",
+      licenses=["notice"],  # BSD-3-Clause
+      sha256="8800c314a76b2da190a2b203259c1091f6d38e0057ed37c2a3d0b734980fa9a5",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/PolymerElements/neon-animation/archive/v1.2.2.tar.gz",
           "https://github.com/PolymerElements/neon-animation/archive/v1.2.2.tar.gz",
       ],
-      strip_prefix = "neon-animation-1.2.2",
-      path = "/neon-animation",
-      srcs = [
+      strip_prefix="neon-animation-1.2.2",
+      path="/neon-animation",
+      srcs=[
           "animations/cascaded-animation.html",
           "animations/fade-in-animation.html",
           "animations/fade-out-animation.html",
@@ -1381,155 +1314,148 @@ def tf_workspace(path_prefix = "", tf_repo_name = ""):
           "neon-shared-element-animation-behavior.html",
           "web-animations.html",
       ],
-      deps = [
+      deps=[
           "@org_polymer",
           "@org_polymer_iron_meta",
           "@org_polymer_iron_resizable_behavior",
           "@org_polymer_iron_selector",
           "@org_polymer_web_animations_js",
-      ],
-  )
+      ],)
 
   webfiles_external(
-      name = "org_polymer_paper_behaviors",
-      licenses = ["notice"],  # BSD-3-Clause
-      sha256 = "7cfcb9082ef9909da262df6b5c120bc62dbeaff278cb563e8fc60465ddd387e5",
-      urls = [
+      name="org_polymer_paper_behaviors",
+      licenses=["notice"],  # BSD-3-Clause
+      sha256="7cfcb9082ef9909da262df6b5c120bc62dbeaff278cb563e8fc60465ddd387e5",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/PolymerElements/paper-behaviors/archive/v1.0.12.tar.gz",
           "https://github.com/PolymerElements/paper-behaviors/archive/v1.0.12.tar.gz",
       ],
-      strip_prefix = "paper-behaviors-1.0.12",
-      path = "/paper-behaviors",
-      srcs = [
+      strip_prefix="paper-behaviors-1.0.12",
+      path="/paper-behaviors",
+      srcs=[
           "paper-button-behavior.html",
           "paper-checked-element-behavior.html",
           "paper-inky-focus-behavior.html",
           "paper-ripple-behavior.html",
       ],
-      deps = [
+      deps=[
           "@org_polymer",
           "@org_polymer_iron_behaviors",
           "@org_polymer_iron_checked_element_behavior",
           "@org_polymer_paper_ripple",
-      ],
-  )
+      ],)
 
   webfiles_external(
-      name = "org_polymer_paper_button",
-      licenses = ["notice"],  # BSD-3-Clause
-      sha256 = "896c0a7e34bfcce63fc23c63e105ed9c4d62fa3a6385b7161e1e5cd4058820a6",
-      urls = [
+      name="org_polymer_paper_button",
+      licenses=["notice"],  # BSD-3-Clause
+      sha256="896c0a7e34bfcce63fc23c63e105ed9c4d62fa3a6385b7161e1e5cd4058820a6",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/PolymerElements/paper-button/archive/v1.0.11.tar.gz",
           "https://github.com/PolymerElements/paper-button/archive/v1.0.11.tar.gz",
       ],
-      strip_prefix = "paper-button-1.0.11",
-      path = "/paper-button",
-      srcs = ["paper-button.html"],
-      deps = [
+      strip_prefix="paper-button-1.0.11",
+      path="/paper-button",
+      srcs=["paper-button.html"],
+      deps=[
           "@org_polymer",
           "@org_polymer_iron_flex_layout",
           "@org_polymer_paper_behaviors",
           "@org_polymer_paper_material",
           "@org_polymer_paper_ripple",
-      ],
-  )
+      ],)
 
   webfiles_external(
-      name = "org_polymer_paper_checkbox",
-      licenses = ["notice"],  # BSD-3-Clause
-      sha256 = "6828a6954a048b1230fbd2606faffbae950ba1d042175b96ec50ae355786a166",
-      urls = [
+      name="org_polymer_paper_checkbox",
+      licenses=["notice"],  # BSD-3-Clause
+      sha256="6828a6954a048b1230fbd2606faffbae950ba1d042175b96ec50ae355786a166",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/PolymerElements/paper-checkbox/archive/v1.4.0.tar.gz",
           "https://github.com/PolymerElements/paper-checkbox/archive/v1.4.0.tar.gz",
       ],
-      strip_prefix = "paper-checkbox-1.4.0",
-      path = "/paper-checkbox",
-      srcs = ["paper-checkbox.html"],
-      deps = [
+      strip_prefix="paper-checkbox-1.4.0",
+      path="/paper-checkbox",
+      srcs=["paper-checkbox.html"],
+      deps=[
           "@org_polymer",
           "@org_polymer_paper_behaviors",
           "@org_polymer_paper_styles",
-      ],
-  )
+      ],)
 
   webfiles_external(
-      name = "org_polymer_paper_dialog",
-      licenses = ["notice"],  # BSD-3-Clause
-      sha256 = "c6a9709e7f528d03dcd574503c18b72d4751ca30017346d16e6a791d37ed9259",
-      urls = [
+      name="org_polymer_paper_dialog",
+      licenses=["notice"],  # BSD-3-Clause
+      sha256="c6a9709e7f528d03dcd574503c18b72d4751ca30017346d16e6a791d37ed9259",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/PolymerElements/paper-dialog/archive/v1.0.4.tar.gz",
           "https://github.com/PolymerElements/paper-dialog/archive/v1.0.4.tar.gz",
       ],
-      strip_prefix = "paper-dialog-1.0.4",
-      path = "/paper-dialog",
-      srcs = ["paper-dialog.html"],
-      deps = [
+      strip_prefix="paper-dialog-1.0.4",
+      path="/paper-dialog",
+      srcs=["paper-dialog.html"],
+      deps=[
           "@org_polymer",
           "@org_polymer_neon_animation",
           "@org_polymer_paper_dialog_behavior",
-      ],
-  )
+      ],)
 
   webfiles_external(
-      name = "org_polymer_paper_dialog_behavior",
-      licenses = ["notice"],  # BSD-3-Clause
-      sha256 = "a7e0e27ce63554bc14f384cf94bcfa24da8dc5f5120dfd565f45e166261aee40",
-      urls = [
+      name="org_polymer_paper_dialog_behavior",
+      licenses=["notice"],  # BSD-3-Clause
+      sha256="a7e0e27ce63554bc14f384cf94bcfa24da8dc5f5120dfd565f45e166261aee40",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/PolymerElements/paper-dialog-behavior/archive/v1.2.5.tar.gz",
           "https://github.com/PolymerElements/paper-dialog-behavior/archive/v1.2.5.tar.gz",
       ],
-      strip_prefix = "paper-dialog-behavior-1.2.5",
-      path = "/paper-dialog-behavior",
-      srcs = [
+      strip_prefix="paper-dialog-behavior-1.2.5",
+      path="/paper-dialog-behavior",
+      srcs=[
           "paper-dialog-behavior.html",
           "paper-dialog-common.css",
           "paper-dialog-shared-styles.html",
       ],
-      suppress = ["cssSyntax"],
-      deps = [
+      suppress=["cssSyntax"],
+      deps=[
           "@org_polymer",
           "@org_polymer_iron_flex_layout",
           "@org_polymer_iron_overlay_behavior",
           "@org_polymer_paper_styles",
-      ],
-  )
+      ],)
 
   webfiles_external(
-      name = "org_polymer_paper_dialog_scrollable",
-      licenses = ["notice"],  # BSD-3-Clause
-      sha256 = "a2e69283e7674f782c44d811387a0f8da2d01fac0172743d1add65e253e6b5ff",
-      urls = [
+      name="org_polymer_paper_dialog_scrollable",
+      licenses=["notice"],  # BSD-3-Clause
+      sha256="a2e69283e7674f782c44d811387a0f8da2d01fac0172743d1add65e253e6b5ff",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/PolymerElements/paper-dialog-scrollable/archive/1.1.5.tar.gz",
           "https://github.com/PolymerElements/paper-dialog-scrollable/archive/1.1.5.tar.gz",
       ],
-      strip_prefix = "paper-dialog-scrollable-1.1.5",
-      path = "/paper-dialog-scrollable",
-      srcs = ["paper-dialog-scrollable.html"],
-      deps = [
+      strip_prefix="paper-dialog-scrollable-1.1.5",
+      path="/paper-dialog-scrollable",
+      srcs=["paper-dialog-scrollable.html"],
+      deps=[
           "@org_polymer",
           "@org_polymer_iron_flex_layout",
           "@org_polymer_paper_dialog_behavior",
           "@org_polymer_paper_styles",
-      ],
-  )
+      ],)
 
   webfiles_external(
-      name = "org_polymer_paper_dropdown_menu",
-      licenses = ["notice"],  # BSD-3-Clause
-      sha256 = "9d88f654ec03ee9be211df9e69bede9e8a22b51bf1dbcc63b79762e4256d81ad",
-      urls = [
+      name="org_polymer_paper_dropdown_menu",
+      licenses=["notice"],  # BSD-3-Clause
+      sha256="9d88f654ec03ee9be211df9e69bede9e8a22b51bf1dbcc63b79762e4256d81ad",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/PolymerElements/paper-dropdown-menu/archive/v1.4.0.tar.gz",
           "https://github.com/PolymerElements/paper-dropdown-menu/archive/v1.4.0.tar.gz",
       ],
-      strip_prefix = "paper-dropdown-menu-1.4.0",
-      path = "/paper-dropdown-menu",
-      srcs = [
+      strip_prefix="paper-dropdown-menu-1.4.0",
+      path="/paper-dropdown-menu",
+      srcs=[
           "paper-dropdown-menu.html",
           "paper-dropdown-menu-icons.html",
           "paper-dropdown-menu-light.html",
           "paper-dropdown-menu-shared-styles.html",
       ],
-      deps = [
+      deps=[
           "@org_polymer",
           "@org_polymer_iron_a11y_keys_behavior",
           "@org_polymer_iron_behaviors",
@@ -1542,59 +1468,56 @@ def tf_workspace(path_prefix = "", tf_repo_name = ""):
           "@org_polymer_paper_menu_button",
           "@org_polymer_paper_ripple",
           "@org_polymer_paper_styles",
-      ],
-  )
+      ],)
 
   webfiles_external(
-      name = "org_polymer_paper_header_panel",
-      licenses = ["notice"],  # BSD-3-Clause
-      sha256 = "0db4bd8a4bf6f20dcd0dffb4f907b31c93a8647c9c021344239cf30b40b87075",
-      urls = [
+      name="org_polymer_paper_header_panel",
+      licenses=["notice"],  # BSD-3-Clause
+      sha256="0db4bd8a4bf6f20dcd0dffb4f907b31c93a8647c9c021344239cf30b40b87075",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/PolymerElements/paper-header-panel/archive/v1.1.4.tar.gz",
           "https://github.com/PolymerElements/paper-header-panel/archive/v1.1.4.tar.gz",
       ],
-      strip_prefix = "paper-header-panel-1.1.4",
-      path = "/paper-header-panel",
-      srcs = ["paper-header-panel.html"],
-      deps = [
+      strip_prefix="paper-header-panel-1.1.4",
+      path="/paper-header-panel",
+      srcs=["paper-header-panel.html"],
+      deps=[
           "@org_polymer",
           "@org_polymer_iron_flex_layout",
-      ],
-  )
+      ],)
 
   webfiles_external(
-      name = "org_polymer_paper_icon_button",
-      licenses = ["notice"],  # BSD-3-Clause
-      sha256 = "9cba5bcfd6aeb4c41581c1392c678cf2278d360e9d122f4d9db54a9ebb404496",
-      urls = [
+      name="org_polymer_paper_icon_button",
+      licenses=["notice"],  # BSD-3-Clause
+      sha256="9cba5bcfd6aeb4c41581c1392c678cf2278d360e9d122f4d9db54a9ebb404496",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/PolymerElements/paper-icon-button/archive/v1.1.3.tar.gz",
           "https://github.com/PolymerElements/paper-icon-button/archive/v1.1.3.tar.gz",
       ],
-      strip_prefix = "paper-icon-button-1.1.3",
-      path = "/paper-icon-button",
-      srcs = [
+      strip_prefix="paper-icon-button-1.1.3",
+      path="/paper-icon-button",
+      srcs=[
           "paper-icon-button.html",
           "paper-icon-button-light.html",
       ],
-      deps = [
+      deps=[
           "@org_polymer",
           "@org_polymer_iron_icon",
           "@org_polymer_paper_behaviors",
           "@org_polymer_paper_styles",
-      ],
-  )
+      ],)
 
   webfiles_external(
-      name = "org_polymer_paper_input",
-      licenses = ["notice"],  # BSD-3-Clause
-      sha256 = "17c3dea9bb1c2026cc61324696c6c774214a0dc37686b91ca214a6af550994db",
-      urls = [
+      name="org_polymer_paper_input",
+      licenses=["notice"],  # BSD-3-Clause
+      sha256="17c3dea9bb1c2026cc61324696c6c774214a0dc37686b91ca214a6af550994db",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/PolymerElements/paper-input/archive/v1.1.18.tar.gz",
           "https://github.com/PolymerElements/paper-input/archive/v1.1.18.tar.gz",
       ],
-      strip_prefix = "paper-input-1.1.18",
-      path = "/paper-input",
-      srcs = [
+      strip_prefix="paper-input-1.1.18",
+      path="/paper-input",
+      srcs=[
           "paper-input.html",
           "paper-input-addon-behavior.html",
           "paper-input-behavior.html",
@@ -1603,7 +1526,7 @@ def tf_workspace(path_prefix = "", tf_repo_name = ""):
           "paper-input-error.html",
           "paper-textarea.html",
       ],
-      deps = [
+      deps=[
           "@org_polymer",
           "@org_polymer_iron_a11y_keys_behavior",
           "@org_polymer_iron_autogrow_textarea",
@@ -1612,206 +1535,196 @@ def tf_workspace(path_prefix = "", tf_repo_name = ""):
           "@org_polymer_iron_form_element_behavior",
           "@org_polymer_iron_input",
           "@org_polymer_paper_styles",
-      ],
-  )
+      ],)
 
   webfiles_external(
-      name = "org_polymer_paper_item",
-      licenses = ["notice"],  # BSD-3-Clause
-      sha256 = "12ee0dcb61b0d5721c5988571f6974d7b2211e97724f4195893fbcc9058cdac8",
-      urls = [
+      name="org_polymer_paper_item",
+      licenses=["notice"],  # BSD-3-Clause
+      sha256="12ee0dcb61b0d5721c5988571f6974d7b2211e97724f4195893fbcc9058cdac8",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/PolymerElements/paper-item/archive/v1.1.4.tar.gz",
           "https://github.com/PolymerElements/paper-item/archive/v1.1.4.tar.gz",
       ],
-      strip_prefix = "paper-item-1.1.4",
-      path = "/paper-item",
-      srcs = [
+      strip_prefix="paper-item-1.1.4",
+      path="/paper-item",
+      srcs=[
           "paper-icon-item.html",
           "paper-item.html",
           "paper-item-behavior.html",
           "paper-item-body.html",
           "paper-item-shared-styles.html",
       ],
-      deps = [
+      deps=[
           "@org_polymer",
           "@org_polymer_iron_behaviors",
           "@org_polymer_iron_flex_layout",
           "@org_polymer_paper_styles",
-      ],
-  )
+      ],)
 
   webfiles_external(
-      name = "org_polymer_paper_listbox",
-      licenses = ["notice"],  # BSD-3-Clause
-      sha256 = "3cb35f4fe9a3f15185a9e91711dba8f27e9291c8cd371ebf1be21b8f1d5f65fb",
-      urls = [
+      name="org_polymer_paper_listbox",
+      licenses=["notice"],  # BSD-3-Clause
+      sha256="3cb35f4fe9a3f15185a9e91711dba8f27e9291c8cd371ebf1be21b8f1d5f65fb",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/PolymerElements/paper-listbox/archive/v1.1.2.tar.gz",
           "https://github.com/PolymerElements/paper-listbox/archive/v1.1.2.tar.gz",
       ],
-      strip_prefix = "paper-listbox-1.1.2",
-      path = "/paper-listbox",
-      srcs = ["paper-listbox.html"],
-      deps = [
+      strip_prefix="paper-listbox-1.1.2",
+      path="/paper-listbox",
+      srcs=["paper-listbox.html"],
+      deps=[
           "@org_polymer",
           "@org_polymer_iron_menu_behavior",
           "@org_polymer_paper_styles",
-      ],
-  )
+      ],)
 
   webfiles_external(
-      name = "org_polymer_paper_material",
-      licenses = ["notice"],  # BSD-3-Clause
-      sha256 = "09f6c8bd6ddbea2be541dc86306efe41cdfb31bec0b69d35a5dc29772bbc8506",
-      urls = [
+      name="org_polymer_paper_material",
+      licenses=["notice"],  # BSD-3-Clause
+      sha256="09f6c8bd6ddbea2be541dc86306efe41cdfb31bec0b69d35a5dc29772bbc8506",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/PolymerElements/paper-material/archive/v1.0.6.tar.gz",
           "https://github.com/PolymerElements/paper-material/archive/v1.0.6.tar.gz",
       ],
-      strip_prefix = "paper-material-1.0.6",
-      path = "/paper-material",
-      srcs = [
+      strip_prefix="paper-material-1.0.6",
+      path="/paper-material",
+      srcs=[
           "paper-material.html",
           "paper-material-shared-styles.html",
       ],
-      deps = [
+      deps=[
           "@org_polymer",
           "@org_polymer_paper_styles",
-      ],
-  )
+      ],)
 
   webfiles_external(
-      name = "org_polymer_paper_menu",
-      licenses = ["notice"],  # BSD-3-Clause
-      sha256 = "a3cee220926e315f7412236b3628288774694447c0da4428345f36d0f127ba3b",
-      urls = [
+      name="org_polymer_paper_menu",
+      licenses=["notice"],  # BSD-3-Clause
+      sha256="a3cee220926e315f7412236b3628288774694447c0da4428345f36d0f127ba3b",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/PolymerElements/paper-menu/archive/v1.2.2.tar.gz",
           "https://github.com/PolymerElements/paper-menu/archive/v1.2.2.tar.gz",
       ],
-      strip_prefix = "paper-menu-1.2.2",
-      path = "/paper-menu",
-      srcs = [
+      strip_prefix="paper-menu-1.2.2",
+      path="/paper-menu",
+      srcs=[
           "paper-menu.html",
           "paper-menu-shared-styles.html",
           "paper-submenu.html",
       ],
-      deps = [
+      deps=[
           "@org_polymer",
           "@org_polymer_iron_behaviors",
           "@org_polymer_iron_collapse",
           "@org_polymer_iron_flex_layout",
           "@org_polymer_iron_menu_behavior",
           "@org_polymer_paper_styles",
-      ],
-  )
+      ],)
 
   webfiles_external(
-      name = "org_polymer_paper_menu_button",
-      licenses = ["notice"],  # BSD-3-Clause
-      sha256 = "be3290c288a2bd4f9887213db22c75add99cc29ff4d088100c0bc4eb0e57997b",
-      urls = [
+      name="org_polymer_paper_menu_button",
+      licenses=["notice"],  # BSD-3-Clause
+      sha256="be3290c288a2bd4f9887213db22c75add99cc29ff4d088100c0bc4eb0e57997b",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/PolymerElements/paper-menu-button/archive/v1.5.1.tar.gz",
           "https://github.com/PolymerElements/paper-menu-button/archive/v1.5.1.tar.gz",
       ],
-      strip_prefix = "paper-menu-button-1.5.1",
-      path = "/paper-menu-button",
-      srcs = [
+      strip_prefix="paper-menu-button-1.5.1",
+      path="/paper-menu-button",
+      srcs=[
           "paper-menu-button.html",
           "paper-menu-button-animations.html",
       ],
-      deps = [
+      deps=[
           "@org_polymer",
           "@org_polymer_iron_a11y_keys_behavior",
           "@org_polymer_iron_behaviors",
           "@org_polymer_iron_dropdown",
           "@org_polymer_neon_animation",
           "@org_polymer_paper_styles",
-      ],
-  )
+      ],)
 
   webfiles_external(
-      name = "org_polymer_paper_progress",
-      licenses = ["notice"],  # BSD-3-Clause
-      sha256 = "2b6776b2f023c1f344feea17ba29b58d879e46f8ed43b7256495054b5183fff6",
-      urls = [
+      name="org_polymer_paper_progress",
+      licenses=["notice"],  # BSD-3-Clause
+      sha256="2b6776b2f023c1f344feea17ba29b58d879e46f8ed43b7256495054b5183fff6",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/PolymerElements/paper-progress/archive/v1.0.9.tar.gz",
           "https://github.com/PolymerElements/paper-progress/archive/v1.0.9.tar.gz",
       ],
-      strip_prefix = "paper-progress-1.0.9",
-      path = "/paper-progress",
-      srcs = ["paper-progress.html"],
-      deps = [
+      strip_prefix="paper-progress-1.0.9",
+      path="/paper-progress",
+      srcs=["paper-progress.html"],
+      deps=[
           "@org_polymer",
           "@org_polymer_iron_flex_layout",
           "@org_polymer_iron_range_behavior",
           "@org_polymer_paper_styles",
-      ],
-  )
+      ],)
 
   webfiles_external(
-      name = "org_polymer_paper_radio_button",
-      licenses = ["notice"],  # BSD-3-Clause
-      sha256 = "6e911d0c308aa388136b3af79d1bdcbe5a1f4159cbc79d71efb4ff3b6c0b4e91",
-      urls = [
+      name="org_polymer_paper_radio_button",
+      licenses=["notice"],  # BSD-3-Clause
+      sha256="6e911d0c308aa388136b3af79d1bdcbe5a1f4159cbc79d71efb4ff3b6c0b4e91",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/PolymerElements/paper-radio-button/archive/v1.1.2.tar.gz",
           "https://github.com/PolymerElements/paper-radio-button/archive/v1.1.2.tar.gz",
       ],
-      strip_prefix = "paper-radio-button-1.1.2",
-      path = "/paper-radio-button",
-      srcs = ["paper-radio-button.html"],
-      deps = [
+      strip_prefix="paper-radio-button-1.1.2",
+      path="/paper-radio-button",
+      srcs=["paper-radio-button.html"],
+      deps=[
           "@org_polymer",
           "@org_polymer_paper_behaviors",
           "@org_polymer_paper_styles",
-      ],
-  )
+      ],)
 
   webfiles_external(
-      name = "org_polymer_paper_radio_group",
-      licenses = ["notice"],  # BSD-3-Clause
-      sha256 = "7885ad1f81e9dcc03dcea4139b54a201ff55c18543770cd44f94530046c9e163",
-      urls = [
+      name="org_polymer_paper_radio_group",
+      licenses=["notice"],  # BSD-3-Clause
+      sha256="7885ad1f81e9dcc03dcea4139b54a201ff55c18543770cd44f94530046c9e163",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/PolymerElements/paper-radio-group/archive/v1.0.9.tar.gz",
           "https://github.com/PolymerElements/paper-radio-group/archive/v1.0.9.tar.gz",
       ],
-      strip_prefix = "paper-radio-group-1.0.9",
-      path = "/paper-radio-group",
-      srcs = ["paper-radio-group.html"],
-      deps = [
+      strip_prefix="paper-radio-group-1.0.9",
+      path="/paper-radio-group",
+      srcs=["paper-radio-group.html"],
+      deps=[
           "@org_polymer",
           "@org_polymer_iron_a11y_keys_behavior",
           "@org_polymer_iron_selector",
           "@org_polymer_paper_radio_button",
-      ],
-  )
+      ],)
 
   webfiles_external(
-      name = "org_polymer_paper_ripple",
-      licenses = ["notice"],  # BSD-3-Clause
-      sha256 = "ba76bfb1c737260a8a103d3ca97faa1f7c3288c7db9b2519f401b7a782147c09",
-      urls = [
+      name="org_polymer_paper_ripple",
+      licenses=["notice"],  # BSD-3-Clause
+      sha256="ba76bfb1c737260a8a103d3ca97faa1f7c3288c7db9b2519f401b7a782147c09",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/PolymerElements/paper-ripple/archive/v1.0.5.tar.gz",
           "https://github.com/PolymerElements/paper-ripple/archive/v1.0.5.tar.gz",
       ],
-      strip_prefix = "paper-ripple-1.0.5",
-      path = "/paper-ripple",
-      srcs = ["paper-ripple.html"],
-      deps = [
+      strip_prefix="paper-ripple-1.0.5",
+      path="/paper-ripple",
+      srcs=["paper-ripple.html"],
+      deps=[
           "@org_polymer",
           "@org_polymer_iron_a11y_keys_behavior",
-      ],
-  )
+      ],)
 
   webfiles_external(
-      name = "org_polymer_paper_slider",
-      licenses = ["notice"],  # BSD-3-Clause
-      sha256 = "08e7c541dbf5d2e959208810bfc03188e82ced87e4d30d325172967f67962c3c",
-      urls = [
+      name="org_polymer_paper_slider",
+      licenses=["notice"],  # BSD-3-Clause
+      sha256="08e7c541dbf5d2e959208810bfc03188e82ced87e4d30d325172967f67962c3c",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/PolymerElements/paper-slider/archive/v1.0.10.tar.gz",
           "https://github.com/PolymerElements/paper-slider/archive/v1.0.10.tar.gz",
       ],
-      strip_prefix = "paper-slider-1.0.10",
-      path = "/paper-slider",
-      srcs = ["paper-slider.html"],
-      deps = [
+      strip_prefix="paper-slider-1.0.10",
+      path="/paper-slider",
+      srcs=["paper-slider.html"],
+      deps=[
           "@org_polymer",
           "@org_polymer_iron_a11y_keys_behavior",
           "@org_polymer_iron_flex_layout",
@@ -1821,43 +1734,39 @@ def tf_workspace(path_prefix = "", tf_repo_name = ""):
           "@org_polymer_paper_input",
           "@org_polymer_paper_progress",
           "@org_polymer_paper_styles",
-      ],
-  )
+      ],)
 
   webfiles_external(
-      name = "org_polymer_paper_spinner",
-      licenses = ["notice"],  # BSD-3-Clause
-      sha256 = "6a752907fab7899cbeed15b478e7b9299047c15fbf9d1561d6eb4d204bdbd178",
-      urls = [
+      name="org_polymer_paper_spinner",
+      licenses=["notice"],  # BSD-3-Clause
+      sha256="6a752907fab7899cbeed15b478e7b9299047c15fbf9d1561d6eb4d204bdbd178",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/PolymerElements/paper-spinner/archive/v1.1.1.tar.gz",
           "https://github.com/PolymerElements/paper-spinner/archive/v1.1.1.tar.gz",
       ],
-      strip_prefix = "paper-spinner-1.1.1",
-      path = "/paper-spinner",
-      srcs = [
-          "paper-spinner.html",
-          "paper-spinner-behavior.html",
-          "paper-spinner-lite.html",
-          "paper-spinner-styles.html"
+      strip_prefix="paper-spinner-1.1.1",
+      path="/paper-spinner",
+      srcs=[
+          "paper-spinner.html", "paper-spinner-behavior.html",
+          "paper-spinner-lite.html", "paper-spinner-styles.html"
       ],
-      deps = [
+      deps=[
           "@org_polymer",
           "@org_polymer_iron_flex_layout",
           "@org_polymer_paper_styles",
-      ],
-  )
+      ],)
 
   webfiles_external(
-      name = "org_polymer_paper_styles",
-      licenses = ["notice"],  # BSD-3-Clause
-      sha256 = "6d26b0a4c286402098853dc7388f6b22f30dfb7a74e47b34992ac03380144bb2",
-      urls = [
+      name="org_polymer_paper_styles",
+      licenses=["notice"],  # BSD-3-Clause
+      sha256="6d26b0a4c286402098853dc7388f6b22f30dfb7a74e47b34992ac03380144bb2",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/PolymerElements/paper-styles/archive/v1.1.4.tar.gz",
           "https://github.com/PolymerElements/paper-styles/archive/v1.1.4.tar.gz",
       ],
-      strip_prefix = "paper-styles-1.1.4",
-      path = "/paper-styles",
-      srcs = [
+      strip_prefix="paper-styles-1.1.4",
+      path="/paper-styles",
+      srcs=[
           "classes/global.html",
           "classes/shadow.html",
           "classes/shadow-layout.html",
@@ -1871,29 +1780,28 @@ def tf_workspace(path_prefix = "", tf_repo_name = ""):
           "shadow.html",
           "typography.html",
       ],
-      deps = [
+      deps=[
           "@org_polymer",
           "@org_polymer_font_roboto",
           "@org_polymer_iron_flex_layout",
-      ],
-  )
+      ],)
 
   webfiles_external(
-      name = "org_polymer_paper_tabs",
-      licenses = ["notice"],  # BSD-3-Clause
-      sha256 = "c23b6a5221db35e5b1ed3eb8e8696b952572563e285adaec96aba1e3134db825",
-      urls = [
+      name="org_polymer_paper_tabs",
+      licenses=["notice"],  # BSD-3-Clause
+      sha256="c23b6a5221db35e5b1ed3eb8e8696b952572563e285adaec96aba1e3134db825",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/PolymerElements/paper-tabs/archive/v1.7.0.tar.gz",
           "https://github.com/PolymerElements/paper-tabs/archive/v1.7.0.tar.gz",
       ],
-      strip_prefix = "paper-tabs-1.7.0",
-      path = "/paper-tabs",
-      srcs = [
+      strip_prefix="paper-tabs-1.7.0",
+      path="/paper-tabs",
+      srcs=[
           "paper-tab.html",
           "paper-tabs.html",
           "paper-tabs-icons.html",
       ],
-      deps = [
+      deps=[
           "@org_polymer",
           "@org_polymer_iron_behaviors",
           "@org_polymer_iron_flex_layout",
@@ -1904,177 +1812,165 @@ def tf_workspace(path_prefix = "", tf_repo_name = ""):
           "@org_polymer_paper_behaviors",
           "@org_polymer_paper_icon_button",
           "@org_polymer_paper_styles",
-      ],
-  )
+      ],)
 
   webfiles_external(
-      name = "org_polymer_paper_toast",
-      licenses = ["notice"],  # BSD-3-Clause
-      sha256 = "55f623712ed1f2bae6d6fadc522a2458e083ccd44cc0a907672547e7b10758a9",
-      urls = [
+      name="org_polymer_paper_toast",
+      licenses=["notice"],  # BSD-3-Clause
+      sha256="55f623712ed1f2bae6d6fadc522a2458e083ccd44cc0a907672547e7b10758a9",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/PolymerElements/paper-toast/archive/v1.3.0.tar.gz",
           "https://github.com/PolymerElements/paper-toast/archive/v1.3.0.tar.gz",
       ],
-      strip_prefix = "paper-toast-1.3.0",
-      path = "/paper-toast",
-      srcs = ["paper-toast.html"],
-      deps = [
+      strip_prefix="paper-toast-1.3.0",
+      path="/paper-toast",
+      srcs=["paper-toast.html"],
+      deps=[
           "@org_polymer",
           "@org_polymer_iron_a11y_announcer",
           "@org_polymer_iron_overlay_behavior",
-      ],
-  )
+      ],)
 
   webfiles_external(
-      name = "org_polymer_paper_toggle_button",
-      licenses = ["notice"],  # BSD-3-Clause
-      sha256 = "4aa7cf0396fa2994a8bc2ac6e8428f48b07b945bb7c41bd52041ef5827b45de3",
-      urls = [
+      name="org_polymer_paper_toggle_button",
+      licenses=["notice"],  # BSD-3-Clause
+      sha256="4aa7cf0396fa2994a8bc2ac6e8428f48b07b945bb7c41bd52041ef5827b45de3",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/PolymerElements/paper-toggle-button/archive/v1.2.0.tar.gz",
           "https://github.com/PolymerElements/paper-toggle-button/archive/v1.2.0.tar.gz",
       ],
-      strip_prefix = "paper-toggle-button-1.2.0",
-      path = "/paper-toggle-button",
-      srcs = ["paper-toggle-button.html"],
-      deps = [
+      strip_prefix="paper-toggle-button-1.2.0",
+      path="/paper-toggle-button",
+      srcs=["paper-toggle-button.html"],
+      deps=[
           "@org_polymer",
           "@org_polymer_iron_flex_layout",
           "@org_polymer_paper_behaviors",
           "@org_polymer_paper_styles",
-      ],
-  )
+      ],)
 
   webfiles_external(
-      name = "org_polymer_paper_toolbar",
-      licenses = ["notice"],  # BSD-3-Clause
-      sha256 = "dbddffc0654d9fb5fb48843087eebe16bf7a134902495a664c96c11bf8a2c63d",
-      urls = [
+      name="org_polymer_paper_toolbar",
+      licenses=["notice"],  # BSD-3-Clause
+      sha256="dbddffc0654d9fb5fb48843087eebe16bf7a134902495a664c96c11bf8a2c63d",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/PolymerElements/paper-toolbar/archive/v1.1.4.tar.gz",
           "https://github.com/PolymerElements/paper-toolbar/archive/v1.1.4.tar.gz",
       ],
-      strip_prefix = "paper-toolbar-1.1.4",
-      path = "/paper-toolbar",
-      srcs = ["paper-toolbar.html"],
-      deps = [
+      strip_prefix="paper-toolbar-1.1.4",
+      path="/paper-toolbar",
+      srcs=["paper-toolbar.html"],
+      deps=[
           "@org_polymer",
           "@org_polymer_iron_flex_layout",
           "@org_polymer_paper_styles",
-      ],
-  )
+      ],)
 
   webfiles_external(
-      name = "org_polymer_paper_tooltip",
-      licenses = ["notice"],  # BSD-3-Clause
-      sha256 = "4c6667acf01f73da14c3cbc0aa574bf14280304567987ee0314534328377d2ad",
-      urls = [
+      name="org_polymer_paper_tooltip",
+      licenses=["notice"],  # BSD-3-Clause
+      sha256="4c6667acf01f73da14c3cbc0aa574bf14280304567987ee0314534328377d2ad",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/PolymerElements/paper-tooltip/archive/v1.1.2.tar.gz",
           "https://github.com/PolymerElements/paper-tooltip/archive/v1.1.2.tar.gz",
       ],
-      strip_prefix = "paper-tooltip-1.1.2",
-      path = "/paper-tooltip",
-      srcs = ["paper-tooltip.html"],
-      deps = [
+      strip_prefix="paper-tooltip-1.1.2",
+      path="/paper-tooltip",
+      srcs=["paper-tooltip.html"],
+      deps=[
           "@org_polymer",
           "@org_polymer_neon_animation",
-      ],
-  )
+      ],)
 
   webfiles_external(
-      name = "org_polymer",
-      licenses = ["notice"],  # BSD-3-Clause
-      sha256 = "07a9e62ffb52193da3af09adda2fbac5cc690439978520e2d03e783863f65f91",
-      strip_prefix = "polymer-1.7.0",
-      urls = [
+      name="org_polymer",
+      licenses=["notice"],  # BSD-3-Clause
+      sha256="07a9e62ffb52193da3af09adda2fbac5cc690439978520e2d03e783863f65f91",
+      strip_prefix="polymer-1.7.0",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/polymer/polymer/archive/v1.7.0.tar.gz",
           "https://github.com/polymer/polymer/archive/v1.7.0.tar.gz",
       ],
-      path = "/polymer",
-      srcs = [
+      path="/polymer",
+      srcs=[
           "polymer.html",
           "polymer-micro.html",
           "polymer-mini.html",
-      ],
-  )
+      ],)
 
   webfiles_external(
-      name = "org_polymer_prism",
-      licenses = ["notice"],  # MIT
-      sha256 = "e06eb54f2a80e6b3cd0bd4d59f900423bcaee53fc03998a056df63740c684683",
-      urls = [
+      name="org_polymer_prism",
+      licenses=["notice"],  # MIT
+      sha256="e06eb54f2a80e6b3cd0bd4d59f900423bcaee53fc03998a056df63740c684683",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/PrismJS/prism/archive/abee2b7587f1925e57777044270e2a1860810994.tar.gz",
           "https://github.com/PrismJS/prism/archive/abee2b7587f1925e57777044270e2a1860810994.tar.gz",
       ],
-      strip_prefix = "prism-abee2b7587f1925e57777044270e2a1860810994",
-      path = "/prism",
-      srcs = [
+      strip_prefix="prism-abee2b7587f1925e57777044270e2a1860810994",
+      path="/prism",
+      srcs=[
           "prism.js",
           "themes/prism.css",
-      ],
-  )
+      ],)
 
   webfiles_external(
-      name = "org_polymer_prism_element",
-      licenses = ["notice"],  # BSD-3-Clause
-      sha256 = "ad70bf9cd5bbdf525d465e1b0658867ab4022193eb9c74087a839044b46312b4",
-      urls = [
+      name="org_polymer_prism_element",
+      licenses=["notice"],  # BSD-3-Clause
+      sha256="ad70bf9cd5bbdf525d465e1b0658867ab4022193eb9c74087a839044b46312b4",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/PolymerElements/prism-element/archive/1.0.4.tar.gz",
           "https://github.com/PolymerElements/prism-element/archive/1.0.4.tar.gz",
       ],
-      strip_prefix = "prism-element-1.0.4",
-      path = "/prism-element",
-      srcs = [
+      strip_prefix="prism-element-1.0.4",
+      path="/prism-element",
+      srcs=[
           "prism-highlighter.html",
           "prism-import.html",
       ],
-      deps = [
+      deps=[
           "@org_polymer",
           "@org_polymer_prism",
-      ],
-  )
+      ],)
 
   webfiles_external(
-      name = "org_polymer_promise_polyfill",
-      licenses = ["notice"],  # BSD-3-Clause
-      sha256 = "4495450e5d884c3e16b537b43afead7f84d17c7dc061bcfcbf440eac083e4ef5",
-      strip_prefix = "promise-polyfill-1.0.0",
-      urls = [
+      name="org_polymer_promise_polyfill",
+      licenses=["notice"],  # BSD-3-Clause
+      sha256="4495450e5d884c3e16b537b43afead7f84d17c7dc061bcfcbf440eac083e4ef5",
+      strip_prefix="promise-polyfill-1.0.0",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/PolymerLabs/promise-polyfill/archive/v1.0.0.tar.gz",
           "https://github.com/PolymerLabs/promise-polyfill/archive/v1.0.0.tar.gz",
       ],
-      path = "/promise-polyfill",
-      srcs = [
-          "Promise.js",
-          "Promise-Statics.js",
-          "promise-polyfill.html",
+      path="/promise-polyfill",
+      srcs=[
+          "Promise.js", "Promise-Statics.js", "promise-polyfill.html",
           "promise-polyfill-lite.html"
       ],
-      deps = ["@org_polymer"],
-  )
+      deps=["@org_polymer"],)
 
   webfiles_external(
-      name = "org_polymer_web_animations_js",
-      licenses = ["notice"],  # BSD-3-Clause
-      sha256 = "f8bd760cbdeba131f6790bd5abe170bcbf7b1755ff58ed16d0b82fa8a7f34a7f",
-      urls = [
+      name="org_polymer_web_animations_js",
+      licenses=["notice"],  # BSD-3-Clause
+      sha256="f8bd760cbdeba131f6790bd5abe170bcbf7b1755ff58ed16d0b82fa8a7f34a7f",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/web-animations/web-animations-js/archive/2.2.1.tar.gz",
           "https://github.com/web-animations/web-animations-js/archive/2.2.1.tar.gz",
       ],
-      strip_prefix = "web-animations-js-2.2.1",
-      path = "/web-animations-js",
-      srcs = ["web-animations-next-lite.min.js"],
-  )
+      strip_prefix="web-animations-js-2.2.1",
+      path="/web-animations-js",
+      srcs=["web-animations-next-lite.min.js"],)
 
   webfiles_external(
-      name = "org_polymer_webcomponentsjs",
-      licenses = ["notice"],  # BSD-3-Clause
-      sha256 = "138c43306ee0a6d699ddca9b3c6b0f4982974ea8b7bdad291ea7276c72301df9",
-      urls = [
+      name="org_polymer_webcomponentsjs",
+      licenses=["notice"],  # BSD-3-Clause
+      sha256="138c43306ee0a6d699ddca9b3c6b0f4982974ea8b7bdad291ea7276c72301df9",
+      urls=[
           "http://bazel-mirror.storage.googleapis.com/github.com/webcomponents/webcomponentsjs/archive/v0.7.22.tar.gz",
           "https://github.com/webcomponents/webcomponentsjs/archive/v0.7.22.tar.gz",
       ],
-      strip_prefix = "webcomponentsjs-0.7.22",
-      path = "/webcomponentsjs",
-      srcs = [
+      strip_prefix="webcomponentsjs-0.7.22",
+      path="/webcomponentsjs",
+      srcs=[
           "CustomElements.js",
           "CustomElements.min.js",
           "HTMLImports.js",
@@ -2087,5 +1983,4 @@ def tf_workspace(path_prefix = "", tf_repo_name = ""):
           "webcomponents.min.js",
           "webcomponents-lite.js",
           "webcomponents-lite.min.js",
-      ],
-  )
+      ],)
diff --git a/third_party/libxsmm.BUILD b/third_party/libxsmm.BUILD
index d61b7d02772..53a814b4b84 100644
--- a/third_party/libxsmm.BUILD
+++ b/third_party/libxsmm.BUILD
@@ -96,7 +96,8 @@ cc_library(
         "include/libxsmm.h",
         "include/libxsmm_config.h",
         "include/libxsmm_dispatch.h",
-    ] + glob([ # trigger rebuild if template changed
+    ] + glob([
+        # trigger rebuild if template changed
         "src/template/*.c",
     ]),
     copts = [