From 08beb079e778ef0f35ce0a41f6d602bd4218962f Mon Sep 17 00:00:00 2001 From: "A. Unique TensorFlower" Date: Wed, 31 Aug 2016 14:20:06 -0800 Subject: [PATCH 01/89] Update generated Python Op docs. Change: 131879711 --- .../contrib.bayesflow.stochastic_tensor.md | 90 +++ .../api_docs/python/contrib.distributions.md | 642 ++++++++++++++++++ ...yesflow.stochastic_tensor.MixtureTensor.md | 85 +++ .../tf.contrib.distributions.Mixture.md | 634 +++++++++++++++++ tensorflow/g3doc/api_docs/python/index.md | 2 + 5 files changed, 1453 insertions(+) create mode 100644 tensorflow/g3doc/api_docs/python/functions_and_classes/shard3/tf.contrib.bayesflow.stochastic_tensor.MixtureTensor.md create mode 100644 tensorflow/g3doc/api_docs/python/functions_and_classes/shard8/tf.contrib.distributions.Mixture.md diff --git a/tensorflow/g3doc/api_docs/python/contrib.bayesflow.stochastic_tensor.md b/tensorflow/g3doc/api_docs/python/contrib.bayesflow.stochastic_tensor.md index b7a9a803d4b..db0d4d00179 100644 --- a/tensorflow/g3doc/api_docs/python/contrib.bayesflow.stochastic_tensor.md +++ b/tensorflow/g3doc/api_docs/python/contrib.bayesflow.stochastic_tensor.md @@ -1451,6 +1451,96 @@ in a `stop_gradients` call to disable any possible backpropagation. +- - - + +### `class tf.contrib.bayesflow.stochastic_tensor.MixtureTensor` {#MixtureTensor} + +`MixtureTensor` is a `StochasticTensor` backed by the distribution `Mixture`. +- - - + +#### `tf.contrib.bayesflow.stochastic_tensor.MixtureTensor.__init__(name=None, dist_value_type=None, loss_fn=score_function, **dist_args)` {#MixtureTensor.__init__} + + + + +- - - + +#### `tf.contrib.bayesflow.stochastic_tensor.MixtureTensor.clone(name=None, **dist_args)` {#MixtureTensor.clone} + + + + +- - - + +#### `tf.contrib.bayesflow.stochastic_tensor.MixtureTensor.distribution` {#MixtureTensor.distribution} + + + + +- - - + +#### `tf.contrib.bayesflow.stochastic_tensor.MixtureTensor.dtype` {#MixtureTensor.dtype} + + + + +- - - + +#### `tf.contrib.bayesflow.stochastic_tensor.MixtureTensor.entropy(name='entropy')` {#MixtureTensor.entropy} + + + + +- - - + +#### `tf.contrib.bayesflow.stochastic_tensor.MixtureTensor.graph` {#MixtureTensor.graph} + + + + +- - - + +#### `tf.contrib.bayesflow.stochastic_tensor.MixtureTensor.input_dict` {#MixtureTensor.input_dict} + + + + +- - - + +#### `tf.contrib.bayesflow.stochastic_tensor.MixtureTensor.loss(final_loss, name='Loss')` {#MixtureTensor.loss} + + + + +- - - + +#### `tf.contrib.bayesflow.stochastic_tensor.MixtureTensor.mean(name='mean')` {#MixtureTensor.mean} + + + + +- - - + +#### `tf.contrib.bayesflow.stochastic_tensor.MixtureTensor.name` {#MixtureTensor.name} + + + + +- - - + +#### `tf.contrib.bayesflow.stochastic_tensor.MixtureTensor.value(name='value')` {#MixtureTensor.value} + + + + +- - - + +#### `tf.contrib.bayesflow.stochastic_tensor.MixtureTensor.value_type` {#MixtureTensor.value_type} + + + + + - - - ### `class tf.contrib.bayesflow.stochastic_tensor.MultinomialTensor` {#MultinomialTensor} diff --git a/tensorflow/g3doc/api_docs/python/contrib.distributions.md b/tensorflow/g3doc/api_docs/python/contrib.distributions.md index 75478dc925d..069ea3a08af 100644 --- a/tensorflow/g3doc/api_docs/python/contrib.distributions.md +++ b/tensorflow/g3doc/api_docs/python/contrib.distributions.md @@ -13333,6 +13333,648 @@ Variance. +### Mixture Models + +- - - + +### `class tf.contrib.distributions.Mixture` {#Mixture} + +Mixture distribution. + +The `Mixture` object implements batched mixture distributions. +The mixture model is defined by a `Categorical` distribution (the mixture) +and a python list of `Distribution` objects. + +Methods supported include `log_prob`, `prob`, `mean`, `sample`, and +`entropy_lower_bound`. +- - - + +#### `tf.contrib.distributions.Mixture.__init__(cat, components, validate_args=True, allow_nan_stats=False, name='Mixture')` {#Mixture.__init__} + +Initialize a Mixture distribution. + +A `Mixture` is defined by a `Categorical` (`cat`, representing the +mixture probabilities) and a list of `Distribution` objects +all having matching dtype, batch shape, event shape, and continuity +properties (the components). + +The user does not pass the list of distributions directly, but rather a +list of `(constructor, batch_tensor_params_dict)` pairs, +called `components`. The list of distributions is created via: + +```python +distributions = [ + c(**params_dict) for (c, params_dict) in zip(*components) +] +``` + +This form allows for certain types of batch-shape optimizations within +this class. + +An example of `components`: + +```python +components = [ + (tf.contrib.distributions.Normal, {"mu": 3.0, "sigma": 1.0}), + (functools.partial(tf.contrib.distributions.Normal, validate_args=False), + {"mu": 3.0, "sigma": 2.0}), + (tf.contrib.distributions.Normal.from_params, + {"mu": 1.0, "sigma": -1.0}) +] +``` + +The `num_classes` of `cat` must be possible to infer at graph construction +time and match `len(distributions)`. + +##### Args: + + +* `cat`: A `Categorical` distribution instance, representing the probabilities + of `distributions`. +* `components`: A list or tuple of `(constructor, batch_tensor_params)` + tuples. The `constructor` must be a callable, and `batch_tensor_params` + must be a dict mapping constructor kwargs to batchwise parameters. + Each `Distribution` instance created by calling + `constructor(**batch_tensor_params)` must have the same type, be defined + on the same domain, and have matching `event_shape` and `batch_shape`. +* `validate_args`: Boolean, default `True`. If `True`, raise a runtime error + if batch or event ranks are inconsistent between cat and any of the + distributions. This is only checked if the ranks cannot be determined + statically at graph construction time. +* `allow_nan_stats`: Boolean, default `False`. If `False`, raise an + exception if a statistic (e.g. mean/mode/etc...) is undefined for any + batch member. If `True`, batch members with valid parameters leading to + undefined statistics will return NaN for this statistic. +* `name`: A name for this distribution (optional). + +##### Raises: + + +* `TypeError`: If cat is not a `Categorical`, or `components` is not + a list or tuple, or the elements of `components` are not + tuples of the form `(callable, dict)`, or the objects resulting + from calling `callable(**dict)` are not instances of `Distribution`, or + the resulting instances of `Distribution` do not have matching + continuity properties, or do not have matching `dtype`. +* `ValueError`: If `components` is an empty list or tuple, or the + distributions created from `components` do have a statically known event + rank. If `cat.num_classes` cannot be inferred at graph creation time, + or the constant value of `cat.num_classes` is not equal to + `len(distributions)`, or all `distributions` and `cat` do not have + matching static batch shapes, or all components' distributions do not + have matching static event shapes. + + +- - - + +#### `tf.contrib.distributions.Mixture.allow_nan_stats` {#Mixture.allow_nan_stats} + +Python boolean describing behavior when a stat is undefined. + +Stats return +/- infinity when it makes sense. E.g., the variance +of a Cauchy distribution is infinity. However, sometimes the +statistic is undefined, e.g., if a distribution's pdf does not achieve a +maximum within the support of the distribution, the mode is undefined. +If the mean is undefined, then by definition the variance is undefined. +E.g. the mean for Student's T for df = 1 is undefined (no clear way to say +it is either + or - infinity), so the variance = E[(X - mean)^2] is also +undefined. + +##### Returns: + + +* `allow_nan_stats`: Python boolean. + + +- - - + +#### `tf.contrib.distributions.Mixture.batch_shape(name='batch_shape')` {#Mixture.batch_shape} + +Shape of a single sample from a single event index as a 1-D `Tensor`. + +The product of the dimensions of the `batch_shape` is the number of +independent distributions of this kind the instance represents. + +##### Args: + + +* `name`: name to give to the op + +##### Returns: + + +* `batch_shape`: `Tensor`. + + +- - - + +#### `tf.contrib.distributions.Mixture.cat` {#Mixture.cat} + + + + +- - - + +#### `tf.contrib.distributions.Mixture.cdf(value, name='cdf')` {#Mixture.cdf} + +Cumulative distribution function. + +##### Args: + + +* `value`: `float` or `double` `Tensor`. +* `name`: The name to give this op. + +##### Returns: + + +* `cdf`: a `Tensor` of shape `sample_shape(x) + self.batch_shape` with + values of type `self.dtype`. + + +- - - + +#### `tf.contrib.distributions.Mixture.distributions` {#Mixture.distributions} + + + + +- - - + +#### `tf.contrib.distributions.Mixture.dtype` {#Mixture.dtype} + +The `DType` of `Tensor`s handled by this `Distribution`. + + +- - - + +#### `tf.contrib.distributions.Mixture.entropy(name='entropy')` {#Mixture.entropy} + +Shanon entropy in nats. + + +- - - + +#### `tf.contrib.distributions.Mixture.entropy_lower_bound(name='entropy_lower_bound')` {#Mixture.entropy_lower_bound} + +A lower bound on the entropy of this mixture model. + +The bound below is not always very tight, and its usefulness depends +on the mixture probabilities and the distributions in use. + +A lower bound is useful for ELBO when the `Mixture` is the variational +distribution: + +\\( +\log p(x) >= ELBO = \int q(z) \log p(x, z) dz + H[q] +\\) + +where \\( p \\) is the prior disribution, \\( q \\) is the variational, +and \\( H[q] \\) is the entropy of \\( q \\). If there is a lower bound +\\( G[q] \\) such that \\( H[q] \geq G[q] \\) then it can be used in +place of \\( H[q] \\). + +For a mixture of distributions \\( q(Z) = \sum_i c_i q_i(Z) \\) with +\\( \sum_i c_i = 1 \\), by the concavity of \\( f(x) = -x \log x \\), a +simple lower bound is: + +\\( +\begin{align} +H[q] & = - \int q(z) \log q(z) dz \\\ + & = - \int (\sum_i c_i q_i(z)) \log(\sum_i c_i q_i(z)) dz \\\ + & \geq - \sum_i c_i \int q_i(z) \log q_i(z) dz \\\ + & = \sum_i c_i H[q_i] +\end{align} +\\) + +This is the term we calculate below for \\( G[q] \\). + +##### Args: + + +* `name`: A name for this operation (optional). + +##### Returns: + + A lower bound on the Mixture's entropy. + + +- - - + +#### `tf.contrib.distributions.Mixture.event_shape(name='event_shape')` {#Mixture.event_shape} + +Shape of a single sample from a single batch as a 1-D int32 `Tensor`. + +##### Args: + + +* `name`: name to give to the op + +##### Returns: + + +* `event_shape`: `Tensor`. + + +- - - + +#### `tf.contrib.distributions.Mixture.from_params(cls, make_safe=True, **kwargs)` {#Mixture.from_params} + +Given (unconstrained) parameters, return an instantiated distribution. + +Subclasses should implement a static method `_safe_transforms` that returns +a dict of parameter transforms, which will be used if `make_safe = True`. + +Example usage: + +``` +# Let's say we want a sample of size (batch_size, 10) +shapes = MultiVariateNormalDiag.param_shapes([batch_size, 10]) + +# shapes has a Tensor shape for mu and sigma +# shapes == { +# "mu": tf.constant([batch_size, 10]), +# "sigma": tf.constant([batch_size, 10]), +# } + +# Here we parameterize mu and sigma with the output of a linear +# layer. Note that sigma is unconstrained. +params = {} +for name, shape in shapes.items(): + params[name] = linear(x, shape[1]) + +# Note that you can forward other kwargs to the `Distribution`, like +# `allow_nan_stats` or `name`. +mvn = MultiVariateNormalDiag.from_params(**params, allow_nan_stats=True) +``` + +Distribution parameters may have constraints (e.g. `sigma` must be positive +for a `Normal` distribution) and the `from_params` method will apply default +parameter transforms. If a user wants to use their own transform, they can +apply it externally and set `make_safe=False`. + +##### Args: + + +* `make_safe`: Whether the `params` should be constrained. If True, + `from_params` will apply default parameter transforms. If False, no + parameter transforms will be applied. +* `**kwargs`: dict of parameters for the distribution. + +##### Returns: + + A distribution parameterized by possibly transformed parameters in + `kwargs`. + +##### Raises: + + +* `TypeError`: if `make_safe` is `True` but `_safe_transforms` is not + implemented directly for `cls`. + + +- - - + +#### `tf.contrib.distributions.Mixture.get_batch_shape()` {#Mixture.get_batch_shape} + +Shape of a single sample from a single event index as a `TensorShape`. + +Same meaning as `batch_shape`. May be only partially defined. + +##### Returns: + + +* `batch_shape`: `TensorShape`, possibly unknown. + + +- - - + +#### `tf.contrib.distributions.Mixture.get_event_shape()` {#Mixture.get_event_shape} + +Shape of a single sample from a single batch as a `TensorShape`. + +Same meaning as `event_shape`. May be only partially defined. + +##### Returns: + + +* `event_shape`: `TensorShape`, possibly unknown. + + +- - - + +#### `tf.contrib.distributions.Mixture.is_continuous` {#Mixture.is_continuous} + + + + +- - - + +#### `tf.contrib.distributions.Mixture.is_reparameterized` {#Mixture.is_reparameterized} + + + + +- - - + +#### `tf.contrib.distributions.Mixture.log_cdf(value, name='log_cdf')` {#Mixture.log_cdf} + +Log cumulative distribution function. + +##### Args: + + +* `value`: `float` or `double` `Tensor`. +* `name`: The name to give this op. + +##### Returns: + + +* `logcdf`: a `Tensor` of shape `sample_shape(x) + self.batch_shape` with + values of type `self.dtype`. + + +- - - + +#### `tf.contrib.distributions.Mixture.log_pdf(value, name='log_pdf')` {#Mixture.log_pdf} + +Log probability density function. + +##### Args: + + +* `value`: `float` or `double` `Tensor`. +* `name`: The name to give this op. + +##### Returns: + + +* `log_prob`: a `Tensor` of shape `sample_shape(x) + self.batch_shape` with + values of type `self.dtype`. + +##### Raises: + + +* `AttributeError`: if not `is_continuous`. + + +- - - + +#### `tf.contrib.distributions.Mixture.log_pmf(value, name='log_pmf')` {#Mixture.log_pmf} + +Log probability mass function. + +##### Args: + + +* `value`: `float` or `double` `Tensor`. +* `name`: The name to give this op. + +##### Returns: + + +* `log_pmf`: a `Tensor` of shape `sample_shape(x) + self.batch_shape` with + values of type `self.dtype`. + +##### Raises: + + +* `AttributeError`: if `is_continuous`. + + +- - - + +#### `tf.contrib.distributions.Mixture.log_prob(value, name='log_prob')` {#Mixture.log_prob} + +Log probability density/mass function (depending on `is_continuous`). + +##### Args: + + +* `value`: `float` or `double` `Tensor`. +* `name`: The name to give this op. + +##### Returns: + + +* `log_prob`: a `Tensor` of shape `sample_shape(x) + self.batch_shape` with + values of type `self.dtype`. + + +- - - + +#### `tf.contrib.distributions.Mixture.mean(name='mean')` {#Mixture.mean} + +Mean. + + +- - - + +#### `tf.contrib.distributions.Mixture.mode(name='mode')` {#Mixture.mode} + +Mode. + + +- - - + +#### `tf.contrib.distributions.Mixture.name` {#Mixture.name} + +Name prepended to all ops created by this `Distribution`. + + +- - - + +#### `tf.contrib.distributions.Mixture.num_components` {#Mixture.num_components} + + + + +- - - + +#### `tf.contrib.distributions.Mixture.param_shapes(cls, sample_shape, name='DistributionParamShapes')` {#Mixture.param_shapes} + +Shapes of parameters given the desired shape of a call to `sample()`. + +Subclasses should override static method `_param_shapes`. + +##### Args: + + +* `sample_shape`: `Tensor` or python list/tuple. Desired shape of a call to + `sample()`. +* `name`: name to prepend ops with. + +##### Returns: + + `dict` of parameter name to `Tensor` shapes. + + +- - - + +#### `tf.contrib.distributions.Mixture.param_static_shapes(cls, sample_shape)` {#Mixture.param_static_shapes} + +param_shapes with static (i.e. TensorShape) shapes. + +##### Args: + + +* `sample_shape`: `TensorShape` or python list/tuple. Desired shape of a call + to `sample()`. + +##### Returns: + + `dict` of parameter name to `TensorShape`. + +##### Raises: + + +* `ValueError`: if `sample_shape` is a `TensorShape` and is not fully defined. + + +- - - + +#### `tf.contrib.distributions.Mixture.parameters` {#Mixture.parameters} + +Dictionary of parameters used by this `Distribution`. + + +- - - + +#### `tf.contrib.distributions.Mixture.pdf(value, name='pdf')` {#Mixture.pdf} + +Probability density function. + +##### Args: + + +* `value`: `float` or `double` `Tensor`. +* `name`: The name to give this op. + +##### Returns: + + +* `prob`: a `Tensor` of shape `sample_shape(x) + self.batch_shape` with + values of type `self.dtype`. + +##### Raises: + + +* `AttributeError`: if not `is_continuous`. + + +- - - + +#### `tf.contrib.distributions.Mixture.pmf(value, name='pmf')` {#Mixture.pmf} + +Probability mass function. + +##### Args: + + +* `value`: `float` or `double` `Tensor`. +* `name`: The name to give this op. + +##### Returns: + + +* `pmf`: a `Tensor` of shape `sample_shape(x) + self.batch_shape` with + values of type `self.dtype`. + +##### Raises: + + +* `AttributeError`: if `is_continuous`. + + +- - - + +#### `tf.contrib.distributions.Mixture.prob(value, name='prob')` {#Mixture.prob} + +Probability density/mass function (depending on `is_continuous`). + +##### Args: + + +* `value`: `float` or `double` `Tensor`. +* `name`: The name to give this op. + +##### Returns: + + +* `prob`: a `Tensor` of shape `sample_shape(x) + self.batch_shape` with + values of type `self.dtype`. + + +- - - + +#### `tf.contrib.distributions.Mixture.sample(sample_shape=(), seed=None, name='sample')` {#Mixture.sample} + +Generate samples of the specified shape. + +Note that a call to `sample()` without arguments will generate a single +sample. + +##### Args: + + +* `sample_shape`: 0D or 1D `int32` `Tensor`. Shape of the generated samples. +* `seed`: Python integer seed for RNG +* `name`: name to give to the op. + +##### Returns: + + +* `samples`: a `Tensor` with prepended dimensions `sample_shape`. + + +- - - + +#### `tf.contrib.distributions.Mixture.sample_n(n, seed=None, name='sample_n')` {#Mixture.sample_n} + +Generate `n` samples. + +##### Args: + + +* `n`: `Scalar` `Tensor` of type `int32` or `int64`, the number of + observations to sample. +* `seed`: Python integer seed for RNG +* `name`: name to give to the op. + +##### Returns: + + +* `samples`: a `Tensor` with a prepended dimension (n,). + +##### Raises: + + +* `TypeError`: if `n` is not an integer type. + + +- - - + +#### `tf.contrib.distributions.Mixture.std(name='std')` {#Mixture.std} + +Standard deviation. + + +- - - + +#### `tf.contrib.distributions.Mixture.validate_args` {#Mixture.validate_args} + +Python boolean indicated possibly expensive checks are enabled. + + +- - - + +#### `tf.contrib.distributions.Mixture.variance(name='variance')` {#Mixture.variance} + +Variance. + + + + ## Posterior inference with conjugate priors. Functions that transform conjugate prior/likelihood pairs to distributions diff --git a/tensorflow/g3doc/api_docs/python/functions_and_classes/shard3/tf.contrib.bayesflow.stochastic_tensor.MixtureTensor.md b/tensorflow/g3doc/api_docs/python/functions_and_classes/shard3/tf.contrib.bayesflow.stochastic_tensor.MixtureTensor.md new file mode 100644 index 00000000000..3280f5a9448 --- /dev/null +++ b/tensorflow/g3doc/api_docs/python/functions_and_classes/shard3/tf.contrib.bayesflow.stochastic_tensor.MixtureTensor.md @@ -0,0 +1,85 @@ +`MixtureTensor` is a `StochasticTensor` backed by the distribution `Mixture`. +- - - + +#### `tf.contrib.bayesflow.stochastic_tensor.MixtureTensor.__init__(name=None, dist_value_type=None, loss_fn=score_function, **dist_args)` {#MixtureTensor.__init__} + + + + +- - - + +#### `tf.contrib.bayesflow.stochastic_tensor.MixtureTensor.clone(name=None, **dist_args)` {#MixtureTensor.clone} + + + + +- - - + +#### `tf.contrib.bayesflow.stochastic_tensor.MixtureTensor.distribution` {#MixtureTensor.distribution} + + + + +- - - + +#### `tf.contrib.bayesflow.stochastic_tensor.MixtureTensor.dtype` {#MixtureTensor.dtype} + + + + +- - - + +#### `tf.contrib.bayesflow.stochastic_tensor.MixtureTensor.entropy(name='entropy')` {#MixtureTensor.entropy} + + + + +- - - + +#### `tf.contrib.bayesflow.stochastic_tensor.MixtureTensor.graph` {#MixtureTensor.graph} + + + + +- - - + +#### `tf.contrib.bayesflow.stochastic_tensor.MixtureTensor.input_dict` {#MixtureTensor.input_dict} + + + + +- - - + +#### `tf.contrib.bayesflow.stochastic_tensor.MixtureTensor.loss(final_loss, name='Loss')` {#MixtureTensor.loss} + + + + +- - - + +#### `tf.contrib.bayesflow.stochastic_tensor.MixtureTensor.mean(name='mean')` {#MixtureTensor.mean} + + + + +- - - + +#### `tf.contrib.bayesflow.stochastic_tensor.MixtureTensor.name` {#MixtureTensor.name} + + + + +- - - + +#### `tf.contrib.bayesflow.stochastic_tensor.MixtureTensor.value(name='value')` {#MixtureTensor.value} + + + + +- - - + +#### `tf.contrib.bayesflow.stochastic_tensor.MixtureTensor.value_type` {#MixtureTensor.value_type} + + + + diff --git a/tensorflow/g3doc/api_docs/python/functions_and_classes/shard8/tf.contrib.distributions.Mixture.md b/tensorflow/g3doc/api_docs/python/functions_and_classes/shard8/tf.contrib.distributions.Mixture.md new file mode 100644 index 00000000000..6776075b869 --- /dev/null +++ b/tensorflow/g3doc/api_docs/python/functions_and_classes/shard8/tf.contrib.distributions.Mixture.md @@ -0,0 +1,634 @@ +Mixture distribution. + +The `Mixture` object implements batched mixture distributions. +The mixture model is defined by a `Categorical` distribution (the mixture) +and a python list of `Distribution` objects. + +Methods supported include `log_prob`, `prob`, `mean`, `sample`, and +`entropy_lower_bound`. +- - - + +#### `tf.contrib.distributions.Mixture.__init__(cat, components, validate_args=True, allow_nan_stats=False, name='Mixture')` {#Mixture.__init__} + +Initialize a Mixture distribution. + +A `Mixture` is defined by a `Categorical` (`cat`, representing the +mixture probabilities) and a list of `Distribution` objects +all having matching dtype, batch shape, event shape, and continuity +properties (the components). + +The user does not pass the list of distributions directly, but rather a +list of `(constructor, batch_tensor_params_dict)` pairs, +called `components`. The list of distributions is created via: + +```python +distributions = [ + c(**params_dict) for (c, params_dict) in zip(*components) +] +``` + +This form allows for certain types of batch-shape optimizations within +this class. + +An example of `components`: + +```python +components = [ + (tf.contrib.distributions.Normal, {"mu": 3.0, "sigma": 1.0}), + (functools.partial(tf.contrib.distributions.Normal, validate_args=False), + {"mu": 3.0, "sigma": 2.0}), + (tf.contrib.distributions.Normal.from_params, + {"mu": 1.0, "sigma": -1.0}) +] +``` + +The `num_classes` of `cat` must be possible to infer at graph construction +time and match `len(distributions)`. + +##### Args: + + +* `cat`: A `Categorical` distribution instance, representing the probabilities + of `distributions`. +* `components`: A list or tuple of `(constructor, batch_tensor_params)` + tuples. The `constructor` must be a callable, and `batch_tensor_params` + must be a dict mapping constructor kwargs to batchwise parameters. + Each `Distribution` instance created by calling + `constructor(**batch_tensor_params)` must have the same type, be defined + on the same domain, and have matching `event_shape` and `batch_shape`. +* `validate_args`: Boolean, default `True`. If `True`, raise a runtime error + if batch or event ranks are inconsistent between cat and any of the + distributions. This is only checked if the ranks cannot be determined + statically at graph construction time. +* `allow_nan_stats`: Boolean, default `False`. If `False`, raise an + exception if a statistic (e.g. mean/mode/etc...) is undefined for any + batch member. If `True`, batch members with valid parameters leading to + undefined statistics will return NaN for this statistic. +* `name`: A name for this distribution (optional). + +##### Raises: + + +* `TypeError`: If cat is not a `Categorical`, or `components` is not + a list or tuple, or the elements of `components` are not + tuples of the form `(callable, dict)`, or the objects resulting + from calling `callable(**dict)` are not instances of `Distribution`, or + the resulting instances of `Distribution` do not have matching + continuity properties, or do not have matching `dtype`. +* `ValueError`: If `components` is an empty list or tuple, or the + distributions created from `components` do have a statically known event + rank. If `cat.num_classes` cannot be inferred at graph creation time, + or the constant value of `cat.num_classes` is not equal to + `len(distributions)`, or all `distributions` and `cat` do not have + matching static batch shapes, or all components' distributions do not + have matching static event shapes. + + +- - - + +#### `tf.contrib.distributions.Mixture.allow_nan_stats` {#Mixture.allow_nan_stats} + +Python boolean describing behavior when a stat is undefined. + +Stats return +/- infinity when it makes sense. E.g., the variance +of a Cauchy distribution is infinity. However, sometimes the +statistic is undefined, e.g., if a distribution's pdf does not achieve a +maximum within the support of the distribution, the mode is undefined. +If the mean is undefined, then by definition the variance is undefined. +E.g. the mean for Student's T for df = 1 is undefined (no clear way to say +it is either + or - infinity), so the variance = E[(X - mean)^2] is also +undefined. + +##### Returns: + + +* `allow_nan_stats`: Python boolean. + + +- - - + +#### `tf.contrib.distributions.Mixture.batch_shape(name='batch_shape')` {#Mixture.batch_shape} + +Shape of a single sample from a single event index as a 1-D `Tensor`. + +The product of the dimensions of the `batch_shape` is the number of +independent distributions of this kind the instance represents. + +##### Args: + + +* `name`: name to give to the op + +##### Returns: + + +* `batch_shape`: `Tensor`. + + +- - - + +#### `tf.contrib.distributions.Mixture.cat` {#Mixture.cat} + + + + +- - - + +#### `tf.contrib.distributions.Mixture.cdf(value, name='cdf')` {#Mixture.cdf} + +Cumulative distribution function. + +##### Args: + + +* `value`: `float` or `double` `Tensor`. +* `name`: The name to give this op. + +##### Returns: + + +* `cdf`: a `Tensor` of shape `sample_shape(x) + self.batch_shape` with + values of type `self.dtype`. + + +- - - + +#### `tf.contrib.distributions.Mixture.distributions` {#Mixture.distributions} + + + + +- - - + +#### `tf.contrib.distributions.Mixture.dtype` {#Mixture.dtype} + +The `DType` of `Tensor`s handled by this `Distribution`. + + +- - - + +#### `tf.contrib.distributions.Mixture.entropy(name='entropy')` {#Mixture.entropy} + +Shanon entropy in nats. + + +- - - + +#### `tf.contrib.distributions.Mixture.entropy_lower_bound(name='entropy_lower_bound')` {#Mixture.entropy_lower_bound} + +A lower bound on the entropy of this mixture model. + +The bound below is not always very tight, and its usefulness depends +on the mixture probabilities and the distributions in use. + +A lower bound is useful for ELBO when the `Mixture` is the variational +distribution: + +\\( +\log p(x) >= ELBO = \int q(z) \log p(x, z) dz + H[q] +\\) + +where \\( p \\) is the prior disribution, \\( q \\) is the variational, +and \\( H[q] \\) is the entropy of \\( q \\). If there is a lower bound +\\( G[q] \\) such that \\( H[q] \geq G[q] \\) then it can be used in +place of \\( H[q] \\). + +For a mixture of distributions \\( q(Z) = \sum_i c_i q_i(Z) \\) with +\\( \sum_i c_i = 1 \\), by the concavity of \\( f(x) = -x \log x \\), a +simple lower bound is: + +\\( +\begin{align} +H[q] & = - \int q(z) \log q(z) dz \\\ + & = - \int (\sum_i c_i q_i(z)) \log(\sum_i c_i q_i(z)) dz \\\ + & \geq - \sum_i c_i \int q_i(z) \log q_i(z) dz \\\ + & = \sum_i c_i H[q_i] +\end{align} +\\) + +This is the term we calculate below for \\( G[q] \\). + +##### Args: + + +* `name`: A name for this operation (optional). + +##### Returns: + + A lower bound on the Mixture's entropy. + + +- - - + +#### `tf.contrib.distributions.Mixture.event_shape(name='event_shape')` {#Mixture.event_shape} + +Shape of a single sample from a single batch as a 1-D int32 `Tensor`. + +##### Args: + + +* `name`: name to give to the op + +##### Returns: + + +* `event_shape`: `Tensor`. + + +- - - + +#### `tf.contrib.distributions.Mixture.from_params(cls, make_safe=True, **kwargs)` {#Mixture.from_params} + +Given (unconstrained) parameters, return an instantiated distribution. + +Subclasses should implement a static method `_safe_transforms` that returns +a dict of parameter transforms, which will be used if `make_safe = True`. + +Example usage: + +``` +# Let's say we want a sample of size (batch_size, 10) +shapes = MultiVariateNormalDiag.param_shapes([batch_size, 10]) + +# shapes has a Tensor shape for mu and sigma +# shapes == { +# "mu": tf.constant([batch_size, 10]), +# "sigma": tf.constant([batch_size, 10]), +# } + +# Here we parameterize mu and sigma with the output of a linear +# layer. Note that sigma is unconstrained. +params = {} +for name, shape in shapes.items(): + params[name] = linear(x, shape[1]) + +# Note that you can forward other kwargs to the `Distribution`, like +# `allow_nan_stats` or `name`. +mvn = MultiVariateNormalDiag.from_params(**params, allow_nan_stats=True) +``` + +Distribution parameters may have constraints (e.g. `sigma` must be positive +for a `Normal` distribution) and the `from_params` method will apply default +parameter transforms. If a user wants to use their own transform, they can +apply it externally and set `make_safe=False`. + +##### Args: + + +* `make_safe`: Whether the `params` should be constrained. If True, + `from_params` will apply default parameter transforms. If False, no + parameter transforms will be applied. +* `**kwargs`: dict of parameters for the distribution. + +##### Returns: + + A distribution parameterized by possibly transformed parameters in + `kwargs`. + +##### Raises: + + +* `TypeError`: if `make_safe` is `True` but `_safe_transforms` is not + implemented directly for `cls`. + + +- - - + +#### `tf.contrib.distributions.Mixture.get_batch_shape()` {#Mixture.get_batch_shape} + +Shape of a single sample from a single event index as a `TensorShape`. + +Same meaning as `batch_shape`. May be only partially defined. + +##### Returns: + + +* `batch_shape`: `TensorShape`, possibly unknown. + + +- - - + +#### `tf.contrib.distributions.Mixture.get_event_shape()` {#Mixture.get_event_shape} + +Shape of a single sample from a single batch as a `TensorShape`. + +Same meaning as `event_shape`. May be only partially defined. + +##### Returns: + + +* `event_shape`: `TensorShape`, possibly unknown. + + +- - - + +#### `tf.contrib.distributions.Mixture.is_continuous` {#Mixture.is_continuous} + + + + +- - - + +#### `tf.contrib.distributions.Mixture.is_reparameterized` {#Mixture.is_reparameterized} + + + + +- - - + +#### `tf.contrib.distributions.Mixture.log_cdf(value, name='log_cdf')` {#Mixture.log_cdf} + +Log cumulative distribution function. + +##### Args: + + +* `value`: `float` or `double` `Tensor`. +* `name`: The name to give this op. + +##### Returns: + + +* `logcdf`: a `Tensor` of shape `sample_shape(x) + self.batch_shape` with + values of type `self.dtype`. + + +- - - + +#### `tf.contrib.distributions.Mixture.log_pdf(value, name='log_pdf')` {#Mixture.log_pdf} + +Log probability density function. + +##### Args: + + +* `value`: `float` or `double` `Tensor`. +* `name`: The name to give this op. + +##### Returns: + + +* `log_prob`: a `Tensor` of shape `sample_shape(x) + self.batch_shape` with + values of type `self.dtype`. + +##### Raises: + + +* `AttributeError`: if not `is_continuous`. + + +- - - + +#### `tf.contrib.distributions.Mixture.log_pmf(value, name='log_pmf')` {#Mixture.log_pmf} + +Log probability mass function. + +##### Args: + + +* `value`: `float` or `double` `Tensor`. +* `name`: The name to give this op. + +##### Returns: + + +* `log_pmf`: a `Tensor` of shape `sample_shape(x) + self.batch_shape` with + values of type `self.dtype`. + +##### Raises: + + +* `AttributeError`: if `is_continuous`. + + +- - - + +#### `tf.contrib.distributions.Mixture.log_prob(value, name='log_prob')` {#Mixture.log_prob} + +Log probability density/mass function (depending on `is_continuous`). + +##### Args: + + +* `value`: `float` or `double` `Tensor`. +* `name`: The name to give this op. + +##### Returns: + + +* `log_prob`: a `Tensor` of shape `sample_shape(x) + self.batch_shape` with + values of type `self.dtype`. + + +- - - + +#### `tf.contrib.distributions.Mixture.mean(name='mean')` {#Mixture.mean} + +Mean. + + +- - - + +#### `tf.contrib.distributions.Mixture.mode(name='mode')` {#Mixture.mode} + +Mode. + + +- - - + +#### `tf.contrib.distributions.Mixture.name` {#Mixture.name} + +Name prepended to all ops created by this `Distribution`. + + +- - - + +#### `tf.contrib.distributions.Mixture.num_components` {#Mixture.num_components} + + + + +- - - + +#### `tf.contrib.distributions.Mixture.param_shapes(cls, sample_shape, name='DistributionParamShapes')` {#Mixture.param_shapes} + +Shapes of parameters given the desired shape of a call to `sample()`. + +Subclasses should override static method `_param_shapes`. + +##### Args: + + +* `sample_shape`: `Tensor` or python list/tuple. Desired shape of a call to + `sample()`. +* `name`: name to prepend ops with. + +##### Returns: + + `dict` of parameter name to `Tensor` shapes. + + +- - - + +#### `tf.contrib.distributions.Mixture.param_static_shapes(cls, sample_shape)` {#Mixture.param_static_shapes} + +param_shapes with static (i.e. TensorShape) shapes. + +##### Args: + + +* `sample_shape`: `TensorShape` or python list/tuple. Desired shape of a call + to `sample()`. + +##### Returns: + + `dict` of parameter name to `TensorShape`. + +##### Raises: + + +* `ValueError`: if `sample_shape` is a `TensorShape` and is not fully defined. + + +- - - + +#### `tf.contrib.distributions.Mixture.parameters` {#Mixture.parameters} + +Dictionary of parameters used by this `Distribution`. + + +- - - + +#### `tf.contrib.distributions.Mixture.pdf(value, name='pdf')` {#Mixture.pdf} + +Probability density function. + +##### Args: + + +* `value`: `float` or `double` `Tensor`. +* `name`: The name to give this op. + +##### Returns: + + +* `prob`: a `Tensor` of shape `sample_shape(x) + self.batch_shape` with + values of type `self.dtype`. + +##### Raises: + + +* `AttributeError`: if not `is_continuous`. + + +- - - + +#### `tf.contrib.distributions.Mixture.pmf(value, name='pmf')` {#Mixture.pmf} + +Probability mass function. + +##### Args: + + +* `value`: `float` or `double` `Tensor`. +* `name`: The name to give this op. + +##### Returns: + + +* `pmf`: a `Tensor` of shape `sample_shape(x) + self.batch_shape` with + values of type `self.dtype`. + +##### Raises: + + +* `AttributeError`: if `is_continuous`. + + +- - - + +#### `tf.contrib.distributions.Mixture.prob(value, name='prob')` {#Mixture.prob} + +Probability density/mass function (depending on `is_continuous`). + +##### Args: + + +* `value`: `float` or `double` `Tensor`. +* `name`: The name to give this op. + +##### Returns: + + +* `prob`: a `Tensor` of shape `sample_shape(x) + self.batch_shape` with + values of type `self.dtype`. + + +- - - + +#### `tf.contrib.distributions.Mixture.sample(sample_shape=(), seed=None, name='sample')` {#Mixture.sample} + +Generate samples of the specified shape. + +Note that a call to `sample()` without arguments will generate a single +sample. + +##### Args: + + +* `sample_shape`: 0D or 1D `int32` `Tensor`. Shape of the generated samples. +* `seed`: Python integer seed for RNG +* `name`: name to give to the op. + +##### Returns: + + +* `samples`: a `Tensor` with prepended dimensions `sample_shape`. + + +- - - + +#### `tf.contrib.distributions.Mixture.sample_n(n, seed=None, name='sample_n')` {#Mixture.sample_n} + +Generate `n` samples. + +##### Args: + + +* `n`: `Scalar` `Tensor` of type `int32` or `int64`, the number of + observations to sample. +* `seed`: Python integer seed for RNG +* `name`: name to give to the op. + +##### Returns: + + +* `samples`: a `Tensor` with a prepended dimension (n,). + +##### Raises: + + +* `TypeError`: if `n` is not an integer type. + + +- - - + +#### `tf.contrib.distributions.Mixture.std(name='std')` {#Mixture.std} + +Standard deviation. + + +- - - + +#### `tf.contrib.distributions.Mixture.validate_args` {#Mixture.validate_args} + +Python boolean indicated possibly expensive checks are enabled. + + +- - - + +#### `tf.contrib.distributions.Mixture.variance(name='variance')` {#Mixture.variance} + +Variance. + + diff --git a/tensorflow/g3doc/api_docs/python/index.md b/tensorflow/g3doc/api_docs/python/index.md index 3bd88eedacb..ebe49cc5d7e 100644 --- a/tensorflow/g3doc/api_docs/python/index.md +++ b/tensorflow/g3doc/api_docs/python/index.md @@ -632,6 +632,7 @@ * [`InverseGammaTensor`](../../api_docs/python/contrib.bayesflow.stochastic_tensor.md#InverseGammaTensor) * [`LaplaceTensor`](../../api_docs/python/contrib.bayesflow.stochastic_tensor.md#LaplaceTensor) * [`MeanValue`](../../api_docs/python/contrib.bayesflow.stochastic_tensor.md#MeanValue) + * [`MixtureTensor`](../../api_docs/python/contrib.bayesflow.stochastic_tensor.md#MixtureTensor) * [`MultinomialTensor`](../../api_docs/python/contrib.bayesflow.stochastic_tensor.md#MultinomialTensor) * [`MultivariateNormalCholeskyTensor`](../../api_docs/python/contrib.bayesflow.stochastic_tensor.md#MultivariateNormalCholeskyTensor) * [`MultivariateNormalDiagPlusVDVTTensor`](../../api_docs/python/contrib.bayesflow.stochastic_tensor.md#MultivariateNormalDiagPlusVDVTTensor) @@ -671,6 +672,7 @@ * [`InverseGamma`](../../api_docs/python/contrib.distributions.md#InverseGamma) * [`kl`](../../api_docs/python/contrib.distributions.md#kl) * [`Laplace`](../../api_docs/python/contrib.distributions.md#Laplace) + * [`Mixture`](../../api_docs/python/contrib.distributions.md#Mixture) * [`Multinomial`](../../api_docs/python/contrib.distributions.md#Multinomial) * [`MultivariateNormalCholesky`](../../api_docs/python/contrib.distributions.md#MultivariateNormalCholesky) * [`MultivariateNormalDiag`](../../api_docs/python/contrib.distributions.md#MultivariateNormalDiag) From 5aee7330167a11c0cf8c0c5bd1fb1162a8237e0a Mon Sep 17 00:00:00 2001 From: "A. Unique TensorFlower" Date: Wed, 31 Aug 2016 14:49:07 -0800 Subject: [PATCH 02/89] I'm requesting a documentation revision for `range`. The documentation for the `range` function was inconsistent in its documentation of optional parameters. Other documentation changes were reverted. Change: 131883258 --- tensorflow/python/ops/math_ops.py | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/tensorflow/python/ops/math_ops.py b/tensorflow/python/ops/math_ops.py index fa6af4740e0..93c78b6254a 100644 --- a/tensorflow/python/ops/math_ops.py +++ b/tensorflow/python/ops/math_ops.py @@ -974,13 +974,15 @@ def range(start, limit=None, delta=1, name="range"): ``` Args: - start: A 0-D (scalar) of type `int32`. First entry in sequence. - Defaults to 0. + start: A 0-D (scalar) of type `int32`. Acts as first entry in the range if + `limit` is not None; otherwise, acts as range limit and first entry + defaults to 0. limit: A 0-D (scalar) of type `int32`. Upper limit of sequence, - exclusive. - delta: A 0-D `Tensor` (scalar) of type `int32`. Optional. Default is 1. - Number that increments `start`. - name: A name for the operation (optional). + exclusive. If None, defaults to the value of `start` while the first + entry of the range defaults to 0. + delta: A 0-D `Tensor` (scalar) of type `int32`. Number that increments + `start`. Defaults to 1. + name: A name for the operation. Defaults to "range". Returns: An 1-D `int32` `Tensor`. From ba3e3ee5f25adb1e4514918938c702477c54f020 Mon Sep 17 00:00:00 2001 From: "A. Unique TensorFlower" Date: Wed, 31 Aug 2016 14:53:01 -0800 Subject: [PATCH 03/89] Rename loss ops more clearly. Change: 131883707 --- .../layers/python/layers/target_column.py | 21 +++++++++---------- .../learn/estimators/dnn_linear_combined.py | 8 ++++--- .../estimators/dnn_linear_combined_test.py | 19 +++++++++++------ 3 files changed, 28 insertions(+), 20 deletions(-) diff --git a/tensorflow/contrib/layers/python/layers/target_column.py b/tensorflow/contrib/layers/python/layers/target_column.py index 8dc0f6548b9..ee054a88897 100644 --- a/tensorflow/contrib/layers/python/layers/target_column.py +++ b/tensorflow/contrib/layers/python/layers/target_column.py @@ -181,7 +181,7 @@ class _TargetColumn(object): weight_tensor, shape=(-1,))) return weighted_loss - def training_loss(self, logits, target, features): + def training_loss(self, logits, target, features, name="training_loss"): """Returns training loss tensor for this head. Training loss is different from the loss reported on the tensorboard as we @@ -197,6 +197,7 @@ class _TargetColumn(object): target: either a tensor for labels or in multihead case, a dict of string to target tensor. features: features dict. + name: Op name. Returns: Loss tensor. @@ -206,10 +207,9 @@ class _TargetColumn(object): weight_tensor = self.get_weight_tensor(features) if weight_tensor is None: - return math_ops.reduce_mean(loss_unweighted, name="loss") - else: - loss_weighted = self._weighted_loss(loss_unweighted, weight_tensor) - return math_ops.reduce_mean(loss_weighted, name="loss") + return math_ops.reduce_mean(loss_unweighted, name=name) + loss_weighted = self._weighted_loss(loss_unweighted, weight_tensor) + return math_ops.reduce_mean(loss_weighted, name=name) def loss(self, logits, target, features): """Returns loss tensor for this head. @@ -233,12 +233,11 @@ class _TargetColumn(object): weight_tensor = self.get_weight_tensor(features) if weight_tensor is None: return math_ops.reduce_mean(loss_unweighted, name="loss") - else: - loss_weighted = self._weighted_loss(loss_unweighted, weight_tensor) - return math_ops.div( - math_ops.reduce_sum(loss_weighted), - math_ops.to_float(math_ops.reduce_sum(weight_tensor)), - name="loss") + loss_weighted = self._weighted_loss(loss_unweighted, weight_tensor) + return math_ops.div( + math_ops.reduce_sum(loss_weighted), + math_ops.to_float(math_ops.reduce_sum(weight_tensor)), + name="loss") class _RegressionTargetColumn(_TargetColumn): diff --git a/tensorflow/contrib/learn/python/learn/estimators/dnn_linear_combined.py b/tensorflow/contrib/learn/python/learn/estimators/dnn_linear_combined.py index 8e99595c472..d81d99f672c 100644 --- a/tensorflow/contrib/learn/python/learn/estimators/dnn_linear_combined.py +++ b/tensorflow/contrib/learn/python/learn/estimators/dnn_linear_combined.py @@ -253,9 +253,11 @@ class _DNNLinearCombinedBaseEstimator(estimator.BaseEstimator): logits = array_ops.reshape( array_ops.tile(centered_bias[0], [batch_size]), [batch_size, self._target_column.num_label_columns]) - training_loss = self._target_column.training_loss(logits, targets, features) - # Learn central bias by an optimizer. 0.1 is a convervative lr for a single - # variable. + with ops.name_scope(None, "centered_bias", (targets, features)): + training_loss = self._target_column.training_loss( + logits, targets, features) + # Learn central bias by an optimizer. 0.1 is a convervative lr for a + # single variable. return training.AdagradOptimizer(0.1).minimize( training_loss, var_list=centered_bias) diff --git a/tensorflow/contrib/learn/python/learn/estimators/dnn_linear_combined_test.py b/tensorflow/contrib/learn/python/learn/estimators/dnn_linear_combined_test.py index 9e3a6dfd7fb..768e8045e6b 100644 --- a/tensorflow/contrib/learn/python/learn/estimators/dnn_linear_combined_test.py +++ b/tensorflow/contrib/learn/python/learn/estimators/dnn_linear_combined_test.py @@ -223,10 +223,13 @@ class DNNLinearCombinedClassifierTest(tf.test.TestCase): linear_feature_columns=[tf.contrib.layers.real_valued_column('x')], dnn_feature_columns=[tf.contrib.layers.real_valued_column('x')], dnn_hidden_units=[3, 3]) - - classifier.fit(input_fn=_input_fn_train, steps=100) - scores = classifier.evaluate(input_fn=_input_fn_eval, - steps=100) + classifier.fit(input_fn=_input_fn_train, steps=100, monitors=( + tf.contrib.learn.monitors.CaptureVariable(var_name='loss'), + tf.contrib.learn.monitors.CaptureVariable( + var_name='centered_bias/training_loss'), + tf.contrib.learn.monitors.CaptureVariable(var_name='training_loss'), + )) + scores = classifier.evaluate(input_fn=_input_fn_eval, steps=100) # If there is no weight column, model should learn y=Not(x). All examples in # eval data set are y=x. So if weight column is ignored, then accuracy # should be zero. @@ -251,8 +254,12 @@ class DNNLinearCombinedClassifierTest(tf.test.TestCase): linear_feature_columns=[tf.contrib.layers.real_valued_column('x')], dnn_feature_columns=[tf.contrib.layers.real_valued_column('x')], dnn_hidden_units=[3, 3]) - - classifier.fit(input_fn=_input_fn_train, steps=100) + classifier.fit(input_fn=_input_fn_train, steps=100, monitors=( + tf.contrib.learn.monitors.CaptureVariable(var_name='loss'), + tf.contrib.learn.monitors.CaptureVariable( + var_name='centered_bias/training_loss'), + tf.contrib.learn.monitors.CaptureVariable(var_name='training_loss'), + )) scores = classifier.evaluate(input_fn=_input_fn_train, steps=100) # If weight column is ignored, then accuracy should be 0.25. If it's not # ignored, then it should be greater than 0.6. From 76885d41e22448ea9825c7f632bb1855ce0ce1ec Mon Sep 17 00:00:00 2001 From: Olivia Nordquist Date: Wed, 31 Aug 2016 15:00:58 -0800 Subject: [PATCH 04/89] contrib layers test is flaky, turning off for now Change: 131884536 --- tensorflow/contrib/layers/BUILD | 1 + 1 file changed, 1 insertion(+) diff --git a/tensorflow/contrib/layers/BUILD b/tensorflow/contrib/layers/BUILD index a41c5efb6e9..a4619c5b163 100644 --- a/tensorflow/contrib/layers/BUILD +++ b/tensorflow/contrib/layers/BUILD @@ -127,6 +127,7 @@ py_test( name = "optimizers_test", srcs = ["python/layers/optimizers_test.py"], srcs_version = "PY2AND3", + tags = ["manual"], # http://b/31223979 deps = [ ":layers_py", "//tensorflow:tensorflow_py", From 614cfd0e3babfc3b9a5c241ea69090b7326eaf40 Mon Sep 17 00:00:00 2001 From: "A. Unique TensorFlower" Date: Wed, 31 Aug 2016 15:05:41 -0800 Subject: [PATCH 05/89] Update generated Python Op docs. Change: 131885129 --- tensorflow/g3doc/api_docs/python/constant_op.md | 14 ++++++++------ .../functions_and_classes/shard5/tf.range.md | 14 ++++++++------ 2 files changed, 16 insertions(+), 12 deletions(-) diff --git a/tensorflow/g3doc/api_docs/python/constant_op.md b/tensorflow/g3doc/api_docs/python/constant_op.md index d8e2747edc8..7c999d72d57 100644 --- a/tensorflow/g3doc/api_docs/python/constant_op.md +++ b/tensorflow/g3doc/api_docs/python/constant_op.md @@ -278,13 +278,15 @@ tf.range(limit) ==> [0, 1, 2, 3, 4] ##### Args: -* `start`: A 0-D (scalar) of type `int32`. First entry in sequence. - Defaults to 0. +* `start`: A 0-D (scalar) of type `int32`. Acts as first entry in the range if + `limit` is not None; otherwise, acts as range limit and first entry + defaults to 0. * `limit`: A 0-D (scalar) of type `int32`. Upper limit of sequence, - exclusive. -* `delta`: A 0-D `Tensor` (scalar) of type `int32`. Optional. Default is 1. - Number that increments `start`. -* `name`: A name for the operation (optional). + exclusive. If None, defaults to the value of `start` while the first + entry of the range defaults to 0. +* `delta`: A 0-D `Tensor` (scalar) of type `int32`. Number that increments + `start`. Defaults to 1. +* `name`: A name for the operation. Defaults to "range". ##### Returns: diff --git a/tensorflow/g3doc/api_docs/python/functions_and_classes/shard5/tf.range.md b/tensorflow/g3doc/api_docs/python/functions_and_classes/shard5/tf.range.md index c33825d3be2..fcd865e5fc1 100644 --- a/tensorflow/g3doc/api_docs/python/functions_and_classes/shard5/tf.range.md +++ b/tensorflow/g3doc/api_docs/python/functions_and_classes/shard5/tf.range.md @@ -23,13 +23,15 @@ tf.range(limit) ==> [0, 1, 2, 3, 4] ##### Args: -* `start`: A 0-D (scalar) of type `int32`. First entry in sequence. - Defaults to 0. +* `start`: A 0-D (scalar) of type `int32`. Acts as first entry in the range if + `limit` is not None; otherwise, acts as range limit and first entry + defaults to 0. * `limit`: A 0-D (scalar) of type `int32`. Upper limit of sequence, - exclusive. -* `delta`: A 0-D `Tensor` (scalar) of type `int32`. Optional. Default is 1. - Number that increments `start`. -* `name`: A name for the operation (optional). + exclusive. If None, defaults to the value of `start` while the first + entry of the range defaults to 0. +* `delta`: A 0-D `Tensor` (scalar) of type `int32`. Number that increments + `start`. Defaults to 1. +* `name`: A name for the operation. Defaults to "range". ##### Returns: From e11b99749d2898df0bce0269c77df316b059e8ea Mon Sep 17 00:00:00 2001 From: "A. Unique TensorFlower" Date: Wed, 31 Aug 2016 15:23:15 -0800 Subject: [PATCH 06/89] Automated rollback of change 131739513 Change: 131887027 --- .../python/framework/tensor_util_test.py | 3 +- tensorflow/core/ops/math_ops.cc | 4 +- tensorflow/core/ops/math_ops_test.cc | 7 +- tensorflow/core/ops/sparse_ops.cc | 2 +- tensorflow/core/ops/sparse_ops_test.cc | 16 - tensorflow/python/framework/common_shapes.py | 22 +- .../python/kernel_tests/array_ops_test.py | 2 +- .../python/kernel_tests/check_ops_test.py | 10 +- .../python/kernel_tests/cwise_ops_test.py | 6 +- .../python/kernel_tests/diag_op_test.py | 10 +- .../kernel_tests/reverse_sequence_op_test.py | 4 +- tensorflow/python/ops/array_ops.py | 335 ++++++++++++++++-- tensorflow/python/ops/math_ops.py | 116 ++++-- 13 files changed, 407 insertions(+), 130 deletions(-) diff --git a/tensorflow/contrib/framework/python/framework/tensor_util_test.py b/tensorflow/contrib/framework/python/framework/tensor_util_test.py index 06d46d3aca2..e08b42d0e50 100644 --- a/tensorflow/contrib/framework/python/framework/tensor_util_test.py +++ b/tensorflow/contrib/framework/python/framework/tensor_util_test.py @@ -244,8 +244,7 @@ class WithShapeTest(tf.test.TestCase): incompatible_shape, tensor_partial_shape) for incompatible_shape in [[1, 2, 1]]: self.assertRaisesRegexp( - ValueError, "Dimensions must be equal", - tf.contrib.framework.with_shape, + ValueError, "Incompatible shapes", tf.contrib.framework.with_shape, incompatible_shape, tensor_partial_shape) for incompatible_shape in [[2, 1]]: self.assertRaisesRegexp( diff --git a/tensorflow/core/ops/math_ops.cc b/tensorflow/core/ops/math_ops.cc index 7c5a6cf11b9..386156b0db7 100644 --- a/tensorflow/core/ops/math_ops.cc +++ b/tensorflow/core/ops/math_ops.cc @@ -128,8 +128,8 @@ REGISTER_OP("BatchMatMul") .SetShapeFn([](InferenceContext* c) { ShapeHandle a_shape; ShapeHandle b_shape; - TF_RETURN_IF_ERROR(c->WithRankAtLeast(c->input(0), 2, &a_shape)); - TF_RETURN_IF_ERROR(c->WithRankAtLeast(c->input(1), 2, &b_shape)); + TF_RETURN_IF_ERROR(c->WithRankAtLeast(c->input(0), 3, &a_shape)); + TF_RETURN_IF_ERROR(c->WithRankAtLeast(c->input(1), 3, &b_shape)); // Determine output rows and cols. bool adj_x; diff --git a/tensorflow/core/ops/math_ops_test.cc b/tensorflow/core/ops/math_ops_test.cc index cb7b50262b0..97f3567c303 100644 --- a/tensorflow/core/ops/math_ops_test.cc +++ b/tensorflow/core/ops/math_ops_test.cc @@ -362,14 +362,11 @@ TEST(MathOpsTest, BatchMatMul_ShapeFn) { set_adj(false, false); // Rank checks. - INFER_ERROR("at least rank 2", op, "[1];?"); - INFER_ERROR("at least rank 2", op, "?;[2]"); + INFER_ERROR("at least rank 3", op, "[1,2];?"); + INFER_ERROR("at least rank 3", op, "?;[1,2]"); INFER_OK(op, "?;?", "?"); - // 0 batch dims. - INFER_OK(op, "[?,?];[?,?]", "[d0_0,d1_1]"); - // 2 batch dims. INFER_OK(op, "[?,?,?,?];?", "[d0_0,d0_1,d0_2,?]"); diff --git a/tensorflow/core/ops/sparse_ops.cc b/tensorflow/core/ops/sparse_ops.cc index 320f9b77829..f4544062b77 100644 --- a/tensorflow/core/ops/sparse_ops.cc +++ b/tensorflow/core/ops/sparse_ops.cc @@ -671,7 +671,7 @@ output: `R-K`-D. The reduced Tensor. .Attr("T: numbertype") \ .SetShapeFn([](InferenceContext* c) { \ ShapeHandle input; \ - TF_RETURN_IF_ERROR(c->WithRank(c->input(0), 2, &input)); \ + TF_RETURN_IF_ERROR(c->WithRank(c->input(1), 2, &input)); \ c->set_output(0, c->Vector(c->Dim(input, 0))); \ return Status::OK(); \ }) diff --git a/tensorflow/core/ops/sparse_ops_test.cc b/tensorflow/core/ops/sparse_ops_test.cc index 3b738c26044..7e27ff0a127 100644 --- a/tensorflow/core/ops/sparse_ops_test.cc +++ b/tensorflow/core/ops/sparse_ops_test.cc @@ -276,20 +276,4 @@ TEST(SparseOpsTest, SparseConcat_ShapeFn) { INFER_ERROR("but are 4 and 5", op, "?;?;?;?;[4];[5]"); } -TEST(SparseOpsTest, SparseDenseCwise_ShapeFn) { - for (const char* op_name : - {"SparseDenseCwiseMul", "SparseDenseCwiseDiv", "SparseDenseCwiseAdd"}) { - ShapeInferenceTestOp op(op_name); - - // output is always a vector. - INFER_OK(op, "?;?;?;?", "[?]"); - - // input(0).dim(0) determines output[0]. - INFER_OK(op, "[?,?];?;?;?", "[d0_0]"); - - // Rank checks. - INFER_ERROR("must be rank 2", op, "[1];?;?;?"); - } -} - } // end namespace tensorflow diff --git a/tensorflow/python/framework/common_shapes.py b/tensorflow/python/framework/common_shapes.py index 2a2e24c3237..843317d3919 100644 --- a/tensorflow/python/framework/common_shapes.py +++ b/tensorflow/python/framework/common_shapes.py @@ -589,16 +589,11 @@ def broadcast_shape(shape_x, shape_y): return tensor_shape.TensorShape(return_dims) -def call_cpp_shape_fn(op, debug_python_shape_fn=None): +def call_cpp_shape_fn(op): """A shape function that delegates to the registered C++ shape function. Args: op: the node in the graph for which to compute output shapes. - debug_python_shape_fn: For testing only during migration to using - call_cpp_shape_fn. Do not submit calls that set this, - as the comparison is slow. If non-None, the python shape function; - this function will be called and its output compared to that of - the C++ shape function. Returns: A TensorShape list of the output shapes of the op, as computed using the @@ -621,20 +616,7 @@ def call_cpp_shape_fn(op, debug_python_shape_fn=None): raise ValueError(err.message) # Convert TensorShapeProto values in output_shapes. - result = [ + return [ tensor_shape.TensorShape(tensor_shape_pb2.TensorShapeProto.FromString(s)) for s in output_shapes ] - - if debug_python_shape_fn: - python_result = [tensor_shape.as_shape(s) - for s in debug_python_shape_fn(op)] - if str(result) != str(python_result): - raise ValueError( - ("Python vs CPP shape mismatch. " - "actual: %s vs expected: %s on node %s " - "with input shapes %s") % ( - str(result), str(python_result), str(op.node_def), - ",".join([str(i.get_shape()) for i in op.inputs]))) - - return result diff --git a/tensorflow/python/kernel_tests/array_ops_test.py b/tensorflow/python/kernel_tests/array_ops_test.py index 542af1a87e6..913e04bb95b 100644 --- a/tensorflow/python/kernel_tests/array_ops_test.py +++ b/tensorflow/python/kernel_tests/array_ops_test.py @@ -227,7 +227,7 @@ class ReverseTest(test_util.TensorFlowTestCase): self.assertEqual(2, reverse_2d_t.get_shape().ndims) dims_3d_t = tf.placeholder(tf.bool, shape=[3]) - with self.assertRaisesRegexp(ValueError, "must be rank 3"): + with self.assertRaisesRegexp(ValueError, "must have rank 3"): tf.reverse(data_2d_t, dims_3d_t) diff --git a/tensorflow/python/kernel_tests/check_ops_test.py b/tensorflow/python/kernel_tests/check_ops_test.py index 49bca0938ff..a3ddc0c8a96 100644 --- a/tensorflow/python/kernel_tests/check_ops_test.py +++ b/tensorflow/python/kernel_tests/check_ops_test.py @@ -97,7 +97,7 @@ class AssertEqualTest(tf.test.TestCase): with self.test_session(): small = tf.constant([1, 1, 1], name="small") small_2 = tf.constant([1, 1], name="small_2") - with self.assertRaisesRegexp(ValueError, "must be"): + with self.assertRaisesRegexp(ValueError, "broadcast"): with tf.control_dependencies([tf.assert_equal(small, small_2)]): out = tf.identity(small) out.eval() @@ -151,7 +151,7 @@ class AssertLessTest(tf.test.TestCase): with self.test_session(): small = tf.constant([1, 1, 1], name="small") big = tf.constant([3, 2], name="big") - with self.assertRaisesRegexp(ValueError, "must be"): + with self.assertRaisesRegexp(ValueError, "broadcast"): with tf.control_dependencies([tf.assert_less(small, big)]): out = tf.identity(small) out.eval() @@ -204,7 +204,7 @@ class AssertLessEqualTest(tf.test.TestCase): with self.test_session(): small = tf.constant([1, 1, 1], name="small") big = tf.constant([3, 1], name="big") - with self.assertRaisesRegexp(ValueError, "must be"): + with self.assertRaisesRegexp(ValueError, "broadcast"): with tf.control_dependencies([tf.assert_less_equal(small, big)]): out = tf.identity(small) out.eval() @@ -258,7 +258,7 @@ class AssertGreaterTest(tf.test.TestCase): with self.test_session(): small = tf.constant([1, 1, 1], name="small") big = tf.constant([3, 2], name="big") - with self.assertRaisesRegexp(ValueError, "must be"): + with self.assertRaisesRegexp(ValueError, "broadcast"): with tf.control_dependencies([tf.assert_greater(big, small)]): out = tf.identity(small) out.eval() @@ -311,7 +311,7 @@ class AssertGreaterEqualTest(tf.test.TestCase): with self.test_session(): small = tf.constant([1, 1, 1], name="big") big = tf.constant([3, 1], name="small") - with self.assertRaisesRegexp(ValueError, "Dimensions must be equal"): + with self.assertRaisesRegexp(ValueError, "broadcast"): with tf.control_dependencies([tf.assert_greater_equal(big, small)]): out = tf.identity(small) out.eval() diff --git a/tensorflow/python/kernel_tests/cwise_ops_test.py b/tensorflow/python/kernel_tests/cwise_ops_test.py index 8b966055e66..0890ea4a2af 100644 --- a/tensorflow/python/kernel_tests/cwise_ops_test.py +++ b/tensorflow/python/kernel_tests/cwise_ops_test.py @@ -986,7 +986,7 @@ class BinaryOpTest(tf.test.TestCase): for func in [tf.add, tf.sub, tf.mul, tf.div, _ADD, _SUB, _MUL, _TRUEDIV, _FLOORDIV]: with self.assertRaisesWithPredicateMatch( - ValueError, lambda e: "Dimensions must" in str(e)): + ValueError, lambda e: "Incompatible shapes" in str(e)): func(tf.convert_to_tensor([10.0, 20.0, 30.0]), tf.convert_to_tensor([[40.0, 50.0], [60.0, 70.0]])) @@ -1146,7 +1146,7 @@ class ComparisonOpTest(tf.test.TestCase): for t in dtypes: for f in funcs: with self.assertRaisesWithPredicateMatch( - ValueError, lambda e: "Dimensions must" in str(e)): + ValueError, lambda e: "Incompatible shapes" in str(e)): f(x.astype(t), y.astype(t)) @@ -1222,7 +1222,7 @@ class LogicalOpTest(tf.test.TestCase): y = np.random.randint(0, 2, 6).astype(np.bool).reshape(3, 2, 1) for f in [tf.logical_and, tf.logical_or, tf.logical_xor]: with self.assertRaisesWithPredicateMatch( - ValueError, lambda e: "Dimensions must" in str(e)): + ValueError, lambda e: "Incompatible shapes" in str(e)): f(x, y) def testUsingAsPythonValueFails(self): diff --git a/tensorflow/python/kernel_tests/diag_op_test.py b/tensorflow/python/kernel_tests/diag_op_test.py index b3d8fe373c6..bdc83ea6328 100644 --- a/tensorflow/python/kernel_tests/diag_op_test.py +++ b/tensorflow/python/kernel_tests/diag_op_test.py @@ -48,7 +48,7 @@ class BatchMatrixDiagTest(tf.test.TestCase): self.assertAllEqual(v_batch_diag.eval(), mat_batch) def testInvalidShape(self): - with self.assertRaisesRegexp(ValueError, "must be at least rank 1"): + with self.assertRaisesRegexp(ValueError, "must have rank at least 1"): tf.batch_matrix_diag(0) def testInvalidShapeAtEval(self): @@ -112,9 +112,9 @@ class BatchMatrixSetDiagTest(tf.test.TestCase): self.assertAllEqual(mat_set_diag_batch, output.eval()) def testInvalidShape(self): - with self.assertRaisesRegexp(ValueError, "must be at least rank 2"): + with self.assertRaisesRegexp(ValueError, "must have rank at least 2"): tf.batch_matrix_set_diag(0, [0]) - with self.assertRaisesRegexp(ValueError, "must be at least rank 1"): + with self.assertRaisesRegexp(ValueError, "must have rank at least 1"): tf.batch_matrix_set_diag([[0]], 0) def testInvalidShapeAtEval(self): @@ -189,9 +189,9 @@ class BatchMatrixDiagPartTest(tf.test.TestCase): self.assertAllEqual(mat_batch_diag.eval(), v_batch) def testInvalidShape(self): - with self.assertRaisesRegexp(ValueError, "must be at least rank 2"): + with self.assertRaisesRegexp(ValueError, "must have rank at least 2"): tf.batch_matrix_diag_part(0) - with self.assertRaisesRegexp(ValueError, r"Dimensions must be equal"): + with self.assertRaisesRegexp(ValueError, r"Dimensions .* not compatible"): tf.batch_matrix_diag_part([[0, 1], [1, 0], [0, 0]]) def testInvalidShapeAtEval(self): diff --git a/tensorflow/python/kernel_tests/reverse_sequence_op_test.py b/tensorflow/python/kernel_tests/reverse_sequence_op_test.py index 4c5ab904648..cea9711d322 100644 --- a/tensorflow/python/kernel_tests/reverse_sequence_op_test.py +++ b/tensorflow/python/kernel_tests/reverse_sequence_op_test.py @@ -134,7 +134,7 @@ class ReverseSequenceTest(tf.test.TestCase): seq_dim=3) # seq_dim out of bounds. - with self.assertRaisesRegexp(ValueError, "seq_dim must be < input rank"): + with self.assertRaisesRegexp(ValueError, "seq_dim must be < input.dims()"): tf.reverse_sequence( tf.placeholder(tf.float32, shape=(32, 2, 3)), seq_lengths=tf.placeholder(tf.int64, shape=(32,)), @@ -142,7 +142,7 @@ class ReverseSequenceTest(tf.test.TestCase): # batch_dim out of bounds. with self.assertRaisesRegexp( - ValueError, "batch_dim must be < input rank"): + ValueError, "batch_dim must be < input.dims()"): tf.reverse_sequence( tf.placeholder(tf.float32, shape=(32, 2, 3)), seq_lengths=tf.placeholder(tf.int64, shape=(32,)), diff --git a/tensorflow/python/ops/array_ops.py b/tensorflow/python/ops/array_ops.py index 85f1a91d353..3b431f258f3 100644 --- a/tensorflow/python/ops/array_ops.py +++ b/tensorflow/python/ops/array_ops.py @@ -800,7 +800,16 @@ def _PackShape(op): return [tensor_shape.TensorShape(input_shape)] -ops.RegisterShape("Unpack")(common_shapes.call_cpp_shape_fn) +@ops.RegisterShape("Unpack") +def _UnpackShape(op): + input_shape = op.inputs[0].get_shape() + if input_shape.ndims is None: + return [tensor_shape.unknown_shape()] * op.get_attr("num") + + input_shape = input_shape.as_list() + del input_shape[op.get_attr("axis")] + return [tensor_shape.TensorShape(input_shape)] * op.get_attr("num") + @ops.RegisterShape("Concat") def _ConcatShape(op): @@ -839,7 +848,9 @@ def _ConcatShape(op): return [output_shape] -ops.RegisterShape("ConcatOffset")(common_shapes.call_cpp_shape_fn) +@ops.RegisterShape("ConcatOffset") +def _ConcatOffsetShape(op): + return [x.get_shape() for x in op.inputs[1:]] def boolean_mask(tensor, mask, name="boolean_mask"): @@ -996,7 +1007,16 @@ def split(split_dim, num_split, value, name="split"): name=name) -ops.RegisterShape("Reverse")(common_shapes.call_cpp_shape_fn) +@ops.RegisterShape("Reverse") +def _ReverseShape(op): + input_shape = op.inputs[0].get_shape() + dims_shape = op.inputs[1].get_shape().with_rank(1) + if dims_shape[0].value is not None: + input_shape = input_shape.with_rank(dims_shape[0]) + if input_shape.ndims is not None and input_shape.ndims > 8: + raise ValueError( + "tf.reverse() does not work on tensors with more than 8 dimensions") + return [input_shape] def transpose(a, perm=None, name="transpose"): @@ -1504,14 +1524,20 @@ def _PlaceholderShape(op): return [tensor_shape.unknown_shape()] -ops.RegisterShape("CheckNumerics")(common_shapes.call_cpp_shape_fn) -ops.RegisterShape("Identity")(common_shapes.call_cpp_shape_fn) -ops.RegisterShape("RefIdentity")(common_shapes.call_cpp_shape_fn) -ops.RegisterShape("StopGradient")(common_shapes.call_cpp_shape_fn) -ops.RegisterShape("BatchMatrixBandPart")(common_shapes.call_cpp_shape_fn) -ops.RegisterShape("QuantizeAndDequantize")(common_shapes.call_cpp_shape_fn) -ops.RegisterShape("Rank")(common_shapes.call_cpp_shape_fn) -ops.RegisterShape("Size")(common_shapes.call_cpp_shape_fn) +@ops.RegisterShape("CheckNumerics") +@ops.RegisterShape("Identity") +@ops.RegisterShape("RefIdentity") +@ops.RegisterShape("StopGradient") +@ops.RegisterShape("BatchMatrixBandPart") +@ops.RegisterShape("QuantizeAndDequantize") +def _UnchangedShape(op): + return [op.inputs[0].get_shape()] + + +@ops.RegisterShape("Rank") +@ops.RegisterShape("Size") +def _ScalarShape(unused_op): + return [tensor_shape.scalar()] @ops.RegisterShape("Slice") @@ -1699,15 +1725,121 @@ def _StridedSliceShape(op): return [tensor_shape.TensorShape(final_shape)] -ops.RegisterShape("Gather")(common_shapes.call_cpp_shape_fn) -ops.RegisterShape("GatherNd")(common_shapes.call_cpp_shape_fn) -ops.RegisterShape("Unique")(common_shapes.call_cpp_shape_fn) -ops.RegisterShape("UniqueWithCounts")(common_shapes.call_cpp_shape_fn) -ops.RegisterShape("BatchMatrixDiag")(common_shapes.call_cpp_shape_fn) -ops.RegisterShape("BatchMatrixSetDiag")(common_shapes.call_cpp_shape_fn) -ops.RegisterShape("BatchMatrixDiagPart")(common_shapes.call_cpp_shape_fn) -ops.RegisterShape("Diag")(common_shapes.call_cpp_shape_fn) -ops.RegisterShape("DiagPart")(common_shapes.call_cpp_shape_fn) +@ops.RegisterShape("Gather") +def _GatherShape(op): + """Shape function for array_ops.gather.""" + params_shape = op.inputs[0].get_shape() + indices_shape = op.inputs[1].get_shape() + return [indices_shape.concatenate(params_shape[1:])] + + +@ops.RegisterShape("GatherNd") +def _GatherNdShape(op): + """Shape function for array_ops.gather_nd.""" + params_shape = op.inputs[0].get_shape() + indices_shape = op.inputs[1].get_shape().with_rank_at_least(1) + indices_rank = indices_shape.ndims + indices_lookup_rank = ( + None if indices_rank is None else indices_shape[-1].value) + if params_shape.ndims is None or indices_lookup_rank is None: + return [tensor_shape.unknown_shape()] + else: + if indices_lookup_rank > params_shape.ndims: + raise ValueError( + "indices.shape[-1] must be <= params.rank, but saw indices shape: %s " + " and params shape: %s" % (indices_shape, params_shape)) + indices_lookup_shape = indices_shape[:-1] + params_slices_shape = params_shape[indices_lookup_rank:] + return [indices_lookup_shape.concatenate(params_slices_shape)] + + +@ops.RegisterShape("Unique") +def _UniqueShape(op): + """Shape function for array_ops.Unique.""" + # The output is a vector with data-dependent length. + input_shape = op.inputs[0].get_shape() + input_shape.assert_has_rank(1) + return [tensor_shape.vector(None), input_shape] + + +@ops.RegisterShape("UniqueWithCounts") +def _UniqueWithCountsShape(op): + """Shape function for array_ops.Unique.""" + # The output is a vector with data-dependent length. + input_shape = op.inputs[0].get_shape() + input_shape.assert_has_rank(1) + return [tensor_shape.vector(None), input_shape, tensor_shape.vector(None)] + + +@ops.RegisterShape("BatchMatrixDiag") +def _BatchMatrixDiagShape(op): + """Shape function for array_ops.batch_matrix_diag.""" + diag_shape = op.inputs[0].get_shape().with_rank_at_least(1) + return [diag_shape.concatenate(diag_shape[-1])] + + +@ops.RegisterShape("BatchMatrixSetDiag") +def _BatchMatrixSetDiagShape(op): + """Shape function for array_ops.batch_matrix_set_diag.""" + input_shape = op.inputs[0].get_shape().with_rank_at_least(2) + diag_shape = op.inputs[1].get_shape().with_rank_at_least(1) + output_shape = diag_shape.concatenate(diag_shape[-1]) + output_shape = output_shape.merge_with(input_shape) + return [output_shape] + + +@ops.RegisterShape("BatchMatrixDiagPart") +def _BatchMatrixDiagPartShape(op): + """Shape function for array_ops.batch_matrix_diag_part.""" + input_shape = op.inputs[0].get_shape().with_rank_at_least(2) + # Last two dims must match + input_shape[-1].assert_is_compatible_with(input_shape[-2]) + return [input_shape[:-1]] + + +@ops.RegisterShape("Diag") +def _DiagShape(op): + """Shape function for array_ops.diag. + + This op has one input (of rank k <= 3), and one output (of rank 2k), + where the shape of the output is the concatenation of the input + shape with itself. + + Args: + op: A Diag Operation. + + Returns: + A single-element list containing the shape of the output. + """ + input_shape = op.inputs[0].get_shape().with_rank_at_most(3) + return [input_shape.concatenate(input_shape)] + +@ops.RegisterShape("DiagPart") +def _DiagPartShape(op): + """Shape function for array_ops.diag_part. + + This op has one input (of rank k = 2, 4, or 6), and one output (of rank k/2), + where the shape of the output is the diagonal of the input shape. + + Args: + op: A DiagPart Operation. + + Returns: + A single-element list containing the shape of the output. + + Raises: + ValueError: If input has odd rank or greater than 6, or the first and + second halves of the shape are incompatible. + + """ + input_shape = op.inputs[0].get_shape().with_rank_at_most(6) + rank = input_shape.ndims + if rank is None: + return [tensor_shape.unknown_shape()] + if rank % 2: + raise ValueError("Input must be even rank, got rank = " + str(rank) + ".") + mid = rank // 2 + return [input_shape[:mid].merge_with(input_shape[mid:])] @ops.RegisterShape("ExpandDims") def _ExpandDimsShape(op): @@ -1738,8 +1870,83 @@ def _ExpandDimsShape(op): return [tensor_shape.TensorShape(result_shape)] -ops.RegisterShape("Squeeze")(common_shapes.call_cpp_shape_fn) -ops.RegisterShape("Bitcast")(common_shapes.call_cpp_shape_fn) +@ops.RegisterShape("Squeeze") +def _SqueezeShape(op): + """Determine shape for squeeze op's output tensor. + + Args: + op: Operation for which to determine shape. + Returns: + Shape of op's output tensor. + Raises: + ValueError: if squeeze_dims includes a dimension outside of [-rank, rank), + where rank is the number of dimensions in the input tensor. Or, if + squeeze_dims includes a dimension for which input shape has a value + not equal to 1. + """ + input_shape = op.inputs[0].get_shape() + if input_shape.dims is None: + return [tensor_shape.unknown_shape()] + + squeeze_dims = op.get_attr("squeeze_dims") or [] + wrapped_squeeze_dims = [] + input_ndims = input_shape.ndims + for i, squeeze_dim in enumerate(squeeze_dims): + if squeeze_dim < -input_ndims or squeeze_dim >= input_ndims: + raise ValueError( + "squeeze_dims[%d]=%d not in [%d, %d)." % ( + i, squeeze_dim, -input_ndims, input_ndims)) + if squeeze_dim < 0: + squeeze_dim += input_ndims + wrapped_squeeze_dims.append(squeeze_dim) + + result_shape = [] + for i, dim in enumerate([d.value for d in input_shape.dims]): + is_explicit_match = i in wrapped_squeeze_dims + if dim is None: + if is_explicit_match: + # Assume that the squeezed dimension will be 1 at runtime. + continue + if not wrapped_squeeze_dims: + # If squeezing all 1 dimensions and we see a None, give up. + return [tensor_shape.unknown_shape()] + elif dim == 1: + if is_explicit_match or not wrapped_squeeze_dims: + continue + elif is_explicit_match: + raise ValueError( + "Can not squeeze dim[%d], expected a dimension of 1, got %d." % ( + i, dim)) + result_shape.append(dim) + return [tensor_shape.TensorShape(result_shape)] + + +@ops.RegisterShape("Bitcast") +def _BitcastShape(op): + """Shape function for Bitcast op.""" + input_shape = op.inputs[0].get_shape() + if input_shape == tensor_shape.unknown_shape(): + return [tensor_shape.unknown_shape()] + input_type = op.inputs[0].dtype + size_of_input = input_type.size + output = dtypes.as_dtype(op.get_attr("type")) + size_of_output = output.size + if size_of_input == size_of_output: + return [input_shape] + else: + if size_of_output > size_of_input: + new_shape = input_shape.with_rank_at_least(1).as_list() + last_val = new_shape[-1] + if last_val is None or last_val == (size_of_output // size_of_input): + new_shape = new_shape[:-1] + else: + raise ValueError( + "Cannot bitcast due to shape. %d is not evenly divisible by %d." % + (new_shape[-1], size_of_input // size_of_output)) + else: + new_shape = input_shape + new_shape = new_shape.concatenate([size_of_input // size_of_output]) + return [tensor_shape.TensorShape(new_shape)] @ops.RegisterShape("Reshape") @@ -1786,7 +1993,13 @@ def _ReshapeShape(op): return [new_shape] -ops.RegisterShape("BroadcastGradientArgs")(common_shapes.call_cpp_shape_fn) +@ops.RegisterShape("BroadcastGradientArgs") +def _BroadcastGradientArgsShape(op): + """Shape function for the BroadcastGradientArgs op.""" + # TODO(mrry): Implement constant_value for BroadcastGradientArgs? + op.inputs[0].get_shape().assert_has_rank(1) + op.inputs[1].get_shape().assert_has_rank(1) + return [tensor_shape.vector(None), tensor_shape.vector(None)] @ops.RegisterShape("Fill") @@ -1813,8 +2026,19 @@ def _FillShape(op): return [tensor_util.constant_value_as_shape(op.inputs[0])] -ops.RegisterShape("InvertPermutation")(common_shapes.call_cpp_shape_fn) -ops.RegisterShape("ListDiff")(common_shapes.call_cpp_shape_fn) +@ops.RegisterShape("InvertPermutation") +def _InvertPermutationShape(op): + """Shape function for the InvertPermutation op.""" + return [op.inputs[0].get_shape().with_rank(1)] + + +@ops.RegisterShape("ListDiff") +def _ListDiffShape(op): + """Shape function for the ListDiff op.""" + op.inputs[0].get_shape().assert_has_rank(1) + op.inputs[1].get_shape().assert_has_rank(1) + # TODO(mrry): Indicate that the length falls within an interval? + return [tensor_shape.vector(None)] * 2 @ops.RegisterShape("Pad") @@ -1879,9 +2103,51 @@ def _MirrorPadGradShape(op): return [tensor_shape.TensorShape(output_dims)] -ops.RegisterShape("ReverseSequence")(common_shapes.call_cpp_shape_fn) -ops.RegisterShape("Shape")(common_shapes.call_cpp_shape_fn) -ops.RegisterShape("ShapeN")(common_shapes.call_cpp_shape_fn) +@ops.RegisterShape("ReverseSequence") +def _ReverseSequenceShape(op): + """Shape function for the ReverseSequence op. + + This op has two inputs: + + * input: A rank-N tensor with size B in the 0th dimension. + * seq_lens: A vector of length B. + + It has one output, with the same size as input. + + Args: + op: A ReverseSequence Operation. + + Returns: + A single-element list containing the shape of the output. + + Raises: + ValueError: If the input shapes are incompatible or seq_dim == batch_dim. + """ + input_shape = op.inputs[0].get_shape() + seq_lens_shape = op.inputs[1].get_shape().with_rank(1) + if input_shape.ndims is None: + return [None] + seq_dim = op.get_attr("seq_dim") + batch_dim = op.get_attr("batch_dim") + if input_shape.ndims is not None: + if batch_dim >= input_shape.ndims: + raise ValueError("batch_dim must be < input.dims() (%d vs %d)" % + (batch_dim, input_shape.ndims)) + if seq_dim >= input_shape.ndims: + raise ValueError("seq_dim must be < input.dims() (%d vs %d)" % + (seq_dim, input_shape.ndims)) + batch_size = input_shape[batch_dim].merge_with(seq_lens_shape[0]) + input_shape = tensor_shape.TensorShape([ + value if ix != batch_dim else batch_size + for ix, value in enumerate(input_shape)]) + return [input_shape] + + +@ops.RegisterShape("Shape") +@ops.RegisterShape("ShapeN") +def _ShapeNShape(op): + """Shape function for the Shape/ShapeN op.""" + return [tensor_shape.vector(x.get_shape().ndims) for x in op.inputs] @ops.RegisterShape("Transpose") @@ -1994,8 +2260,17 @@ def _TileGradShape(op): return [tensor_shape.TensorShape(output_dims)] -ops.RegisterShape("Where")(common_shapes.call_cpp_shape_fn) -ops.RegisterShape("ZerosLike")(common_shapes.call_cpp_shape_fn) +@ops.RegisterShape("Where") +def _WhereShape(op): + """Shape function for the Where op.""" + input_shape = op.inputs[0].get_shape() + return [tensor_shape.matrix(None, input_shape.ndims)] + + +@ops.RegisterShape("ZerosLike") +def _ZerosLikeShape(op): + """Shape function for the ZerosLike op.""" + return [op.inputs[0].get_shape()] def edit_distance(hypothesis, truth, normalize=True, name="edit_distance"): diff --git a/tensorflow/python/ops/math_ops.py b/tensorflow/python/ops/math_ops.py index 93c78b6254a..24a2331ca36 100644 --- a/tensorflow/python/ops/math_ops.py +++ b/tensorflow/python/ops/math_ops.py @@ -1594,7 +1594,22 @@ def accumulate_n(inputs, shape=None, tensor_dtype=None, name=None): ref, var_name=var.op.name, name=name) -ops.RegisterShape("BatchMatMul")(common_shapes.call_cpp_shape_fn) +@ops.RegisterShape("BatchMatMul") +def _BatchMatMulShape(op): + """Shape function for BatchMatMul op.""" + a_shape = op.inputs[0].get_shape() + adj_a = op.get_attr("adj_x") + b_shape = op.inputs[1].get_shape() + adj_b = op.get_attr("adj_y") + if a_shape.dims is None and b_shape.dims is None: + return [tensor_shape.unknown_shape()] + batch_dims = a_shape[:-2].merge_with(b_shape[:-2]) + output_rows = a_shape[-1] if adj_a else a_shape[-2] + output_cols = b_shape[-2] if adj_b else b_shape[-1] + inner_a = a_shape[-2] if adj_a else a_shape[-1] + inner_b = b_shape[-1] if adj_b else b_shape[-2] + inner_a.assert_is_compatible_with(inner_b) + return [batch_dims.concatenate([output_rows, output_cols])] def sigmoid(x, name=None): @@ -1780,31 +1795,28 @@ ops.RegisterShape("Cumsum")(common_shapes.unchanged_shape) ops.RegisterShape("Cumprod")(common_shapes.unchanged_shape) -ops.RegisterShape("Add")(common_shapes.call_cpp_shape_fn) -ops.RegisterShape("Complex")(common_shapes.call_cpp_shape_fn) -ops.RegisterShape("Div")(common_shapes.call_cpp_shape_fn) -ops.RegisterShape("Equal")(common_shapes.call_cpp_shape_fn) -ops.RegisterShape("Greater")(common_shapes.call_cpp_shape_fn) -ops.RegisterShape("GreaterEqual")(common_shapes.call_cpp_shape_fn) -ops.RegisterShape("Igamma")(common_shapes.call_cpp_shape_fn) -ops.RegisterShape("Igammac")(common_shapes.call_cpp_shape_fn) -ops.RegisterShape("Zeta")(common_shapes.call_cpp_shape_fn) -ops.RegisterShape("Polygamma")(common_shapes.call_cpp_shape_fn) -ops.RegisterShape("Less")(common_shapes.call_cpp_shape_fn) -ops.RegisterShape("LessEqual")(common_shapes.call_cpp_shape_fn) -ops.RegisterShape("LogicalAnd")(common_shapes.call_cpp_shape_fn) -ops.RegisterShape("LogicalOr")(common_shapes.call_cpp_shape_fn) -ops.RegisterShape("Maximum")(common_shapes.call_cpp_shape_fn) -ops.RegisterShape("Minimum")(common_shapes.call_cpp_shape_fn) -ops.RegisterShape("Mod")(common_shapes.call_cpp_shape_fn) -ops.RegisterShape("Mul")(common_shapes.call_cpp_shape_fn) -ops.RegisterShape("NotEqual")(common_shapes.call_cpp_shape_fn) -ops.RegisterShape("Pow")(common_shapes.call_cpp_shape_fn) -ops.RegisterShape("Sub")(common_shapes.call_cpp_shape_fn) -ops.RegisterShape("SquaredDifference")(common_shapes.call_cpp_shape_fn) - - -# TODO(cwhipkey): inline body into callers. +@ops.RegisterShape("Add") +@ops.RegisterShape("Complex") +@ops.RegisterShape("Div") +@ops.RegisterShape("Equal") +@ops.RegisterShape("Greater") +@ops.RegisterShape("GreaterEqual") +@ops.RegisterShape("Igamma") +@ops.RegisterShape("Igammac") +@ops.RegisterShape("Zeta") +@ops.RegisterShape("Polygamma") +@ops.RegisterShape("Less") +@ops.RegisterShape("LessEqual") +@ops.RegisterShape("LogicalAnd") +@ops.RegisterShape("LogicalOr") +@ops.RegisterShape("Maximum") +@ops.RegisterShape("Minimum") +@ops.RegisterShape("Mod") +@ops.RegisterShape("Mul") +@ops.RegisterShape("NotEqual") +@ops.RegisterShape("Pow") +@ops.RegisterShape("Sub") +@ops.RegisterShape("SquaredDifference") def _BroadcastShape(op): """Common shape function for binary operators that broadcast their inputs.""" return [common_shapes.broadcast_shape( @@ -1827,10 +1839,21 @@ def _BetaincOpShape(op): # pylint: disable=invalid-name return [merged_shape if merged_shape.ndims is not None else a_shape] -ops.RegisterShape("SparseDenseCwiseMul")(common_shapes.call_cpp_shape_fn) -ops.RegisterShape("SparseDenseCwiseDiv")(common_shapes.call_cpp_shape_fn) -ops.RegisterShape("SparseDenseCwiseAdd")(common_shapes.call_cpp_shape_fn) -ops.RegisterShape("AddN")(common_shapes.call_cpp_shape_fn) +@ops.RegisterShape("SparseDenseCwiseMul") +@ops.RegisterShape("SparseDenseCwiseDiv") +@ops.RegisterShape("SparseDenseCwiseAdd") +def _SparseDenseBinaryOpShape(op): # pylint: disable=invalid-name + """Common shape for 'sparse dense -> sparse' operators.""" + nnz = op.inputs[1].get_shape()[0] + return [tensor_shape.TensorShape(nnz)] + + +@ops.RegisterShape("AddN") +def _AddNShape(op): + merged_shape = tensor_shape.unknown_shape() + for input_ in op.inputs: + merged_shape = merged_shape.merge_with(input_.get_shape()) + return [merged_shape] @ops.RegisterShape("Select") @@ -1931,14 +1954,31 @@ def _ReductionShape(op): return [tensor_shape.TensorShape(returned_dims)] -ops.RegisterShape("SegmentMax")(common_shapes.call_cpp_shape_fn) -ops.RegisterShape("SegmentMean")(common_shapes.call_cpp_shape_fn) -ops.RegisterShape("SegmentMin")(common_shapes.call_cpp_shape_fn) -ops.RegisterShape("SegmentProd")(common_shapes.call_cpp_shape_fn) -ops.RegisterShape("SegmentSum")(common_shapes.call_cpp_shape_fn) -ops.RegisterShape("SparseSegmentMean")(common_shapes.call_cpp_shape_fn) -ops.RegisterShape("SparseSegmentSqrtN")(common_shapes.call_cpp_shape_fn) -ops.RegisterShape("SparseSegmentSum")(common_shapes.call_cpp_shape_fn) +@ops.RegisterShape("SegmentMax") +@ops.RegisterShape("SegmentMean") +@ops.RegisterShape("SegmentMin") +@ops.RegisterShape("SegmentProd") +@ops.RegisterShape("SegmentSum") +def _SegmentReductionShape(op): + """Common shape function for segment reduction ops.""" + data_shape = op.inputs[0].get_shape() + segment_ids_shape = op.inputs[1].get_shape() + segment_ids_shape.assert_has_rank(1) + return [tensor_shape.TensorShape([None]).concatenate(data_shape[1:])] + + +@ops.RegisterShape("SparseSegmentMean") +@ops.RegisterShape("SparseSegmentSqrtN") +@ops.RegisterShape("SparseSegmentSum") +def _SparseSegmentReductionShape(op): + """Common shape function for sparse segment reduction ops.""" + data_shape = op.inputs[0].get_shape() + indices_shape = op.inputs[1].get_shape() + indices_shape.assert_has_rank(1) + segment_ids_shape = op.inputs[2].get_shape() + segment_ids_shape.assert_has_rank(1) + indices_shape.assert_is_compatible_with(segment_ids_shape) + return [tensor_shape.TensorShape([None]).concatenate(data_shape[1:])] @ops.RegisterShape("SparseSegmentMeanGrad") From 62c159ffe847eeb788550a32b8be572e41055022 Mon Sep 17 00:00:00 2001 From: Suharsh Sivakumar Date: Wed, 31 Aug 2016 15:43:07 -0800 Subject: [PATCH 07/89] Add Reset implementation for DirectSession. - Reset clears and closes the specified containers for ALL DirectSession objects. - Add closed bit to DirectSession to ensure that operations that occur after Close is called fail. Change: 131889161 --- .../core/common_runtime/direct_session.cc | 127 +++++++++++++----- .../core/common_runtime/direct_session.h | 25 +++- .../common_runtime/direct_session_test.cc | 124 +++++++++++++++++ 3 files changed, 240 insertions(+), 36 deletions(-) diff --git a/tensorflow/core/common_runtime/direct_session.cc b/tensorflow/core/common_runtime/direct_session.cc index 6aedcf4e7e8..4c90226231c 100644 --- a/tensorflow/core/common_runtime/direct_session.cc +++ b/tensorflow/core/common_runtime/direct_session.cc @@ -26,7 +26,6 @@ limitations under the License. #include "tensorflow/core/common_runtime/gpu/gpu_tracer.h" #include "tensorflow/core/common_runtime/graph_optimizer.h" #include "tensorflow/core/common_runtime/memory_types.h" -#include "tensorflow/core/common_runtime/session_factory.h" #include "tensorflow/core/common_runtime/simple_placer.h" #include "tensorflow/core/common_runtime/step_stats_collector.h" #include "tensorflow/core/framework/function.h" @@ -113,6 +112,77 @@ string GetRendezvousKey(const string& tensor_name, } // namespace +class DirectSessionFactory : public SessionFactory { + public: + DirectSessionFactory() {} + + bool AcceptsOptions(const SessionOptions& options) override { + return options.target.empty(); + } + + Session* NewSession(const SessionOptions& options) override { + // Must do this before the CPU allocator is created. + if (options.config.graph_options().build_cost_model() > 0) { + EnableCPUAllocatorFullStats(true); + } + std::vector devices; + Status s = DeviceFactory::AddDevices( + options, "/job:localhost/replica:0/task:0", &devices); + if (!s.ok()) { + LOG(ERROR) << s; + return nullptr; + } + + DirectSession* session = + new DirectSession(options, new DeviceMgr(devices), this); + { + mutex_lock l(sessions_lock_); + sessions_.push_back(session); + } + return session; + } + + Status Reset(const SessionOptions& options, + const std::vector& containers) override { + std::vector sessions_to_reset; + { + mutex_lock l(sessions_lock_); + // We create a copy to ensure that we don't have a deadlock when + // session->Close calls the DirectSessionFactory.Deregister, which + // acquires sessions_lock_. + std::swap(sessions_to_reset, sessions_); + } + Status s; + for (auto session : sessions_to_reset) { + s.Update(session->Reset(containers)); + } + // TODO(suharshs): Change the Reset behavior of all SessionFactories so that + // it doesn't close the sessions? + for (auto session : sessions_to_reset) { + s.Update(session->Close()); + } + return s; + } + + void Deregister(const DirectSession* session) { + mutex_lock l(sessions_lock_); + sessions_.erase(std::remove(sessions_.begin(), sessions_.end(), session), + sessions_.end()); + } + + private: + mutex sessions_lock_; + std::vector sessions_ GUARDED_BY(sessions_lock_); +}; + +class DirectSessionRegistrar { + public: + DirectSessionRegistrar() { + SessionFactory::Register("DIRECT_SESSION", new DirectSessionFactory()); + } +}; +static DirectSessionRegistrar registrar; + std::atomic_int_fast64_t DirectSession::step_id_counter_(1); // NOTE: On Android with a single device, there is never @@ -146,10 +216,13 @@ void DirectSession::SchedClosure(thread::ThreadPool* pool, } DirectSession::DirectSession(const SessionOptions& options, - const DeviceMgr* device_mgr) + const DeviceMgr* device_mgr, + DirectSessionFactory* const factory) : options_(options), device_mgr_(device_mgr), + factory_(factory), cancellation_manager_(new CancellationManager()), + closed_(false), operation_timeout_in_ms_(options_.config.operation_timeout_in_ms()) { if (options_.config.session_inter_op_thread_pool_size() > 0) { for (int i = 0; i < options_.config.session_inter_op_thread_pool_size(); @@ -194,6 +267,7 @@ DirectSession::DirectSession(const SessionOptions& options, } DirectSession::~DirectSession() { + if (!closed_) Close(); for (auto& it : partial_runs_) { it.second.reset(nullptr); } @@ -237,6 +311,7 @@ Status DirectSession::Create(const GraphDef& graph) { } Status DirectSession::Extend(const GraphDef& graph) { + TF_RETURN_IF_ERROR(CheckNotClosed()); mutex_lock l(graph_def_lock_); return ExtendLocked(graph); } @@ -267,6 +342,7 @@ Status DirectSession::Run(const RunOptions& run_options, const std::vector& target_nodes, std::vector* outputs, RunMetadata* run_metadata) { + TF_RETURN_IF_ERROR(CheckNotClosed()); direct_session_runs->GetCell()->IncrementBy(1); { mutex_lock l(graph_def_lock_); @@ -412,6 +488,7 @@ Status DirectSession::PRunSetup(const std::vector& input_names, const std::vector& output_names, const std::vector& target_nodes, string* handle) { + TF_RETURN_IF_ERROR(CheckNotClosed()); { mutex_lock l(graph_def_lock_); if (!graph_created_) { @@ -487,6 +564,7 @@ Status DirectSession::PRunSetup(const std::vector& input_names, Status DirectSession::PRun(const string& handle, const NamedTensorList& inputs, const std::vector& output_names, std::vector* outputs) { + TF_RETURN_IF_ERROR(CheckNotClosed()); std::vector parts = str_util::Split(handle, ';'); const string& key = parts[0]; // Get the executors for this partial run. @@ -1002,8 +1080,20 @@ Status DirectSession::CreateGraphs( return s; } +::tensorflow::Status DirectSession::Reset( + const std::vector& containers) { + device_mgr_->ClearContainers(containers); + return ::tensorflow::Status::OK(); +} + ::tensorflow::Status DirectSession::Close() { cancellation_manager_->StartCancel(); + { + mutex_lock l(mu_); + if (closed_) return ::tensorflow::Status::OK(); + closed_ = true; + } + if (factory_ != nullptr) factory_->Deregister(this); return ::tensorflow::Status::OK(); } @@ -1051,37 +1141,4 @@ void DirectSession::WaitForNotification(RunState* run_state, } } -class DirectSessionFactory : public SessionFactory { - public: - DirectSessionFactory() {} - - bool AcceptsOptions(const SessionOptions& options) override { - return options.target.empty(); - } - - Session* NewSession(const SessionOptions& options) override { - // Must do this before the CPU allocator is created. - if (options.config.graph_options().build_cost_model() > 0) { - EnableCPUAllocatorFullStats(true); - } - std::vector devices; - Status s = DeviceFactory::AddDevices( - options, "/job:localhost/replica:0/task:0", &devices); - if (!s.ok()) { - LOG(ERROR) << s; - return nullptr; - } - - return new DirectSession(options, new DeviceMgr(devices)); - } -}; - -class DirectSessionRegistrar { - public: - DirectSessionRegistrar() { - SessionFactory::Register("DIRECT_SESSION", new DirectSessionFactory()); - } -}; -static DirectSessionRegistrar registrar; - } // namespace tensorflow diff --git a/tensorflow/core/common_runtime/direct_session.h b/tensorflow/core/common_runtime/direct_session.h index dcb2c584c82..8681d8fb7c4 100644 --- a/tensorflow/core/common_runtime/direct_session.h +++ b/tensorflow/core/common_runtime/direct_session.h @@ -28,6 +28,7 @@ limitations under the License. #include "tensorflow/core/common_runtime/device_set.h" #include "tensorflow/core/common_runtime/executor.h" #include "tensorflow/core/common_runtime/rendezvous_mgr.h" +#include "tensorflow/core/common_runtime/session_factory.h" #include "tensorflow/core/common_runtime/simple_graph_execution_state.h" #include "tensorflow/core/debug/debug_graph_utils.h" #include "tensorflow/core/framework/cancellation.h" @@ -47,11 +48,18 @@ namespace tensorflow { class CostModel; class DebugGateway; class Device; +class DirectSessionFactory; class DirectSession : public Session { public: + typedef std::function CloseCallback; + // Takes ownership of 'device_mgr'. - DirectSession(const SessionOptions& options, const DeviceMgr* device_mgr); + // 'factory' is used to unregister the DirectSession with 'factory' when its + // closed. This ensures that Reset requests from the 'factory' don't get sent + // to sessions that are already closed. + DirectSession(const SessionOptions& options, const DeviceMgr* device_mgr, + DirectSessionFactory* factory); ~DirectSession() override; typedef std::vector> NamedTensorList; @@ -83,6 +91,10 @@ class DirectSession : public Session { const std::vector& output_names, std::vector* outputs) override; + // Reset clears 'containers' from the device_mgr of the DirectSession. + // If 'containers' is empty, then Reset clears the default container. + ::tensorflow::Status Reset(const std::vector& containers); + ::tensorflow::Status Close() override; void ExportCostModels(CostModelManager::CostModelMap* cost_models) { @@ -198,6 +210,12 @@ class DirectSession : public Session { // operation_timeout_in_ms is greater than 0. void WaitForNotification(RunState* run_state, int64 timeout_in_ms); + ::tensorflow::Status CheckNotClosed() { + mutex_lock l(mu_); + if (closed_) return errors::Cancelled("Session has been closed."); + return ::tensorflow::Status::OK(); + } + const SessionOptions options_; // Device structures. @@ -232,10 +250,12 @@ class DirectSession : public Session { // This holds all the tensors that are currently alive in the session. SessionState session_state_; + DirectSessionFactory* const factory_; // not owned CancellationManager* cancellation_manager_; // Saves and restores device placements for stateful nodes. mutex mu_; + // Map of placed stateful nodes, i.e. nodes for which is_stateful() // is true, such as "params" and "queue" nodes. Once placed these // nodes can not be moved to a different device. Maps node names to @@ -251,6 +271,9 @@ class DirectSession : public Session { // library; it copies and modifies the function library. std::unique_ptr flib_def_; + // true if the Session has been Closed. + bool closed_ GUARDED_BY(mu_); + // For generating unique names. int64 name_counter_ GUARDED_BY(mu_) = 0; diff --git a/tensorflow/core/common_runtime/direct_session_test.cc b/tensorflow/core/common_runtime/direct_session_test.cc index 380f2ca8fd6..51f70e551b5 100644 --- a/tensorflow/core/common_runtime/direct_session_test.cc +++ b/tensorflow/core/common_runtime/direct_session_test.cc @@ -970,5 +970,129 @@ TEST(DirectSessionTest, TestSessionInterOpThreadsInvalidOptions) { } } +TEST(DirectSessionTest, TestDirectSessionRunClose) { + // Construct a graph with a variable and a single assign. + Graph g(OpRegistry::Global()); + Tensor t(DT_FLOAT, TensorShape({})); + t.scalar()() = {1.2}; + Node* var_val = test::graph::Constant(&g, t); + Node* var = test::graph::Var(&g, DT_FLOAT, {}); + Node* var_assign = test::graph::Assign(&g, var, var_val); + GraphDef def; + test::graph::ToGraphDef(&g, &def); + + SessionOptions options; + (*options.config.mutable_device_count())["CPU"] = 2; + std::unique_ptr session(NewSession(options)); + ASSERT_TRUE(session != nullptr); + TF_ASSERT_OK(session->Create(def)); + + // Assign a value to the var. + TF_ASSERT_OK(session->Run({} /* inputs */, {}, + {var_assign->name()} /* target_nodes */, nullptr)); + + // Run a read on the variable to ensure that it works. + std::vector outputs; + TF_ASSERT_OK(session->Run( + {} /* inputs */, {var->name() + ":0"} /* output_names */, {}, &outputs)); + EXPECT_EQ(t.scalar()(), outputs[0].scalar()()); + outputs.clear(); + + // Close the session. + session->Close(); + + // Run the read on the variable to get an error. + Status s = session->Run({} /* inputs */, {}, + {var_assign->name()} /* target_nodes */, nullptr); + EXPECT_EQ("Cancelled: Session has been closed.", s.ToString()); +} + +TEST(DirectSessionTest, TestDirectSessionPRunClose) { + GraphDef def; + Graph g(OpRegistry::Global()); + + Tensor first_value(DT_FLOAT, TensorShape({})); + first_value.scalar()() = 1.0; + Node* first_const = test::graph::Constant(&g, first_value); + Node* first_identity = test::graph::Identity(&g, first_const); + + Tensor second_value(DT_FLOAT, TensorShape({})); + second_value.scalar()() = 2.0; + Node* second_const = test::graph::Constant(&g, second_value); + Node* second_identity = test::graph::Identity(&g, second_const); + + Node* third = test::graph::Add(&g, first_identity, second_identity); + Node* third_identity = test::graph::Identity(&g, third); + + test::graph::ToGraphDef(&g, &def); + + std::unique_ptr session(CreateSession()); + ASSERT_TRUE(session != nullptr); + TF_ASSERT_OK(session->Create(def)); + + std::vector outputs; + + string handle; + Status s = session->PRunSetup( + {first_const->name(), second_const->name()}, + {first_identity->name() + ":0", second_identity->name() + ":0", + third_identity->name() + ":0"}, + {}, &handle); + TF_ASSERT_OK(s); + + Tensor value_11(DT_FLOAT, TensorShape({})); + value_11.scalar()() = 11.0; + Tensor value_22(DT_FLOAT, TensorShape({})); + value_22.scalar()() = 22.0; + + // Close the session. + session->Close(); + + // Feed first_const, fetch first_identity + s = session->PRun(handle, {{first_const->name(), value_11}}, + {first_identity->name() + ":0"}, &outputs); + EXPECT_EQ("Cancelled: Session has been closed.", s.ToString()); +} + +TEST(DirectSessionTest, TestDirectSessionReset) { + // Construct a graph with a variable and a single assign. + Graph g(OpRegistry::Global()); + Tensor t(DT_FLOAT, TensorShape({})); + t.scalar()() = {1.2}; + Node* var_val = test::graph::Constant(&g, t); + Node* var = test::graph::Var(&g, DT_FLOAT, {}); + Node* var_assign = test::graph::Assign(&g, var, var_val); + GraphDef def; + test::graph::ToGraphDef(&g, &def); + + SessionOptions options; + (*options.config.mutable_device_count())["CPU"] = 2; + std::unique_ptr session(NewSession(options)); + ASSERT_TRUE(session != nullptr); + TF_ASSERT_OK(session->Create(def)); + + // Assign a value to the var. + TF_ASSERT_OK(session->Run({} /* inputs */, {}, + {var_assign->name()} /* target_nodes */, nullptr)); + + // Run a read on the variable to ensure that it works. + std::vector outputs; + TF_ASSERT_OK(session->Run( + {} /* inputs */, {var->name() + ":0"} /* output_names */, {}, &outputs)); + EXPECT_EQ(t.scalar()(), outputs[0].scalar()()); + outputs.clear(); + + // Reset the containers. + Reset(options, {}); + + // Run the read on the variable to get an error. + // TODO(suharshs): This test only works because we close the Session in Reset. + // If we change the behavior of Reset to not close the Session, this test will + // fail, since the Variable buffer is cached by var. + Status s = session->Run({} /* inputs */, {}, + {var_assign->name()} /* target_nodes */, nullptr); + EXPECT_EQ("Cancelled: Session has been closed.", s.ToString()); +} + } // namespace } // namespace tensorflow From 7a1210bdbdade7210d48db287065ecac950338aa Mon Sep 17 00:00:00 2001 From: Vincent Vanhoucke Date: Wed, 31 Aug 2016 16:01:52 -0800 Subject: [PATCH 08/89] Fix ~63 ClangTidy - Performance findings in TensorFlow. Change: 131891101 --- tensorflow/c/c_api_test.cc | 2 +- tensorflow/core/common_runtime/bfc_allocator.cc | 2 +- tensorflow/core/common_runtime/constant_folding.cc | 2 +- tensorflow/core/common_runtime/copy_tensor.cc | 5 ++++- tensorflow/core/common_runtime/device_set.cc | 4 ++-- tensorflow/core/common_runtime/function_test.cc | 2 +- tensorflow/core/common_runtime/gpu/gpu_event_mgr.cc | 2 +- .../core/common_runtime/gpu/gpu_stream_util_test.cc | 2 +- tensorflow/core/common_runtime/gpu/pool_allocator.cc | 11 ++++++----- tensorflow/core/common_runtime/simple_placer.cc | 8 ++++---- tensorflow/core/framework/function.cc | 4 ++-- tensorflow/core/framework/function_testlib.cc | 6 +++--- tensorflow/core/framework/op_def_util.cc | 2 +- tensorflow/core/framework/op_kernel_test.cc | 4 ++-- tensorflow/core/graph/optimizer_cse_test.cc | 2 +- tensorflow/core/graph/quantize_training.cc | 2 +- tensorflow/core/kernels/argmax_op.cc | 2 +- tensorflow/core/kernels/attention_ops.cc | 2 +- tensorflow/core/kernels/candidate_sampler_ops.cc | 2 +- tensorflow/core/kernels/gather_nd_op.cc | 4 ++-- tensorflow/core/kernels/maxpooling_op.cc | 2 +- tensorflow/core/kernels/scan_ops.cc | 2 +- tensorflow/core/lib/jpeg/jpeg_mem.cc | 3 ++- tensorflow/core/platform/cloud/gcs_file_system.cc | 6 +++--- tensorflow/core/util/tensor_slice_reader.cc | 3 ++- tensorflow/core/util/tensor_slice_set.cc | 10 +++++----- tensorflow/core/util/tensor_slice_writer.cc | 4 +++- .../tools/proto_text/gen_proto_text_functions_lib.cc | 6 +++--- 28 files changed, 57 insertions(+), 49 deletions(-) diff --git a/tensorflow/c/c_api_test.cc b/tensorflow/c/c_api_test.cc index 589f001b142..8dcd6a118bf 100644 --- a/tensorflow/c/c_api_test.cc +++ b/tensorflow/c/c_api_test.cc @@ -87,7 +87,7 @@ TEST(CApi, AllocateTensor) { static void TestEncodeDecode(int line, const std::vector& data) { const tensorflow::int64 n = data.size(); - for (std::vector dims : + for (const std::vector& dims : std::vector>{ {n}, {1, n}, {n, 1}, {n / 2, 2}}) { // Create C++ Tensor diff --git a/tensorflow/core/common_runtime/bfc_allocator.cc b/tensorflow/core/common_runtime/bfc_allocator.cc index 70b01d6485a..f525d1d9812 100644 --- a/tensorflow/core/common_runtime/bfc_allocator.cc +++ b/tensorflow/core/common_runtime/bfc_allocator.cc @@ -157,7 +157,7 @@ bool BFCAllocator::Extend(size_t rounded_bytes) { InsertFreeChunkIntoBin(h); // Invoke visitors on newly allocated region. - for (auto visitor : region_visitors_) { + for (const auto& visitor : region_visitors_) { visitor(mem_addr, bytes); } return true; diff --git a/tensorflow/core/common_runtime/constant_folding.cc b/tensorflow/core/common_runtime/constant_folding.cc index 9bd162b72fd..6a49c940b3e 100644 --- a/tensorflow/core/common_runtime/constant_folding.cc +++ b/tensorflow/core/common_runtime/constant_folding.cc @@ -279,7 +279,7 @@ bool ReplaceTensorWithConstant(Graph* graph, Device* partition_device, edges_to_remove.push_back(out_edge); } } - string node_name = n->name(); + const string& node_name = n->name(); Node* constant_node; auto builder = NodeDefBuilder(strings::StrCat(graph->NewName(node_name), "__cf__", UniqueConstantId()), diff --git a/tensorflow/core/common_runtime/copy_tensor.cc b/tensorflow/core/common_runtime/copy_tensor.cc index 5dc8c33b2a7..e55ef7d5ba9 100644 --- a/tensorflow/core/common_runtime/copy_tensor.cc +++ b/tensorflow/core/common_runtime/copy_tensor.cc @@ -16,6 +16,7 @@ limitations under the License. #include "tensorflow/core/common_runtime/copy_tensor.h" #include +#include #include #include "tensorflow/core/lib/core/errors.h" #include "tensorflow/core/platform/logging.h" @@ -26,7 +27,9 @@ namespace { struct RegistrationInfo { RegistrationInfo(DeviceType s, DeviceType r, CopyTensor::CopyFunction cf) - : sender_device_type(s), receiver_device_type(r), copy_function(cf) {} + : sender_device_type(std::move(s)), + receiver_device_type(r), + copy_function(cf) {} DeviceType sender_device_type; DeviceType receiver_device_type; CopyTensor::CopyFunction copy_function; diff --git a/tensorflow/core/common_runtime/device_set.cc b/tensorflow/core/common_runtime/device_set.cc index 98c6c3843ce..8ff93760d49 100644 --- a/tensorflow/core/common_runtime/device_set.cc +++ b/tensorflow/core/common_runtime/device_set.cc @@ -71,9 +71,9 @@ std::vector DeviceSet::PrioritizedDeviceTypeList() const { std::vector result; std::set seen; for (Device* d : devices_) { - auto t = d->device_type(); + const auto& t = d->device_type(); if (seen.insert(t).second) { - result.emplace_back(DeviceType(t)); + result.emplace_back(t); } } std::sort(result.begin(), result.end(), DeviceTypeComparator); diff --git a/tensorflow/core/common_runtime/function_test.cc b/tensorflow/core/common_runtime/function_test.cc index 2f5507a0c55..e263e62bd84 100644 --- a/tensorflow/core/common_runtime/function_test.cc +++ b/tensorflow/core/common_runtime/function_test.cc @@ -144,7 +144,7 @@ class FunctionLibraryRuntimeTest : public ::testing::Test { void Init(const std::vector& flib) { FunctionDefLibrary proto; - for (auto fdef : flib) *(proto.add_function()) = fdef; + for (const auto& fdef : flib) *(proto.add_function()) = fdef; delete lib_def_; lib_def_ = new FunctionLibraryDefinition(OpRegistry::Global(), proto); delete lib_; diff --git a/tensorflow/core/common_runtime/gpu/gpu_event_mgr.cc b/tensorflow/core/common_runtime/gpu/gpu_event_mgr.cc index 7506e35ff34..f18ee5efd85 100644 --- a/tensorflow/core/common_runtime/gpu/gpu_event_mgr.cc +++ b/tensorflow/core/common_runtime/gpu/gpu_event_mgr.cc @@ -95,7 +95,7 @@ void EventMgr::ThenDeleteTensors(perftools::gputools::Stream* stream, FlushAccumulatedTensors(); } accumulated_stream_ = stream; - for (auto t : tensors) { + for (const auto& t : tensors) { // accumulated_tensors_ takes over ownership of the reference to "t" accumulated_tensors_->push_back(t); accumulated_tensor_bytes_ += t.TotalBytes(); diff --git a/tensorflow/core/common_runtime/gpu/gpu_stream_util_test.cc b/tensorflow/core/common_runtime/gpu/gpu_stream_util_test.cc index 5b4812bb34a..3aaaf87e79c 100644 --- a/tensorflow/core/common_runtime/gpu/gpu_stream_util_test.cc +++ b/tensorflow/core/common_runtime/gpu/gpu_stream_util_test.cc @@ -129,7 +129,7 @@ TEST_F(GpuStreamUtilTest, StreamOverrides) { // Nodes should be assigned to streams by op type. for (const auto& it : node_to_stream_id) { Node* n = g.FindNodeId(it.first); - const string op = n->type_string(); + const string& op = n->type_string(); const int stream = it.second; if (op == "Const") { EXPECT_EQ(stream, 90); diff --git a/tensorflow/core/common_runtime/gpu/pool_allocator.cc b/tensorflow/core/common_runtime/gpu/pool_allocator.cc index b44108d1ace..e0362b38e6b 100644 --- a/tensorflow/core/common_runtime/gpu/pool_allocator.cc +++ b/tensorflow/core/common_runtime/gpu/pool_allocator.cc @@ -20,6 +20,7 @@ limitations under the License. #include // for munmap #include +#include #include "tensorflow/core/lib/strings/numbers.h" #include "tensorflow/core/platform/logging.h" @@ -31,7 +32,7 @@ namespace tensorflow { PoolAllocator::PoolAllocator(size_t pool_size_limit, bool auto_resize, SubAllocator* allocator, RoundUpInterface* size_rounder, string name) - : name_(name), + : name_(std::move(name)), has_size_limit_(pool_size_limit > 0), auto_resize_(auto_resize), pool_size_limit_(pool_size_limit), @@ -125,7 +126,7 @@ void* PoolAllocator::AllocateRaw(size_t alignment, size_t num_bytes) { return PrepareChunk(r, alignment, num_bytes); } else { void* ptr = allocator_->Alloc(kPoolAlignment, num_bytes); - for (auto v : alloc_visitors_) { + for (const auto& v : alloc_visitors_) { v(ptr, num_bytes); } return PrepareChunk(ptr, alignment, num_bytes); @@ -137,7 +138,7 @@ void PoolAllocator::DeallocateRaw(void* ptr) { ChunkPrefix* cp = FindPrefix(ptr); CHECK_LE((void*)cp, (void*)ptr); if (!has_size_limit_ && !auto_resize_) { - for (auto v : free_visitors_) { + for (const auto& v : free_visitors_) { v(cp, cp->num_bytes); } allocator_->Free(cp, cp->num_bytes); @@ -160,7 +161,7 @@ void PoolAllocator::Clear() { mutex_lock lock(mutex_); for (auto iter : pool_) { PtrRecord* pr = iter.second; - for (auto v : free_visitors_) { + for (const auto& v : free_visitors_) { v(pr->ptr, pr->num_bytes); } allocator_->Free(pr->ptr, pr->num_bytes); @@ -217,7 +218,7 @@ void PoolAllocator::EvictOne() { DCHECK(iter != pool_.end()); } pool_.erase(iter); - for (auto v : free_visitors_) { + for (const auto& v : free_visitors_) { v(prec->ptr, prec->num_bytes); } allocator_->Free(prec->ptr, prec->num_bytes); diff --git a/tensorflow/core/common_runtime/simple_placer.cc b/tensorflow/core/common_runtime/simple_placer.cc index 6e177da57fc..2a9e0fa1963 100644 --- a/tensorflow/core/common_runtime/simple_placer.cc +++ b/tensorflow/core/common_runtime/simple_placer.cc @@ -42,7 +42,7 @@ std::vector FilterSupportedDevices( const std::vector& devices, const DeviceTypeVector& supported_device_types) { std::vector filtered_devices; - for (DeviceType d : supported_device_types) { + for (const DeviceType& d : supported_device_types) { for (Device* device : devices) { if (DeviceType(device->attributes().device_type()) == d) { filtered_devices.emplace_back(device); @@ -495,7 +495,7 @@ class ColocationGraph { "' does not match any device"); } - for (DeviceType d : member->supported_device_types) { + for (const DeviceType& d : member->supported_device_types) { if (DeviceType(assigned_device->attributes().device_type()) == d) { return Status::OK(); } @@ -545,9 +545,9 @@ class ColocationGraph { target->clear(); // Iterate in priority order. - for (DeviceType device_type : temp) { + for (const DeviceType& device_type : temp) { bool found = false; - for (DeviceType other_device_type : other) { + for (const DeviceType& other_device_type : other) { if (device_type == other_device_type) { found = true; break; diff --git a/tensorflow/core/framework/function.cc b/tensorflow/core/framework/function.cc index 83676a90c51..bedc85ab4e7 100644 --- a/tensorflow/core/framework/function.cc +++ b/tensorflow/core/framework/function.cc @@ -861,11 +861,11 @@ string DebugString(const GraphDef& instantiated_func_def) { string DebugStringWhole(const GraphDef& gdef) { string ret; - for (auto fdef : gdef.library().function()) { + for (const auto& fdef : gdef.library().function()) { strings::StrAppend(&ret, Print(fdef)); } strings::StrAppend(&ret, "\n"); - for (auto ndef : gdef.node()) { + for (const auto& ndef : gdef.node()) { strings::StrAppend(&ret, Print(ndef), "\n"); } return ret; diff --git a/tensorflow/core/framework/function_testlib.cc b/tensorflow/core/framework/function_testlib.cc index 900ceed1a59..47db0f03391 100644 --- a/tensorflow/core/framework/function_testlib.cc +++ b/tensorflow/core/framework/function_testlib.cc @@ -31,11 +31,11 @@ GraphDef GDef(gtl::ArraySlice nodes, VersionDef* versions = g.mutable_versions(); versions->set_producer(TF_GRAPH_DEF_VERSION); versions->set_min_consumer(TF_GRAPH_DEF_VERSION_MIN_CONSUMER); - for (auto n : nodes) { + for (const auto& n : nodes) { *(g.add_node()) = n; } auto lib = g.mutable_library(); - for (auto f : funcs) { + for (const auto& f : funcs) { *(lib->add_function()) = f; } return g; @@ -49,7 +49,7 @@ NodeDef NDef(const string& name, const string& op, NodeDef n; n.set_name(name); n.set_op(op); - for (auto in : inputs) n.add_input(in); + for (const auto& in : inputs) n.add_input(in); n.set_device(device); for (auto na : attrs) n.mutable_attr()->insert({na.first, na.second.proto}); return n; diff --git a/tensorflow/core/framework/op_def_util.cc b/tensorflow/core/framework/op_def_util.cc index 5717488b1cb..c36e6dd653b 100644 --- a/tensorflow/core/framework/op_def_util.cc +++ b/tensorflow/core/framework/op_def_util.cc @@ -60,7 +60,7 @@ Status AllowedTypeValue(DataType dt, const OpDef::AttrDef& attr) { Status AllowedStringValue(const string& str, const OpDef::AttrDef& attr) { const AttrValue& allowed_values(attr.allowed_values()); - for (auto allowed : allowed_values.list().s()) { + for (const auto& allowed : allowed_values.list().s()) { if (str == allowed) { return Status::OK(); } diff --git a/tensorflow/core/framework/op_kernel_test.cc b/tensorflow/core/framework/op_kernel_test.cc index db4b6037ef0..b4556c9272d 100644 --- a/tensorflow/core/framework/op_kernel_test.cc +++ b/tensorflow/core/framework/op_kernel_test.cc @@ -381,7 +381,7 @@ class OpKernelBuilderTest : public ::testing::Test { DeviceTypeVector devices; TF_EXPECT_OK(SupportedDeviceTypesForNode(DeviceTypes(), def, &devices)); bool found = false; - for (DeviceType dt : devices) { + for (const DeviceType& dt : devices) { if (dt == device_type) { found = true; } @@ -414,7 +414,7 @@ class OpKernelBuilderTest : public ::testing::Test { DeviceTypeVector devices; if (errors::IsNotFound(status)) { TF_EXPECT_OK(SupportedDeviceTypesForNode(DeviceTypes(), def, &devices)); - for (DeviceType dt : devices) { + for (const DeviceType& dt : devices) { EXPECT_NE(dt, device_type); } } else { diff --git a/tensorflow/core/graph/optimizer_cse_test.cc b/tensorflow/core/graph/optimizer_cse_test.cc index 0841bac93cd..1091af4e451 100644 --- a/tensorflow/core/graph/optimizer_cse_test.cc +++ b/tensorflow/core/graph/optimizer_cse_test.cc @@ -326,7 +326,7 @@ TEST_F(OptimizerCSETest, Constant_Dedup) { // A graph contains a bunch of constants. Graph g(OpRegistry::Global()); - for (auto val : {a, b, c, d, d, c, b, a}) { + for (const auto& val : {a, b, c, d, d, c, b, a}) { test::graph::Constant(&g, val); // Node name is n/_0, n/_1, ... } GraphDef gdef; diff --git a/tensorflow/core/graph/quantize_training.cc b/tensorflow/core/graph/quantize_training.cc index 8521dff6fa2..930d7bd15f6 100644 --- a/tensorflow/core/graph/quantize_training.cc +++ b/tensorflow/core/graph/quantize_training.cc @@ -74,7 +74,7 @@ inline bool IsGradientNode(const Graph* graph, const Node* node) { // Returns true if the root tensor op type is known, false otherwise. bool FindType(const Graph* graph, const Node* node, bool* signed_input, bool* range_given, float* input_min, float* input_max) { - const string src_op = node->type_string(); + const string& src_op = node->type_string(); if (src_op == "Const" || src_op == "Variable") { *signed_input = true; *range_given = false; diff --git a/tensorflow/core/kernels/argmax_op.cc b/tensorflow/core/kernels/argmax_op.cc index 595bd7bd5e4..2f92a2da9f8 100644 --- a/tensorflow/core/kernels/argmax_op.cc +++ b/tensorflow/core/kernels/argmax_op.cc @@ -67,7 +67,7 @@ class ArgOp : public OpKernel { input.shape().DebugString())); TensorShape output_shape; - TensorShape input_shape = input.shape(); + const TensorShape& input_shape = input.shape(); for (int d = 0; d < input_dims - 1; ++d) { output_shape.AddDim(input_shape.dim_size((d < dim) ? d : d + 1)); } diff --git a/tensorflow/core/kernels/attention_ops.cc b/tensorflow/core/kernels/attention_ops.cc index 695068d3150..cc8f122cab3 100644 --- a/tensorflow/core/kernels/attention_ops.cc +++ b/tensorflow/core/kernels/attention_ops.cc @@ -41,7 +41,7 @@ class ExtractGlimpseOp : public OpKernel { // depth). void Compute(OpKernelContext* context) override { const Tensor& input = context->input(0); - const TensorShape input_shape = input.shape(); + const TensorShape& input_shape = input.shape(); const int32 num_dims = input_shape.dims(); OP_REQUIRES( context, num_dims == 4, diff --git a/tensorflow/core/kernels/candidate_sampler_ops.cc b/tensorflow/core/kernels/candidate_sampler_ops.cc index d64dca3d0b5..6aa9059dc70 100644 --- a/tensorflow/core/kernels/candidate_sampler_ops.cc +++ b/tensorflow/core/kernels/candidate_sampler_ops.cc @@ -190,7 +190,7 @@ class ComputeAccidentalHitsOp : public OpKernel { void Compute(OpKernelContext* context) override { const Tensor& in_true_candidates = context->input(0); - TensorShape in_true_candidates_shape = in_true_candidates.shape(); + const TensorShape& in_true_candidates_shape = in_true_candidates.shape(); OP_REQUIRES(context, TensorShapeUtils::IsMatrix(in_true_candidates_shape) && in_true_candidates_shape.dim_size(1) == num_true_, errors::InvalidArgument( diff --git a/tensorflow/core/kernels/gather_nd_op.cc b/tensorflow/core/kernels/gather_nd_op.cc index c2a5192efb1..73f30cdae37 100644 --- a/tensorflow/core/kernels/gather_nd_op.cc +++ b/tensorflow/core/kernels/gather_nd_op.cc @@ -53,7 +53,7 @@ class GatherNdOp : public OpKernel { "index innermost dimension length must be <= params rank; saw: ", indices.dim_size(indices.dims() - 1), " vs. ", params.dims())); - TensorShape indices_shape(indices.shape()); + const TensorShape& indices_shape(indices.shape()); const int64 indices_nd = indices_shape.dim_size(indices_shape.dims() - 1); // Check that we have enough index space @@ -79,7 +79,7 @@ class GatherNdOp : public OpKernel { N_result *= indices_shape.dim_size(i); } - TensorShape params_shape(params.shape()); + const TensorShape& params_shape(params.shape()); Index total_nd = params_shape.dims(); TensorShape result_shape(indices_shape); diff --git a/tensorflow/core/kernels/maxpooling_op.cc b/tensorflow/core/kernels/maxpooling_op.cc index 97e2bfcad54..27888d3a313 100644 --- a/tensorflow/core/kernels/maxpooling_op.cc +++ b/tensorflow/core/kernels/maxpooling_op.cc @@ -272,7 +272,7 @@ class MaxPoolingGradOp : public OpKernel { OP_REQUIRES(context, out_backprop.dims() == 4, errors::InvalidArgument("out_backprop must be 4-dimensional")); - TensorShape output_shape = tensor_in.shape(); + const TensorShape& output_shape = tensor_in.shape(); Tensor tensor_out_dup; OP_REQUIRES_OK(context, diff --git a/tensorflow/core/kernels/scan_ops.cc b/tensorflow/core/kernels/scan_ops.cc index 604e712b0fd..2604b738448 100644 --- a/tensorflow/core/kernels/scan_ops.cc +++ b/tensorflow/core/kernels/scan_ops.cc @@ -58,7 +58,7 @@ public: errors::InvalidArgument("ScanOp: Expected scan axis in the range [", 0, ", ", input.dims(), "), but got ", axis)); - TensorShape output_shape = input.shape(); + const TensorShape& output_shape = input.shape(); Tensor* output = nullptr; OP_REQUIRES_OK(ctx, ctx->allocate_output(0, output_shape, &output)); diff --git a/tensorflow/core/lib/jpeg/jpeg_mem.cc b/tensorflow/core/lib/jpeg/jpeg_mem.cc index 9a317f1fd2c..ac12798322b 100644 --- a/tensorflow/core/lib/jpeg/jpeg_mem.cc +++ b/tensorflow/core/lib/jpeg/jpeg_mem.cc @@ -23,6 +23,7 @@ limitations under the License. #include #include #include +#include #include "tensorflow/core/lib/jpeg/jpeg_handle.h" #include "tensorflow/core/platform/logging.h" @@ -52,7 +53,7 @@ class FewerArgsForCompiler { : datasize_(datasize), flags_(flags), pnwarn_(nwarn), - allocate_output_(allocate_output), + allocate_output_(std::move(allocate_output)), height_read_(0), height_(0), stride_(0) { diff --git a/tensorflow/core/platform/cloud/gcs_file_system.cc b/tensorflow/core/platform/cloud/gcs_file_system.cc index 7426006cec6..fc35c293d27 100644 --- a/tensorflow/core/platform/cloud/gcs_file_system.cc +++ b/tensorflow/core/platform/cloud/gcs_file_system.cc @@ -92,7 +92,7 @@ class GcsRandomAccessFile : public RandomAccessFile { : bucket_(bucket), object_(object), auth_provider_(auth_provider), - http_request_factory_(std::move(http_request_factory)), + http_request_factory_(http_request_factory), read_ahead_bytes_(read_ahead_bytes) {} /// The implementation of reads with a read-ahead buffer. @@ -189,7 +189,7 @@ class GcsWritableFile : public WritableFile { : bucket_(bucket), object_(object), auth_provider_(auth_provider), - http_request_factory_(std::move(http_request_factory)) { + http_request_factory_(http_request_factory) { if (GetTmpFilename(&tmp_content_filename_).ok()) { outfile_.open(tmp_content_filename_, std::ofstream::binary | std::ofstream::app); @@ -208,7 +208,7 @@ class GcsWritableFile : public WritableFile { : bucket_(bucket), object_(object), auth_provider_(auth_provider), - http_request_factory_(std::move(http_request_factory)) { + http_request_factory_(http_request_factory) { tmp_content_filename_ = tmp_content_filename; outfile_.open(tmp_content_filename_, std::ofstream::binary | std::ofstream::app); diff --git a/tensorflow/core/util/tensor_slice_reader.cc b/tensorflow/core/util/tensor_slice_reader.cc index 9ab81af43b1..b40f5e77369 100644 --- a/tensorflow/core/util/tensor_slice_reader.cc +++ b/tensorflow/core/util/tensor_slice_reader.cc @@ -15,6 +15,7 @@ limitations under the License. #include "tensorflow/core/util/tensor_slice_reader.h" +#include #include #include "tensorflow/core/framework/types.pb_text.h" #include "tensorflow/core/framework/versions.h" @@ -107,7 +108,7 @@ TensorSliceReader::TensorSliceReader(const string& filepattern, TensorSliceReader::TensorSliceReader(const string& filepattern, OpenTableFunction open_function, int preferred_shard) - : filepattern_(filepattern), open_function_(open_function) { + : filepattern_(filepattern), open_function_(std::move(open_function)) { VLOG(1) << "TensorSliceReader for " << filepattern; Status s = io::GetMatchingFiles(Env::Default(), filepattern, &fnames_); if (!s.ok()) { diff --git a/tensorflow/core/util/tensor_slice_set.cc b/tensorflow/core/util/tensor_slice_set.cc index d4b9a4087cc..4217df90ca1 100644 --- a/tensorflow/core/util/tensor_slice_set.cc +++ b/tensorflow/core/util/tensor_slice_set.cc @@ -42,7 +42,7 @@ Status TensorSliceSet::Register(const TensorSlice& slice, const string& tag, // We check if there is any intersection between this slice and any of the // registered slices. if (slices_hull_.Overlaps(slice)) { - for (const auto x : slices_) { + for (const auto& x : slices_) { if (slice.Overlaps(x.second.slice)) { return errors::Internal("Overlapping slices: existing slice = ", x.first, ", new slice = ", str); @@ -89,7 +89,7 @@ bool TensorSliceSet::Query(const TensorSlice& slice, float* data) const { int64 overlap_size = 0; TensorSlice intersection; TensorShape inter_shape; - for (const auto x : slices_) { + for (const auto& x : slices_) { if (slice.Intersect(x.second.slice, &intersection)) { s = intersection.SliceTensorShape(shape_, &inter_shape); if (!s.ok()) { @@ -103,7 +103,7 @@ bool TensorSliceSet::Query(const TensorSlice& slice, float* data) const { // We have it! // Now we need to copy the data to "data" if (data) { - for (const auto x : slices_) { + for (const auto& x : slices_) { CopyDataFromTensorSliceToTensorSlice(shape_, x.second.slice, slice, x.second.data, data); } @@ -146,7 +146,7 @@ bool TensorSliceSet::QueryMeta( int64 overlap_size = 0; TensorSlice intersection; TensorShape inter_shape; - for (const auto x : slices_) { + for (const auto& x : slices_) { if (slice.Intersect(x.second.slice, &intersection)) { s = intersection.SliceTensorShape(shape_, &inter_shape); if (!s.ok()) { @@ -180,7 +180,7 @@ Status RegisterTensorSlice( tensor_slices->insert(std::make_pair(name, tss)); } else { // Check if the shapes match - TensorShape tss_shape(tss->shape()); + const TensorShape& tss_shape(tss->shape()); if (!shape.IsSameSize(tss_shape)) { return errors::Internal("Incompatible tensor shapes detected for tensor ", name, ": existing = ", tss_shape.DebugString(), diff --git a/tensorflow/core/util/tensor_slice_writer.cc b/tensorflow/core/util/tensor_slice_writer.cc index 8907aa65227..928d6fe72c7 100644 --- a/tensorflow/core/util/tensor_slice_writer.cc +++ b/tensorflow/core/util/tensor_slice_writer.cc @@ -15,6 +15,8 @@ limitations under the License. #include "tensorflow/core/util/tensor_slice_writer.h" +#include + #include "tensorflow/core/lib/core/errors.h" #include "tensorflow/core/lib/io/table_builder.h" #include "tensorflow/core/lib/random/random.h" @@ -81,7 +83,7 @@ Status CreateTableTensorSliceBuilder(const string& name, TensorSliceWriter::TensorSliceWriter(const string& filename, CreateBuilderFunction create_builder) : filename_(filename), - create_builder_(create_builder), + create_builder_(std::move(create_builder)), tmpname_(strings::StrCat(filename, ".tempstate", random::New64())), slices_(0) { VersionDef* versions = sts_.mutable_meta()->mutable_versions(); diff --git a/tensorflow/tools/proto_text/gen_proto_text_functions_lib.cc b/tensorflow/tools/proto_text/gen_proto_text_functions_lib.cc index 77484a4d528..a5b0f03a25a 100644 --- a/tensorflow/tools/proto_text/gen_proto_text_functions_lib.cc +++ b/tensorflow/tools/proto_text/gen_proto_text_functions_lib.cc @@ -294,7 +294,7 @@ void Generator::AppendFieldValueAppend(const FieldDescriptor& field, } void Generator::AppendFieldAppend(const FieldDescriptor& field) { - const string name = field.name(); + const string& name = field.name(); if (field.is_map()) { Print("{").Nest(); @@ -445,7 +445,7 @@ void Generator::AppendParseMessageFunction(const Descriptor& md) { Unnest().Print("}"); for (int i = 0; i < md.field_count(); ++i) { const FieldDescriptor* field = md.field(i); - const string field_name = field->name(); + const string& field_name = field->name(); string mutable_value_expr; string set_value_prefix; if (map_append) { @@ -530,7 +530,7 @@ void Generator::AppendParseMessageFunction(const Descriptor& md) { for (int enum_i = 0; enum_i < enum_d->value_count(); ++enum_i) { const auto* value_d = enum_d->value(enum_i); - const string value_name = value_d->name(); + const string& value_name = value_d->name(); string condition = StrCat("value == \"", value_name, "\" || value == \"", value_d->number(), "\""); if (value_d->number() == 0) { From bc5df827de21d349e4b1c8111d177cc30c9ab787 Mon Sep 17 00:00:00 2001 From: Eugene Brevdo Date: Wed, 31 Aug 2016 16:05:49 -0800 Subject: [PATCH 09/89] Add bucketing input helpers to tf.contrib.training. Change: 131891671 --- tensorflow/contrib/training/BUILD | 13 + tensorflow/contrib/training/__init__.py | 12 + .../training/python/training/bucket_ops.py | 374 ++++++++++++++++++ .../python/training/bucket_ops_test.py | 356 +++++++++++++++++ 4 files changed, 755 insertions(+) create mode 100644 tensorflow/contrib/training/python/training/bucket_ops.py create mode 100644 tensorflow/contrib/training/python/training/bucket_ops_test.py diff --git a/tensorflow/contrib/training/BUILD b/tensorflow/contrib/training/BUILD index a44143ba406..79901b6ee56 100644 --- a/tensorflow/contrib/training/BUILD +++ b/tensorflow/contrib/training/BUILD @@ -11,6 +11,7 @@ py_library( name = "training_py", srcs = [ "__init__.py", + "python/training/bucket_ops.py", "python/training/sampling_ops.py", "python/training/sequence_queueing_state_saver.py", ], @@ -67,6 +68,18 @@ py_test( ], ) +py_test( + name = "bucket_ops_test", + size = "medium", + srcs = ["python/training/bucket_ops_test.py"], + srcs_version = "PY2AND3", + deps = [ + ":training_py", + "//tensorflow:tensorflow_py", + "//tensorflow/python:framework_test_lib", + ], +) + filegroup( name = "all_files", srcs = glob( diff --git a/tensorflow/contrib/training/__init__.py b/tensorflow/contrib/training/__init__.py index d8cd9058008..3c0ff0f8cfa 100644 --- a/tensorflow/contrib/training/__init__.py +++ b/tensorflow/contrib/training/__init__.py @@ -38,6 +38,17 @@ balanced. @@stratified_sample @@stratified_sample_unknown_dist + +## Bucketing + +Use ['bucket'](#bucket) or +['bucket_by_sequence_length'](#bucket_by_sequence_length) to stratify +minibatches into groups ("buckets"). Use `bucket_by_sequence_length` +with the argument `dynamic_pad=True` to receive minibatches of similarly +sized sequences for efficient training via `dynamic_rnn`. + +@@bucket +@@bucket_by_sequence_length """ from __future__ import absolute_import @@ -45,6 +56,7 @@ from __future__ import division from __future__ import print_function # pylint: disable=unused-import,wildcard-import +from tensorflow.contrib.training.python.training.bucket_ops import * from tensorflow.contrib.training.python.training.sampling_ops import * from tensorflow.contrib.training.python.training.sequence_queueing_state_saver import * from tensorflow.python.util.all_util import make_all diff --git a/tensorflow/contrib/training/python/training/bucket_ops.py b/tensorflow/contrib/training/python/training/bucket_ops.py new file mode 100644 index 00000000000..3a28c9141fa --- /dev/null +++ b/tensorflow/contrib/training/python/training/bucket_ops.py @@ -0,0 +1,374 @@ +# Copyright 2016 The TensorFlow Authors. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================== + +"""Operations for bucketing data into groups. + +The classes and functions in this module are used to queue up data into +buckets conditional on side information (e.g. sequence length). +""" +from __future__ import absolute_import +from __future__ import division +from __future__ import print_function + +import functools + +import numpy as np + +from tensorflow.python.framework import constant_op +from tensorflow.python.framework import dtypes +from tensorflow.python.framework import errors +from tensorflow.python.framework import ops +from tensorflow.python.framework import tensor_shape +from tensorflow.python.framework import tensor_util +from tensorflow.python.ops import array_ops +from tensorflow.python.ops import control_flow_ops +from tensorflow.python.ops import data_flow_ops +from tensorflow.python.ops import logging_ops +from tensorflow.python.ops import math_ops +from tensorflow.python.training import input as input_py +from tensorflow.python.training import queue_runner + + +# pylint: disable=protected-access +_as_original_type = input_py._as_original_type +_as_tensor_list = input_py._as_tensor_list +_deserialize_sparse_tensors = input_py._deserialize_sparse_tensors +_dtypes = input_py._dtypes +_serialize_sparse_tensors = input_py._serialize_sparse_tensors +_shapes = input_py._shapes +_which_queue = input_py._which_queue +# pylint: enable=protected-access + + +def _validate_bucket(tensor_list): + tensor_list = ops.convert_n_to_tensor_or_indexed_slices(tensor_list) + if not tensor_list: + raise ValueError("Expected at least one tensor in bucket().") + return tensor_list + + +def bucket(tensors, + which_bucket, + batch_size, + num_buckets, + num_threads=1, + capacity=32, + shapes=None, + dynamic_pad=False, + allow_smaller_final_batch=False, + keep_input=None, + shared_name=None, + name=None): + """Lazy bucketing of input tensors according to `which_bucket`. + + The argument `tensors` can be a list or a dictionary of tensors. + The value returned by the function will be of the same type + as `tensors`. + + The tensors entering this function are put into the bucket given by + `which_bucket`. Each bucket has its own queue. When a bucket contains + `batch_size` elements, this minibatch is pushed onto a top queue. The + tensors returned from this function are a the result of dequeueing the + next minibatch from this top queue. + + This function is implemented using several queues. A `QueueRunner` for the + queues is added to the current `Graph`'s `QUEUE_RUNNER` collection. + + As the returned tensors are the result of of a dequeue operation, evaluating + them will throw a `tf.errors.OutOfRangeError` when the input queue is + exhausted. If these tensors are feeding another input queue, its queue runner + will catch this exception, however, if they are used in your main thread + you are responsible for catching this yourself. + + *N.B.:* If `dynamic_pad` is `False`, you must ensure that either + (i) the `shapes` argument is passed, or (ii) all of the tensors in + `tensors` must have fully-defined shapes. `ValueError` will be + raised if neither of these conditions holds. + + If `dynamic_pad` is `True`, it is sufficient that the *rank* of the + tensors is known, but individual dimensions may have shape `None`. + In this case, for each enqueue the dimensions with value `None` + may have a variable length; upon dequeue, the output tensors will be padded + on the right to the maximum shape of the tensors in the current minibatch. + For numbers, this padding takes value 0. For strings, this padding is + the empty string. See `PaddingFIFOQueue` for more info. + + If `allow_smaller_final_batch` is `True`, a smaller batch value than + `batch_size` is returned when the queues are closed and there are not enough + elements to fill the batch, otherwise the pending elements are discarded. + In addition, all output tensors' static shapes, as accessed via the + `get_shape()` method will have a 0th `Dimension` value of `None`, and + operations that depend on fixed batch_size would fail. + + Args: + tensors: The list or dictionary of tensors, representing a single element, + to bucket. Nested lists are not supported. + which_bucket: An `int32` scalar Tensor taking a value in `[0, num_buckets)`. + batch_size: The new batch size pulled from the queue + (python int or int32 scalar). + num_buckets: A python integer, the number of buckets. + num_threads: An integer. The number of threads enqueuing `tensors`. + capacity: An integer. The maximum number of minibatches in the top queue, + and also the maximum number of elements within each bucket. + shapes: (Optional) The shapes for each example. Defaults to the + inferred shapes for `tensors`. + dynamic_pad: Boolean. Allow variable dimensions in input shapes. + The given dimensions are padded upon dequeue so that tensors within a + batch have the same shapes. + allow_smaller_final_batch: (Optional) Boolean. If `True`, allow the final + batches to be smaller if there are insufficient items left in the queues. + keep_input: (Optional). A `bool` scalar Tensor. If provided, this tensor + controls whether the input is added to the queue or not. If it evaluates + `True`, then `tensors` are added to the bucket; otherwise they are + dropped. This tensor essentially acts as a filtering mechanism. + The default behavior is to assume `keep_input=True`. + shared_name: (Optional). If set, the queues will be shared under the given + name across multiple sessions. + name: (Optional) A name for the operations. + + Returns: + A tuple `(bucket, outputs)` where `bucket` is + a `int32` scalar tensor and `outputs` is a list or + dictionary of batched outputs corresponding to elements of `tensors`. + Every step will receive a new bucket of outputs. + + Raises: + ValueError: If the `shapes` are not specified, and cannot be + inferred from the elements of `tensors`. + """ + tensor_list = _as_tensor_list(tensors) + with ops.name_scope(name, "bucket", tensor_list) as name: + tensor_list = _validate_bucket(tensor_list) + (tensor_list, sparse_info) = _serialize_sparse_tensors( + tensor_list, enqueue_many=False) + + # Round-trip batch_size to a tensor, and possibly back + batch_size = ops.convert_to_tensor( + batch_size, dtype=dtypes.int32, name="batch_size") + static_batch_size = tensor_util.constant_value(batch_size) + batch_size = ( + static_batch_size if static_batch_size is not None else batch_size) + + types = _dtypes([tensor_list]) + shapes = _shapes([tensor_list], shapes, enqueue_many=False) + + which_bucket = ops.convert_to_tensor( + which_bucket, dtype=dtypes.int32, name="which_bucket") + + queue_creator = _which_queue(dynamic_pad) + bucket_queues = [] + for i in range(num_buckets): + shared_name_i = ( + "%s_%d" % (shared_name, i) if shared_name is not None else None) + bucket_queues.append( + queue_creator(capacity=capacity, + dtypes=types, + shapes=shapes, + shared_name=shared_name_i, name="bucket_queue_%d" % i)) + + maybe_static_batch_size = ( + None if allow_smaller_final_batch else static_batch_size) + + bucket_shapes = [tensor_shape.vector(maybe_static_batch_size).concatenate(s) + for s in bucket_queues[0].shapes] + # top_queue is a PaddingFIFOQueue even if the bucket queues are regular FIFO + # queues because if we use allow_smaller_final_batch, shapes will + # contain Nones in their first entry; as a result, a regular + # FIFOQueue would die when being passed shapes that are not fully defined. + top_queue = data_flow_ops.PaddingFIFOQueue( + capacity=capacity, + dtypes=[dtypes.int32] + types, + shapes=[tensor_shape.scalar()] + bucket_shapes, + shared_name=shared_name, name="top_queue") + + def enqueue_which(): + def enqueue_single(i): + return bucket_queues[i].enqueue(tensor_list) + enqueues = [ + control_flow_ops.cond( + math_ops.equal(which_bucket, i), + functools.partial(enqueue_single, i), + control_flow_ops.no_op) + for i in range(num_buckets)] + return control_flow_ops.group(*enqueues, name="group_enqueues") + + if keep_input is not None: + # TODO(ebrevdo): Expand keep_input param to core training + # methods, and pipe through to _serialize_sparse_tensors; so + # that expensive serialization is guarded by keep_input. + maybe_enqueue = control_flow_ops.cond( + keep_input, + enqueue_which, + control_flow_ops.no_op) + else: + maybe_enqueue = enqueue_which() + + bucket_enqueue_ops = [maybe_enqueue] * num_threads + + if allow_smaller_final_batch: + which_dequeue = lambda q: q.dequeue_up_to + else: + which_dequeue = lambda q: q.dequeue_many + + enqueues_to_top = [ + top_queue.enqueue( + [constant_op.constant(i)] + + which_dequeue(q)(batch_size, name="read_bucket_%d" % i), + name="enqueue_from_bucket_%d" % i) + for i, q in enumerate(bucket_queues)] + + for i, q in enumerate(bucket_queues): + queue_runner.add_queue_runner(queue_runner.QueueRunner( + q, [enqueues_to_top[i]], + queue_closed_exception_types=( + errors.OutOfRangeError, errors.CancelledError))) + queue_runner.add_queue_runner(queue_runner.QueueRunner( + top_queue, bucket_enqueue_ops, + queue_closed_exception_types=( + errors.OutOfRangeError, errors.CancelledError))) + + for q in bucket_queues: + logging_ops.scalar_summary( + "bucket/%s/size" % q.name, + math_ops.cast(top_queue.size(), dtypes.float32)) + logging_ops.scalar_summary( + "bucket/%s/fraction_of_%d_full" % (top_queue.name, capacity), + math_ops.cast(top_queue.size(), dtypes.float32) * (1. / capacity)) + + dequeued = top_queue.dequeue(name="dequeue_top") + which_bucket_dequeued = dequeued[0] + dequeued = dequeued[1:] + dequeued = _deserialize_sparse_tensors(dequeued, sparse_info) + return (which_bucket_dequeued, _as_original_type(tensors, dequeued)) + + +def bucket_by_sequence_length(input_length, + tensors, + batch_size, + bucket_boundaries, + num_threads=1, + capacity=32, + shapes=None, + dynamic_pad=False, + allow_smaller_final_batch=False, + keep_input=None, + shared_name=None, + name=None): + """Lazy bucketing of inputs according to their length. + + This method calls `tf.contrib.training.bucket` under the hood, after first + subdividing the bucket boundaries into separate buckets and identifying which + bucket the given `input_length` belongs to. See the documentation for + `which_bucket` for details of the other arguments. + + Args: + input_length: `int32` scalar `Tensor`, the sequence length of tensors. + tensors: The list or dictionary of tensors, representing a single element, + to bucket. Nested lists are not supported. + batch_size: The new batch size pulled from the queue + (python int or int32 scalar). + bucket_boundaries: int list, increasing non-negative numbers. + The edges of the buckets to use when bucketing tensors. Two extra buckets + are created, one for `input_length < bucket_boundaries[0]` and + one for `input_length >= bucket_boundaries[-1]`. + num_threads: An integer. The number of threads enqueuing `tensors`. + capacity: An integer. The maximum number of minibatches in the top queue, + and also the maximum number of elements within each bucket. + shapes: (Optional) The shapes for each example. Defaults to the + inferred shapes for `tensors`. + dynamic_pad: Boolean. Allow variable dimensions in input shapes. + The given dimensions are padded upon dequeue so that tensors within a + batch have the same shapes. + allow_smaller_final_batch: (Optional) Boolean. If `True`, allow the final + batches to be smaller if there are insufficient items left in the queues. + keep_input: (Optional). A `bool` scalar Tensor. If provided, this tensor + controls whether the input is added to the queue or not. If it evaluates + `True`, then `tensors` are added to the bucket; otherwise they are + dropped. This tensor essentially acts as a filtering mechanism. + The default behavior is to assume `keep_input=True`. + shared_name: (Optional). If set, the queues will be shared under the given + name across multiple sessions. + name: (Optional) A name for the operations. + + Returns: + A tuple `(sequence_length, outputs)` where `sequence_length` is + a 1-D `Tensor` of size `batch_size` and `outputs` is a list or dictionary + of batched, bucketed, outputs corresponding to elements of `tensors`. + + Raises: + TypeError: if `bucket_boundaries` is not a list of python integers. + ValueError: if `bucket_boundaries` is empty or contains non-increasing + values. + """ + tensor_list = _as_tensor_list(tensors) + if not isinstance(bucket_boundaries, (list, tuple)): + raise TypeError( + "bucket_boundaries must be a list or tuple, but received: %s" + % bucket_boundaries) + if not bucket_boundaries: + raise ValueError("bucket_boundaries must not be empty") + for (s, e) in zip(bucket_boundaries[:-1], bucket_boundaries[1:]): + if not isinstance(s, int) or not isinstance(e, int): + raise TypeError( + "bucket boundaries must be integers, but saw: %s and %s" % (s, e)) + if s >= e: + raise ValueError( + "Buckets must contain sequential increasing lengths, but saw: " + "%d before %d" % (s, e)) + + with ops.name_scope(name, "bucket_by_sequence_length", + [input_length] + tensor_list) as name: + input_length = ops.convert_to_tensor( + input_length, dtype=dtypes.int32, name="input_length") + # Bucketing conditions are: + # l < b[0] + # b[0] <= l < b[1] + # b[1] <= l < b[2] + # ... + # b[N-2] <= l < b[N-1] + # b[N-1] <= l + # Equivalent to: + # [-inf, b[0], b[1], ..., b[N-1]] <= l < [b[0], b[1], ..., b[N-1], inf] + buckets_min = [np.iinfo(np.int32).min] + list(bucket_boundaries) + buckets_max = list(bucket_boundaries) + [np.iinfo(np.int32).max] + conditions_c = math_ops.logical_and( + math_ops.less_equal(buckets_min, input_length), + math_ops.less(input_length, buckets_max)) + which_bucket = math_ops.reduce_min(array_ops.where(conditions_c)) + which_bucket = math_ops.to_int32(which_bucket) + + if shapes is not None: + shapes = [tensor_shape.scalar()] + shapes + + _, dequeued = bucket( + tensors=[input_length] + tensor_list, + which_bucket=which_bucket, + batch_size=batch_size, + num_buckets=len(bucket_boundaries) + 1, + num_threads=num_threads, + capacity=capacity, + shapes=shapes, + dynamic_pad=dynamic_pad, + allow_smaller_final_batch=allow_smaller_final_batch, + keep_input=keep_input, + shared_name=shared_name) + + return (dequeued[0], _as_original_type(tensors, dequeued[1:])) + + +__all__ = [ + "bucket", + "bucket_by_sequence_length" +] diff --git a/tensorflow/contrib/training/python/training/bucket_ops_test.py b/tensorflow/contrib/training/python/training/bucket_ops_test.py new file mode 100644 index 00000000000..587cf9411ce --- /dev/null +++ b/tensorflow/contrib/training/python/training/bucket_ops_test.py @@ -0,0 +1,356 @@ +# Copyright 2016 The TensorFlow Authors. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================== + +"""Tests for tf.contrib.training.bucket.""" +from __future__ import absolute_import +from __future__ import division +from __future__ import print_function + +import random + +import numpy as np +import tensorflow as tf + + +def _which_bucket(bucket_edges, v): + """Identify which bucket v falls into. + + Args: + bucket_edges: int array, bucket edges + v: int scalar, index + Returns: + int scalar, the bucket. + If v < bucket_edges[0], return 0. + If bucket_edges[0] <= v < bucket_edges[1], return 1. + ... + If bucket_edges[-2] <= v < bucket_edges[-1], return len(bucket_edges). + If v >= bucket_edges[-1], return len(bucket_edges) + 1 + """ + v = np.asarray(v) + full = [0] + bucket_edges + found = np.where(np.logical_and(v >= full[:-1], v < full[1:]))[0] + if not found.size: + return len(full) + return found[0] + + +class BucketTest(tf.test.TestCase): + + def setUp(self): + tf.reset_default_graph() + + self.scalar_int_feed = tf.placeholder(tf.int32, ()) + self.unk_int64_feed = tf.placeholder(tf.int64, (None,)) + self.vec3_str_feed = tf.placeholder(tf.string, (3,)) + + self._coord = tf.train.Coordinator() + # Make capacity very large so we can feed all the inputs in the + # main thread without blocking + input_queue = tf.PaddingFIFOQueue( + 5000, + dtypes=[tf.int32, tf.int64, tf.string], + shapes=[(), (None,), (3,)]) + + self._input_enqueue_op = input_queue.enqueue( + (self.scalar_int_feed, self.unk_int64_feed, self.vec3_str_feed)) + self.scalar_int, self.unk_int64, self.vec3_str = input_queue.dequeue() + self._threads = None + self._close_op = input_queue.close() + self._sess = None + + def enqueue_inputs(self, sess, feed_dict): + sess.run(self._input_enqueue_op, feed_dict=feed_dict) + + def start_queue_runners(self, sess): + # Store session to be able to close inputs later + if self._sess is None: + self._sess = sess + self._threads = tf.train.start_queue_runners(coord=self._coord) + + def tearDown(self): + if self._sess is not None: + self._sess.run(self._close_op) + self._coord.request_stop() + self._coord.join(self._threads) + + def testSingleBucket(self): + bucketed_dynamic = tf.contrib.training.bucket( + tensors=[self.scalar_int, self.unk_int64, self.vec3_str], + which_bucket=tf.constant(0), + num_buckets=2, + batch_size=32, + num_threads=10, + dynamic_pad=True) + # Check shape inference on bucketing outputs + self.assertAllEqual( + [[32], [32, None], [32, 3]], + [out.get_shape().as_list() for out in bucketed_dynamic[1]]) + with self.test_session() as sess: + for v in range(32): + self.enqueue_inputs( + sess, + {self.scalar_int_feed: v, + self.unk_int64_feed: v * [v], + self.vec3_str_feed: 3 * [str(v)]}) + self.start_queue_runners(sess) + + # Get a single minibatch + bucketed_values = sess.run(bucketed_dynamic) + + # (which_bucket, bucket_tensors). + self.assertEqual(2, len(bucketed_values)) + + # Count number of bucket_tensors. + self.assertEqual(3, len(bucketed_values[1])) + + # Ensure bucket 0 was used for all minibatch entries. + self.assertAllEqual(0, bucketed_values[0]) + + expected_scalar_int = np.arange(32) + expected_unk_int64 = np.zeros((32, 31)).astype(np.int64) + for i in range(32): + expected_unk_int64[i, :i] = i + expected_vec3_str = np.vstack(3 * [np.arange(32).astype(bytes)]).T + + # Must resort the output because num_threads > 1 leads to + # sometimes-inconsistent insertion order. + resort = np.argsort(bucketed_values[1][0]) + self.assertAllEqual(expected_scalar_int, bucketed_values[1][0][resort]) + self.assertAllEqual(expected_unk_int64, bucketed_values[1][1][resort]) + self.assertAllEqual(expected_vec3_str, bucketed_values[1][2][resort]) + + def testEvenOddBuckets(self): + which_bucket = (self.scalar_int % 2) + bucketed_dynamic = tf.contrib.training.bucket( + tensors=[self.scalar_int, self.unk_int64, self.vec3_str], + which_bucket=which_bucket, + num_buckets=2, + batch_size=32, + num_threads=10, + dynamic_pad=True) + # Check shape inference on bucketing outputs + self.assertAllEqual( + [[32], [32, None], [32, 3]], + [out.get_shape().as_list() for out in bucketed_dynamic[1]]) + with self.test_session() as sess: + for v in range(64): + self.enqueue_inputs( + sess, + {self.scalar_int_feed: v, + self.unk_int64_feed: v * [v], + self.vec3_str_feed: 3 * [str(v)]}) + self.start_queue_runners(sess) + + # Get two minibatches (one containing even values, one containing odds) + bucketed_values_0 = sess.run(bucketed_dynamic) + bucketed_values_1 = sess.run(bucketed_dynamic) + + # (which_bucket, bucket_tensors). + self.assertEqual(2, len(bucketed_values_0)) + self.assertEqual(2, len(bucketed_values_1)) + + # Count number of bucket_tensors. + self.assertEqual(3, len(bucketed_values_0[1])) + self.assertEqual(3, len(bucketed_values_1[1])) + + # Figure out which output has the even values (there's + # randomness due to the multithreaded nature of bucketing) + if bucketed_values_0[0] % 2 == 1: + bucketed_values_even, bucketed_values_odd = ( + bucketed_values_1, bucketed_values_0) + else: + bucketed_values_even, bucketed_values_odd = ( + bucketed_values_0, bucketed_values_1) + + # Ensure bucket 0 was used for all minibatch entries. + self.assertAllEqual(0, bucketed_values_even[0]) + self.assertAllEqual(1, bucketed_values_odd[0]) + + # Test the first bucket outputted, the events starting at 0 + expected_scalar_int = np.arange(0, 32 * 2, 2) + expected_unk_int64 = np.zeros((32, 31 * 2)).astype(np.int64) + for i in range(0, 32): + expected_unk_int64[i, :2*i] = 2*i + expected_vec3_str = np.vstack( + 3 * [np.arange(0, 32 * 2, 2).astype(bytes)]).T + + # Must resort the output because num_threads > 1 leads to + # sometimes-inconsistent insertion order. + resort = np.argsort(bucketed_values_even[1][0]) + self.assertAllEqual(expected_scalar_int, + bucketed_values_even[1][0][resort]) + self.assertAllEqual(expected_unk_int64, + bucketed_values_even[1][1][resort]) + self.assertAllEqual(expected_vec3_str, + bucketed_values_even[1][2][resort]) + + # Test the second bucket outputted, the odds starting at 1 + expected_scalar_int = np.arange(1, 32 * 2 + 1, 2) + expected_unk_int64 = np.zeros((32, 31 * 2 + 1)).astype(np.int64) + for i in range(0, 32): + expected_unk_int64[i, :2*i + 1] = 2*i + 1 + expected_vec3_str = np.vstack( + 3 * [np.arange(1, 32 * 2 + 1, 2).astype(bytes)]).T + + # Must resort the output because num_threads > 1 leads to + # sometimes-inconsistent insertion order. + resort = np.argsort(bucketed_values_odd[1][0]) + self.assertAllEqual(expected_scalar_int, + bucketed_values_odd[1][0][resort]) + self.assertAllEqual(expected_unk_int64, + bucketed_values_odd[1][1][resort]) + self.assertAllEqual(expected_vec3_str, + bucketed_values_odd[1][2][resort]) + + def testEvenOddBucketsFilterOutAllOdd(self): + which_bucket = (self.scalar_int % 2) + keep_input = tf.equal(which_bucket, 0) + bucketed_dynamic = tf.contrib.training.bucket( + tensors=[self.scalar_int, self.unk_int64, self.vec3_str], + which_bucket=which_bucket, + num_buckets=2, + batch_size=32, + num_threads=10, + keep_input=keep_input, + dynamic_pad=True) + # Check shape inference on bucketing outputs + self.assertAllEqual( + [[32], [32, None], [32, 3]], + [out.get_shape().as_list() for out in bucketed_dynamic[1]]) + with self.test_session() as sess: + for v in range(128): + self.enqueue_inputs( + sess, + {self.scalar_int_feed: v, + self.unk_int64_feed: v * [v], + self.vec3_str_feed: 3 * [str(v)]}) + self.start_queue_runners(sess) + + # Get two minibatches ([0, 2, ...] and [64, 66, ...]) + bucketed_values_even0 = sess.run(bucketed_dynamic) + bucketed_values_even1 = sess.run(bucketed_dynamic) + + # Ensure that bucket 1 was completely filtered out + self.assertAllEqual(0, bucketed_values_even0[0]) + self.assertAllEqual(0, bucketed_values_even1[0]) + + # Merge their output for sorting and comparison + bucketed_values_all_elem0 = np.concatenate( + (bucketed_values_even0[1][0], + bucketed_values_even1[1][0])) + + self.assertAllEqual( + np.arange(0, 128, 2), sorted(bucketed_values_all_elem0)) + + +class BucketBySequenceLengthTest(tf.test.TestCase): + + def _testBucketBySequenceLength(self, allow_small_batch): + tf.reset_default_graph() + + # All inputs must be identical lengths across tuple index. + # The input reader will get input_length from the first tuple + # entry. + data_len = 4 + target_len = 3 + input_pairs = [ + (length, + ([np.int64(length)] * data_len, + [str(length).encode("ascii")] * target_len)) + for length in (1, 3, 4, 5, 6, 10)] + + lengths = tf.placeholder(tf.int32, ()) + data = tf.placeholder(tf.int64, (data_len,)) + targets = tf.placeholder(tf.string, (target_len,)) + + batch_size = 8 + bucket_boundaries = [3, 4, 5, 10] + + # Make capacity very large so we can feed all the inputs in the + # main thread without blocking + input_queue = tf.FIFOQueue( + 5000, (tf.int32, tf.int64, tf.string), + ((), (data_len,), (target_len,))) + input_enqueue_op = input_queue.enqueue((lengths, data, targets)) + lengths_t, data_t, targets_t = input_queue.dequeue() + close_input_op = input_queue.close() + + (out_lengths_t, data_and_targets_t) = ( + tf.contrib.training.bucket_by_sequence_length( + input_length=lengths_t, + tensors=[data_t, targets_t], + batch_size=batch_size, + bucket_boundaries=bucket_boundaries, + allow_smaller_final_batch=allow_small_batch, + num_threads=10)) + + expected_batch_size = None if allow_small_batch else batch_size + self.assertEqual(out_lengths_t.get_shape().as_list(), + [expected_batch_size]) + self.assertEqual(data_and_targets_t[0].get_shape().as_list(), + [expected_batch_size, data_len]) + self.assertEqual(data_and_targets_t[1].get_shape().as_list(), + [expected_batch_size, target_len]) + + def _read_test(sess): + for _ in range(50): + (out_lengths, (data, targets)) = sess.run( + (out_lengths_t, data_and_targets_t)) + if allow_small_batch: + self.assertEqual(data_len, data.shape[1]) + self.assertEqual(target_len, targets.shape[1]) + self.assertGreaterEqual(batch_size, out_lengths.shape[0]) + self.assertGreaterEqual(batch_size, data.shape[0]) + self.assertGreaterEqual(batch_size, targets.shape[0]) + else: + self.assertEqual((batch_size, data_len), data.shape) + self.assertEqual((batch_size, target_len), targets.shape) + self.assertEqual((batch_size,), out_lengths.shape) + for (lr, dr, tr) in zip(out_lengths, data, targets): + # Make sure length matches data (here it's the same value) + self.assertEqual(dr[0], lr) + # Make sure data & targets match + self.assertEqual(dr[0], int(tr[0].decode("ascii"))) + # Make sure for each row, data came from the same bucket. + self.assertEqual(_which_bucket(bucket_boundaries, dr[0]), + _which_bucket(bucket_boundaries, dr[1])) + + with self.test_session() as sess: + coord = tf.train.Coordinator() + + # Feed the inputs, then close the input thread. + for _ in range(50 * batch_size + 100): + which = random.randint(0, len(input_pairs) - 1) + length, pair = input_pairs[which] + sess.run(input_enqueue_op, feed_dict={ + lengths: length, data: pair[0], targets: pair[1]}) + sess.run(close_input_op) + + # Start the queue runners + threads = tf.train.start_queue_runners(coord=coord) + # Read off the top of the bucket and ensure correctness of output + _read_test(sess) + coord.request_stop() + coord.join(threads) + + def testBucketBySequenceLength(self): + self._testBucketBySequenceLength(allow_small_batch=False) + + def testBucketBySequenceLengthAllow(self): + self._testBucketBySequenceLength(allow_small_batch=True) + + +if __name__ == "__main__": + tf.test.main() From 1f4e610111e88b185bf50006fa0b96d407a6f62b Mon Sep 17 00:00:00 2001 From: "A. Unique TensorFlower" Date: Wed, 31 Aug 2016 16:09:58 -0800 Subject: [PATCH 10/89] Fix array_ops operations that support SparseTensor also to support SparseTensorValue. Change: 131892078 --- .../python/kernel_tests/array_ops_test.py | 32 ++++++++++- .../kernel_tests/edit_distance_op_test.py | 57 +++++++++++++------ tensorflow/python/ops/array_ops.py | 22 ++++--- 3 files changed, 86 insertions(+), 25 deletions(-) diff --git a/tensorflow/python/kernel_tests/array_ops_test.py b/tensorflow/python/kernel_tests/array_ops_test.py index 913e04bb95b..ef30f97efc4 100644 --- a/tensorflow/python/kernel_tests/array_ops_test.py +++ b/tensorflow/python/kernel_tests/array_ops_test.py @@ -18,7 +18,6 @@ from __future__ import absolute_import from __future__ import division from __future__ import print_function -import math import time import numpy as np @@ -717,5 +716,36 @@ class SliceAssignTest(test_util.TensorFlowTestCase): v = tf.Variable([1, 2]) sess.run(v[:].assign([1, 2])) + +class ShapeSizeRankTest(test_util.TensorFlowTestCase): + + def testDenseShape(self): + with self.test_session(): + t_value = [[0, 42], [24, 0]] + self.assertAllEqual((2, 2), tf.shape(t_value).eval()) + self.assertEqual(4, tf.size(t_value).eval()) + self.assertEqual(2, tf.rank(t_value).eval()) + + t = tf.constant(t_value) + self.assertAllEqual((2, 2), tf.shape(t).eval()) + self.assertEqual(4, tf.size(t).eval()) + self.assertEqual(2, tf.rank(t).eval()) + + def testSparseShape(self): + with self.test_session(): + sp_value = tf.SparseTensorValue( + indices=((0, 1), (1, 0)), + values=(42, 24), + shape=(2, 2)) + self.assertAllEqual((2, 2), tf.shape(sp_value).eval()) + self.assertEqual(4, tf.size(sp_value).eval()) + self.assertEqual(2, tf.rank(sp_value).eval()) + + sp = tf.SparseTensor.from_value(sp_value) + self.assertAllEqual((2, 2), tf.shape(sp).eval()) + self.assertEqual(4, tf.size(sp).eval()) + self.assertEqual(2, tf.rank(sp).eval()) + + if __name__ == "__main__": tf.test.main() diff --git a/tensorflow/python/kernel_tests/edit_distance_op_test.py b/tensorflow/python/kernel_tests/edit_distance_op_test.py index 6d5cf73fc55..4662b956cfe 100644 --- a/tensorflow/python/kernel_tests/edit_distance_op_test.py +++ b/tensorflow/python/kernel_tests/edit_distance_op_test.py @@ -31,26 +31,49 @@ def ConstantOf(x): class EditDistanceTest(tf.test.TestCase): - def _testEditDistance(self, hypothesis, truth, normalize, - expected_output, expected_err_re=None): - # hypothesis and truth are (index, value, shape) tuples - hypothesis_st = tf.SparseTensor(*[ConstantOf(x) for x in hypothesis]) - truth_st = tf.SparseTensor(*[ConstantOf(x) for x in truth]) + def _testEditDistanceST( + self, hypothesis_st, truth_st, normalize, expected_output, + expected_shape, expected_err_re=None): edit_distance = tf.edit_distance( hypothesis=hypothesis_st, truth=truth_st, normalize=normalize) - with self.test_session(): - if expected_err_re is None: - # Shape inference figures out the shape from the shape variables - # Explicit tuple() needed since zip returns an iterator in Python 3. - expected_shape = [ - max(h, t) for h, t in tuple(zip(hypothesis[2], truth[2]))[:-1]] - self.assertEqual(edit_distance.get_shape(), expected_shape) - output = edit_distance.eval() - self.assertAllClose(output, expected_output) - else: - with self.assertRaisesOpError(expected_err_re): - edit_distance.eval() + if expected_err_re is None: + self.assertEqual(edit_distance.get_shape(), expected_shape) + output = edit_distance.eval() + self.assertAllClose(output, expected_output) + else: + with self.assertRaisesOpError(expected_err_re): + edit_distance.eval() + + def _testEditDistance(self, hypothesis, truth, normalize, + expected_output, expected_err_re=None): + # Shape inference figures out the shape from the shape variables + # Explicit tuple() needed since zip returns an iterator in Python 3. + expected_shape = [ + max(h, t) for h, t in tuple(zip(hypothesis[2], truth[2]))[:-1]] + + # SparseTensorValue inputs. + with tf.Graph().as_default() as g, self.test_session(g): + # hypothesis and truth are (index, value, shape) tuples + self._testEditDistanceST( + hypothesis_st=tf.SparseTensorValue( + *[ConstantOf(x) for x in hypothesis]), + truth_st=tf.SparseTensorValue(*[ConstantOf(x) for x in truth]), + normalize=normalize, + expected_output=expected_output, + expected_shape=expected_shape, + expected_err_re=expected_err_re) + + # SparseTensor inputs. + with tf.Graph().as_default() as g, self.test_session(g): + # hypothesis and truth are (index, value, shape) tuples + self._testEditDistanceST( + hypothesis_st=tf.SparseTensor(*[ConstantOf(x) for x in hypothesis]), + truth_st=tf.SparseTensor(*[ConstantOf(x) for x in truth]), + normalize=normalize, + expected_output=expected_output, + expected_shape=expected_shape, + expected_err_re=expected_err_re) def testEditDistanceNormalized(self): hypothesis_indices = [[0, 0], [0, 1], diff --git a/tensorflow/python/ops/array_ops.py b/tensorflow/python/ops/array_ops.py index 3b431f258f3..b4421ba640e 100644 --- a/tensorflow/python/ops/array_ops.py +++ b/tensorflow/python/ops/array_ops.py @@ -104,7 +104,9 @@ _baseslice = slice # Aliases for some automatically-generated names. listdiff = gen_array_ops.list_diff + def shape(input, name=None): + # pylint: disable=redefined-builtin """Returns the shape of a tensor. This operation returns a 1-D integer tensor representing the shape of `input`. @@ -127,6 +129,7 @@ def shape(input, name=None): def shape_internal(input, name=None, optimize=True): + # pylint: disable=redefined-builtin """Returns the shape of a tensor. Args: @@ -138,7 +141,7 @@ def shape_internal(input, name=None, optimize=True): A `Tensor` of type `int32`. """ with ops.name_scope(name, "Shape", [input]) as name: - if isinstance(input, ops.SparseTensor): + if isinstance(input, (ops.SparseTensor, ops.SparseTensorValue)): return gen_math_ops.cast(input.shape, dtypes.int32) else: input_tensor = ops.convert_to_tensor(input) @@ -149,6 +152,7 @@ def shape_internal(input, name=None, optimize=True): def size(input, name=None): + # pylint: disable=redefined-builtin """Returns the size of a tensor. This operation returns an integer representing the number of elements in @@ -172,6 +176,7 @@ def size(input, name=None): def size_internal(input, name=None, optimize=True): + # pylint: disable=redefined-builtin,protected-access """Returns the size of a tensor. Args: @@ -183,7 +188,7 @@ def size_internal(input, name=None, optimize=True): A `Tensor` of type `int32`. """ with ops.name_scope(name, "Size", [input]) as name: - if isinstance(input, ops.SparseTensor): + if isinstance(input, (ops.SparseTensor, ops.SparseTensorValue)): return gen_math_ops._prod(gen_math_ops.cast(input.shape, dtypes.int32), 0, name=name) else: @@ -195,6 +200,7 @@ def size_internal(input, name=None, optimize=True): def rank(input, name=None): + # pylint: disable=redefined-builtin """Returns the rank of a tensor. This operation returns an integer representing the rank of `input`. @@ -222,6 +228,7 @@ def rank(input, name=None): def rank_internal(input, name=None, optimize=True): + # pylint: disable=redefined-builtin """Returns the rank of a tensor. Args: @@ -233,7 +240,7 @@ def rank_internal(input, name=None, optimize=True): A `Tensor` of type `int32`. """ with ops.name_scope(name, "Rank", [input]) as name: - if isinstance(input, ops.SparseTensor): + if isinstance(input, (ops.SparseTensor, ops.SparseTensorValue)): return gen_array_ops.size(input.shape, name=name) else: input_tensor = ops.convert_to_tensor(input) @@ -341,6 +348,7 @@ def _SliceHelper(tensor, slice_spec, var=None): # pylint: disable=undefined-variable,protected-access def slice(input_, begin, size, name=None): + # pylint: disable=redefined-builtin """Extracts a slice from a tensor. This operation extracts a slice of size `size` from a tensor `input` starting @@ -2332,10 +2340,10 @@ def edit_distance(hypothesis, truth, normalize=True, name="edit_distance"): Raises: TypeError: If either `hypothesis` or `truth` are not a `SparseTensor`. """ - if not isinstance(hypothesis, ops.SparseTensor): - raise TypeError("Hypothesis must be a SparseTensor") - if not isinstance(truth, ops.SparseTensor): - raise TypeError("Truth must be a SparseTensor") + if not isinstance(hypothesis, (ops.SparseTensor, ops.SparseTensorValue)): + raise TypeError("Hypothesis must be a SparseTensor.") + if not isinstance(truth, (ops.SparseTensor, ops.SparseTensorValue)): + raise TypeError("Truth must be a SparseTensor.") return gen_array_ops._edit_distance(hypothesis.indices, hypothesis.values, From c35e69c56941d79163dc9f054f57c199b1a4cc44 Mon Sep 17 00:00:00 2001 From: Derek Murray Date: Wed, 31 Aug 2016 16:22:04 -0800 Subject: [PATCH 11/89] Raise an error when a client attempts to feed two values for the same tensor. This change makes TensorFlow more robust to incorrect uses of the C and C++ APIs (but the Python API already prevents duplicates). Fixes #4084. Change: 131893321 --- tensorflow/core/common_runtime/direct_session_test.cc | 8 ++++++++ tensorflow/core/graph/subgraph.cc | 10 +++++++++- 2 files changed, 17 insertions(+), 1 deletion(-) diff --git a/tensorflow/core/common_runtime/direct_session_test.cc b/tensorflow/core/common_runtime/direct_session_test.cc index 51f70e551b5..4eb48c7bcf3 100644 --- a/tensorflow/core/common_runtime/direct_session_test.cc +++ b/tensorflow/core/common_runtime/direct_session_test.cc @@ -397,6 +397,14 @@ TEST(DirectSessionTest, MultipleFeedTest) { ASSERT_EQ(2, outputs.size()); ASSERT_EQ(11.0, outputs[0].flat()(0)); ASSERT_EQ(22.0, outputs[1].flat()(0)); + + // Feed [first_const, first_const] + s = session->Run( + {{first_const->name(), value_11}, {first_const->name(), value_22}}, + {first_identity->name() + ":0", second_identity->name() + ":0"}, {}, + &outputs); + EXPECT_TRUE(errors::IsInvalidArgument(s)); + EXPECT_TRUE(StringPiece(s.error_message()).contains("fed more than once")); } REGISTER_OP("Darth") diff --git a/tensorflow/core/graph/subgraph.cc b/tensorflow/core/graph/subgraph.cc index b6a32b9ea7f..c2978bbcf4a 100644 --- a/tensorflow/core/graph/subgraph.cc +++ b/tensorflow/core/graph/subgraph.cc @@ -235,7 +235,15 @@ Status RewriteGraphForExecution( "Must specify at least one target to fetch or execute."); } - std::unordered_set endpoints(fed_outputs.begin(), fed_outputs.end()); + std::unordered_set endpoints; + for (const string& endpoint_name : fed_outputs) { + auto result = endpoints.insert(endpoint_name); + if (!result.second) { + return errors::InvalidArgument("Endpoint \"", endpoint_name, + "\" fed more than once."); + } + } + for (const auto& fetch : fetch_outputs) { if (endpoints.count(fetch) > 0) { return errors::InvalidArgument(fetch, " is both fed and fetched."); From 281d02fbeb3d14214bfe8d677aa72d005d680fa3 Mon Sep 17 00:00:00 2001 From: Renato Utsch Date: Wed, 31 Aug 2016 16:28:59 -0800 Subject: [PATCH 12/89] Refactor the sidebar to a single codebase With this, the next step is to integrate it to the tf-pane-helper automatically, so that creating new dashboards becomes quick and easy. Change: 131893958 --- .../tf-dashboard-common/dashboard-style.html | 35 +--- .../tf-sidebar-helper.html | 159 ++++++++++++++++++ .../tf-distribution-dashboard.html | 57 +++---- .../tf-event-dashboard.html | 146 ++++++++-------- .../tf-histogram-dashboard.html | 134 +++++---------- .../components/vz-line-chart/vz-line-chart.ts | 3 +- 6 files changed, 297 insertions(+), 237 deletions(-) create mode 100644 tensorflow/tensorboard/components/tf-dashboard-common/tf-sidebar-helper.html diff --git a/tensorflow/tensorboard/components/tf-dashboard-common/dashboard-style.html b/tensorflow/tensorboard/components/tf-dashboard-common/dashboard-style.html index 2126015b135..b6225ba5b23 100644 --- a/tensorflow/tensorboard/components/tf-dashboard-common/dashboard-style.html +++ b/tensorflow/tensorboard/components/tf-dashboard-common/dashboard-style.html @@ -21,10 +21,6 @@ limitations under the License. diff --git a/tensorflow/tensorboard/components/tf-dashboard-common/tf-sidebar-helper.html b/tensorflow/tensorboard/components/tf-dashboard-common/tf-sidebar-helper.html new file mode 100644 index 00000000000..e2e43fb084d --- /dev/null +++ b/tensorflow/tensorboard/components/tf-dashboard-common/tf-sidebar-helper.html @@ -0,0 +1,159 @@ + + + + + + + + + + + diff --git a/tensorflow/tensorboard/components/tf-distribution-dashboard/tf-distribution-dashboard.html b/tensorflow/tensorboard/components/tf-distribution-dashboard/tf-distribution-dashboard.html index bd3447ede86..e92586626e9 100644 --- a/tensorflow/tensorboard/components/tf-distribution-dashboard/tf-distribution-dashboard.html +++ b/tensorflow/tensorboard/components/tf-distribution-dashboard/tf-distribution-dashboard.html @@ -18,12 +18,10 @@ limitations under the License. - - - + @@ -33,17 +31,16 @@ limitations under the License. tf-distribution-dashboard is a complete frontend that loads runs from a backend, and creates chart panes that display data for those runs. -It provides a categorizer, run selector, and x type selector, by which the user -can customize how data is organized and displayed. +It provides a x type selector and the normal tf-sidebar-helper options, by +which the user can customize how data is organized and displayed. -Each chart has a button that can toggle whether it is "selected"; selectedRuns +Each chart has a button that can toggle whether it is "expanded"; expanded charts are larger. Organizationally, the #plumbing div contains components that have no concrete -manifestation and just effect data bindings or data loading. The #sidebar contains -shared controls like the tf-categorizer, tf-run-selector, and tf-x-type-selector. -The #center div contains tf-distribution-charts embedded inside -tf-collapsable-panes. +manifestation and just effect data bindings or data loading. The .sidebar div +contains shared controls provided by tf-sidebar-helper. The .center div +contains tf-distribution-charts embedded inside tf-panes-helper's. -->