Edits to "Threading and Queues" section of programmers guide.

PiperOrigin-RevId: 161203568
This commit is contained in:
Derek Murray 2017-07-07 08:20:48 -07:00 committed by TensorFlower Gardener
parent 81e81b796d
commit a86d439070

View File

@ -1,28 +1,44 @@
# Threading and Queues # Threading and Queues
Queues are a powerful mechanism for asynchronous computation using TensorFlow. Note: In versions of TensorFlow before 1.2, we recommended using multi-threaded,
queue-based input pipelines for performance. Beginning with TensorFlow 1.2,
however, we recommend using the `tf.contrib.data` module instead. (See
[Datasets](datasets) for details.) The `tf.contrib.data` module offers an
easier-to-use interface for constructing efficient input pipelines. Furthermore,
we've stopped developing the old multi-threaded, queue-based input pipelines.
We've retained the documentation in this file to help developers who are still
maintaining older code.
Like everything in TensorFlow, a queue is a node in a TensorFlow graph. It's a Multithreaded queues are a powerful and widely used mechanism supporting
stateful node, like a variable: other nodes can modify its content. In asynchronous computation.
particular, nodes can enqueue new items in to the queue, or dequeue existing
items from the queue.
To get a feel for queues, let's consider a simple example. We will create a Following the [dataflow programming model](graphs.md), TensorFlow's queues are
"first in, first out" queue (`FIFOQueue`) and fill it with zeros. implemented using nodes in the computation graph. A queue is a stateful node,
Then we'll construct a graph like a variable: other nodes can modify its content. In particular, nodes can
that takes an item off the queue, adds one to that item, and puts it back on the enqueue new items in to the queue, or dequeue existing items from the
end of the queue. Slowly, the numbers on the queue increase. queue. TensorFlow's queues provide a way to coordinate multiple steps of a
computation: a queue will **block** any step that attempts to dequeue from it
when it is empty, or enqueue to it when it is full. When that condition no
longer holds, the queue will unblock the step and allow execution to proceed.
TensorFlow implements several classes of queue. The principal difference between
these classes is the order that items are removed from the queue. To get a feel
for queues, let's consider a simple example. We will create a "first in, first
out" queue (@{tf.FIFOQueue}) and fill it with zeros. Then we'll construct a
graph that takes an item off the queue, adds one to that item, and puts it back
on the end of the queue. Slowly, the numbers on the queue increase.
<div style="width:70%; margin:auto; margin-bottom:10px; margin-top:20px;"> <div style="width:70%; margin:auto; margin-bottom:10px; margin-top:20px;">
<img style="width:100%" src="https://www.tensorflow.org/images/IncremeterFifoQueue.gif"> <img style="width:100%" src="https://www.tensorflow.org/images/IncremeterFifoQueue.gif">
</div> </div>
`Enqueue`, `EnqueueMany`, and `Dequeue` are special nodes. They take a pointer `Enqueue`, `EnqueueMany`, and `Dequeue` are special nodes. They take a pointer
to the queue instead of a normal value, allowing them to change it. We recommend to the queue instead of a normal value, allowing them to mutate its state. We
you think of these as being like methods of the queue. In fact, in the Python recommend that you think of these operations as being like methods of the queue
API, they are methods of the queue object (e.g. `q.enqueue(...)`). in an object-oriented sense. In fact, in the Python API, these operations are
created by calling methods on a queue object (e.g. `q.enqueue(...)`).
**N.B.** Queue methods (such as `q.enqueue(...)`) *must* run on the same device Note: Queue methods (such as `q.enqueue(...)`) *must* run on the same device
as the queue. Incompatible device placement directives will be ignored when as the queue. Incompatible device placement directives will be ignored when
creating these operations. creating these operations.
@ -32,13 +48,13 @@ Now that you have a bit of a feel for queues, let's dive into the details...
Queues, such as @{tf.FIFOQueue} Queues, such as @{tf.FIFOQueue}
and @{tf.RandomShuffleQueue}, and @{tf.RandomShuffleQueue},
are important TensorFlow objects for computing tensors asynchronously in a are important TensorFlow objects that aid in computing tensors asynchronously
graph. in a graph.
For example, a typical input architecture is to use a `RandomShuffleQueue` to For example, a typical input pipeline uses a `RandomShuffleQueue` to
prepare inputs for training a model: prepare inputs for training a model:
* Multiple threads prepare training examples and push them in the queue. * Multiple threads prepare training examples and enqueue them in the queue.
* A training thread executes a training op that dequeues mini-batches from the * A training thread executes a training op that dequeues mini-batches from the
queue queue
@ -46,7 +62,8 @@ This architecture has many benefits, as highlighted in the
@{$reading_data$Reading data how to}, which also gives an overview of @{$reading_data$Reading data how to}, which also gives an overview of
functions that simplify the construction of input pipelines. functions that simplify the construction of input pipelines.
The TensorFlow `Session` object is multithreaded, so multiple threads can The TensorFlow `Session` object is multithreaded and thread-safe, so multiple
threads can
easily use the same session and run ops in parallel. However, it is not always easily use the same session and run ops in parallel. However, it is not always
easy to implement a Python program that drives threads as described above. All easy to implement a Python program that drives threads as described above. All
threads must be able to stop together, exceptions must be caught and threads must be able to stop together, exceptions must be caught and
@ -62,11 +79,12 @@ enqueue tensors in the same queue.
## Coordinator ## Coordinator
The `Coordinator` class helps multiple threads stop together. The @{tf.train.Coordinator} class manages background threads in a TensorFlow
program and helps multiple threads stop together.
Its key methods are: Its key methods are:
* @{tf.train.Coordinator.should_stop}: returns True if the threads should stop. * @{tf.train.Coordinator.should_stop}: returns `True` if the threads should stop.
* @{tf.train.Coordinator.request_stop}: requests that threads should stop. * @{tf.train.Coordinator.request_stop}: requests that threads should stop.
* @{tf.train.Coordinator.join}: waits until the specified threads have stopped. * @{tf.train.Coordinator.join}: waits until the specified threads have stopped.
@ -105,10 +123,10 @@ also has support to capture and report exceptions. See the @{tf.train.Coordinat
## QueueRunner ## QueueRunner
The `QueueRunner` class creates a number of threads that repeatedly run an The @{tf.train.QueueRunner} class creates a number of threads that repeatedly
enqueue op. These threads can use a coordinator to stop together. In run an enqueue op. These threads can use a coordinator to stop together. In
addition, a queue runner runs a *closer thread* that automatically closes the addition, a queue runner will run a *closer operation* that closes the queue if
queue if an exception is reported to the coordinator. an exception is reported to the coordinator.
You can use a queue runner to implement the architecture described above. You can use a queue runner to implement the architecture described above.