STT-tensorflow/tensorflow/python/debug
Shanqing Cai 1a3b7af373 [tfdbg2] Support reading multiple DebugEvent file sets from the same tfdbg run
TensorFlow jobs that involve multiple hosts (e.g., parameter-server setups and
TPU coordinator-worker setups) can generate >1 DebugEvent file sets when
instrumented with tfdbg2's `tf.debugging.experimental.enable_dump_debug_info()`.

This CL adds capability to load these multiple file sets belonging to the
same tfdbg_run_id to DebugEventsReader and DebugDataReader.

PiperOrigin-RevId: 317765159
Change-Id: Ifcf593bd8b404e3e1c3a6f3f3be70bd6b8b73555
2020-06-22 17:56:53 -07:00
..
cli Merge pull request #34985 from kiszk:spelling_tweaks_python 2020-02-12 13:41:49 -08:00
examples [tfdbg2] A few fixes and improvements to example debug_mnist_v2 2020-06-15 08:50:39 -07:00
lib [tfdbg2] Support reading multiple DebugEvent file sets from the same tfdbg run 2020-06-22 17:56:53 -07:00
wrappers [tfdbg2] Fork local_cli_wrapper_test for keras related tests. 2020-06-17 13:20:04 -07:00
BUILD [tfdbg2] Fork local_cli_wrapper_test for keras related tests. 2020-06-17 13:20:04 -07:00
README.md
__init__.py

README.md

TensorFlow Debugger (TFDBG)

[TOC]

TensorFlow Debugger (TFDBG) is a specialized debugger for TensorFlow's computation graphs. It provides access to internal graph structures and tensor values at TensorFlow runtime.

Why TFDBG?

In TensorFlow's current computation-graph framework, almost all actual computation after graph construction happens in a single Python function, namely tf.Session.run. Basic Python debugging tools such as pdb cannot be used to debug Session.run, due to the fact that TensorFlow's graph execution happens in the underlying C++ layer. C++ debugging tools such as gdb are not ideal either, because of their inability to recognize and organize the stack frames and variables in a way relevant to TensorFlow's operations, tensors and other graph constructs.

TFDBG addresses these limitations. Among the features provided by TFDBG, the following ones are designed to facilitate runtime debugging of TensorFlow models:

  • Easy access through session wrappers
  • Easy integration with common high-level APIs, such as TensorFlow Estimators and Keras
  • Inspection of runtime tensor values and node connections
  • Conditional breaking after runs that generate tensors satisfying given predicates, which makes common debugging tasks such as tracing the origin of infinities and NaNs easier
  • Association of nodes and tensors in graphs with Python source lines
  • Profiling of models at the level of graph nodes and Python source lines. (Omitted internal-only feature)
  • A gRPC-based remote debugging protocol, which allows us to build a browser-based graphical user interface (GUI) for TFDBG: the TensorBoard Debugger Plugin.

How to use TFDBG?