STT-tensorflow/tensorflow/python/autograph/g3doc/pyct_tutorial.ipynb

{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "fWfkYsCgPvqR"
      },
      "source": [
        "# Short intro to the SCT library of AutoGraph\n",
        "\n",
        "**Work in progress, use with care and expect changes.**\n",
        "\n",
        "The `pyct` module packages the source code transformation APIs used by AutoGraph.\n",
        "\n",
        "This tutorial is just a preview - there is no PIP package yet, and the API has not been finalized, although most of those shown here are quite stable.\n",
        "\n",
        "[Run in Colab](https://colab.research.google.com/github/tensorflow/tensorflow/blob/master/tensorflow/python/autograph/g3doc/pyct_tutorial.ipynb)\n",
        "\n",
        "Requires `tf-nightly`:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {},
        "colab_type": "code",
        "id": "wq1DRamRlqoB"
      },
      "outputs": [],
      "source": [
        "!pip install tf-nightly"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "r7Q78WIKe2cu"
      },
      "source": [
        "### Writing a custom code translator\n",
        "\n",
        "[transformer.CodeGenerator](https://github.com/tensorflow/tensorflow/blob/40802bcdb5c8a4379da2145441f51051402bd29b/tensorflow/python/autograph/pyct/transformer.py#L480) is an AST visitor that outputs a string. This makes it useful in the final stage of translating Python to another language."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "HHaCMFOpuoVx"
      },
      "source": [
        "Here's a toy C++ code generator written using a `transformer.CodeGenerator`, which is just a fancy subclass of [ast.NodeVisitor](https://docs.python.org/3/library/ast.html#ast.NodeVisitor):"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {},
        "colab_type": "code",
        "id": "PJlTIbJlurpm"
      },
      "outputs": [],
      "source": [
        "import gast\n",
        "from tensorflow.python.autograph.pyct import transformer\n",
        "\n",
        "class BasicCppCodegen(transformer.CodeGenerator):\n",
        "\n",
        "  def visit_Name(self, node):\n",
        "    self.emit(node.id)\n",
        "\n",
        "  def visit_arguments(self, node):\n",
        "    self.visit(node.args[0])\n",
        "    for arg in node.args[1:]:\n",
        "      self.emit(', ')\n",
        "      self.visit(arg)\n",
        "\n",
        "  def visit_FunctionDef(self, node):\n",
        "    self.emit('void {}'.format(node.name))\n",
        "    self.emit('(')\n",
        "    self.visit(node.args)\n",
        "    self.emit(') {\\n')\n",
        "    self.visit_block(node.body)\n",
        "    self.emit('\\n}')\n",
        "\n",
        "  def visit_Call(self, node):\n",
        "    self.emit(node.func.id)\n",
        "    self.emit('(')\n",
        "    self.visit(node.args[0])\n",
        "    for arg in node.args[1:]:\n",
        "      self.emit(', ')\n",
        "      self.visit(arg)\n",
        "    self.emit(');')\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "chCX1A3rA9Pn"
      },
      "source": [
        "Another helpful API is [transpiler.GenericTranspiler](https://github.com/tensorflow/tensorflow/blob/ee7172a929cb0c3d94a094fafc60bbaa175c085d/tensorflow/python/autograph/pyct/transpiler.py#L227) which takes care of parsing:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {},
        "colab_type": "code",
        "id": "LmwWewU1Bw0B"
      },
      "outputs": [],
      "source": [
        "import gast\n",
        "from tensorflow.python.autograph.pyct import transpiler\n",
        "\n",
        "class PyToBasicCpp(transpiler.GenericTranspiler):\n",
        "\n",
        "  def transform_ast(self, node, ctx):\n",
        "    codegen = BasicCppCodegen(ctx)\n",
        "    codegen.visit(node)\n",
        "    return codegen.code_buffer"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "nUhlScyOjlYM"
      },
      "source": [
        "Try it on a simple function:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {},
        "colab_type": "code",
        "id": "ty9q853QvUqo"
      },
      "outputs": [],
      "source": [
        "def f(x, y):\n",
        "  print(x, y)\n",
        "\n",
        "code, _ = PyToBasicCpp().transform(f, None)\n",
        "print(code)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "rmRI9dG_ydE_"
      },
      "source": [
        "### Helpful static analysis passes\n",
        "\n",
        "The `static_analysis` module contains various helper passes for dataflow analyis.\n",
        "\n",
        "All these passes annotate the AST. These annotations can be extracted using [anno.getanno](https://github.com/tensorflow/tensorflow/blob/40802bcdb5c8a4379da2145441f51051402bd29b/tensorflow/python/autograph/pyct/anno.py#L111). Most of them rely on the `qual_names` annotations, which just simplify the way more complex identifiers like `a.b.c` are accessed.\n",
        "\n",
        "The most useful is the activity analysis which just inventories symbols read, modified, etc.:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {},
        "colab_type": "code",
        "id": "GEJ30Wea4Xfy"
      },
      "outputs": [],
      "source": [
        "def get_node_and_ctx(f):\n",
        "  node, source = parser.parse_entity(f, ())\n",
        "  f_info = transformer.EntityInfo(\n",
        "    name='f',\n",
        "    source_code=source,\n",
        "    source_file=None,\n",
        "    future_features=(),\n",
        "    namespace=None)\n",
        "  ctx = transformer.Context(f_info, None, None)\n",
        "  return node, ctx"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {},
        "colab_type": "code",
        "id": "BiwPJrDd0aAX"
      },
      "outputs": [],
      "source": [
        "from tensorflow.python.autograph.pyct import anno\n",
        "from tensorflow.python.autograph.pyct import parser\n",
        "from tensorflow.python.autograph.pyct import qual_names\n",
        "from tensorflow.python.autograph.pyct.static_analysis import annos\n",
        "from tensorflow.python.autograph.pyct.static_analysis import activity\n",
        "\n",
        "\n",
        "def f(a):\n",
        "  b = a + 1\n",
        "  return b\n",
        "\n",
        "\n",
        "node, ctx = get_node_and_ctx(f)\n",
        "\n",
        "node = qual_names.resolve(node)\n",
        "node = activity.resolve(node, ctx)\n",
        "\n",
        "fn_scope = anno.getanno(node, annos.NodeAnno.BODY_SCOPE)  # Note: tag will be changed soon.\n",
        "\n",
        "\n",
        "print('read:', fn_scope.read)\n",
        "print('modified:', fn_scope.modified)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "w8dBRlKkFNIP"
      },
      "source": [
        "Another useful utility is the control flow graph builder.\n",
        "\n",
        "Of course, a CFG that fully accounts for all effects is impractical to build in a late-bound language like Python without creating an almost fully-connected graph. However, one can be reasonably built if we ignore the potential for functions to raise arbitrary exceptions."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {},
        "colab_type": "code",
        "id": "KvLe9lWnFg7N"
      },
      "outputs": [],
      "source": [
        "from tensorflow.python.autograph.pyct import cfg\n",
        "\n",
        "\n",
        "def f(a):\n",
        "  if a \u003e 0:\n",
        "    return a\n",
        "  b = -a\n",
        "\n",
        "node, ctx = get_node_and_ctx(f)\n",
        "\n",
        "node = qual_names.resolve(node)\n",
        "cfgs = cfg.build(node)\n",
        "cfgs[node]"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "Cro-jfPA2oxR"
      },
      "source": [
        "Other useful analyses include liveness analysis. Note that these make simplifying assumptions, because in general the CFG of a Python program is a graph that's almost complete. The only robust assumption is that execution can't jump backwards."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {},
        "colab_type": "code",
        "id": "73dARy4_2oAI"
      },
      "outputs": [],
      "source": [
        "from tensorflow.python.autograph.pyct import anno\n",
        "from tensorflow.python.autograph.pyct import cfg\n",
        "from tensorflow.python.autograph.pyct import qual_names\n",
        "from tensorflow.python.autograph.pyct.static_analysis import annos\n",
        "from tensorflow.python.autograph.pyct.static_analysis import reaching_definitions\n",
        "from tensorflow.python.autograph.pyct.static_analysis import reaching_fndefs\n",
        "from tensorflow.python.autograph.pyct.static_analysis import liveness\n",
        "\n",
        "\n",
        "def f(a):\n",
        "  b = a + 1\n",
        "  return b\n",
        "\n",
        "\n",
        "node, ctx = get_node_and_ctx(f)\n",
        "\n",
        "node = qual_names.resolve(node)\n",
        "cfgs = cfg.build(node)\n",
        "node = activity.resolve(node, ctx)\n",
        "node = reaching_definitions.resolve(node, ctx, cfgs)\n",
        "node = reaching_fndefs.resolve(node, ctx, cfgs)\n",
        "node = liveness.resolve(node, ctx, cfgs)\n",
        "\n",
        "print('live into `b = a + 1`:', anno.getanno(node.body[0], anno.Static.LIVE_VARS_IN))\n",
        "print('live into `return b`:', anno.getanno(node.body[1], anno.Static.LIVE_VARS_IN))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "GKSaqLbKQI_v"
      },
      "source": [
        "### Writing a custom Python-to-Python transpiler\n",
        "\n",
        "`transpiler.Py2Py` is a generic class for a Python [source-to-source compiler](https://en.wikipedia.org/wiki/Source-to-source_compiler). It operates on Python ASTs. Subclasses override its [transform_ast](https://github.com/tensorflow/tensorflow/blob/95ea3404528afcb1a74dd5f0946ea8d17beda28b/tensorflow/python/autograph/pyct/transpiler.py#L261) method.\n",
        "\n",
        "Unlike the `transformer` module, which have an AST as input/output, the `transpiler` APIs accept and return actual Python objects, handling the tasks associated with parsing, unparsing and loading of code."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "eicHoYlzRhnc"
      },
      "source": [
        "Here's a transpiler that does nothing:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {},
        "colab_type": "code",
        "id": "edaG6dWEPvUI"
      },
      "outputs": [],
      "source": [
        "from tensorflow.python.autograph.pyct import transpiler\n",
        "\n",
        "\n",
        "class NoopTranspiler(transpiler.PyToPy):\n",
        "\n",
        "  def get_caching_key(self, ctx):\n",
        "    # You may return different caching keys if the transformation may generate\n",
        "    # code versions.\n",
        "    return 0\n",
        "\n",
        "  def get_extra_locals(self):\n",
        "    # No locals needed for now; see below.\n",
        "    return {}\n",
        "\n",
        "  def transform_ast(self, ast, transformer_context):\n",
        "    return ast\n",
        "\n",
        "tr = NoopTranspiler()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "hKxmlWeQSQyN"
      },
      "source": [
        "The main entry point is the [transform](https://github.com/tensorflow/tensorflow/blob/95ea3404528afcb1a74dd5f0946ea8d17beda28b/tensorflow/python/autograph/pyct/transpiler.py#L384) method returns the transformed version of the input."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {},
        "colab_type": "code",
        "id": "HXTIYsunSVr1"
      },
      "outputs": [],
      "source": [
        "def f(x, y):\n",
        "  return x + y\n",
        "\n",
        "\n",
        "new_f, module, source_map = tr.transform(f, None)\n",
        "\n",
        "new_f(1, 1)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "aKO42LBXw3SD"
      },
      "source": [
        "### Adding new variables to the transformed code\n",
        "\n",
        "The transformed function has the same global and local variables as the original function. You can of course generate local imports to add any new references into the generated code, but an easier method is to use the `get_extra_locals` method:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {},
        "colab_type": "code",
        "id": "_Wl0n5I_1NJZ"
      },
      "outputs": [],
      "source": [
        "from tensorflow.python.autograph.pyct import parser\n",
        "\n",
        "\n",
        "class HelloTranspiler(transpiler.PyToPy):\n",
        "\n",
        "  def get_caching_key(self, ctx):\n",
        "    return 0\n",
        "\n",
        "  def get_extra_locals(self):\n",
        "    return {'name': 'you'}\n",
        "\n",
        "  def transform_ast(self, ast, transformer_context):\n",
        "    print_code = parser.parse('print(\"Hello\", name)')\n",
        "    ast.body = [print_code] + ast.body\n",
        "    return ast\n",
        "\n",
        "\n",
        "def f(x, y):\n",
        "  pass\n",
        "\n",
        "new_f, _, _ = HelloTranspiler().transform(f, None)\n",
        "\n",
        "_ = new_f(1, 1)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {},
        "colab_type": "code",
        "id": "JcMSHJXK6pO2"
      },
      "outputs": [],
      "source": [
        "import inspect\n",
        "\n",
        "print(inspect.getsource(new_f))"
      ]
    }
  ],
  "metadata": {
    "colab": {
      "collapsed_sections": [],
      "last_runtime": {
        "build_target": "",
        "kind": "local"
      },
      "name": "pyctr_tutorial.ipynb",
      "provenance": [
        {
          "file_id": "1dT93XRkt7vUpVp7GZech8LB0u1OytKff",
          "timestamp": 1586205976756
        }
      ]
    },
    "kernelspec": {
      "display_name": "Python 3",
      "name": "python3"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}