Docfix: Document usage of --config=opt for SIMD instruction sets. (#7211)
* Docfix: Document usage of --config=opt for SIMD instruction sets.
This commit is contained in:
parent
85235cbcb4
commit
b47dc70e54
3
configure
vendored
3
configure
vendored
@ -76,7 +76,8 @@ done
|
|||||||
## Set up architecture-dependent optimization flags.
|
## Set up architecture-dependent optimization flags.
|
||||||
if [ -z "$CC_OPT_FLAGS" ]; then
|
if [ -z "$CC_OPT_FLAGS" ]; then
|
||||||
default_cc_opt_flags="-march=native"
|
default_cc_opt_flags="-march=native"
|
||||||
read -p "Please specify optimization flags to use during compilation [Default is $default_cc_opt_flags]: " CC_OPT_FLAGS
|
read -p "Please specify optimization flags to use during compilation when bazel option "\
|
||||||
|
"\"--config=opt\" is specified [Default is $default_cc_opt_flags]: " CC_OPT_FLAGS
|
||||||
if [ -z "$CC_OPT_FLAGS" ]; then
|
if [ -z "$CC_OPT_FLAGS" ]; then
|
||||||
CC_OPT_FLAGS=$default_cc_opt_flags
|
CC_OPT_FLAGS=$default_cc_opt_flags
|
||||||
fi
|
fi
|
||||||
|
@ -7,9 +7,9 @@
|
|||||||
# append "_gpu" to the test name to invoke the GPU benchmarks. Example:
|
# append "_gpu" to the test name to invoke the GPU benchmarks. Example:
|
||||||
#
|
#
|
||||||
# # for CPU tests
|
# # for CPU tests
|
||||||
# $ bazel test -c opt --copt=-mavx //third_party/tensorflow/core/kernels:my_op_test
|
# $ bazel test --config opt //third_party/tensorflow/core/kernels:my_op_test
|
||||||
# # for GPU benchmarks
|
# # for GPU benchmarks
|
||||||
# $ bazel run -c opt --copt=-mavx --config=cuda //third_party/tensorflow/core/kernels:my_op_test_gpu -- --benchmarks=..
|
# $ bazel run --config opt --config=cuda //third_party/tensorflow/core/kernels:my_op_test_gpu -- --benchmarks=..
|
||||||
#
|
#
|
||||||
package(default_visibility = ["//visibility:public"])
|
package(default_visibility = ["//visibility:public"])
|
||||||
|
|
||||||
|
@ -17,7 +17,7 @@
|
|||||||
|
|
||||||
Run using bazel:
|
Run using bazel:
|
||||||
|
|
||||||
bazel run -c opt \
|
bazel run --config opt \
|
||||||
<...>/tensorflow/examples/how_tos/reading_data:fully_connected_preloaded
|
<...>/tensorflow/examples/how_tos/reading_data:fully_connected_preloaded
|
||||||
|
|
||||||
or, if installed via pip:
|
or, if installed via pip:
|
||||||
|
@ -17,7 +17,7 @@
|
|||||||
|
|
||||||
Run using bazel:
|
Run using bazel:
|
||||||
|
|
||||||
bazel run -c opt \
|
bazel run --config opt \
|
||||||
<...>/tensorflow/examples/how_tos/reading_data:fully_connected_preloaded_var
|
<...>/tensorflow/examples/how_tos/reading_data:fully_connected_preloaded_var
|
||||||
|
|
||||||
or, if installed via pip:
|
or, if installed via pip:
|
||||||
|
@ -32,7 +32,7 @@ normalized from 0-1 in left top right bottom order.
|
|||||||
To build it, run this command:
|
To build it, run this command:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
$ bazel build -c opt tensorflow/examples/multibox_detector/...
|
$ bazel build --config opt tensorflow/examples/multibox_detector/...
|
||||||
```
|
```
|
||||||
|
|
||||||
That should build a binary executable that you can then run like this:
|
That should build a binary executable that you can then run like this:
|
||||||
|
@ -856,13 +856,13 @@ default and if you want to limit RAM usage you can add `--local_resources
|
|||||||
2048,.5,1.0` while invoking bazel.
|
2048,.5,1.0` while invoking bazel.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
$ bazel build -c opt //tensorflow/tools/pip_package:build_pip_package
|
$ bazel build --config opt //tensorflow/tools/pip_package:build_pip_package
|
||||||
|
|
||||||
# To build with support for CUDA:
|
# To build with support for CUDA:
|
||||||
$ bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package
|
$ bazel build --config opt --config=cuda //tensorflow/tools/pip_package:build_pip_package
|
||||||
|
|
||||||
# Alternatively, to build with support for OpenCL:
|
# Alternatively, to build with support for OpenCL (Experimental):
|
||||||
$ bazel build -c opt --config=sycl //tensorflow/tools/pip_package:build_pip_package
|
$ bazel build --config opt --config=sycl //tensorflow/tools/pip_package:build_pip_package
|
||||||
|
|
||||||
$ bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
|
$ bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
|
||||||
|
|
||||||
@ -873,20 +873,21 @@ $ sudo pip install /tmp/tensorflow_pkg/tensorflow-0.12.1-py2-none-any.whl
|
|||||||
## Optimizing CPU performance
|
## Optimizing CPU performance
|
||||||
|
|
||||||
To be compatible with as wide a range of machines as possible, TensorFlow
|
To be compatible with as wide a range of machines as possible, TensorFlow
|
||||||
defaults to only using SSE4.1 SIMD instructions on x86 machines. Most modern PCs
|
defaults to only using SSE4 SIMD instructions. Most modern computers support
|
||||||
and Macs support more advanced instructions, so if you're building a binary
|
more advanced instructions. So if you're building a binary that you'll only
|
||||||
that you'll only be running on your own machine, you can enable these by using
|
be running on your own machine, you can enable these by using `-march=native`
|
||||||
`--copt=-march=native` in your bazel build command. For example:
|
for optimization options when running `configure`. Then you can build your
|
||||||
|
optimized binaries with the following command:
|
||||||
|
|
||||||
``` bash
|
``` bash
|
||||||
$ bazel build --copt=-march=native -c opt //tensorflow/tools/pip_package:build_pip_package
|
$ bazel build --config opt //tensorflow/tools/pip_package:build_pip_package
|
||||||
```
|
```
|
||||||
|
|
||||||
If you are distributing a binary but know the capabilities of the machines
|
If you are distributing a binary but know the capabilities of the machines
|
||||||
you'll be running on, you can manually choose the right instructions with
|
you'll be running on, you can manually choose the right instructions with
|
||||||
something like `--copt=-march=avx`. You may also want to enable multiple
|
something like `-march=avx`. You may also want to enable multiple
|
||||||
features using several arguments, for example
|
features using several arguments, for example
|
||||||
`--copt=-mavx2 --copt=-mfma`.
|
`-mavx2,-mfma`.
|
||||||
|
|
||||||
If you run a binary built using SIMD instructions on a machine that doesn't
|
If you run a binary built using SIMD instructions on a machine that doesn't
|
||||||
support them, you'll see an illegal instruction error when that code is
|
support them, you'll see an illegal instruction error when that code is
|
||||||
@ -902,10 +903,10 @@ system directories, run the following commands inside the TensorFlow root
|
|||||||
directory:
|
directory:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
bazel build -c opt //tensorflow/tools/pip_package:build_pip_package
|
bazel build --config opt //tensorflow/tools/pip_package:build_pip_package
|
||||||
|
|
||||||
# To build with GPU support:
|
# To build with GPU support:
|
||||||
bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package
|
bazel build --config opt --config=cuda //tensorflow/tools/pip_package:build_pip_package
|
||||||
|
|
||||||
mkdir _python_build
|
mkdir _python_build
|
||||||
cd _python_build
|
cd _python_build
|
||||||
|
@ -177,7 +177,7 @@ tf_custom_op_library(
|
|||||||
Run the following command to build `zero_out.so`.
|
Run the following command to build `zero_out.so`.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
$ bazel build -c opt //tensorflow/core/user_ops:zero_out.so
|
$ bazel build --config opt //tensorflow/core/user_ops:zero_out.so
|
||||||
```
|
```
|
||||||
|
|
||||||
> Note:
|
> Note:
|
||||||
|
@ -42,10 +42,10 @@ bazel build tensorflow/examples/image_retraining:retrain
|
|||||||
|
|
||||||
If you have a machine which supports [the AVX instruction set](https://en.wikipedia.org/wiki/Advanced_Vector_Extensions)
|
If you have a machine which supports [the AVX instruction set](https://en.wikipedia.org/wiki/Advanced_Vector_Extensions)
|
||||||
(common in x86 CPUs produced in the last few years) you can improve the running
|
(common in x86 CPUs produced in the last few years) you can improve the running
|
||||||
speed of the retraining by building for that architecture, like this:
|
speed of the retraining by building for that architecture, like this (after choosing appropriate options in `configure`):
|
||||||
|
|
||||||
```sh
|
```sh
|
||||||
bazel build -c opt --copt=-mavx tensorflow/examples/image_retraining:retrain
|
bazel build --config opt tensorflow/examples/image_retraining:retrain
|
||||||
```
|
```
|
||||||
|
|
||||||
The retrainer can then be run like this:
|
The retrainer can then be run like this:
|
||||||
|
@ -40,7 +40,7 @@ Construct and execute TensorFlow graphs in Go.
|
|||||||
```sh
|
```sh
|
||||||
cd ${GOPATH}/src/github.com/tensorflow/tensorflow
|
cd ${GOPATH}/src/github.com/tensorflow/tensorflow
|
||||||
./configure
|
./configure
|
||||||
bazel build -c opt //tensorflow:libtensorflow.so
|
bazel build --config opt //tensorflow:libtensorflow.so
|
||||||
```
|
```
|
||||||
|
|
||||||
This can take a while (tens of minutes, more if also building for GPU).
|
This can take a while (tens of minutes, more if also building for GPU).
|
||||||
|
@ -30,7 +30,7 @@ then
|
|||||||
then
|
then
|
||||||
echo "Protocol buffer compiler protoc not found in PATH or in ${PROTOC}"
|
echo "Protocol buffer compiler protoc not found in PATH or in ${PROTOC}"
|
||||||
echo "Perhaps build it using:"
|
echo "Perhaps build it using:"
|
||||||
echo "bazel build -c opt @protobuf//:protoc"
|
echo "bazel build --config opt @protobuf//:protoc"
|
||||||
exit 1
|
exit 1
|
||||||
fi
|
fi
|
||||||
PROTOC=$PATH_PROTOC
|
PROTOC=$PATH_PROTOC
|
||||||
|
@ -40,7 +40,7 @@ Configure and build the Java Archive (JAR) and native library:
|
|||||||
./configure
|
./configure
|
||||||
|
|
||||||
# Build the JAR and native library
|
# Build the JAR and native library
|
||||||
bazel build -c opt \
|
bazel build --config opt \
|
||||||
//tensorflow/java:tensorflow \
|
//tensorflow/java:tensorflow \
|
||||||
//tensorflow/java:libtensorflow_jni
|
//tensorflow/java:libtensorflow_jni
|
||||||
```
|
```
|
||||||
|
@ -39,7 +39,7 @@ $adb shell "/data/local/tmp/benchmark_model \
|
|||||||
### On desktop:
|
### On desktop:
|
||||||
(1) build the binary
|
(1) build the binary
|
||||||
```bash
|
```bash
|
||||||
$bazel build -c opt tensorflow/tools/benchmark:benchmark_model
|
$bazel build --config opt tensorflow/tools/benchmark:benchmark_model
|
||||||
```
|
```
|
||||||
|
|
||||||
(2) Run on your compute graph, similar to the Android case but without the need of adb shell.
|
(2) Run on your compute graph, similar to the Android case but without the need of adb shell.
|
||||||
|
@ -13,7 +13,7 @@ and [Rust](https://github.com/tensorflow/rust).
|
|||||||
The command:
|
The command:
|
||||||
|
|
||||||
```sh
|
```sh
|
||||||
bazel build -c opt //tensorflow/tools/lib_package:libtensorflow
|
bazel build --config opt //tensorflow/tools/lib_package:libtensorflow
|
||||||
```
|
```
|
||||||
|
|
||||||
produces `bazel-bin/tensorflow/tools/lib_package/libtensorflow.tar.gz`, which
|
produces `bazel-bin/tensorflow/tools/lib_package/libtensorflow.tar.gz`, which
|
||||||
|
@ -98,7 +98,7 @@ TODO(xpan): Provide graph.pbtxt, model.ckpt, tfprof_log and run_meta download.
|
|||||||
|
|
||||||
```shell
|
```shell
|
||||||
# Build the tool.
|
# Build the tool.
|
||||||
bazel build -c opt tensorflow/tools/tfprof/...
|
bazel build --config opt tensorflow/tools/tfprof/...
|
||||||
|
|
||||||
# Help information, including detail 'option' instructions.
|
# Help information, including detail 'option' instructions.
|
||||||
bazel-bin/tensorflow/tools/tfprof/tfprof help
|
bazel-bin/tensorflow/tools/tfprof/tfprof help
|
||||||
|
Loading…
Reference in New Issue
Block a user