Add -O2 to the adding an op HOWTO (#4195)
Based on a [request from StackOverflow](http://stackoverflow.com/questions/39280669/best-way-to-modify-a-built-in-tensorflow-kernel/39301780?noredirect=1#comment65975519_39301780) where a user observed that the same code compiled following these instructions is 10x slower than code built as part of the binary installation.
This commit is contained in:
parent
68a74ea985
commit
2ab7e63262
@ -139,7 +139,7 @@ to compile your Op into a dynamic library.
|
||||
```bash
|
||||
TF_INC=$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_include())')
|
||||
|
||||
g++ -std=c++11 -shared zero_out.cc -o zero_out.so -fPIC -I $TF_INC
|
||||
g++ -std=c++11 -shared zero_out.cc -o zero_out.so -fPIC -I $TF_INC -O2
|
||||
```
|
||||
|
||||
On Mac OS X, the additional flag "-undefined dynamic_lookup" is required when
|
||||
|
Loading…
Reference in New Issue
Block a user