Please approve this CL. It will be submitted automatically, and its GitHub pull request will be marked as merged.
Imported from GitHub PR #25011
New PR to continue the efforts started by @deven-amd in #20709 / #22669 / #24156.
This PR aims to refactor StreamExecutor GPU interfaces so it can be shared among CUDA and ROCm. The PR would be the first part of a series of PRs.
Based on @timshen91 's inputs, I've refactored logic in #214156 so :
- only contains changes in stream_executor/....
- does not remove any stream_executor/cuda/*.h, so that things outside of stream_executor don't break. All the types and functions in the namespace cuda now alias to namespace gpu counterparts. For example, namespace cuda { using CUDADriver = gpu::GpuDriver; }.
- all stream_executor/gpu/BUILD targets should be only visible to //third_party/tensorflow/stream_executor:__subpackages__.
- target stream_executor/gpu:X should be only used by stream_executor/cuda:cuda_X or stream_executor/rocm:rocm_X, not cuda_Y. For example, cuda:cuda_platform should depend on cuda:cuda_driver, not gpu:gpu_driver.
Copybara import of the project:
- 267affbb73df9164baf4e62142fe7201e6a305ee [ROCm][CUDA] StreamExecutor logic for ROCm / CUDA platform by Wen-Heng (Jack) Chung <whchung@gmail.com>
- 04fac5bf358059bdb2cd4a3e092e52dc982ea7b0 Merge 267affbb73df9164baf4e62142fe7201e6a305ee into 5f8ea... by Wen-Heng (Jack) Chung <whchung@gmail.com>
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/tensorflow/pull/25011 from ROCmSoftwarePlatform:google-upstream-pr-stream-executor-alt 267affbb73df9164baf4e62142fe7201e6a305ee
PiperOrigin-RevId: 231250990
Step 1 of re-namespace'ing StreamExecutor into ::stream_executor.
This moves everything inside of stream_executor/..., and leaves a
namespace alias into ::perftools::gputools. The next steps will clean
up users to use the new namespace.
This is mostly a mechanical change, but it also includes a bunch of
non-mechanical changes that ideally would be split out into separate
patches. Unfortunately they all sort of need to be shoved in here for
various reasons:
- forward declarations need to be in the same namespace as the actual
types, so we need to change all forward declarations of
StreamExecutor types in this one patch.
- Uses of these forward declarations need to be changed to the new
namespace (or otherwise we need to add a namespace alias to the
relevant header, but this is pretty ugly).
- Various initialization code needs to live in StreamExecutor's "real"
namespace, so all this needs to be changed.
PiperOrigin-RevId: 193256128
Adds initialization methods to Platform. Some platforms require initialization.
Those that do not have trivial implementations of these methods.
PiperOrigin-RevId: 188363315
Change 109695551
Update FAQ
Change 109694725
Add a gradient for resize_bilinear op.
Change 109694505
Don't mention variables module in docs
variables.Variable should be tf.Variable.
Change 109658848
Adding an option to create a new thread-pool for each session.
Change 109640570
Take the snapshot of stream-executor.
+ Expose an interface for scratch space allocation in the interface.
Change 109638559
Let image_summary accept uint8 input
This allows users to do their own normalization / scaling if the default
(very weird) behavior of image_summary is undesired.
This required a slight tweak to fake_input.cc to make polymorphically typed
fake inputs infer if their type attr is not set but has a default.
Unfortunately, adding a second valid type to image_summary *disables* automatic
implicit conversion from np.float64 to tf.float32, so this change is slightly
backwards incompatible.
Change 109636969
Add serialization operations for SparseTensor.
Change 109636644
Update generated Op docs.
Change 109634899
TensorFlow: add a markdown file for producing release notes for our
releases. Seed with 0.5.0 with a boring but accurate description.
Change 109634502
Let histogram_summary take any realnumbertype
It used to take only floats, not it understands ints.
Change 109634434
TensorFlow: update locations where we mention python 3 support, update
them to current truth.
Change 109632108
Move HSV <> RGB conversions, grayscale conversions, and adjust_* ops back to tensorflow
- make GPU-capable version of RGBToHSV and HSVToRGB, allows only float input/output
- change docs to reflect new size constraints
- change HSV format to be [0,1] for all components
- add automatic dtype conversion for all adjust_* and grayscale conversion ops
- fix up docs
Change 109631077
Improve optimizer exceptions
1. grads_and_vars is now a tuple, so must be wrapped when passed to format.
2. Use '%r' instead of '%s' for dtype formatting
Base CL: 109697989
Changes:
* error message that refers to removed `DefaultSession` method.
* -Wnull-conversion warnings
* the "_start_time" attr for recvs when the flag "--brain_enable_scheduling_for_recvs" is set.
* typo in tutorial data download progress message.
* a typo ("however their installing"=>"however installing").
* typo, rename "TensorFlow Mechanics" to "How To" to be consistent with the website.
* a typo ("subtact"=>"subtract").
* protobuf examples in comments in tensorflow::Example.proto.
* formula formatting in MNIST beginner tutorial
* negative fraction-of-queue-full stats
* protobuf inclusion path so that Android demo will build under Blaze.
* small typo (moderatly > moderately)
* Session.run() to check that tensor arguments come from the session's graph.
* another six import
* seq2seq typo in bazel command
Base CL: 108349164