Commit Graph

16 Commits

Author SHA1 Message Date
A. Unique TensorFlower
155ce6c067 Qualify uses of std::string
PiperOrigin-RevId: 297212802
Change-Id: Ic65150e7ab418be034f48d45ce25ef5d19105836
2020-02-25 15:07:45 -08:00
Gaurav Jain
a97e848221 Remove blocks_per_core_limit in DeviceDescription
Querying blocks_per_core_limit requires an active CUDA context. We'd
like to avoid this requirement for DeviceDescription and keep it as
stateless as possible. Thereby allowing querying the device without
allocating any memory.

Currently blocks_per_core_limit is not being used anywhere. However, if
we'd like to add it back, we can do so with a dedicated function.

PiperOrigin-RevId: 244747460
2019-04-22 15:49:43 -07:00
TensorFlower Gardener
e9678fd3b8 Merge pull request from MattConley:GPU_Occupancy_Fix
PiperOrigin-RevId: 235267574
2019-02-22 15:19:02 -08:00
Tim Shen
aba83497f5 PR : [GPU][ROCm][CUDA] StreamExecutor logic for ROCm / CUDA platform (PR 20709 / 22669 / 24156 continued)
Please approve this CL. It will be submitted automatically, and its GitHub pull request will be marked as merged.

Imported from GitHub PR 

New PR to continue the efforts started by @deven-amd in  /  / .

This PR aims to refactor StreamExecutor GPU interfaces so it can be shared among CUDA and ROCm. The PR would be the first part of a series of PRs.

Based on @timshen91 's inputs, I've refactored logic in  so :

- only contains changes in stream_executor/....
- does not remove any stream_executor/cuda/*.h, so that things outside of stream_executor don't break. All the types and functions in the namespace cuda now alias to namespace gpu counterparts. For example, namespace cuda { using CUDADriver = gpu::GpuDriver; }.
- all stream_executor/gpu/BUILD targets should be only visible to //third_party/tensorflow/stream_executor:__subpackages__.
- target stream_executor/gpu:X should be only used by stream_executor/cuda:cuda_X or stream_executor/rocm:rocm_X, not cuda_Y. For example, cuda:cuda_platform should depend on cuda:cuda_driver, not gpu:gpu_driver.

Copybara import of the project:

  - 267affbb73df9164baf4e62142fe7201e6a305ee [ROCm][CUDA] StreamExecutor logic for ROCm / CUDA platform by Wen-Heng (Jack) Chung <whchung@gmail.com>
  - 04fac5bf358059bdb2cd4a3e092e52dc982ea7b0 Merge 267affbb73df9164baf4e62142fe7201e6a305ee into 5f8ea... by Wen-Heng (Jack) Chung <whchung@gmail.com>

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/tensorflow/pull/25011 from ROCmSoftwarePlatform:google-upstream-pr-stream-executor-alt 267affbb73df9164baf4e62142fe7201e6a305ee
PiperOrigin-RevId: 231250990
2019-01-28 11:30:18 -08:00
Matt Conley
e31b0d045a Addressing comments to improve code. 2019-01-16 16:05:47 -08:00
Matt Conley
b7db6c6545 Fix to restore desired blocks-per-core behavior to XLA. 2019-01-15 16:39:43 -08:00
TensorFlower Gardener
6161d8cc4d Merge pull request from MattConley:CudaOccupancy
PiperOrigin-RevId: 215331087
2018-10-01 21:18:17 -07:00
A. Unique TensorFlower
8878a5c476 Added ABSL_DEPRECATED annotations to various deprecated TensorFlow functions.
PiperOrigin-RevId: 213693027
2018-09-19 14:19:54 -07:00
Matt Conley
fa20b59b92 Move CUDA-specific occupancy calculation into proper file
-Maintain functionality, just move CalculateOccupancy() and CompareOccupancy() methods from device_description to cuda_gpu_executor
-Remove CUDA requirement in general class device_description
2018-09-04 14:20:40 -07:00
Matt Conley
e93a9f9ccf Update GPU occupancy checking to utilize CUDA's occupancy calculator functions
-Replace references to the UnqueryableDeviceParams struct with calls to CUDA's built-in occupancy calculation functions
-Update calls to the occupancy checking functions with the new changes
-Changes should provide more long-term reliability and will remove the need to manually update hardcoded data values for new GPU architectures
2018-08-28 18:55:51 -07:00
Justin Lebar
4764bf2986 [StreamExecutor] Rename ::perftools::gputools -> ::stream_executor, part 1.
Step 1 of re-namespace'ing StreamExecutor into ::stream_executor.

This moves everything inside of stream_executor/..., and leaves a
namespace alias into ::perftools::gputools.  The next steps will clean
up users to use the new namespace.

This is mostly a mechanical change, but it also includes a bunch of
non-mechanical changes that ideally would be split out into separate
patches.  Unfortunately they all sort of need to be shoved in here for
various reasons:

 - forward declarations need to be in the same namespace as the actual
   types, so we need to change all forward declarations of
   StreamExecutor types in this one patch.

 - Uses of these forward declarations need to be changed to the new
   namespace (or otherwise we need to add a namespace alias to the
   relevant header, but this is pretty ugly).

 - Various initialization code needs to live in StreamExecutor's "real"
   namespace, so all this needs to be changed.

PiperOrigin-RevId: 193256128
2018-04-17 14:28:51 -07:00
Justin Lebar
b08c542710 [SE] [XLA:GPU] Inform --xla_hlo_profile of the GPU's memory bandwidth.
Add a memory_bandwidth() property to StreamExecutor's DeviceDescription,
and use this in the GPU's --xla_hlo_profile.

PiperOrigin-RevId: 189157407
2018-03-15 02:25:56 -07:00
Martin Wicke
d57572e996 Merge changes from github.
PiperOrigin-RevId: 167401527
2017-09-02 19:25:56 -07:00
A. Unique TensorFlower
122cdce33e Update copyright for 3p/tf.
Change: 123901292
2016-06-02 13:41:12 -07:00
Manjunath Kudlur
9c3043ff3b TensorFlow: Improve performance of Alexnet
Changes:

* error message that refers to removed `DefaultSession` method.
* -Wnull-conversion warnings
* the "_start_time" attr for recvs when the flag "--brain_enable_scheduling_for_recvs" is set.
* typo in tutorial data download progress message.
* a typo ("however their installing"=>"however installing").
* typo, rename "TensorFlow Mechanics" to "How To" to be consistent with the website.
* a typo ("subtact"=>"subtract").
* protobuf examples in comments in tensorflow::Example.proto.
* formula formatting in MNIST beginner tutorial
* negative fraction-of-queue-full stats
* protobuf inclusion path so that Android demo will build under Blaze.
* small typo (moderatly > moderately)
* Session.run() to check that tensor arguments come from the session's graph.
* another six import
* seq2seq typo in bazel command

Base CL: 108349164
2015-11-20 10:30:41 -08:00
Manjunath Kudlur
f41959ccb2 TensorFlow: Initial commit of TensorFlow library.
TensorFlow is an open source software library for numerical computation
using data flow graphs.

Base CL: 107276108
2015-11-06 16:27:58 -08:00