..
data
Replace instances of "blacklist" with "denylist" where possible. See Google Developer guidelines at https://developers.google.com/style/word-list#blacklist for more information.
2020-07-20 16:05:25 -07:00
ir
Integrate LLVM at llvm/llvm-project@a1a1d338e9
2021-02-05 14:26:21 -08:00
llvm_gpu_backend
Fix a few ClangTidy warnings
2021-01-12 06:02:22 -08:00
tests
[XLA:GPU] Fix all-reduce when it's degenerate and has arbitrary reduction computation.
2021-02-24 14:34:15 -08:00
alias_passthrough_params_test.cc
[XLA:GPU][NFC] Prefer using * to acces absl::optional<> values.
2021-02-04 12:29:09 -08:00
alias_passthrough_params.cc
[XLA] Unify aliasing types
2020-07-07 16:07:00 -07:00
alias_passthrough_params.h
amdgpu_compiler_registration.cc
amdgpu_compiler.cc
Changing "GpuVersion" datatype to include hipDeviceProp_t::gcnArchName
2021-01-11 22:57:46 +00:00
amdgpu_compiler.h
[XLA/GPU] Make HloModule optional in CompileTargetBinary.
2020-12-30 14:32:07 -08:00
backend_configs.proto
buffer_allocations.cc
[XLA/GPU] Remove uses of BufferAssignment in GpuExecutable.
2020-12-21 15:52:53 -08:00
buffer_allocations.h
[XLA/GPU] Remove uses of BufferAssignment in GpuExecutable.
2020-12-21 15:52:53 -08:00
buffer_comparator_test.cc
buffer_comparator.cc
Return a failed status (instead of crashing) when block count is too high
2021-02-19 01:49:52 -08:00
buffer_comparator.h
BUILD
[XLA:GPU] Enforce collectives ordering to be their appearence in the module
2021-03-04 21:47:17 -08:00
cholesky_thunk.cc
[XLA/GPU] Reduce the amount of cuSolver contexts created.
2021-02-22 17:25:34 -08:00
cholesky_thunk.h
[XLA/GPU] Reduce the amount of cuSolver contexts created.
2021-02-22 17:25:34 -08:00
collective_permute_thunk.cc
[XLA:GPU] Migrate CollectivePermute thunk generation to MLIR
2021-02-04 10:46:05 -08:00
collective_permute_thunk.h
[XLA:GPU] Migrate CollectivePermute thunk generation to MLIR
2021-02-04 10:46:05 -08:00
conditional_thunk.cc
[XLA:GPU][NFC] Avoid using rvalue references per style guide.
2021-01-13 09:18:47 -08:00
conditional_thunk.h
[XLA:GPU][NFC] Avoid using rvalue references per style guide.
2021-01-13 09:18:47 -08:00
convolution_thunk.cc
[XLA:GPU][NFC] Avoid using rvalue references per style guide.
2021-01-13 09:18:47 -08:00
convolution_thunk.h
[XLA:GPU][NFC] Avoid using rvalue references per style guide.
2021-01-13 09:18:47 -08:00
copy_thunk.cc
[XLA/GPU] Remove uses of Thunk::hlo_instruction() for profiling.
2020-07-10 15:31:01 -07:00
copy_thunk.h
[XLA/GPU] Remove uses of Thunk::hlo_instruction() for profiling.
2020-07-10 15:31:01 -07:00
cublas_gemm_pad_for_tensor_cores_test.cc
cublas_gemm_pad_for_tensor_cores.cc
cublas_gemm_pad_for_tensor_cores.h
cudnn_batchnorm_rewriter.cc
cudnn_batchnorm_rewriter.h
cudnn_batchnorm_runner.cc
[NFC] Eliminate references to HLO insts from CudnnBatchNorm Thunks.
2020-10-06 09:24:36 -07:00
cudnn_batchnorm_runner.h
[XLA/GPU] Simplify reduction implementation.
2020-11-24 13:02:44 -08:00
cudnn_batchnorm_thunk.cc
[XLA:GPU][NFC] Avoid using rvalue references per style guide.
2021-01-13 09:18:47 -08:00
cudnn_batchnorm_thunk.h
[XLA:GPU][NFC] Avoid using rvalue references per style guide.
2021-01-13 09:18:47 -08:00
cudnn_fused_conv_rewriter_test.cc
[XLA] Add creation_pass_id and dummy op_names to OpMetadata.
2020-12-11 17:48:13 -08:00
cudnn_fused_conv_rewriter.cc
cudnn_fused_conv_rewriter.h
cudnn_pad_for_convolutions_test.cc
cudnn_pad_for_convolutions.cc
cudnn_pad_for_convolutions.h
cusolver_context.cc
cusolver_context.h
cusolver_rewriter.cc
cusolver_rewriter.h
custom_call_test.cc
[XLA:GPU] Fix Custom call implementation to correctly handle token inputs.
2021-02-22 09:12:56 -08:00
custom_call_thunk.cc
[XLA:GPU] Fix Custom call implementation to correctly handle token inputs.
2021-02-22 09:12:56 -08:00
custom_call_thunk.h
[XLA:GPU] Fix Custom call implementation to correctly handle token inputs.
2021-02-22 09:12:56 -08:00
elemental_ir_emitter.cc
Merge pull request #47164 from nouiz:upstream-llvm_var_name
2021-02-17 06:12:19 -08:00
elemental_ir_emitter.h
Merge pull request #47164 from nouiz:upstream-llvm_var_name
2021-02-17 06:12:19 -08:00
fft_thunk.cc
[XLA:GPU] Don't share FFT plans across devices.
2020-11-05 17:10:09 -08:00
fft_thunk.h
Prefix thread annotations with ABSL_.
2020-11-16 08:26:41 -08:00
for_thunk.cc
[NFC] Eliminate references to HLO Inst from ForThunk
2020-10-06 12:56:52 -07:00
for_thunk.h
[NFC] Mark Thunk subclass members const when possible
2020-10-07 10:13:20 -07:00
fusion_merger_test.cc
Avoid fusing expensive ops with reusing ops.
2021-01-21 06:59:55 -08:00
fusion_merger.cc
[XLA] Implement the fusion progress visualizer, which dumps out the HTML+JS page visualizing the fusion decisions for XLA:GPU
2021-01-25 17:35:11 -08:00
fusion_merger.h
[XLA:GPU] Update and improve documentation of the FusionMerger HLO pass.
2020-12-09 07:44:04 -08:00
gemm_algorithm_picker.cc
[XLA:GPU][NFC] Prefer using * to acces absl::optional<> values.
2021-02-04 12:29:09 -08:00
gemm_algorithm_picker.h
gemm_rewriter.cc
gemm_rewriter.h
gemm_thunk.cc
[XLA:GPU][NFC] Avoid using rvalue references per style guide.
2021-01-13 09:18:47 -08:00
gemm_thunk.h
[XLA:GPU][NFC] Avoid using rvalue references per style guide.
2021-01-13 09:18:47 -08:00
gpu_autotuning.proto
Replace instances of "blacklist" with "denylist" where possible. See Google Developer guidelines at https://developers.google.com/style/word-list#blacklist for more information.
2020-07-20 16:05:25 -07:00
gpu_compiler.cc
[XLA:GPU] Enforce collectives ordering to be their appearence in the module
2021-03-04 21:47:17 -08:00
gpu_compiler.h
[XLA/GPU] Add an LMHLO -> Execution result test.
2021-01-08 15:39:11 -08:00
gpu_constants.cc
gpu_constants.h
gpu_conv_algorithm_picker.cc
[XLA:GPU] Do not autotune convolutions on ROCm if requiring determinism
2021-02-26 11:40:02 -08:00
gpu_conv_algorithm_picker.h
gpu_conv_padding_legalization_test.cc
Fix spatial dimension reference in conv padding legalization.
2020-09-18 03:46:36 -07:00
gpu_conv_padding_legalization.cc
Fix spatial dimension reference in conv padding legalization.
2020-09-18 03:46:36 -07:00
gpu_conv_padding_legalization.h
gpu_conv_rewriter_test.cc
[XLA/GPU] Fix convolution rewriter heuristics for better pattern-matching backward filter convs.
2021-02-04 12:24:45 -08:00
gpu_conv_rewriter.cc
[XLA] Split permutation utilities from xla/util.* into a new xla/permutation_util.*
2021-02-09 17:01:49 -08:00
gpu_conv_rewriter.h
gpu_conv_runner.cc
[XLA:GPU] Migrate convolution thunk emission to MLIR
2020-12-17 11:15:21 -08:00
gpu_conv_runner.h
[XLA:GPU] Migrate convolution thunk emission to MLIR
2020-12-17 11:15:21 -08:00
gpu_copy_insertion.cc
gpu_copy_insertion.h
gpu_device_info.h
Return a failed status (instead of crashing) when block count is too high
2021-02-19 01:49:52 -08:00
gpu_executable_run_options.cc
Split out common NCCL utils.
2020-12-03 05:25:58 -08:00
gpu_executable_run_options.h
[NFC] Moved GlobalDeviceId
into its own file as it is also used by the CPU runtime.
2020-11-24 04:21:38 -08:00
gpu_executable.cc
move singleton xla::gpu::GPUDebugInfoManager to xla::XlaDebugInfoManager. in hope that it can be reused by XLA/CPU.
2021-02-09 10:28:57 -08:00
gpu_executable.h
Merge pull request #45022 from nouiz:upstream_maybeowning
2021-01-11 01:30:30 -08:00
gpu_fusible_test.cc
gpu_fusible.cc
Properly determine the shared memory requirements of a fusion node.
2020-11-03 00:15:54 -08:00
gpu_fusible.h
[XLA/GPU] Address review comments.
2020-10-12 18:00:16 -07:00
gpu_hlo_schedule_test.cc
Roll-forward with fixes
2021-02-22 16:27:43 -08:00
gpu_hlo_schedule.cc
Roll-forward with fixes
2021-02-22 16:27:43 -08:00
gpu_hlo_schedule.h
Roll-forward with fixes
2021-02-22 16:27:43 -08:00
gpu_layout_assignment_test.cc
gpu_layout_assignment.cc
Add an option to override XLA GPU conv layouts to NHWC
2020-11-16 16:06:46 -08:00
gpu_layout_assignment.h
[XLA:GPU] Allow using ChannelLayoutConstraints for XLA:GPU
2021-02-23 18:14:58 -08:00
gpu_sanitize_constant_names_test.cc
gpu_sanitize_constant_names.cc
gpu_sanitize_constant_names.h
gpu_scatter_expander.cc
[XLA] Expand simple scatter operations into dynamic-update-slice.
2020-08-18 13:21:28 -07:00
gpu_scatter_expander.h
[XLA] Expand simple scatter operations into dynamic-update-slice.
2020-08-18 13:21:28 -07:00
gpu_transfer_manager.cc
[XLA] Drop useless shape argument from TransferManager::TransferLiteralFromOutfeed.
2021-01-25 17:29:36 -08:00
gpu_transfer_manager.h
[XLA] Drop useless shape argument from TransferManager::TransferLiteralFromOutfeed.
2021-01-25 17:29:36 -08:00
gpu_types.h
Fix a few ClangTidy warnings
2021-01-12 06:02:22 -08:00
hlo_algorithm_denylist_test.cc
Rename hlo_algorithm_blacklist to hlo_algorithm_denylist
2020-07-23 11:13:39 -07:00
hlo_algorithm_denylist.cc
Rename hlo_algorithm_blacklist to hlo_algorithm_denylist
2020-07-23 11:13:39 -07:00
hlo_algorithm_denylist.h
Rename hlo_algorithm_blacklist to hlo_algorithm_denylist
2020-07-23 11:13:39 -07:00
hlo_execution_profiler.cc
[XLA:GPU][NFC] Prefer using * to acces absl::optional<> values.
2021-02-04 12:29:09 -08:00
hlo_execution_profiler.h
[XLA/GPU] Make module
parameter to GpuExecutable optional.
2020-12-29 14:33:16 -08:00
hlo_to_ir_bindings.cc
[XLA/GPU] Emit constants computation by computation. This helps with incrementally transitioning to MLIR.
2020-09-14 19:16:24 -07:00
hlo_to_ir_bindings.h
Roll forward XLA GPU LHLO sort op migration
2020-08-13 12:00:17 -07:00
horizontal_input_fusion_test.cc
[XLA:GPU] Re-enable horizontal input fusion. Use a minimum of 64 threads per block for multi-output fusions.
2020-11-17 03:28:50 -08:00
horizontal_input_fusion.cc
Merge pull request #43964 from trentlo:horizontal_input_fusion_again
2020-10-15 01:40:44 -07:00
horizontal_input_fusion.h
Implement horizontal input fusion.
2020-10-12 17:59:55 -07:00
horizontal_loop_fusion_test.cc
[XLA/GPU] Re-enable h-loop-fusion to share operands with users.
2021-02-24 15:34:18 -08:00
horizontal_loop_fusion.cc
[XLA/GPU] Re-enable h-loop-fusion to share operands with users.
2021-02-24 15:34:18 -08:00
horizontal_loop_fusion.h
Rename horizontal_fusion to horizontal_loop_fusion.
2020-09-16 17:58:22 -07:00
infeed_manager.cc
infeed_manager.h
infeed_thunk.cc
[XLA:GPU][NFC] Avoid using rvalue references per style guide.
2021-01-13 09:18:47 -08:00
infeed_thunk.h
[XLA:GPU][NFC] Avoid using rvalue references per style guide.
2021-01-13 09:18:47 -08:00
instruction_fusion_test.cc
instruction_fusion.cc
Evaluate the maximum code duplication in fusion nodes instead of average.
2020-09-02 07:36:23 -07:00
instruction_fusion.h
ir_emission_utils_test.cc
Integrate LLVM at llvm/llvm-project@418c218efa
2021-02-11 10:53:15 -08:00
ir_emission_utils.cc
[XLA/GPU] Remove unintended debug line.
2021-02-04 14:52:01 -08:00
ir_emission_utils.h
Roll-forward with fix:
2021-02-03 15:25:54 -08:00
ir_emitter_context.h
[XLA/GPU] Add optional BufferAllocation fields to IrEmitterContext, and
2020-12-30 14:56:31 -08:00
ir_emitter_nested.cc
[XLA/GPU] Emit constants computation by computation. This helps with incrementally transitioning to MLIR.
2020-09-14 19:16:24 -07:00
ir_emitter_nested.h
[XLA/GPU] Emit constants computation by computation. This helps with incrementally transitioning to MLIR.
2020-09-14 19:16:24 -07:00
ir_emitter_unnested.cc
[XLA/GPU] Cleanup all rest of the emitters for supporting LMHLO.
2021-02-23 14:17:11 -08:00
ir_emitter_unnested.h
[XLA/GPU] Cleanup all rest of the emitters for supporting LMHLO.
2021-02-23 14:17:11 -08:00
ir_emitter.cc
Integrate LLVM at llvm/llvm-project@bf6380c096
2021-02-26 04:08:05 -08:00
ir_emitter.h
[XLA/GPU] Plumb through Bitcast op for LMHLO.
2021-02-10 19:49:41 -08:00
kernel_mapping_scheme.h
Merge pull request #39327 from nouiz:upstream_copy_bug
2020-05-13 12:09:48 -07:00
kernel_thunk.cc
XLA Parallel reduce.
2020-08-27 15:21:33 -07:00
kernel_thunk.h
[XLA/GPU] Remove uses of Thunk::hlo_instruction() for profiling.
2020-07-10 15:31:01 -07:00
launch_dimensions.cc
Return a failed status (instead of crashing) when block count is too high
2021-02-19 01:49:52 -08:00
launch_dimensions.h
Return a failed status (instead of crashing) when block count is too high
2021-02-19 01:49:52 -08:00
memset_thunk.cc
[XLA/GPU] Remove uses of Thunk::hlo_instruction() for profiling.
2020-07-10 15:31:01 -07:00
memset_thunk.h
[NFC] Mark Thunk subclass members const when possible
2020-10-07 10:13:20 -07:00
multi_output_fusion_test.cc
Properly determine the shared memory requirements of a fusion node.
2020-11-03 00:15:54 -08:00
multi_output_fusion.cc
[XLA] Implement the fusion progress visualizer, which dumps out the HTML+JS page visualizing the fusion decisions for XLA:GPU
2021-01-25 17:35:11 -08:00
multi_output_fusion.h
[XLA] Implement the fusion progress visualizer, which dumps out the HTML+JS page visualizing the fusion decisions for XLA:GPU
2021-01-25 17:35:11 -08:00
nccl_all_gather_thunk.cc
[XLA:GPU] Fix collective communication ops to work correctly in SPMD mode.
2021-02-22 12:42:21 -08:00
nccl_all_gather_thunk.h
[XLA:GPU] Fix collective communication ops to work correctly in SPMD mode.
2021-02-22 12:42:21 -08:00
nccl_all_reduce_thunk.cc
[XLA:GPU] Fix collective communication ops to work correctly in SPMD mode.
2021-02-22 12:42:21 -08:00
nccl_all_reduce_thunk.h
[XLA:GPU] Fix all-reduce when it's degenerate and has arbitrary reduction computation.
2021-02-24 14:34:15 -08:00
nccl_all_to_all_thunk.cc
[XLA:GPU] Fix collective communication ops to work correctly in SPMD mode.
2021-02-22 12:42:21 -08:00
nccl_all_to_all_thunk.h
[XLA:GPU] Fix collective communication ops to work correctly in SPMD mode.
2021-02-22 12:42:21 -08:00
nccl_collective_thunk.cc
[XLA][NFC] Add a struct to describe the LogicalID of a device.
2021-02-22 19:40:16 -08:00
nccl_collective_thunk.h
[XLA:GPU] Fix collective communication ops to work correctly in SPMD mode.
2021-02-22 12:42:21 -08:00
nccl_test_utils_dummy.cc
[XLA-GPU] NFC: Rename dummy NCCL files, so they appear next to the real ones in the file list.
2020-12-08 04:36:45 -08:00
nccl_test_utils.cc
[XLA:GPU] Fix build failure in nccl_test_utils.cc
2021-02-01 09:37:38 -08:00
nccl_test_utils.h
Split out common NCCL utils.
2020-12-03 05:25:58 -08:00
nccl_utils.cc
Store NcclCliques in new NcclCliqueMap.
2021-01-29 04:16:02 -08:00
nccl_utils.h
[XLA:GPU][NFC] Avoid using fully qualified namespace.
2021-02-01 18:30:57 -08:00
nvptx_compiler_registration.cc
Add a skeleton for the MLIR GPU backend.
2019-07-16 03:18:23 -07:00
nvptx_compiler.cc
Merge pull request #47121 from nouiz:upstream-llvm_file
2021-02-26 02:44:59 -08:00
nvptx_compiler.h
[XLA/GPU] Make HloModule optional in CompileTargetBinary.
2020-12-30 14:32:07 -08:00
outfeed_manager.cc
outfeed_manager.h
outfeed_thunk.cc
[XLA:GPU] Migrate outfeed thunk emission to MLIR
2021-01-12 17:08:21 -08:00
outfeed_thunk.h
[XLA:GPU] Migrate outfeed thunk emission to MLIR
2021-01-12 17:08:21 -08:00
parallel_loop_emitter.cc
Merge pull request #42683 from nouiz:upstream_master_grid_size
2020-10-06 04:04:24 -07:00
parallel_loop_emitter.h
Merge pull request #42683 from nouiz:upstream_master_grid_size
2020-10-06 04:04:24 -07:00
reduction_degenerate_dim_remover.cc
Merge pull request #46452 from wangsiyu:master
2021-01-21 12:24:31 -08:00
reduction_degenerate_dim_remover.h
reduction_dimension_grouper.cc
reduction_dimension_grouper.h
reduction_layout_normalizer.cc
Add op metadata to bitcasts inserted by ReductionLayoutNormalizer.
2020-06-16 10:36:57 -07:00
reduction_layout_normalizer.h
reduction_splitter_test.cc
[XLA:GPU] Split reduce ops with large but non-consecutive reduction dimensions.
2020-06-19 10:35:29 -07:00
reduction_splitter.cc
[XLA:GPU] Split reduce ops with large but non-consecutive reduction dimensions.
2020-06-19 10:35:29 -07:00
reduction_splitter.h
[XLA:GPU] Split reduce ops with large but non-consecutive reduction dimensions.
2020-06-19 10:35:29 -07:00
replica_id_thunk.cc
[XLA][NFC] Add a struct to describe the LogicalID of a device.
2021-02-22 19:40:16 -08:00
replica_id_thunk.h
[XLA:GPU] Add support for PartitionId
2021-01-29 13:42:19 -08:00
sequential_thunk.cc
[XLA/GPU] Remove Thunk::hlo_instruction().
2020-07-13 15:32:19 -07:00
sequential_thunk.h
[XLA/GPU] Remove Thunk::hlo_instruction().
2020-07-13 15:32:19 -07:00
stream_assignment_test.cc
stream_assignment.cc
stream_assignment.h
stream_executor_util.cc
[XLA:GPU] Migrate convolution thunk emission to MLIR
2020-12-14 20:55:56 -08:00
stream_executor_util.h
[XLA:GPU] Migrate convolution thunk emission to MLIR
2020-12-14 20:55:56 -08:00
target_constants.h
target_util.cc
[XLA] More readable emitted LLVM code.
2021-02-15 06:17:34 -08:00
target_util.h
[XLA] More readable emitted LLVM code.
2021-02-15 06:17:34 -08:00
thunk_emitter.cc
[XLA:GPU] Migrate TriangularSolve thunk emission to use MLIR
2021-02-05 11:30:21 -08:00
thunk_emitter.h
[XLA:GPU] Migrate TriangularSolve thunk emission to use MLIR
2021-02-05 11:30:21 -08:00
thunk_schedule.cc
[XLA/GPU] Add an LMHLO -> Execution result test.
2021-01-08 15:39:11 -08:00
thunk_schedule.h
[XLA/GPU] Add an LMHLO -> Execution result test.
2021-01-08 15:39:11 -08:00
thunk.cc
[XLA:GPU] Add support for PartitionId
2021-01-29 13:42:19 -08:00
thunk.h
[XLA:GPU] Add support for PartitionId
2021-01-29 13:42:19 -08:00
tree_reduction_rewriter.cc
tree_reduction_rewriter.h
triangular_solve_thunk.cc
[XLA/GPU] Remove uses of Thunk::hlo_instruction() for profiling.
2020-07-10 15:31:01 -07:00
triangular_solve_thunk.h
[XLA/GPU] Remove uses of Thunk::hlo_instruction() for profiling.
2020-07-10 15:31:01 -07:00
variadic_op_splitter_test.cc
variadic_op_splitter.cc
variadic_op_splitter.h
while_thunk.cc
[NFC] Eliminate references to HLO Inst from WhileThunk
2020-10-06 13:55:38 -07:00
while_thunk.h
[NFC] Mark Thunk subclass members const when possible
2020-10-07 10:13:20 -07:00
while_transformer_test.cc
xfeed_queue.h