Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ROCm] Improve softmax performance. #1740

Open
wants to merge 1 commit into
base: release/2.4
Choose a base branch
from

Conversation

doru1004
Copy link

This patch improves the performance of softmax for 2D tensors by:

  • using a softmax calculation which eliminates the increase of shared memory usage with the size of the tensor and relies on global memory accesses for the tensor data accesses while still using shared memory for the actual reduction step (the shared memory used for the reduction is constant and does not increase with tensor size).
  • for the final computation replacing the division by the sum with the multiplication of 1/sum. The 1/sum is computed as the last step of the warp reduction.
  • replace the use of the exp function with the __expf function.

The impact on numerical accuracy is within a 1e-5 for half precision and 1e-7 for full precision.

The impact on performance for MI300X is between 22% and 50% percentage improvement over current runtimes.

@jerrymannil
Copy link

@doru1004
Plz run pytorch UTs as well.

UTs that reference softmax: https://github.com/search?q=repo%3Apytorch%2Fpytorch+path%3A%2F%5Etest%5C%2F%2F+softmax&type=code

After that you can also try to run the full pytorch UT test suite
CONTINUE_THROUGH_ERROR=1 .ci/pytorch/test.sh

@jerrymannil jerrymannil changed the title Improve softmax performance. [ROCm] Improve softmax performance. Nov 21, 2024
@okakarpa
Copy link
Collaborator

Jenkins build for 211a9130ea8724d3b356368e73fd64a9a3157c83 commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

Detected error during Pytorch building:

Warning: Unused direct dependencies:
	/var/lib/jenkins/pytorch/build/lib/libshm.so
	/opt/rocm/lib/libhsa-runtime64.so.1
	/lib/x86_64-linux-gnu/libm.so.6
[7981/8635] Building HIPCC object caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/flash_attn/torch_hip_generated_flash_api.hip.o
FAILED: caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/flash_attn/torch_hip_generated_flash_api.hip.o /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/flash_attn/torch_hip_generated_flash_api.hip.o 
cd /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/flash_attn && /opt/conda/envs/py_3.10/bin/cmake -E make_directory /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/flash_attn/. && /opt/conda/envs/py_3.10/bin/cmake -D verbose:BOOL=OFF -D build_configuration:STRING=RELEASE -D generated_file:STRING=/var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/flash_attn/./torch_hip_generated_flash_api.hip.o -P /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/flash_attn/torch_hip_generated_flash_api.hip.o.cmake
In file included from /var/lib/jenkins/pytorch/aten/src/ATen/native/transformers/hip/flash_attn/flash_api.hip:57:
/var/lib/jenkins/pytorch/aten/src/ATen/native/transformers/hip/aotriton_adapter.h:120:10: error: no matching constructor for initialization of 'aotriton::TensorView<0>'
  120 |   return aotriton::TensorView<0>(reinterpret_cast<intptr_t>(q.data_ptr()),
      |          ^                       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

@pruthvistony
Copy link
Collaborator

@jerrymannil ,
You can trigger a CI build for the branch - improve-softmax-performance-2.4 and verify the UTs.

Copy link
Collaborator

@pruthvistony pruthvistony left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
Once it is confirmed all UTs are running fine. we can merge it.

@okakarpa
Copy link
Collaborator

Jenkins build for 211a9130ea8724d3b356368e73fd64a9a3157c83 commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

Detected error during Pytorch building:

[7940/8635] Building HIPCC object caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/torch_hip_generated_WeightNorm.hip.o
[7941/8635] Building HIPCC object caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/torch_hip_generated_scaled_modified_bessel_k0.hip.o
[7942/8635] Building HIPCC object caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/torch_hip_generated_scaled_modified_bessel_k1.hip.o
[7943/8635] Building HIPCC object caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/torch_hip_generated_modified_bessel_k1.hip.o
[7944/8635] Building HIPCC object caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/flash_attn/torch_hip_generated_flash_api.hip.o
FAILED: caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/flash_attn/torch_hip_generated_flash_api.hip.o /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/flash_attn/torch_hip_generated_flash_api.hip.o 
cd /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/flash_attn && /opt/conda/envs/py_3.10/bin/cmake -E make_directory /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/flash_attn/. && /opt/conda/envs/py_3.10/bin/cmake -D verbose:BOOL=OFF -D build_configuration:STRING=RELEASE -D generated_file:STRING=/var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/flash_attn/./torch_hip_generated_flash_api.hip.o -P /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/flash_attn/torch_hip_generated_flash_api.hip.o.cmake
In file included from /var/lib/jenkins/pytorch/aten/src/ATen/native/transformers/hip/flash_attn/flash_api.hip:57:
/var/lib/jenkins/pytorch/aten/src/ATen/native/transformers/hip/aotriton_adapter.h:120:10: error: no matching constructor for initialization of 'aotriton::TensorView<0>'
  120 |   return aotriton::TensorView<0>(reinterpret_cast<intptr_t>(q.data_ptr()),
      |          ^                       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

@okakarpa
Copy link
Collaborator

Jenkins build for 211a9130ea8724d3b356368e73fd64a9a3157c83 commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

Detected error during Pytorch building:

[7943/8635] Building HIPCC object caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/torch_hip_generated_fused_adam_impl.hip.o
[7944/8635] Building HIPCC object caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/torch_hip_generated_modified_bessel_k0.hip.o
[7945/8635] Building HIPCC object caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/torch_hip_generated_fused_adam_amsgrad_impl.hip.o
[7946/8635] Building HIPCC object caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/torch_hip_generated_fused_adamw_amsgrad_impl.hip.o
[7947/8635] Building HIPCC object caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/torch_hip_generated_attention.hip.o
FAILED: caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/torch_hip_generated_attention.hip.o /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/torch_hip_generated_attention.hip.o 
cd /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip && /opt/conda/envs/py_3.10/bin/cmake -E make_directory /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/. && /opt/conda/envs/py_3.10/bin/cmake -D verbose:BOOL=OFF -D build_configuration:STRING=RELEASE -D generated_file:STRING=/var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/./torch_hip_generated_attention.hip.o -P /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/torch_hip_generated_attention.hip.o.cmake
In file included from /var/lib/jenkins/pytorch/aten/src/ATen/native/transformers/hip/attention.hip:84:
/var/lib/jenkins/pytorch/aten/src/ATen/native/transformers/hip/aotriton_adapter.h:120:10: error: no matching constructor for initialization of 'aotriton::TensorView<0>'
  120 |   return aotriton::TensorView<0>(reinterpret_cast<intptr_t>(q.data_ptr()),
      |          ^                       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

@okakarpa
Copy link
Collaborator

Jenkins build for 211a9130ea8724d3b356368e73fd64a9a3157c83 commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

Detected error during Pytorch building:

	/opt/rocm/lib/libhsa-runtime64.so.1
	/lib/x86_64-linux-gnu/libm.so.6
[7981/8635] Building HIPCC object caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/torch_hip_generated_BinaryMiscOpsKernels.hip.o
[7982/8635] Building HIPCC object caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/torch_hip_generated_UnaryGeometricAcoshKernel.hip.o
[7983/8635] Building HIPCC object caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/torch_hip_generated_attention_backward.hip.o
FAILED: caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/torch_hip_generated_attention_backward.hip.o /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/torch_hip_generated_attention_backward.hip.o 
cd /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip && /opt/conda/envs/py_3.10/bin/cmake -E make_directory /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/. && /opt/conda/envs/py_3.10/bin/cmake -D verbose:BOOL=OFF -D build_configuration:STRING=RELEASE -D generated_file:STRING=/var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/./torch_hip_generated_attention_backward.hip.o -P /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/torch_hip_generated_attention_backward.hip.o.cmake
In file included from /var/lib/jenkins/pytorch/aten/src/ATen/native/transformers/hip/attention_backward.hip:49:
/var/lib/jenkins/pytorch/aten/src/ATen/native/transformers/hip/aotriton_adapter.h:120:10: error: no matching constructor for initialization of 'aotriton::TensorView<0>'
  120 |   return aotriton::TensorView<0>(reinterpret_cast<intptr_t>(q.data_ptr()),
      |          ^                       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

@okakarpa
Copy link
Collaborator

Jenkins build for 211a9130ea8724d3b356368e73fd64a9a3157c83 commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

Detected error during Pytorch building:

	/var/lib/jenkins/pytorch/build/lib/libshm.so
	/opt/rocm/lib/libhsa-runtime64.so.1
	/lib/x86_64-linux-gnu/libm.so.6
[7981/8635] Building HIPCC object caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/torch_hip_generated_shifted_chebyshev_polynomial_w.hip.o
[7982/8635] Building HIPCC object caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/torch_hip_generated_attention.hip.o
FAILED: caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/torch_hip_generated_attention.hip.o /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/torch_hip_generated_attention.hip.o 
cd /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip && /opt/conda/envs/py_3.10/bin/cmake -E make_directory /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/. && /opt/conda/envs/py_3.10/bin/cmake -D verbose:BOOL=OFF -D build_configuration:STRING=RELEASE -D generated_file:STRING=/var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/./torch_hip_generated_attention.hip.o -P /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/torch_hip_generated_attention.hip.o.cmake
In file included from /var/lib/jenkins/pytorch/aten/src/ATen/native/transformers/hip/attention.hip:84:
/var/lib/jenkins/pytorch/aten/src/ATen/native/transformers/hip/aotriton_adapter.h:120:10: error: no matching constructor for initialization of 'aotriton::TensorView<0>'
  120 |   return aotriton::TensorView<0>(reinterpret_cast<intptr_t>(q.data_ptr()),
      |          ^                       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

@doru1004 doru1004 force-pushed the improve-softmax-performance-2.4 branch from 211a913 to 19324cb Compare November 27, 2024 15:51
@okakarpa
Copy link
Collaborator

Jenkins build for 19324cb093099ea7d73b28f8a2155ae6b2d0280e commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

Detected error during Pytorch building:

[7866/8635] Building HIPCC object caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/torch_hip_generated_ActivationMishKernel.hip.o
[7867/8635] Building HIPCC object caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/torch_hip_generated_RecordStream.hip.o
[7868/8635] Building HIPCC object caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/torch_hip_generated_MultinomialKernel.hip.o
[7869/8635] Building HIPCC object caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/torch_hip_generated_MultiLabelMarginCriterion.hip.o
[7870/8635] Building HIPCC object caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/torch_hip_generated_SoftMax.hip.o
FAILED: caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/torch_hip_generated_SoftMax.hip.o /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/torch_hip_generated_SoftMax.hip.o 
cd /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip && /opt/conda/envs/py_3.10/bin/cmake -E make_directory /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/. && /opt/conda/envs/py_3.10/bin/cmake -D verbose:BOOL=OFF -D build_configuration:STRING=RELEASE -D generated_file:STRING=/var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/./torch_hip_generated_SoftMax.hip.o -P /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/torch_hip_generated_SoftMax.hip.o.cmake
/var/lib/jenkins/pytorch/aten/src/ATen/native/hip/SoftMax.hip:1206:3: error: no matching function for call to 'host_softmax'
 1206 |   host_softmax<LogSoftMaxForwardEpilogue,true>(input, dim, half_to_float, output);
      |   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/var/lib/jenkins/pytorch/aten/src/ATen/native/hip/SoftMax.hip:940:8: note: candidate template ignored: invalid explicitly-specified argument for template parameter 'EpilogueWithMul'

@okakarpa
Copy link
Collaborator

Jenkins build for 19324cb093099ea7d73b28f8a2155ae6b2d0280e commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

Detected error during Pytorch building:

22 warnings generated when compiling for host.
[7866/8635] Building HIPCC object caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/torch_hip_generated_Reduce.hip.o
[7867/8635] Building HIPCC object caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/torch_hip_generated_RecordStream.hip.o
[7868/8635] Building HIPCC object caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/torch_hip_generated_MultinomialKernel.hip.o
[7869/8635] Building HIPCC object caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/torch_hip_generated_SoftMax.hip.o
FAILED: caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/torch_hip_generated_SoftMax.hip.o /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/torch_hip_generated_SoftMax.hip.o 
cd /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip && /opt/conda/envs/py_3.10/bin/cmake -E make_directory /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/. && /opt/conda/envs/py_3.10/bin/cmake -D verbose:BOOL=OFF -D build_configuration:STRING=RELEASE -D generated_file:STRING=/var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/./torch_hip_generated_SoftMax.hip.o -P /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/torch_hip_generated_SoftMax.hip.o.cmake
/var/lib/jenkins/pytorch/aten/src/ATen/native/hip/SoftMax.hip:1206:3: error: no matching function for call to 'host_softmax'
 1206 |   host_softmax<LogSoftMaxForwardEpilogue,true>(input, dim, half_to_float, output);
      |   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/var/lib/jenkins/pytorch/aten/src/ATen/native/hip/SoftMax.hip:940:8: note: candidate template ignored: invalid explicitly-specified argument for template parameter 'EpilogueWithMul'

@doru1004 doru1004 force-pushed the improve-softmax-performance-2.4 branch from 19324cb to 5515a5a Compare December 2, 2024 17:36
@okakarpa
Copy link
Collaborator

okakarpa commented Dec 2, 2024

Jenkins build for 5515a5aabcd2138110071c46419f695b16631db5 commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

Detected error during Pytorch building:

[7940/8635] Building HIPCC object caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/torch_hip_generated_ForeachBinaryOpScalarList.hip.o
[7941/8635] Building HIPCC object caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/torch_hip_generated_WeightNorm.hip.o
[7942/8635] Building HIPCC object caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/torch_hip_generated_scaled_modified_bessel_k0.hip.o
[7943/8635] Building HIPCC object caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/torch_hip_generated_fused_adam_amsgrad_impl.hip.o
[7944/8635] Building HIPCC object caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/torch_hip_generated_attention_backward.hip.o
FAILED: caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/torch_hip_generated_attention_backward.hip.o /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/torch_hip_generated_attention_backward.hip.o 
cd /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip && /opt/conda/envs/py_3.10/bin/cmake -E make_directory /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/. && /opt/conda/envs/py_3.10/bin/cmake -D verbose:BOOL=OFF -D build_configuration:STRING=RELEASE -D generated_file:STRING=/var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/./torch_hip_generated_attention_backward.hip.o -P /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/torch_hip_generated_attention_backward.hip.o.cmake
In file included from /var/lib/jenkins/pytorch/aten/src/ATen/native/transformers/hip/attention_backward.hip:49:
/var/lib/jenkins/pytorch/aten/src/ATen/native/transformers/hip/aotriton_adapter.h:120:10: error: no matching constructor for initialization of 'aotriton::TensorView<0>'
  120 |   return aotriton::TensorView<0>(reinterpret_cast<intptr_t>(q.data_ptr()),
      |          ^                       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

@doru1004 doru1004 force-pushed the improve-softmax-performance-2.4 branch 2 times, most recently from b4358bf to da48662 Compare December 2, 2024 22:07
@okakarpa
Copy link
Collaborator

okakarpa commented Dec 2, 2024

Jenkins build for da48662953ffe5890a4baec41fc0454a0e26f477 commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

Detected error during Pytorch building:

/opt/rocm-6.2.3/lib/llvm/bin/../../../include/hip/hip_runtime_api.h:580:41: note: expanded from macro 'DEPRECATED'
  580 | #define DEPRECATED(msg) __attribute__ ((deprecated(msg)))
      |                                         ^
1 warning generated when compiling for gfx908.
[7837/8635] Building HIPCC object caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/torch_hip_generated_SoftMax.hip.o
FAILED: caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/torch_hip_generated_SoftMax.hip.o /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/torch_hip_generated_SoftMax.hip.o 
cd /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip && /opt/conda/envs/py_3.10/bin/cmake -E make_directory /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/. && /opt/conda/envs/py_3.10/bin/cmake -D verbose:BOOL=OFF -D build_configuration:STRING=RELEASE -D generated_file:STRING=/var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/./torch_hip_generated_SoftMax.hip.o -P /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/torch_hip_generated_SoftMax.hip.o.cmake
/var/lib/jenkins/pytorch/aten/src/ATen/native/hip/SoftMax.hip:976:15: error: use of undeclared identifier 'ekse'; did you mean 'else'?
  976 |             } ekse {
      |               ^~~~
      |               else

@okakarpa
Copy link
Collaborator

okakarpa commented Dec 2, 2024

Jenkins build for da48662953ffe5890a4baec41fc0454a0e26f477 commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

Detected error during Pytorch building:

/var/lib/jenkins/pytorch/aten/src/ATen/native/hip/DistributionTemplates.h:182:17: warning: loop not unrolled: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering [-Wpass-failed=transform-warning]
/var/lib/jenkins/pytorch/aten/src/ATen/native/hip/DistributionTemplates.h:182:17: warning: loop not unrolled: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering [-Wpass-failed=transform-warning]
12 warnings generated when compiling for gfx942.
[7982/8635] Building HIPCC object caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/torch_hip_generated_UnaryGeometricAcoshKernel.hip.o
[7983/8635] Building HIPCC object caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/torch_hip_generated_attention.hip.o
FAILED: caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/torch_hip_generated_attention.hip.o /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/torch_hip_generated_attention.hip.o 
cd /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip && /opt/conda/envs/py_3.10/bin/cmake -E make_directory /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/. && /opt/conda/envs/py_3.10/bin/cmake -D verbose:BOOL=OFF -D build_configuration:STRING=RELEASE -D generated_file:STRING=/var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/./torch_hip_generated_attention.hip.o -P /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/torch_hip_generated_attention.hip.o.cmake
In file included from /var/lib/jenkins/pytorch/aten/src/ATen/native/transformers/hip/attention.hip:84:
/var/lib/jenkins/pytorch/aten/src/ATen/native/transformers/hip/aotriton_adapter.h:120:10: error: no matching constructor for initialization of 'aotriton::TensorView<0>'
  120 |   return aotriton::TensorView<0>(reinterpret_cast<intptr_t>(q.data_ptr()),
      |          ^                       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

@doru1004 doru1004 force-pushed the improve-softmax-performance-2.4 branch from da48662 to 5a4ae3d Compare December 3, 2024 00:06
@okakarpa
Copy link
Collaborator

okakarpa commented Dec 3, 2024

Jenkins build for 5a4ae3de47cf3499a6d55025e3733e3bd49a8a8b commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

Detected error during Pytorch building:

	/var/lib/jenkins/pytorch/build/lib/libshm.so
	/opt/rocm/lib/libhsa-runtime64.so.1
	/lib/x86_64-linux-gnu/libm.so.6
[7981/8635] Building HIPCC object caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/torch_hip_generated_spherical_bessel_j0.hip.o
[7982/8635] Building HIPCC object caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/torch_hip_generated_attention.hip.o
FAILED: caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/torch_hip_generated_attention.hip.o /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/torch_hip_generated_attention.hip.o 
cd /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip && /opt/conda/envs/py_3.10/bin/cmake -E make_directory /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/. && /opt/conda/envs/py_3.10/bin/cmake -D verbose:BOOL=OFF -D build_configuration:STRING=RELEASE -D generated_file:STRING=/var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/./torch_hip_generated_attention.hip.o -P /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/torch_hip_generated_attention.hip.o.cmake
In file included from /var/lib/jenkins/pytorch/aten/src/ATen/native/transformers/hip/attention.hip:84:
/var/lib/jenkins/pytorch/aten/src/ATen/native/transformers/hip/aotriton_adapter.h:120:10: error: no matching constructor for initialization of 'aotriton::TensorView<0>'
  120 |   return aotriton::TensorView<0>(reinterpret_cast<intptr_t>(q.data_ptr()),
      |          ^                       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

@rocm-mici
Copy link

Jenkins build for 5a4ae3de47cf3499a6d55025e3733e3bd49a8a8b commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

Detected error during Pytorch building:

	/lib/x86_64-linux-gnu/libm.so.6
[7979/8635] Building HIPCC object caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/torch_hip_generated_BinaryMiscOpsKernels.hip.o
[7980/8635] Building HIPCC object caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/torch_hip_generated_spherical_bessel_j0.hip.o
[7981/8635] Building HIPCC object caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/torch_hip_generated_shifted_chebyshev_polynomial_w.hip.o
[7982/8635] Building HIPCC object caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/torch_hip_generated_attention_backward.hip.o
FAILED: caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/torch_hip_generated_attention_backward.hip.o /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/torch_hip_generated_attention_backward.hip.o 
cd /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip && /opt/conda/envs/py_3.10/bin/cmake -E make_directory /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/. && /opt/conda/envs/py_3.10/bin/cmake -D verbose:BOOL=OFF -D build_configuration:STRING=RELEASE -D generated_file:STRING=/var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/./torch_hip_generated_attention_backward.hip.o -P /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/torch_hip_generated_attention_backward.hip.o.cmake
In file included from /var/lib/jenkins/pytorch/aten/src/ATen/native/transformers/hip/attention_backward.hip:49:
/var/lib/jenkins/pytorch/aten/src/ATen/native/transformers/hip/aotriton_adapter.h:120:10: error: no matching constructor for initialization of 'aotriton::TensorView<0>'
  120 |   return aotriton::TensorView<0>(reinterpret_cast<intptr_t>(q.data_ptr()),
      |          ^                       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

@rocm-mici
Copy link

Jenkins build for 5a4ae3de47cf3499a6d55025e3733e3bd49a8a8b commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

Detected error during Pytorch building:

[7936/8635] Building HIPCC object caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/torch_hip_generated_modified_bessel_k1.hip.o
[7937/8635] Building HIPCC object caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/torch_hip_generated_modified_bessel_i0.hip.o
[7938/8635] Building HIPCC object caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/torch_hip_generated_modified_bessel_k0.hip.o
[7939/8635] Building HIPCC object caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/torch_hip_generated_legendre_polynomial_p.hip.o
[7940/8635] Building HIPCC object caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/torch_hip_generated_attention.hip.o
FAILED: caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/torch_hip_generated_attention.hip.o /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/torch_hip_generated_attention.hip.o 
cd /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip && /opt/conda/envs/py_3.10/bin/cmake -E make_directory /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/. && /opt/conda/envs/py_3.10/bin/cmake -D verbose:BOOL=OFF -D build_configuration:STRING=RELEASE -D generated_file:STRING=/var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/./torch_hip_generated_attention.hip.o -P /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/torch_hip_generated_attention.hip.o.cmake
In file included from /var/lib/jenkins/pytorch/aten/src/ATen/native/transformers/hip/attention.hip:84:
/var/lib/jenkins/pytorch/aten/src/ATen/native/transformers/hip/aotriton_adapter.h:120:10: error: no matching constructor for initialization of 'aotriton::TensorView<0>'
  120 |   return aotriton::TensorView<0>(reinterpret_cast<intptr_t>(q.data_ptr()),
      |          ^                       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

@doru1004 doru1004 force-pushed the improve-softmax-performance-2.4 branch from 5a4ae3d to 70a25e2 Compare December 4, 2024 16:55
@rocm-repo-management-api
Copy link

Jenkins build for 70a25e21a46ec6f8685e0b86d1a975f4aa248409 commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

Detected error during Pytorch building:

[7941/8635] Building HIPCC object caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/torch_hip_generated_fused_adam_amsgrad_impl.hip.o
[7942/8635] Building HIPCC object caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/torch_hip_generated_fused_adamw_amsgrad_impl.hip.o
[7943/8635] Building HIPCC object caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/torch_hip_generated_ActivationGeluKernel.hip.o
[7944/8635] Building HIPCC object caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/torch_hip_generated_fused_adam_impl.hip.o
[7945/8635] Building HIPCC object caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/torch_hip_generated_attention_backward.hip.o
FAILED: caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/torch_hip_generated_attention_backward.hip.o /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/torch_hip_generated_attention_backward.hip.o 
cd /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip && /opt/conda/envs/py_3.10/bin/cmake -E make_directory /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/. && /opt/conda/envs/py_3.10/bin/cmake -D verbose:BOOL=OFF -D build_configuration:STRING=RELEASE -D generated_file:STRING=/var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/./torch_hip_generated_attention_backward.hip.o -P /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/torch_hip_generated_attention_backward.hip.o.cmake
In file included from /var/lib/jenkins/pytorch/aten/src/ATen/native/transformers/hip/attention_backward.hip:49:
/var/lib/jenkins/pytorch/aten/src/ATen/native/transformers/hip/aotriton_adapter.h:120:10: error: no matching constructor for initialization of 'aotriton::TensorView<0>'
  120 |   return aotriton::TensorView<0>(reinterpret_cast<intptr_t>(q.data_ptr()),
      |          ^                       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

@rocm-repo-management-api
Copy link

Jenkins build for 70a25e21a46ec6f8685e0b86d1a975f4aa248409 commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

@rocm-repo-management-api
Copy link

Jenkins build for 70a25e21a46ec6f8685e0b86d1a975f4aa248409 commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

@rocm-repo-management-api
Copy link

Jenkins build for 70a25e21a46ec6f8685e0b86d1a975f4aa248409 commit finished as ABORTED
Links: Blue Ocean view / Build artifacts

@rocm-repo-management-api
Copy link

rocm-repo-management-api bot commented Dec 6, 2024

Jenkins build for 70a25e21a46ec6f8685e0b86d1a975f4aa248409 commit finished as SUCCESS
Links: Blue Ocean view / Build artifacts

@rocm-repo-management-api
Copy link

rocm-repo-management-api bot commented Dec 9, 2024

Jenkins build for 70a25e21a46ec6f8685e0b86d1a975f4aa248409 commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants