-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CI: refactor and extend atomic tests #1772
base: develop
Are you sure you want to change the base?
Conversation
a5cb324
to
5e233b9
Compare
d95dbb9
to
3e773d2
Compare
For an unknown reason, the OpenACC test shows access to invalid memory and HIP requires over 60 minutes to compile the atomic test. |
using Op = alpaka::AtomicAdd; | ||
|
||
template<typename... TArgs> | ||
static ALPAKA_FN_ACC auto atomic(TArgs&&... args) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jkelling found out that this variadic template usage is the root of the nvhpc error (invalid memory access)
The fix for atomics is already merged with #1773. |
fix alpaka-group#1769 - test all hierarchies which can be used for atomics - extent test cases to test for equal, smaller, larger and zero as left side operand - test calling the atomic interface `atomic*()` and `atomicOp<*>()`
Clang for HIP as problems if we have all atomic tests within one kernel and throws the compiler error: `error: stack size limit exceeded (155632) in _ZN6alpaka16uniform_cuda_hip6de`, therefore we split the tests into multiple kernel to workaround the issue.
8f0e3eb
to
3408a7c
Compare
clang CUDA is exposing
atomic*_block()
function signatures even if these can not be used by the selected architecture.This leads to compile issues if clang is used as CUDA compiler.
Our tests cases have not checked all combinations of atomics, therefore the second commit refactors the atomic tests and fixes #1769:
atomic*()
andatomicOp<*>()
error: stack size limit exceeded (155632) in _ZN6alpaka16uniform_cuda_hip6detail20uniformCudaHipKernelINS_8ApiHipRtENS_22AccGpuUniformCudaHipRtIS3_St17integral_constantImLm1EEmEES6_m16AtomicTestKernelIS7_xvEJPbxEEEvNS_3VecIT1_T2_EET3_DpT4_