-
Notifications
You must be signed in to change notification settings - Fork 27.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
xpu: test_eager_matches_sdpa_inference tests fail with pytorch XPU backend #34888
Comments
dvrogozh
added a commit
to dvrogozh/transformers
that referenced
this issue
Nov 22, 2024
Currently torch.nn.attention.sdpa_kernel falls back to CPU when torch works with XPU backend. So, cpu thresholds should be used in associated tests. Fixes: huggingface#34888 Signed-off-by: Dmitry Rogozhkin <[email protected]>
Please, help review PR with the fix: |
dvrogozh
added a commit
to dvrogozh/transformers
that referenced
this issue
Nov 23, 2024
Currently torch.nn.attention.sdpa_kernel falls back to CPU when torch works with XPU backend. So cpu thresholds should be used in associated tests. Fixes: huggingface#34888 Signed-off-by: Dmitry Rogozhkin <[email protected]>
dvrogozh
added a commit
to dvrogozh/transformers
that referenced
this issue
Nov 23, 2024
Currently torch.nn.attention.sdpa_kernel falls back to CPU when torch works with XPU backend. So, cpu thresholds should be used in associated tests. Fixes: huggingface#34888 Signed-off-by: Dmitry Rogozhkin <[email protected]>
dvrogozh
added a commit
to dvrogozh/transformers
that referenced
this issue
Nov 25, 2024
As of PyTorch 2.5 XPU backend supports only torch.nn.attention.SDPBackend.MATH which is implemented on PyTorch level using aten operators and is device agnostic with respect to implementation of each aten operator. Thus, we can reuse CUDA (or CPU) MATH weights for XPU. Currently XPU backendtorch.nn.attention.sdpa_kernel falls back to CPU when torch works with XPU backend. So, cpu thresholds should be used in associated tests. Fixes: huggingface#34888 Signed-off-by: Dmitry Rogozhkin <[email protected]>
dvrogozh
added a commit
to dvrogozh/transformers
that referenced
this issue
Nov 26, 2024
As of PyTorch 2.5 XPU backend supports only torch.nn.attention.SDPBackend.MATH which is implemented on PyTorch level using aten operators and is device agnostic with respect to implementation of each aten operator. Thus, we can reuse CUDA (or CPU) MATH weights for XPU. Currently XPU backendtorch.nn.attention.sdpa_kernel falls back to CPU when torch works with XPU backend. So, cpu thresholds should be used in associated tests. Fixes: huggingface#34888 Signed-off-by: Dmitry Rogozhkin <[email protected]>
dvrogozh
added a commit
to dvrogozh/transformers
that referenced
this issue
Nov 26, 2024
As of PyTorch 2.5 XPU backend supports only torch.nn.attention.SDPBackend.MATH which is implemented on PyTorch level using aten operators and is device agnostic with respect to implementation of each aten operator. Thus, we can reuse CUDA (or CPU) MATH weights for XPU. Currently XPU backendtorch.nn.attention.sdpa_kernel falls back to CPU when torch works with XPU backend. So, cpu thresholds should be used in associated tests. Fixes: huggingface#34888 Signed-off-by: Dmitry Rogozhkin <[email protected]>
dvrogozh
added a commit
to dvrogozh/transformers
that referenced
this issue
Nov 26, 2024
As of PyTorch 2.5 XPU backend supports only torch.nn.attention.SDPBackend.MATH which is implemented on PyTorch level using aten operators and is device agnostic with respect to implementation of each aten operator. Thus, we can reuse CUDA (or CPU) MATH weights for XPU. Currently XPU backendtorch.nn.attention.sdpa_kernel falls back to CPU when torch works with XPU backend. So, cpu thresholds should be used in associated tests. Fixes: huggingface#34888 Signed-off-by: Dmitry Rogozhkin <[email protected]>
dvrogozh
added a commit
to dvrogozh/transformers
that referenced
this issue
Nov 26, 2024
As of PyTorch 2.5 XPU backend supports only torch.nn.attention.SDPBackend.MATH which is implemented on PyTorch level using aten operators and is device agnostic with respect to implementation of each aten operator. Thus, we can reuse CUDA (or CPU) MATH weights for XPU. Fixes: huggingface#34888 Signed-off-by: Dmitry Rogozhkin <[email protected]>
dvrogozh
added a commit
to dvrogozh/transformers
that referenced
this issue
Nov 27, 2024
As of PyTorch 2.5 XPU backend supports only torch.nn.attention.SDPBackend.MATH which is implemented on PyTorch level using aten operators and is device agnostic with respect to implementation of each aten operator. Thus, we can reuse CUDA (or CPU) MATH weights for XPU. Fixes: huggingface#34888 Signed-off-by: Dmitry Rogozhkin <[email protected]>
ydshieh
pushed a commit
that referenced
this issue
Dec 2, 2024
* Use torch.nn.attention.sdpa_kernel instead of deprecated torch.backends.cuda.sdp_kernel Signed-off-by: Dmitry Rogozhkin <[email protected]> * Fix test_eager_matches_sdpa_inference for XPU backend As of PyTorch 2.5 XPU backend supports only torch.nn.attention.SDPBackend.MATH which is implemented on PyTorch level using aten operators and is device agnostic with respect to implementation of each aten operator. Thus, we can reuse CUDA (or CPU) MATH weights for XPU. Fixes: #34888 Signed-off-by: Dmitry Rogozhkin <[email protected]> * Use torch.amp.autocast instead of deprecated torch.cuda.amp.autocast in nemotron Signed-off-by: Dmitry Rogozhkin <[email protected]> --------- Signed-off-by: Dmitry Rogozhkin <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
With:
CC: @amyeroberts @ydshieh
The text was updated successfully, but these errors were encountered: