-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix FindBLAS.cmake compatibility with MKL 2024.0 #1794
Conversation
Hello @vpirogov, I am actually in the course of building PyTorch, but with all the third-party dependencies built separately (as if they were system libraries). My group is using the Intel compiler and MKL, so that is the BLAS that we've set for PyTorch. It appeared that this was also needed for oneDNN. But then I noticed this comment about MKL not being necessary, or even beneficial performance-wise with the current oneDNN codebase. I was able to build oneDNN successfully without MKL, and am currently using that for PyTorch. But I left this PR up on the basis that if MKL remains a supported build option for the project, then this trivial issue of the MKL library location should be addressed. An alternative would be to update the CMake code to visibly deprecate MKL usage, or even remove it altogether. The legacy nature of the option is not obvious to an outsider---not least because this project is part of the Intel ecosystem, and thus one would expect to see Intel building blocks as hard dependencies. (That the MKL has been sidelined here speaks quite well of the project's independence, I might add.) Is there a need to retain support for building against the MKL? It seems like a good bit of confusion and complexity could be eliminated with it. |
@iskunk, thanks for clarifications. Your PR does make sense, but it got me thinking why we are still providing option to use Intel MKL. I agree that currently it's confusing and we should drop it or at least mark as deprecated. |
@vpirogov, well, technically, we document that we support only ARMPL and ACCELERATE BLAS vendors so trying to use other BLAS vendors would be an undefined behavior. One thing that we can do here is to add a check to the build system to make sure that the BLAS vendor specified by the user is one of the supported ones. |
@iskunk , could you try running
This approach works in Gentoo, including MKL 2024.0 (there is no ebuild for MKL 2024.0, but I modified existing one and it worked). Also, oneDNN still works with generic BLAS libraries. I still find it useful (to have as an option), because:
Collapsed block my old test, which was incorrect
Here is how performance looks for M=N=K for 3 cases on Ryzen 7950x3d:
Here DNNL was built without MKL/BLAS (i. e. with native gemm kernels). Note this strange saw effect in dnnl_sgemm, btw. I don't know if it is an expected behavior. UPD: I take my words about "OneDNN sgemm is slower" back! I did not follow https://oneapi-src.github.io/oneDNN/dev_guide_performance_settings.html and tested with It is again same or faster for bf16bf16f32 gemm (alpha=1, beta=0). I just want to highlight, that linking to MKL indeed is a way to shoot oneself in the leg, especially on AMD CPUs. |
Thanks for the update. I was not using the But that's likely a moot point, given those benchmark results 😃 Are there any valid non-performance-based reasons for wanting to use the MKL, e.g. numerical stability? (That's not my use case, but it could be someone else's) |
@iskunk, there are no reasons to build oneDNN with MKL as a BLAS vendor. The only reason this option exists after oneDNN v1.0 is for performance debug purposes (like @AngryLoki's analysis above). |
37b2b5e
to
ae704f4
Compare
ae704f4
to
2b629b2
Compare
Thank you for your contribution, @iskunk! To minimize confusion about |
Thank you @vpirogov. I would suggest making the warning for MKL a bit more prominent than others. Given this project's Intel lineage, the expectation for users going in would be for the MKL to be either required, optimal, or at least first among equals. So an unobtrusive warning on that point is more likely to be overlooked, especially if the user is in a hurry. |
Description
I am building oneDNN using the Intel oneAPI compiler and MKL version 2024.0.0. I get this error at configuration time:
I see in
cmake/FindBLAS.cmake
that oneDNN expects to find the MKL libraries under anmkl/lib/
directory, among other places. However, my oneAPI install has the MKL under a versioned directory:That noted, I also get the same configuration error with any of these settings:
If I look in the
mkl/
directory, I see that there is a handylatest
symlink to avoid hard-coding the version:If I add
mkl/latest/lib
and variations toBLAS_mkl_LIB_PATH_SUFFIXES
, then configuration succeeds, and I am able to build and test oneDNN without issue.The version of the Intel compiler/libraries I am using is fairly new, and I suspect the oneDNN build may not have been tested against it before now.