-
-
Notifications
You must be signed in to change notification settings - Fork 14.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
adaptivecpp: init at 24.06.0 #360893
base: master
Are you sure you want to change the base?
adaptivecpp: init at 24.06.0 #360893
Conversation
Interesting that you apparently had no issues with the rocm build, while I had quite some trouble just to try make it compile. I do see that the clhpp package in nixpkgs was updated a few weeks ago, and in general, some time has passed, so that might have helped. As for the open questions that you posed, I'd say updating to rocm6 is preferable, if it's possible as well as creating an alias in |
False alarm on ROCm 6 building, I forgot However, it SHOULD be possible. Building the arch-linux package in distrobox works just fine. Their build script doesn't do anything special. I think we might just need a slightly newer ROCm for some fixes? They use ROCm 6.2.4 and LLVM 18.1. We're on ROCm 6.0.2 |
As it turns out, the issue was caused by Nix' default hardening options (specifically zerocallusedregs)
|
b4658da
to
f10b721
Compare
Adds adaptivecpp, based on opensycl package. Updated to newest version and to use LLVM 17 and ROCm 6.
f10b721
to
1f7d58e
Compare
I've added the test suite, but I'm not sure it actually runs on GPU. It compiles and runs without complaints but it only seems to use CPU (nothing shows up in nvtop while btop spikes). Looking at their CI, they seem to specify specific Target Architecture. It's not the most elegant, but I've passed it through like this, though it doesn't actually seem to make any difference in what runs: nix-build -A adaptivecpp.tests --arg config '{ rocmSupport = true; }' --arg targetsBuild '"omp;hip:gfx1030"' --arg targetsRun '"omp;hip"' Could you kindly take a look at this @yboettcher? |
I might be wrong, but given that nix builds are usually rather "sealed off" I would not expect a nix-build call to be able to use (or even detect) a gpu. I cloned your branch, built the adativecppWithRocm package (with nix-build) and then used that to try and build the adaptivecpp tests manually (in a clone of the adaptivecpp repo).
to the postFixup in addition to whats already there.
messages. Setting the mentioned variables however, only added some initial output about a few AMD_COMGR_ACTION_... calls that all ended with AMD_COMGR_STATUS_SUCCESS, so I guess that's not helpful. And I think this is also how far I've gotten in trying to make this work: Explicit compilation for a specific target works (when adding that extra That said, I think I'd rather have a dysfunctional generic target with an all around updated compiler where you can at least specify a specific target, than no updated compiler (which also does not even have a generic target). Although I would prefer if we could somehow make this work, but I honestly don't know why it fails. It might even be that the way rocm is installed on nixos is the problem. Or not. |
I second this, having a up to date version would be much appreciated even if the generic backend does not yet work. |
Hi - this discussion has made it into the AdaptiveCpp discord ;) Without having seen the JIT logs it's hard to say why If ROCm thinks that it has generated the device binary successfully, then it is possible that there is an internal ROCm issue. IIRC there were also some ROCm versions where the retrieval of the ROCm log was prevented by a bug (in case of an error, ROCm would return before filling the error log). This might play a role here. I don't want get involved in your internal prioritization, but perhaps it is helpful for your decision process to outline how we see these compilation flows from the perspective of the AdaptiveCpp project.
|
Looking at the output more closely, I'm not convinced the executables should be wrapped. I've built it without any wrapping, and I got different results for the tests though (both with and without wrapping): |
I just tested the generic backend on Nvidia hardware, it seems to be working correctly. |
@illuhad, thank you very much for your insights! Given that we get the
Given that I added some more details (including the dumped IR) here. @blenderfreaky I noticed that when I run a binary that was compiled with Log excerpt when executing
|
@yboettcher thanks for the details. I assumed that since
there was an issue retrieving the logs. Anyway, from you
for each of the relevant ROCm bitcode libraries like From you IR output, we can see that it tries to call The way the AdaptiveCpp amdgpu backend discovers these bitcode libraries is at the moment a bit hacky (for various reasons): It tries to invoke
You can also try |
I am guessing here, but given that it works on omp and nvidia, I would assume that the problem is not with this adaptivecpp derivation, but maybe somewhere in how rocm is working on nixos. But that's just a guess. In any case, unless we can find some "simple fix" by just supplying adaptivecpp with some parameter or including another buildInput or something like that, I would say that fixing this issue is out of scope of this PR, and I would say we can merge. Cpu and NVidia users gain the |
It turns out I linked to the wrong path in the nix store. I now also get the Looking into the code @illuhad linked, acpp should try to run
Passing I tried re-adding the wrapper, however that didn't help at all. @yboettcher, how exactly did you pass the arguments for it to see ROCm? Revisting the One thing I noticed is that Nonetheless, peppering in some debug statements, it doesn't seem like it actually ever receives those arguments from the json files, so maybe this is a dead-end path anyways? Even though
If you want to try the patch-stuff I did, I've put it in a separate branch for now to keep this one somewhat merge-able. See blenderfreaky:adaptivecpp-amd |
Yep, I think too that one of them is likely incorrect.
The fallback paths are not well tested. I think the main assumption is that we require a working ROCm installation, and a working ROCm installation will have a functioning Is it possible to fix |
I don't think Afaict, all that's needed is to somehow pass |
What is the workflow for users? Can they just invoke As I said, the fallback path is not well-tested. It would be more predictable to get it to work with The fallback path seems to assume a standard ROCm directory layout where the bitcode libraries live in |
Scanning through some other nix builds, it looks like they seem to pass I now create a merged derivation with a wrapped
Otherwise we could just brute-force it by patching the path into the code, though that just seems dirty. |
Okay, bitcode linking looks better now! But I do find it curious that there's no |
Oddly enough, this doesn't seem to be particularly consistent. Running the simple example from the gist above for example, Afaict though, there's no obvious error message however Here is the simple examples logs for referenceRan with
|
Strange. But I see
A mismatch of oclc ABI bitcode libraries should not happen and can lead to incorrect execution of kernels, but IMO should not lead to a JIT failure. I'm not sure what's going on. The It's also possible that |
Yeah it works with or without the postfixup for the generic backend on Nvidia hardware since no flags affect the behavior on nvidia. |
Removed the wrapper. I say we merge this for now and move over to a new issue/PR for the AMD stuff. There seemed to be some issues with the builds on ofBorg, unsure if those are relevant though. |
@illuhad somehow, the example builds and runs fine if I use the provided Makefile, but I haven't been able to even build it using CMake. I think that's a problem with my usage of CMake though. I don't have much time this week for debugging but I've uploaded the comgr logs & bitcode in case you want to take a look. |
Same for me. I tried to also make a wrapped hipcc with the device libs and tried compile the rocm example @illuhad provided, but I just couldn't get any of it to work with cmake (with or without the wrapped hipcc). I initially only tried the cmake version, which looks like CMake failed on
I don't know how this is set up, but if nixpkgs is set up in a way that ofborg failures prevent a merge, then I guess they're relevant. I once had issues where adaptivecpp tried to compile the opencl backend (because rocm also provides opencl), but that backend requires network access, which failed, making it necessary to manually disable the opencl backend. But I haven't encountered this issue in your branch. |
Gonna have wait for ofBorg to run through I guess Do you by chance have the |
I actually had a |
@blenderfreaky Thanks, I don't know why we are seeing a JIT failure, but the bitcode does not actually seem to contain the kernel from your source file (only some AMD builtin kernels). So something seems definitely off, although I cannot say what exactly. EDIT: It seems that |
@illuhad Interesting that the it says Running the program with
Note that
It seems to be looking for |
Are referring to the godbolt link? I don't think this is cause for concern. When a target architecture is not provided, clang/LLVM falls back to the oldest AMD GPU it knows, which presumably is gfx700. |
Ah ok, makes sense. I compiled and ran the example in arch via distrobox and compared the comgr commands, there are some interesting differences. On Arch:
On Nix:
It seems the two are using different methods for linking, no idea why though. |
@blenderfreaky Are we looking at the same ROCm versions? IIRC there was a change where they changed how builtin bitcode libraries were included in the compilation. But it was back around ROCm 5.7. I'm also a bit surprised to see EDIT: Notice the two different comgr actions:
This indicates that the difference might be related to the mentioned ROCm change. |
@illuhad We're using ROCm 6.0.2 (Nix seems a little out of date here), the arch one is on 6.2.1 & 6.2.4 for some packages. Both are definitely >= ROCm 6 though. I interpreted the difference in Action as the Arch one doing Linking and stuff in one step, whereas on the Nix one splits it out into several, linking in a later step. @yboettcher Can you make any sense of the ofBorg failures? It looks like it's just the test failing, however:
We could just comment them out to get this merged sooner though I think, seeing as they don't run as they should anyways and it's working for Nvidia already. |
I actually have the same failures ofborg has on my end. It appears that for the normal
but when building the normal Edit:
|
OpenSYCL
has been renamed toAdaptiveCpp
, "due to external legal pressure" (see their repo)The package is a 1:1 of
pkgs/development/compilers/opensycl/default.nix
, with the repository and name updated, as well as the version bumped to the newest.Open questions:
opensycl
package and add an alias or warning?and does seem towork now, but I haven't done sufficient testing yetTagging maintainers: @yboettcher
Things done
Built successfully on x86_64 with AMD (ROCm) GPU.
nix.conf
? (See Nix manual)sandbox = relaxed
sandbox = true
nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD"
. Note: all changes have to be committed, also see nixpkgs-review usage./result/bin/
)Add a 👍 reaction to pull requests you find important.