Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix non-strict HIP device lib order #2217

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Snektron
Copy link

@Snektron Snektron commented Jul 5, 2024

In #2045 initial support for HIP was added. While trying it out, I noticed that different runs across different machines did not yield an expected cache hit. After some investigation, it turns out that the list of bitcode device libraries is not sorted after discovering them from the file system, and this resulted in a different order across those machines. I've added a fix that simply sorts those libraries after discovering them, that seems to be similar to how its handled elsewhere.

@sylvestre
Copy link
Collaborator

could you please add a test for this ? thanks

@codecov-commenter
Copy link

codecov-commenter commented Jul 5, 2024

Codecov Report

Attention: Patch coverage is 0% with 3 lines in your changes missing coverage. Please review.

Project coverage is 40.82%. Comparing base (0cc0c62) to head (48d6ed7).
Report is 99 commits behind head on main.

Files with missing lines Patch % Lines
src/compiler/c.rs 0.00% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2217      +/-   ##
==========================================
+ Coverage   30.91%   40.82%   +9.91%     
==========================================
  Files          53       55       +2     
  Lines       20112    20675     +563     
  Branches     9755     9801      +46     
==========================================
+ Hits         6217     8440    +2223     
- Misses       7922     8100     +178     
+ Partials     5973     4135    -1838     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Snektron
Copy link
Author

Snektron commented Jul 8, 2024

Im not 100% sure what the best way is to add the tests, considering that fully testing that issue requires deleting and then re-adding a file from the HIP bitcode directory (/opt/rocm/amdgcn/bitcode usually). Regardless, I'm away for a few weeks and will finish this when I get back.

@Snektron Snektron force-pushed the fix-device-libs-order branch 3 times, most recently from e0a58c1 to 48d6ed7 Compare July 25, 2024 08:45
@Snektron
Copy link
Author

bump

@sylvestre
Copy link
Collaborator

still need tests, sorry :)

@Snektron Snektron force-pushed the fix-device-libs-order branch 3 times, most recently from 4bd15cc to 3d28979 Compare November 22, 2024 17:21
@Snektron
Copy link
Author

Seems that fuse doesn't work in github actions. I guess the only other approach swapping out calls to fs::read_dir with something that can be mocked.

@Snektron
Copy link
Author

Hmm, it might be possible to LD_PRELOAD a library which returns random results from readdir...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants