Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Topic/fix same instance #12709

Merged
merged 2 commits into from
Jul 25, 2024
Merged

Conversation

bosilca
Copy link
Member

@bosilca bosilca commented Jul 25, 2024

Multiple fixes again main:

  1. the collective embiggening missed one spot in alltoallv where the array of counts was directly accessed.
  2. the patch to match check if the requests were from the same instance was incorrect, it checked if they were issued on the same communicator instead.

The prior version identified requests from the same communicator instead
of instance.

Signed-off-by: George Bosilca <[email protected]>
Somehow this has been missed during the MPI_Count upgrade of the
collective API.

Signed-off-by: George Bosilca <[email protected]>
@bosilca bosilca added the bug label Jul 25, 2024
@bosilca bosilca added this to the v5.0.6 milestone Jul 25, 2024
@bosilca bosilca self-assigned this Jul 25, 2024
@edgargabriel
Copy link
Member

@bosilca Does the alltoallv commit belong into this pr, or did it accidentally slip in?
And just to confirm, the fix for the same instance issue here, that is more of an optimization than an actual fix, isn't it?

@bosilca
Copy link
Member Author

bosilca commented Jul 25, 2024

@edgargabriel you are right, they are not related. They were all discovered on the same session on a machine with limited access outside, it was easier to bundle everything in a single PR.

@edgargabriel
Copy link
Member

@bosilca may I suggest that you at least spell this out in the pr description, that two different issues are tackled here? (In addition, I would also still be interested in the answer to my other question, whether this is more of an optimization vs. an actual bug fix)

@bosilca
Copy link
Member Author

bosilca commented Jul 25, 2024

@edgargabriel sorry I missed your second question. It is a bug fix, the current code checks if the PML requests are issued on the same communicator instead of the same instance.

@edgargabriel
Copy link
Member

@bosilca ok, sounds good, thank you. In that case I need to pull this pr also in into our internal repo :-)

@bosilca bosilca merged commit 20b900e into open-mpi:main Jul 25, 2024
20 checks passed
@bosilca bosilca deleted the topic/fix_same_instance branch July 25, 2024 15:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants