-
Notifications
You must be signed in to change notification settings - Fork 864
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fcoll/vulcan: add two_phase read_all #12894
Merged
edgargabriel
merged 3 commits into
open-mpi:main
from
edgargabriel:topic/vulcan-two-phase-read-all
Oct 30, 2024
Merged
fcoll/vulcan: add two_phase read_all #12894
edgargabriel
merged 3 commits into
open-mpi:main
from
edgargabriel:topic/vulcan-two-phase-read-all
Oct 30, 2024
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
edgargabriel
changed the title
Topic/vulcan two phase read all
topic/vulcan: add two_phase read_all
Oct 29, 2024
edgargabriel
force-pushed
the
topic/vulcan-two-phase-read-all
branch
from
October 29, 2024 21:15
dbed343
to
69801e8
Compare
edgargabriel
changed the title
topic/vulcan: add two_phase read_all
fcoll/vulcan: add two_phase read_all
Oct 29, 2024
bosilca
approved these changes
Oct 30, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have not updated/added your copyright blurb. Other than that everything looks good, but I'm not well versed into the MPIO code.
@bosilca thank you, good point, I will do that before merging! |
This is a cleanup of the write_all operation, in preparation for adding read_all implementation to the vulcan component. Specifically: - remove the multiple group option: this was envisioned for the vulcan component, but never fully implemented. Therefore, having a few stubs at some locations (and an mca parameter) doesn't make sense, and are being removed to simplify the code. - remote the write_chunksize option: this was an artifact of the code based having evolved from dynamic_gen2 (where the option makes sense). However, in vulcan it doesn't really make sense and was actually not correctly implemented eitherway, so if somebody would have used that option, it would prbably have failed. The changed have been validated with the ompio testsuite as well as the hdf5 testphdf5, t_shapesame, and t_filters_parallel tests. Signed-off-by: Edgar Gabriel <[email protected]>
extract some code into stand alone routines in preparation for adding a vulcan read_all implementation. Specifically, this pr adds new routines for : - mca_fcoll_vulcan_calc_blocklen_disps - mca_fcoll_vulcan_calc_file_offsets - mca_fcoll_vulcan_calc_io_array which will be reused in the read_all as well. Also, some white-space cleanup of the code. Signed-off-by: Edgar Gabriel <[email protected]>
add an implementation of the read_all operation that uses the two-phase I/O algorithm using even partitioning, i.e. the same base idea that is used by the write_all operation of this component. In addition to using the 'correct' data partitioning approach for the component, the vulcan read_all implementation also adds some other features that were there for the write_all operations, but not for the (generic) read_all algorithm used by all components so far. Specifically, it can overlap the execution of the I/O phase and the communication phase. The algorithm can also use GPU buffers for aggregation. Signed-off-by: Edgar Gabriel <[email protected]>
edgargabriel
force-pushed
the
topic/vulcan-two-phase-read-all
branch
from
October 30, 2024 15:00
69801e8
to
030ead1
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
add an implementation of the read_all operation that uses the two-phase I/O algorithm using even partitioning, i.e. the same base idea that is used by the write_all operation of the vulcan component. Until now, all components have been using the same 'generic' read_all code, which was based on the fcoll/dynamic module idea.
In addition to using the 'correct' data partitioning approach for the component, the vulcan read_all implementation also adds some other features that were there for the write_all operations, but not for the (generic) read_all algorithm used by all components so far. Specifically, it can overlap the execution of the I/O phase and the communication phase. The algorithm can also use GPU buffers for aggregation.
The code has been tested with:
The PR looks complicated, but its actually not that bad. The first two commits perform some code cleanup and reorganization of the fcoll_vulcan_file_write_all function, with the specific goal of simplifying the code and allowing to reuse big chunks of the code for the read_all operation. The third commit adds than the actual new algorithm code for read_all.
As a side note, I noticed one more issue in the code base regarding the registration/deregistration of the ompio progress function, but I will fix that after this PR is merged.