Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document ORTE -> PRRTE MCA parameter changes #11890

Merged
merged 1 commit into from
Sep 11, 2023

Conversation

qkoziol
Copy link
Contributor

@qkoziol qkoziol commented Sep 2, 2023

Document MCA parameter changes from move from ORTE to PRRTE.

Addresses Github issue #7668

Copy link
Contributor

@lrbison lrbison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed for typos and wording. Also confirmed none of the deprecated options would apply to 5.0.0 version series. I cannot say if this is an exhaustive listing.

@jsquyres
Copy link
Member

jsquyres commented Sep 6, 2023

I rebased this PR on current main/HEAD, and added a 2nd suggested commit. This commit can be squashed or deleted.

Copy link
Contributor

@janjust janjust left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would squash the commit, but otherwise looking good to me.

@lrbison
Copy link
Contributor

lrbison commented Sep 7, 2023

When replacing hwloc_base_use_hwthreads_as_cpus
What does this mean?

Value: [<Value>]:hwtcpus

I expected it to be perhaps just

rmaps_default_mapping_policy:hwtcpus

@rhc54
Copy link
Contributor

rhc54 commented Sep 7, 2023

When replacing hwloc_base_use_hwthreads_as_cpus What does this mean?

Value: [<Value>]:hwtcpus

I expected it to be perhaps just

rmaps_default_mapping_policy:hwtcpus

The cmd line option looks like --map-by :hwtcpus if you aren't specifying any mapping policy. However, note that this does NOT affect the default PRRTE behavior - it is a per-job directive. For mpirun, it doesn't matter as you are running a one-shot - unless your app does a comm_spawn, in which case it might (depends on whether or not you have set inherit).

If you want to affect the defaults, then you have to use the MCA parameter: PRTE_MCA_hwloc_base_use_hwthreads_as_cpus

@lrbison
Copy link
Contributor

lrbison commented Sep 7, 2023

@rhc54 ah, thank you for explaining.

Jeff shared a rendered version of the table with me: https://ompi--11890.org.readthedocs.build/en/11890/mca.html#label-mca-backward-compat

And while most of these make sense to me to read, the rmaps_default_mapping_policy entries was hard for me to grok, as I expect it would be for users. I wonder if it would be worth an example explaining the concept of these parameters which get captured as extra colon-separated tags on top of other parameters.

However, note that this does NOT affect the default PRRTE behavior - it is a per-job directive

Hmm, am I hearing you correctly then: This table isn't correctly informing users how to take the previous MCA parameter hwloc_base_use_hwthreads_as_cpus and get the same effect using rmaps_default_mapping_policy because updating the default won't change the one-shot launch behavior. What is the correct parameter to direct the user towards?

@rhc54
Copy link
Contributor

rhc54 commented Sep 7, 2023

Hmm, am I hearing you correctly then: This table isn't correctly informing users how to take the previous MCA parameter hwloc_base_use_hwthreads_as_cpus and get the same effect using rmaps_default_mapping_policy because updating the default won't change the one-shot launch behavior.

Sorry - no, it is the other way around. Updating the default via MCA param will change the one-shot behavior. My point was that the cmd line options (e.g., --map-by foo) do not change the default behavior - they only impact the one-shot job request. So if you pass --map-by :hwt you will treat hwt's as cpus for that specific job. However, if that job does a comm_spawn, you may not treat hwt's as cpus for the child job (it would depend on whether or not inherit was set).

If you always want to treat hwt's as cpus, then you would set the MCA param so the default is set (probably in your default MCA param file, which is what I do).

@lrbison
Copy link
Contributor

lrbison commented Sep 7, 2023

OK, thank you for clarifying

@jsquyres
Copy link
Member

jsquyres commented Sep 7, 2023

@rhc54 Thanks for all the clarifications.

@qkoziol @lrbison Can you guys fix up the content? Feel free to squash it all down to a single commit.

@qkoziol
Copy link
Contributor Author

qkoziol commented Sep 9, 2023

Working on adding some examples

@qkoziol
Copy link
Contributor Author

qkoziol commented Sep 9, 2023

Updated with examples

Copy link
Member

@jsquyres jsquyres left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@qkoziol These are great examples!

One point, however -- all your examples show extra spaces around the = in the assignment of shell variables. That actually causes a syntax error; the extra spaces need to be removed:

# This is an error
$ export FOO = bar
zsh: bad assignment
$ FOO = bar
zsh: command not found: FOO

# It's also an error in bash
$ export FOO = bar
bash: export: `=': not a valid identifier
$ FOO = bar
bash: FOO: command not found
# Either of these are correct
$ export FOO=bar
$ FOO=bar

Addresses Github issue open-mpi#7668

Co-authored-by: [email protected]

Signed-off-by: Quincey Koziol <[email protected]>
@jsquyres jsquyres merged commit eda04d6 into open-mpi:main Sep 11, 2023
8 checks passed
@janjust
Copy link
Contributor

janjust commented Sep 11, 2023

@qkoziol please open up v5.0 PR

@qkoziol
Copy link
Contributor Author

qkoziol commented Sep 11, 2023

@qkoziol please open up v5.0 PR

FYI: v5.0 PR: #11926

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants