Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rebound Params issue #17

Open
brryan opened this issue Nov 25, 2024 · 5 comments
Open

Rebound Params issue #17

brryan opened this issue Nov 25, 2024 · 5 comments
Assignees
Labels
bug Something isn't working

Comments

@brryan
Copy link
Collaborator

brryan commented Nov 25, 2024

Running disk_nbody_cyl.in on 8 ranks on an interactive skylake-gold session gives me this error:

terminate called after throwing an instance of 'std::runtime_error'
  what():  ### PARTHENON ERROR
  Condition:   pin_hash == pin_hash_root
  Message:     Parameter input object must be the same on every rank, otherwise I/O may be
                unable to write it safely. If you reached this error message, look to make sure
                that your calls to functions that look like pin->GetOrAdd are all called
                exactly the same way on every MPI rank.
  File:        /vast/home/brryan/github/artemis/external/parthenon/src/outputs/output_utils.cpp
  Line number: 340

Weirdly, this error goes away if I run with only 4 ranks. If I fill out the reb_simulation struct on all ranks rather than just the first (line ~153 of nbody.cpp) I can run for at least a few cycles. We may just have to keep this struct synced across ranks. Not sure why this hasn't come up before.

@brryan brryan added the bug Something isn't working label Nov 25, 2024
@brryan brryan self-assigned this Nov 25, 2024
@adamdempsey90
Copy link
Collaborator

what is this actually testing? That the parameter file is the same on all ranks?

@adamdempsey90
Copy link
Collaborator

It's the restart string of bytes we stuff into the params (in InitializeFromRestart)?

@adamdempsey90
Copy link
Collaborator

adamdempsey90 commented Nov 25, 2024

If it's complaining about the reb_sim we add at line 198, then that's never used when outputting/reading, so the fatal is kind of annoying. I don't think there's anything wrong with what we're doing, i.e., we read/write the rebound sim correctly.

@adamdempsey90
Copy link
Collaborator

Can we just add some flag to parameters that says don't write/read these from the outputs?

@brryan
Copy link
Collaborator Author

brryan commented Nov 26, 2024

Yeah I think it's just reb_sim, not the string we create from that to store for restarts which should be common to all procs. I think you're right that Parthenon is too aggressive in testing for consistency across parameters, since in general we don't save custom types so they don't matter. I'll take a look to see if this would be straightforward to fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants