Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use MPI_Bcast instead of multiple p2p messages to update nest from parent #743

Merged
merged 20 commits into from
Jan 22, 2024

Conversation

dkokron
Copy link
Contributor

@dkokron dkokron commented Dec 20, 2023

Performance profiling of a HAFS case on NOAA systems revealed significant of time was spent in fill_nested_grid_cpl(). The fill_nested_grid_cpl() routine from FV3/atmos_cubed_sphere/driver/fvGFS/atmosphere.F90 is showing up as a performance bottleneck. This routine gathers a global SST field (6,336,000 bytes) onto rank 0, then broadcasts that field to all ranks in the nest. The code uses point-to-point (p2p) messages (Isend/Recv) from rank 0 to the nest ranks. This communication pattern is maxing out the SlingShot-10 link on the first node resulting in a .15s hit every fifth time step.

The proposed fix is to modify the relevant FV3 code to use a single MPI_Bcast (via mpp_broadcast()) instead of multiple point-to-point messages. The use of mpp_broadcast depends on a fix to FMS that was merged on 16 June and is available in version 2023.02 of that package.
NOAA-GFDL/FMS#1246

This PR depends on merging of PR 272 into the GFDL_atmos_cubed_sphere
NOAA-GFDL/GFDL_atmos_cubed_sphere#272

I ran the UFS regression suite on acorn and cactus. Both runs resulted in "REGRESSION TEST WAS SUCCESSFUL"

This change is zero-diff. No need to update baselines

@BrianCurtis-NOAA
Copy link
Collaborator

Waiting on ACS approval and hash update and .gitmodules revert.

@jkbk2004 jkbk2004 merged commit 6c2b775 into NOAA-EMC:develop Jan 22, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants