forked from open-mpi/ompi
-
Notifications
You must be signed in to change notification settings - Fork 1
WeeklyTelcon_20180410
Geoffrey Paulsen edited this page Jan 15, 2019
·
1 revision
- Dialup Info: (Do not post to public mailing list or public wiki)
- Geoff Paulsen
- Jeff Squyrese
- Brian
- Edgar Gabriel
- Geoffroy Vallee
- Howard
- Josh Hursey
- Nathan Hjelm
- Thomas Naughton
- Todd Kordenbrock
- Xin Zhao
Review All Open Blockers
Review v2.x Milestones v2.1.3
- v2.1.4 - Targeting Oct 15th,
- Merged in a bunch of stuff.
- One-sided multithreaded bugs that came up.
- Doesn't feel like it's worth it to fix in v2.1.x, so instead pulled configurey changes from v2.0 to v2.1.x
- No new news on v2.1.x
Review v3.0.x Milestones v3.0.2
- v3.0.1 went out the door.
- Oops, Did not get PMIx Compatibility pieces in embedded PMIx
- v3.0.2 open for bugfixes. Quick turnaround on this.
- Shooting for May 1st.
- Will pre-emptively fix PMIx compatibility pieces to pickup PMIx v1.2.5 clients.
- This will bring in PMIx compatibility with OMPI client (mpirun/orted/libmpi) from OMPI v2.1.3
- memkind disable needs to get into v3.0.2, Either taken care of or waiting to be taken care of.
- PR (fix ppc64-big-Endian) can't merger until 4563 is merged.
- Thought Nathan was going to fix the hang, and then merge.
- Given this is the same issue as ARM, where we don't have a block, thought we'd just remove
- We now understand the problem, and not a silent data corruption, just a hang.
Review v3.1.x Milestones v3.1.0
- Schedule - ASAP - but blockers keep getting filed.
- No one seems particularly eager to get it out.
- Two blockers
- One is high level of failures in CISCO MTT. Pretty sure it's not unique to 3.1.x, and happening on v3.0.x
-
Issue 4857 in some situations, v3.1.x produces mpicc wrappers that can't link correctly.
- Decided to close as can't replicate.
Review Master Master Pull Requests
- Nothing new.
- Implications for OpenMPI
- When you have PMIx client v1.2.3 with server v1.2.3 works. (all testing with itself works)
- This graph is coming from a PMIx client / server standpoint, and describes
- Wasn't there some blanket cross-version support statements?
- v1.2.5, v2.0.3, v2.1.1, v3.0.0
- How is PMIx dstore represented in this graph? ORTE MCA parameter needed for client/server missmatch
- There is a 3rd chart to describe what testing should be done.
- This chart does not describe configuring with external PMIx, and compatibility.
- Containers and externals are different, to be discussed later.
- Need to figure out how to discuss this with Users.
- Perhaps discussing compatibilities between user's tools (Orte / slurm / mpirun / Debuggers / etc)
- one of the things good about PMI v1 or v2, is that their interface stayed the same for years.
- Well, also PMIx supporting multiple "levels" the message is no longer "use PMI v1/v2 everywhere... there are various levels of support / compatibility everywhere.
- IBM CI is back up
- Cisco and IBM MTT didn't trigger last night.
Review Master MTT testing
- Mellanox, Sandia, Intel
- LANL, Houston, IBM, Fujitsu
- Amazon,
- Cisco, ORNL, UTK, NVIDIA