-
Notifications
You must be signed in to change notification settings - Fork 1
Meeting 2015 01
This is a standalone meeting; it is not being held in conjunction with an MPI Forum meeting.
Doodle for choosing the date: https://doodle.com/zzaupgxge9y6medu
- Date: 9am Tuesday, January 27 through 3pm Thursday, January 29, 2015
- Location: Cisco Richardson facility (outside Dallas), building 4:
Cisco Building 4
2200 East President George Bush Highway
Richardson, Texas 75082-3550
Google maps link: https://goo.gl/maps/SNrbu
Local attendees:
- (*) Jeff Squyres - Cisco
- (*) Howard Pritchard - Los Alamos
- (*) Ralph Castain - Intel
- (*) George Bosilca - U. Tennessee, Knoxville
- (*) Dave Goodell - Cisco
- (*) Edgar Gabriel - U. Houston
- (*) Vish Venkatesan (not Tuesday) - Intel
- (*) Geoff Paulsen - IBM
- (*) Joshua Ladd - Mellanox Technologies
- (*) Rayaz Jagani - IBM
- (*) Dave Solt - IBM
- (*) Perry Schmidt - IBM
- (*) Naoyuki Shida - Fujitsu
- (*) Shinji Sumimoto - Fujitsu
- (*) Stan Graves - IBM
- (*) Mark Allen - IBM
- ...please add your name if you plan to attend...
(*) = Registered (by Jeff)
Remote attendees
- Nathan Hjelm - Los Alamos
- Ryan Grant - Sandia (planning to attend for the MTL and 1.9 branch discussions)
Wed afternoon (in priority order)
Thurs morning
- Ralph: ORCM update
- Roadmap
- Instant On launch planning
- Jeff: Progress on thread-multiple support
- Ralph: Collective switching points & MPI tuning params - what is required to change them. Had a discussion brought up by Mellanox, and we never finished this.
- Intel/LANL: MTL selection issue (PSM vs. OFI)
- Nathan: Enhance MTL interface to include one-sided and atomics
Deferred
- Ralph: RTE-MPI sharing of BTLs
Since this will be a full meeting in itself, we'll have a good amount of time for discussion, design, and for hacking!
-
Jeff/Howard: Branch for v1.9
- See Releasev19 wiki page
- We need to make a list of features for v1.9.0 to see if we're ready to branch yet
-
Jeff: libtool 2.4.4 bug / libltdl may no longer be embeddable. Should we embed manually, or should we just tell people to have libltdl-devel installed?
- Resolved: let's stop embedding; we'll always link against external libltdl.
- However: this means people need to have the libltdl headers installed (e.g., libltdl-devel RPM). We don't care about telling developers to do this, but we are a little worried about telling users to do this (because it raises the bar for building Open MPI -- the assumption that libltldl-devel is almost certainly not installed on most user machines).
- The question becomes: what is configure's default behavior when it can't find ltdl.h?
- Abort
- Just fall back to --disable-dlopen behavior (i.e., slurp in plugins)
- Let's bring up the "default behavior" issue as an RFC / beer discussion.
-
Jeff/Howard: Jenkins integration with Github:
- how do we do multiple Jenkins servers? (e.g., running at different organizations)
- much discussion in the room. Seems like a good idea to have multiple Jenkins polling github and running their own smoke tests. Need to figure out how to have them report results. Mike Dubman/Eugene V/Dave G will go investigate how to do this.
-
Howard/George: fate of coll ML
-
see http://www.open-mpi.org/community/lists/devel/2015/01/16820.php
-
who owns it?
-
should we try to fix it or disable by default?
-
Point was raised that coll/ml is very expensive during communicator creation -- including MPI_COMM_WORLD. Should we delete coll/ml? George asked Pasha; Pasha is checking.
-
Pasha: disable it for now, ORNL will fix and re-enable
-
DONE: George opal_ignore'd the coll/ml component
-
Ralph: Scalable startup, including:
- Current state of opal_pmix integration
- Async modex, static endpoint support
- Re-define the role of PML/BTL add_procs: need to move to a more lazy-based setup of peers
- Memory footprint reduction
- Resolved:
- Revive sparse groups
- Edgar checked: passes smoke test today
- first phase: replace ompi_proc_t array with pointer array to ompi_proc_t's
- investigate further reduction in footprint
- very simple, 1-way static setup of group hash, current optimize for MCW
- investigate further reduction in footprint
- remove add_procs from MPI_Init unless preconnect called
- PML calls add_procs with 1 proc on first send to peer
- need centralized method to check if we need to make a proc (must be thread safe)
- may need to poll BTLs...etc. Expensive! Async? Must also be done thread safe
- still a blocking call
- Nathan: if one-sided calls BTLs directly, then need to check/call add_procs
- call add_procs with all procs for preconnect-all and in connect/accept, or if PML component indicates it needs to add_procs with all procs
- need to check with MTL owners on impact to them
- will only add_procs a peer proc at most once before it is del_proc'd
- PML calls add_procs with 1 proc on first send to peer
- del_procs needs to release memory and NULL the proc entry to ensure that you get NULL when you next look for the proc
- differentiate between "I need a proc for..."
- communication
- non-communication
- need to check BTL/MTLs to see how they handle messages from peers that we don't have an ompi_proc_t for
- need way for BTL/MTL to upcall the PML with the message so the PML can create a new ompi_proc_t, call add_proc, handle message
-
COMM_SPLIT_TYPE PR: https://github.com/open-mpi/ompi/pull/326 -- what about IP issues?
-
Jeff added request to PR that the author mark it as released as BSD so we can properly ingest it
-
George to contact offlist to discuss enhancements
-
Edgar: extracting libnbc core from the collective component into a standalone directory such that it can be used from OMPIO and other locations
- move the libnbc core portions into a subdirectory in ompi
- modification to libnbc will include new read/write primitives as well as new send/recv primitives with an additional indirection level for buffer pointers.
-
Ralph: Review: v1.8 series / RM experience with Github and Jenkins and the release process
- Ralph's feedback: lots more PRs than we used to have CMRs
- Ralph's feedback: people seem to be relying on Jenkins for correctness, when Jenkins is really just a smoke test
- Github fans will look at creating some helpful scrips to support MTT testing of PRs
-
Ralph: PMIx update
- Given orally at meeting
-
Ralph: Data passing down to OPAL
- Revising process naming scheme
- MPI_Info
- OPAL_info (renamed) object and typedef it at the OMPI layer
- Dave Salt from IBM volunteered
- OPAL_info (renamed) object and typedef it at the OMPI layer
- Error response propagation (e.g., BTL error propagation up from OPAL into ORTE and OMPI, particularly in the presence of async progress).
- Create opal_errhandler registration, call that function with errcode and remote process involved (if applicable) when encountering error that cannot be propagated upward (e.g., async progress thread)
- Ralph will move the orte_event_base + progress thread down to OPAL
- Ralph will provide opal_errhandler registration and callback mechanism
- Ralph will integrate the pmix progress thread to the OPAL one
- opal_event_base priority reservations:
- error handler (top)
- next 4 levels for BTLs
- lowest 3 levels for ORTE/RTE
- Create opal_errhandler registration, call that function with errcode and remote process involved (if applicable) when encountering error that cannot be propagated upward (e.g., async progress thread)
-
Howard: Progress on async progress
- What happened to this proposal: http://www.open-mpi.org/community/lists/devel/2014/02/14170.php
- Ralph will implement a global opal_event_base as part of the error response, as per above
- What happened to this proposal: http://www.open-mpi.org/community/lists/devel/2014/02/14170.php
-
Nathan: --disable-smp-locks: remove this option?
- See RFC email http://www.open-mpi.org/community/lists/devel/2015/01/16736.php
- See, in particular, George's replies
- In short: atomics are only used when multi-threading is enabled. But sm and vader need the smp locks.
- However, people are discovering --disable-smp-locks, but this breaks sm/vader.
- OMPI atomic functions:
- CAPS versions: only enabled when opal_using_threads() is true
- true when set_opal_using_thread(true), which is MPI_THREAD_MULTIPLE
- lower_case version: only on when --enable-smp-locks
- CAPS versions: only enabled when opal_using_threads() is true
- George misunderstood: now he gets it and agrees with Nathan: remove the --enable-smp-locks option.
-
Nathan: Performance of freelists and other common OPAL classes with OPAL_ENABLE_MULTI_THREADS==1 (as discussed in [GitHub]). Part of this is done already -- LIFO is a bit faster now (with threads), etc.
- This is pretty much already resolved (after this item was added to the agenda) -- a fix went in on master for this, and a different fix went in for v1.8.
- So the issue is now moot.
-
Vish: Memkind integration: see http://www.open-mpi.org/community/lists/devel/2014/11/16320.php
- Vish has slides that he will post here.
- Notes from the discussion in the room:
- We all generally agree that memkind introduces some new, desirable functionality
- With some discussion in the room, it seems "easy" to to add this functionality to MPI_ALLOC_MEM/MPI_FREE_MEM.
- We decided that it's quite hard to know how to use this internally in the rest of the OMPI code base right now. We assume we will want to use it; we just don't know how yet (there are many variables). So let's get some experience with memkind in MPI_ALLOC_MEM first and revisit how to use this internally in the rest of the code base.
- Here's the 4 steps we think we need to do:
- remove "allocator" framework use from ob1, replace it with malloc (because the use of allocator there seems to be pretty useless)
- create new allocator modules for things like:
- posix_memalign
- mmap
- malloc
- ...?
- change the mpool framework/modules to use allocator modules to get memory
- update MPI_Alloc_mem to:
- lazily create allocator modules from memkind when each memkind type requested
- make an mpool with that allocator
- allocate memory from the mpool associated with that memkind allocator type
- (somehow) register the memory with all other mpools (e.g., mpools in use by the BTLs)
- MPI_FREE_MEM needs to unregister with all mpools (probably already done?)
- MPI_FREE_MEM needs to return the memory to the right mpool
- Nathan and Vish will coordinate to move forward on this.
- George and NAthan are digging in to ensure that allocator is not already being used in a way that will be problematic. ob1 usage seems to be understood / ok to change. sm mpool needs to be investigated -- it uses allocator, too.
-
Fujitsu: future plans for Open MPI development
- Shinji will post slides here.
-
Ralph/Nathan: MTL overhead reduction
- ...more...
-
Jeff: MPI extensions: MPIX_ prefix, or OMPI_ prefix?
- Just a discussion between Jeff and George.
- [Jenkins and CI Testing] (https://github.com/open-mpi/ompi/wiki/jenkins_ci_testing_etc.pdf)