forked from open-mpi/ompi
-
Notifications
You must be signed in to change notification settings - Fork 1
Meeting 2016 08
Josh Hursey edited this page Aug 12, 2016
·
72 revisions
Logistics:
- Start: 9am, Tue Aug 16, 2016
- Finish: 1pm, Thu Aug 18, 2016
- Location: IBM facility, Dallas, TX
- Attendance fee: $50/person, see registration link below
Please both register at EventBrite ($50/person) and add your name to the wiki list below if you are coming to the meeting:
- Jeff Squyres, Cisco
- Howard Pritchard, LANL
- Geoffrey Paulsen, IBM
- Ralph Castain, Intel
- George Bosilca, UTK (17 and 18)
- Josh Hursey, IBM
- Edgar Gabriel, UHouston
- Takahiro Kawashima, Fujitsu
- Shinji Sumimoto, Fujitsu
- Brian Barrett, Amazon Web Services
- Nathan Hjelm, LANL
- Sameh Sharkawi, IBM (17 and 18)
- Mark Allen, IBM
- ...please fill in your name here if you're going to attend...
- Plans for v2.1.0 release
- Need community to contribute what they want in v2.1.0
- Want to release by end of 2016 at the latest
- After v2.1.0 release, should we merge from master to the v2.x branch?
- Only if there are no backwards compatibility issues (!)
- This would allow us to close the divergence/gap from master to v2.x, but keep life in the v2.x series (which is attractive to some organizations)
- Alternatively, we might want to fork and create a new 3.x branch.
- Present information about IBM Spectrum MPI, processes, etc.
- May have PR's ready to discuss requested changes, but schedule is tight in July / August for us.
- MTT updates / future direction
- Migration to new cloud services update for website, database, etc.
- Spend time migrating Jenkins: IU -> Ralph's server
- Spend time migrating MTT: IU -> Ralph's server
- What do we want to do after Ralph's server for Jenkins and MTT?
- MTT: new server / cherrypy
- Jenkins: Java
- Revamp / consolidate: ompi master:contrib/ -- there's currently 3 subdirs that should really be disambiguated and overlap removed. Perhaps name subdirs by the DNS name where they reside / operate?
- infrastructure
- build server
- nightly
- Spend time documenting where everything is / how it is setup
- Fix OMPI timeline page: https://www.open-mpi.org/software/ompi/versions/timeline.php
- Possible umbrella non-profit organization
- How to help alleviate "drowning in CI data" syndrome?
- One example: https://github.com/open-mpi/ompi/pull/1801
- One suggestion: should we actively market for testers in the community to help wrangle this stuff?
- If Jenkins detects an error, can we get Jenkins to retry the tests without the PR changes, and then compare the results to see if the PR itself is introducing a new error?
- How do we stabilize Jenkins to alleviate all these false positives?
- PMIx roadmap discussions
- Thread-safety design
- Need some good multi-threaded performance tests (per Nathan and Artem discussion)
- Do we need to write them ourselves?
- Review/define the path forward
- Need some good multi-threaded performance tests (per Nathan and Artem discussion)
- Fujitsu status
- Memory consumption evaluation
- MTT status
- PMIx status
- MCA support as a separate package?
- Now that we have multiple projects (PMIx, Warewulf) and others using MCA plugins, does it make sense to create a separate repo/package for MCA itself? Integrating MCA into these projects was modestly painful (e.g., identifying what other infrastructure - such as argv.h/c - needs to be included) - perhaps a more packaged solution will make it simpler.
- Need to "tag" the component libraries with their project name as library confusion is becoming more prevalent as OMPI begins to utilize MCA-based packages such as PMIx
- Plans for folding
ompi-release
Github repo back intoompi
Github repo - (Possibly) Remove atomics from
OBJ_RETAIN
/OBJ_RELEASE
in theTHREAD_SINGLE
case.- @nysal said he would look at this.
- See https://github.com/open-mpi/ompi/issues/1902.
- NTH: Already done. We did this in 1.8.x/1.10.x but never committed to updating master.
- Continue
--net
mpirun
CLI option discussion from Feb 2016 meeting- Originally an IBM proposal.
- Tied to issues of "I just want to use network X" user intent, without needing to educate users on the complexities of PML, MTL, BTL, COLL, ...etc.
- We didn't come to any firm conclusions in August.
- OpenMPI non-profit?
- MPI_Reduce_Local - move into coll framework.
- Revive btl/openib memalign hooks?
- Discuss appropriate default settings for openib BTL
- Email thread on performance conflicts between RMA/openib and SM/Vader
- Ralph offers to give presentation on "Flash Provisioning of Clusters", if folks are interested
- Cleanup of exposed internal symbols (see https://github.com/open-mpi/ompi/pull/1955)
- Performance Regression tracking
- What do we want to track, and how are we going to do that.
- https://github.com/open-mpi/ompi/issues/1831#issuecomment-229520276
- https://github.com/open-mpi/mtt/issues/445