Skip to content
This repository has been archived by the owner on Apr 30, 2019. It is now read-only.

ARMCI Semantics

Jeff Hammond edited this page Jun 17, 2016 · 6 revisions

Ordering

Unlike the blocking operation, the nonblocking operations are NOT ordered.

[Reference 4]

Atomicity

Acc is atomic at the element level, where the element is defined by the operation. Memory locks are an implementation detail needed to handle races (often between the master process and the data server thread).

You can think about Acc as an unordered collection (say set) of accumulate operations being issued in some arbitrary order, one per element. The order between the updates is not guaranteed. Neither that all of the updates are observed at the same time, or in any particular order. The only atomicity is at the element level. That too I would stress is only with respect to other Acc operations, not Put or Get. Assuming the operation is associative and commutative, the result is determinate.

For comparison, semantically, Put can be degenerately specified to be atomic at the byte-level. Together with the fact that the operation is not commutative, performing concurrent operations to the same byte causes non-determinacy and is considered erroneous.

This raises another issue if the lower-level communication runtime is not byte-atomic for Puts. Then ARMCI's byte level communication calls, with correctness guaranteed in the presence of concurrent disjoint Puts, will need more effort/cost to implement.

[Reference 5]

Summary:

  1. Accumulate operations are atomic only with respect to other accumulate operations, not Put or Get.

  2. Put (Get?) is byte-atomic.

References

  1. Jarek Nieplocha and Jialin Ju. ARMCI: A Portable Aggregate Remote Memory Copy Interface. Version 1.1, October 30, 2000.

  2. Jarek Nieplocha and Bryan Carpenter. ARMCI: A Portable Remote Memory Copy Library for Distributed Array Libraries and Compiler Run-time Systems.

  3. J. Nieplocha, V. Tipparaju, A. Saify and D. Panda. Protocols and Strategies for Optimizing Performance of Remote Memory Operations on Clusters.

  4. https://svn.pnl.gov/svn/hpctools/trunk/ga/armci/examples/features/non-blocking/README (https://github.com/jeffhammond/ga/blob/master/armci/examples/features/non-blocking/README).

  5. https://groups.google.com/forum/m/#!topic/hpctools/rXaOSP0Ml-s

Clone this wiki locally