-
Notifications
You must be signed in to change notification settings - Fork 864
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SHMEM_LOCKS: MCS implementation of SHMEM LOCKS #11796
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
have some minor comments
oshmem/shmem/c/shmem_clear_lock.c
Outdated
@@ -27,5 +28,10 @@ | |||
|
|||
void shmem_clear_lock(volatile long *lock) | |||
{ | |||
_shmem_clear_lock((void *)lock, sizeof(long)); | |||
if (oshmem_shmem_mcs_locks) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
general comment: pls use 4 space tabs as per OMPI Coding Style
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have checked and fixed all files updated in this PR to stick to 4 space tabs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you please double check? E.g. looks like 8-width tabs are used in this file (and possibly others)
oshmem/shmem/c/shmem_clear_lock.c
Outdated
@@ -27,5 +28,10 @@ | |||
|
|||
void shmem_clear_lock(volatile long *lock) | |||
{ | |||
_shmem_clear_lock((void *)lock, sizeof(long)); | |||
if (oshmem_shmem_mcs_locks) { | |||
SHMEM_API_VERBOSE(10, "shmem_clear_lock using MCS Lock"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe move this print to the very beggining of the func and also print locking scheme being cleaned?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have moved the print and also printing the locking scheme now with the latest commit
oshmem/shmem/c/shmem_fetch.c
Outdated
type prefix##_ctx##type_name##_atomic_fetch(shmem_ctx_t ctx, \ | ||
const type *target, int pe) \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unrelated change, pls remove
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
oshmem/shmem/c/shmem_lock.c
Outdated
MCA_ATOMIC_CALL(cswap(oshmem_ctx_default, target, (void*)&prev_value, | ||
cond, value, target_size, pe)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls consider pushing changes in this file in a separate PR, as they are not relavant for the new locking algo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed unwanted changes from this PR.
oshmem/shmem/c/shmem_mcs_lock.c
Outdated
/* | ||
* Copyright (c) 2013 Mellanox Technologies, Inc. | ||
* All rights reserved. | ||
* Copyright (c) 2014 Cisco Systems, Inc. All rights reserved. | ||
* Copyright (c) 2015 Research Organization for Information Science | ||
* and Technology (RIST). All rights reserved. | ||
* $COPYRIGHT$ | ||
* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as this is newly introduced file, i'd guess only relevant Nvidia copyright notice should be present
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed the License
oshmem/shmem/c/shmem_mcs_lock.c
Outdated
* shmem_atomic_add(next, my_pe - NEXT_MASK, | ||
prev_tailpe); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why need this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No we can remove this. also
oshmem/shmem/c/shmem_mcs_lock.c
Outdated
/** | ||
* Wait for predecessor release lock to this PE | ||
* signal to false. | ||
* int curr = shmem_atomic_fetch(next, my_pe); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, artifact from previous implementation removed.
oshmem/shmem/c/shmem_mcs_lock.c
Outdated
* prev_value = shmem_atomic_compare_swap(tail, swap_cond, 0, | ||
* mcs_tail_owner); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't it almost the same what is used below?
same comment is relevant for other similar comments in this file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have removed this comment.
oshmem/shmem/c/shmem_mcs_lock.c
Outdated
shmem_mcs_internal_set_lock(lock); | ||
} | ||
|
||
void _shmem_mcs_clear_lock(long *lock) { | ||
shmem_mcs_internal_clear_lock(lock); | ||
} | ||
|
||
int _shmem_mcs_test_lock(long *lock) { | ||
return shmem_mcs_internal_test_lock(lock); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need these wrappers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was intended to keep similar to the existing implementation. Doesn't have purpose for this implementation. Removing it.
oshmem/shmem/c/shmem_set_lock.c
Outdated
@@ -27,5 +28,10 @@ | |||
|
|||
void shmem_set_lock(volatile long *lock) | |||
{ | |||
_shmem_set_lock((void *)lock, sizeof(long)); | |||
if (oshmem_shmem_mcs_locks) { | |||
SHMEM_API_VERBOSE(10, "shmem_set_lock using MCS Lock"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe such comments are worth to be common for all type of locks? (i.e. to move them to the very beginning of func)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed and moved in the latest commit.
Thank you for the review. I have fixed all the issues pointed in the comments and uploaded a new patch. Please review and let me know if you have any other comments. |
oshmem/shmem/c/shmem_clear_lock.c
Outdated
@@ -27,5 +28,10 @@ | |||
|
|||
void shmem_clear_lock(volatile long *lock) | |||
{ | |||
_shmem_clear_lock((void *)lock, sizeof(long)); | |||
if (oshmem_shmem_mcs_locks) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you please double check? E.g. looks like 8-width tabs are used in this file (and possibly others)
oshmem/shmem/c/shmem_mcs_lock.c
Outdated
/* | ||
* Copyright (c) 2023 NVIDIA Corporation. | ||
* All rights reserved. | ||
* and Technology (RIST). All rights reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
leftover?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it was, removed now.
oshmem/shmem/c/shmem_mcs_lock.c
Outdated
RUNTIME_CHECK_RC(retv); | ||
|
||
/** | ||
* This value to be changed eventually by predecessor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alignment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have redone the alignment for the whole file now, hopefully this patch fixes the alignment issues.
oshmem/shmem/c/shmem_mcs_lock.c
Outdated
int next_value = 0; | ||
int swap_cond = 0; | ||
int prev_value = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need for extra spaces before =
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
oshmem/shmem/c/shmem_mcs_lock.c
Outdated
int | ||
_shmem_mcs_test_lock(long *lockp) | ||
{ | ||
lock_t *lock = (lock_t *) lockp; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls align by =
for consistency with other vars in this func
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
oshmem/shmem/c/shmem_mcs_lock.c
Outdated
if (0 == prev_tail) | ||
return 0; /** lock acquired successfully */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you please use {} for every statement, like
if (0 == prev_tail) | |
return 0; /** lock acquired successfully */ | |
if (0 == prev_tail) { | |
return 0; /** lock acquired successfully */ | |
} |
(we try to use this style for oshmem and ucx pml code)
also relevant for other one line statements in this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
oshmem/shmem/c/shmem_set_lock.c
Outdated
@@ -27,5 +29,12 @@ | |||
|
|||
void shmem_set_lock(volatile long *lock) | |||
{ | |||
_shmem_set_lock((void *)lock, sizeof(long)); | |||
SHMEM_API_VERBOSE(10, "shmem_set_lock"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe remove this one now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
oshmem/shmem/c/shmem_test_lock.c
Outdated
@@ -28,5 +31,12 @@ | |||
|
|||
int shmem_test_lock(volatile long *lock) | |||
{ | |||
return _shmem_test_lock((void *)lock, sizeof(long)); | |||
SHMEM_API_VERBOSE(10, "shmem_test_lock"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Hello! The Git Commit Checker CI bot found a few problems with this PR: 94dae7c: Update shmem_mcs_lock.c - minor alignment for "="
Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks! |
Hello! The Git Commit Checker CI bot found a few problems with this PR: 4b2a05a: Update shmem_mcs_lock.c - aligning a commit
94dae7c: Update shmem_mcs_lock.c - minor alignment for "="
Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks! |
Hello! The Git Commit Checker CI bot found a few problems with this PR: 923785e: Update shmem_test_lock.c - Remove unwanted prints
26b3a73: Update shmem_clear_lock.c remove unwanted print.
4b2a05a: Update shmem_mcs_lock.c - aligning a commit
94dae7c: Update shmem_mcs_lock.c - minor alignment for "="
Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks! |
1 similar comment
Hello! The Git Commit Checker CI bot found a few problems with this PR: 923785e: Update shmem_test_lock.c - Remove unwanted prints
26b3a73: Update shmem_clear_lock.c remove unwanted print.
4b2a05a: Update shmem_mcs_lock.c - aligning a commit
94dae7c: Update shmem_mcs_lock.c - minor alignment for "="
Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks! |
Hello! The Git Commit Checker CI bot found a few problems with this PR: 3786663: Update shmem_mcs_lock.c - Some more minor alignmen...
923785e: Update shmem_test_lock.c - Remove unwanted prints
26b3a73: Update shmem_clear_lock.c remove unwanted print.
4b2a05a: Update shmem_mcs_lock.c - aligning a commit
94dae7c: Update shmem_mcs_lock.c - minor alignment for "="
Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks! |
Hello! The Git Commit Checker CI bot found a few problems with this PR: 5b47412: Update shmem_mcs_lock.c - removed unwanted lines/s...
3786663: Update shmem_mcs_lock.c - Some more minor alignmen...
923785e: Update shmem_test_lock.c - Remove unwanted prints
26b3a73: Update shmem_clear_lock.c remove unwanted print.
4b2a05a: Update shmem_mcs_lock.c - aligning a commit
94dae7c: Update shmem_mcs_lock.c - minor alignment for "="
Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks! |
Hello! The Git Commit Checker CI bot found a few problems with this PR: 6936a0e: Commits from github online without signed-off opti...
5b47412: Update shmem_mcs_lock.c - removed unwanted lines/s...
3786663: Update shmem_mcs_lock.c - Some more minor alignmen...
923785e: Update shmem_test_lock.c - Remove unwanted prints
26b3a73: Update shmem_clear_lock.c remove unwanted print.
4b2a05a: Update shmem_mcs_lock.c - aligning a commit
94dae7c: Update shmem_mcs_lock.c - minor alignment for "="
Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks! |
d0b7901
to
4203927
Compare
@yosefe, pls review |
oshmem/shmem/c/shmem_mcs_lock.c
Outdated
/** | ||
* @file | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems that the comment is not needed here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
oshmem/shmem/c/shmem_mcs_lock.c
Outdated
/** has meaning only on MCSQ_TAIL OWNER */ | ||
int tail; | ||
/** It has meaning on all PEs */ | ||
/** The next pointer is a combination of the PE ID and wait signal */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/** The next pointer is a combination of the PE ID and wait signal */ | |
/** The next pointer is a combination of the PE ID and wait signal */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
oshmem/shmem/c/shmem_mcs_lock.c
Outdated
#define SIGNAL_MASK 0x80000000U //Wait signal mask | ||
#define NEXT(A) (A & NEXT_MASK) | ||
#define GET_PE(P) (P & NEXT_MASK) // Improve readability |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use C-style comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
oshmem/shmem/c/shmem_mcs_lock.c
Outdated
* releasing it. | ||
*/ | ||
/** | ||
* Can make this to be shmem_atomic_set to be safe in non-cc architecutres |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* Can make this to be shmem_atomic_set to be safe in non-cc architecutres | |
* Can make this to be shmem_atomic_set to be safe in non-cc architectures |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
oshmem/shmem/c/shmem_mcs_lock.c
Outdated
int retv = 0; | ||
|
||
/** | ||
* Can make atomic fetch to be safe in non-cc architecutres |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* Can make atomic fetch to be safe in non-cc architecutres | |
* Can make atomic fetch to be safe in non-cc architectures |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
oshmem/shmem/c/shmem_set_lock.c
Outdated
@@ -9,7 +11,6 @@ | |||
* | |||
* $HEADER$ | |||
*/ | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please return back the empty line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is better to replace sizeof(int)
by sizeof(variable_name)
in swap, add, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, swap, add calls are internal MCA calls of shmem_int__ calls. sizeof(int) can help with readability than the variable name.
@vvenkates27 , can you please squash the commits? |
Adding MCS algorithm-based implementation for shmem_locks to improve performance for large scale SHMEM applications using locks. MCS lock is now the default algorithm, use the following MCA parameter to disable. --mca oshmem_enable_mcs_lock 0 to disable mcs locks and revert to default ticket locking. --mca oshmem_api_verbose 10 for debug information on shmem_locks. Signed-off-by: Vishwanath Venkatesan <[email protected]>
4a4e214
to
1396585
Compare
@vvenkates27 can you please port it to 4.1.x and 5.0 branches? |
SHMEM_LOCKS: MCS implementation of SHMEM LOCKS Adding MCS algorithm-based implementation for shmem_locks to improve performance for large scale SHMEM applications using locks. MCS lock is now the default algorithm, use the following MCA parameter to disable. --mca oshmem_enable_mcs_lock 0 to disable mcs locks and revert to default ticket locking. --mca oshmem_api_verbose 10 for debug information on shmem_locks. Signed-off-by: Vishwanath Venkatesan <[email protected]>
Submitted pull requests for both branches. |
Adding MCS algorithm-based implementation for shmem_locks to improve performance for large scale SHMEM applications using locks. MCS lock is now the default algorithm, use the following MCA parameter to disable.
-- Use MCA parameter --mca oshmem_enable_mcs_lock 0 to disable mcs locks and revert to default ticket locking.
-- Use oshmem_api_verbose 10 for debug information on shmem_locks.
@manjugv @brminich @rakhmets @yosefe