-
Notifications
You must be signed in to change notification settings - Fork 1
opal_progress guidelines
Per discussion at 2018 OMPI dev meeting, we agreed to move forward to make opal_progress()
multithreaded. This is the guideline to make your component in compliance with this change.
OMPI will not serialize calls to opal_progress()
anymore. This means your component's progress function might get simultaneously invoked from multiple threads. This change allows a communication component to be more efficient in multithreaded scenario if they choose to (ie, parallelize your component by creating multiple working lanes).
At this stage, we still serialize opal_progress()
to give grace period for the components to adjust. We will be informing you via mailing list for the deadline that this change will take effect.
Not all OMPI MCA components will be affected by this change. Basically all communication components should be affected as well as any component that has registered a progress function. If your component works with timed triggers you might also be careful as the completion event might now be called from another thread, with opportunities for race conditions.
This depends on how your component works. Let's break it down.
- If your component is thread-safe.
- If you want to enhance your multithreaded performance, YES, take a look at btl/uct or btl/ofi to really take advantage of this change.
- If you just want to get by.
- If calling progress from multiple threads will not affect performance, NO.
- If calling progress from multiple threads will affect performance, Yes, goto HOW.
- If your component is not thread-safe.
- Maybe it is time to make it so?
- If you want your component to be compliance, Yes, goto HOW.
- If not, your component will have a hard time running in multithreaded mode. You might not care but someone will. Your component might create problem for others.
Again, this is the minimum requirement from the component.
- create a component mutex.
- Do a trylock in your progress function.
typedef struct yourcomponent_t {
...
...
...
opal_mutex_t component_lock; /* Add a new mutex_t here and initialize it at init. */
} yourcomponent_t;
void yourcomponent_component_progress(void)
{
if (!OPAL_THREAD_TRYLOCK(&component->component_lock)) {
/* YOUR ORIGINAL PROGRESS ROUTINES HERE*/
/* YOUR ORIGINAL PROGRESS ROUTINES HERE*/
/* YOUR ORIGINAL PROGRESS ROUTINES HERE*/
OPAL_THREAD_UNLOCK(&component->component_lock);
}
}
Don't forget to initialize/cleanup the mutex!