Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use the first sister message as a Heartbeat, too #93

Open
wants to merge 1 commit into
base: kinetic-devel
Choose a base branch
from

Conversation

peci1
Copy link

@peci1 peci1 commented Apr 11, 2023

I've noticed that when the sister is killed quickly after being started, there is (very much repeatable) chance to deadlock the other Bond. Consider this sequence:

  1. Sister sends her first status message. My bond executes AwaitingSister->SisterAlive() which calls Connect(). Connect() stops the connection timer.
  2. Sister dies before sending her second message.
  3. My bond is now in Alive state. However, Heartbeat() has not yet been called (that would be done by the second message which did not come), so heartbeat_timer_ is still off and does not trigger the timeout event.

The fix I sent fixes it on the C++ side. I'm not sure about the Python side, and not sure about the SM source code. The Heartbeat() needs to be called already in the Alive state (it is undefined in AwaitingSister). Quickly skimming through the SM syntax, it doesn't seem to me it would support calling some functions before the transition and some after it.

If that would be the case, I'd suggest calling Heartbeat in the AwaitingSister state (right after calling Connected()) and defining it for AwaitingSister state to do the same as for the Alive state. That would require no additional changes to the SM definition, so it might be preferred.

I'll let the maintainers decide which approach would be better.

@peci1
Copy link
Author

peci1 commented Apr 11, 2023

A practical example where this can happen is when a nodelet is unloaded very shortly after being loaded.

@peci1
Copy link
Author

peci1 commented Jun 15, 2024

Could this please get a review?

@mjcarroll mjcarroll self-assigned this Jun 17, 2024
@mjcarroll mjcarroll self-requested a review June 17, 2024 14:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants