Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nullify pointer to ClientListener in CustomClientInfo in destructor #205

Closed
wants to merge 1 commit into from

Conversation

sloretz
Copy link
Contributor

@sloretz sloretz commented Jun 8, 2018

This mitigates (does not fix) a race condition when one thread calls rmw_wait() and another calls rmw_destroy_client(). When rmw_wait() wakes up it calls CustomClientInfo->listener_, but both the instance of CustomClientInfo and the object pointed to by listener_ have been destroyed. This nulls the field listener_ and has rmw_wait() check this field before using it to reduce the likelyhood of a crash.

I observed this race condition while debugging a CI test failure in ros2/rclcpp#488. The symptom seen is a SIGABRT on OSX when trying to lock ClientListener.internalMutex_.

I'm not sure what a real fix would look like. rmw_wait() needs a way to tell the client it is holding onto has been destroyed. The other entities appear to be using the same pattern, so I suspect they suffer from similar race conditions too. (ticket to follow)

@sloretz sloretz added the bug Something isn't working label Jun 8, 2018
@sloretz sloretz self-assigned this Jun 8, 2018
@dirk-thomas
Copy link
Member

I would suggest to wait with merging this until #203 has been merged or it has been decided to delay it for now.

@mikaelarguedas
Copy link
Member

My understanding is that #203 will not be integrated before feature freeze so we can move these race condition mitigation forward for the release.

@sloretz is this PR ready for review ?

@dirk-thomas
Copy link
Member

If #203 doesn't necessarily change the default behavior and we don't have concerns about regressions it is imo still on the list to possibly be merged.

@sloretz
Copy link
Contributor Author

sloretz commented Dec 21, 2018

Closing this since there have been enough changes to cause conflicts and it only mitigated an issue.

@sloretz sloretz closed this Dec 21, 2018
@sloretz sloretz deleted the fini_client_wait_crash branch December 21, 2018 01:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants