Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fast-RTPS not terminating cleanly [5681] #473

Closed
guillaumeautran opened this issue Apr 1, 2019 · 7 comments
Closed

Fast-RTPS not terminating cleanly [5681] #473

guillaumeautran opened this issue Apr 1, 2019 · 7 comments

Comments

@guillaumeautran
Copy link
Contributor

The problem is expressed with ROS2 and Fast-RTPS. When the last ROS2 node is deleted and the ROS2 system properly terminated, a Fast-RTPS ChannelResource thread lingers behind blocked on a network receive call preventing proper termination of the stack.

Stacktrace of the surviving thread:

Thread 5 (Thread 0x7f0d03fff700 (LWP 5605)):
#0  0x00007f0d3018a94d in recvmsg () at ../sysdeps/unix/syscall-template.S:84
#1  0x00007f0d0f769f9f in asio::detail::socket_ops::recvfrom (ec=..., addrlen=<synthetic pointer>, addr=0x7f0d03ffe270, flags=0, count=1, bufs=0x7f0d03ffe1d0, s=18)
    at /usr/include/asio/detail/impl/socket_ops.ipp:933
#2  asio::detail::socket_ops::sync_recvfrom (ec=..., addrlen=<synthetic pointer>, addr=0x7f0d03ffe270, flags=0, count=1, bufs=0x7f0d03ffe1d0, state=<optimized out>, s=18)
    at /usr/include/asio/detail/impl/socket_ops.ipp:956
#3  asio::detail::reactive_socket_service<asio::ip::udp>::receive_from<asio::mutable_buffers_1> (flags=0, this=<optimized out>, impl=..., impl=..., ec=..., 
    sender_endpoint=..., buffers=...) at /usr/include/asio/detail/reactive_socket_service.hpp:284
#4  asio::datagram_socket_service<asio::ip::udp>::receive_from<asio::mutable_buffers_1> (this=<optimized out>, ec=..., flags=0, sender_endpoint=..., buffers=..., impl=...)
    at /usr/include/asio/datagram_socket_service.hpp:395
#5  asio::basic_datagram_socket<asio::ip::udp, asio::datagram_socket_service<asio::ip::udp> >::receive_from<asio::mutable_buffers_1> (sender_endpoint=..., buffers=..., 
    this=0x1001c) at /usr/include/asio/basic_datagram_socket.hpp:790
#6  eprosima::fastrtps::rtps::UDPTransportInterface::Receive (this=this@entry=0x21b9400, pChannelResource=pChannelResource@entry=0x21b21e0, receiveBuffer=0x2203210 "", 
    receiveBufferCapacity=<optimized out>, receiveBufferSize=@0x21b21f8: 0, remoteLocator=...)
    at ./src/cpp/transport/UDPTransportInterface.cpp:412
#7  0x00007f0d0f76a490 in eprosima::fastrtps::rtps::UDPTransportInterface::performListenOperation (this=0x21b9400, pChannelResource=0x21b21e0, input_locator=...)
    at ./src/cpp/transport/UDPTransportInterface.cpp:387
#8  0x00007f0d0f7723d8 in std::_Mem_fn_base<void (eprosima::fastrtps::rtps::UDPTransportInterface::*)(eprosima::fastrtps::rtps::UDPChannelResource*, eprosima::fastrtps::rtps::Locator_t), true>::operator()<eprosima::fastrtps::rtps::UDPChannelResource*, eprosima::fastrtps::rtps::Locator_t, void>(eprosima::fastrtps::rtps::UDPTransportInterface*, eprosima::fastrtps::rtps::UDPChannelResource*&&, eprosima::fastrtps::rtps::Locator_t&&) const (__object=<optimized out>, this=<optimized out>) at /usr/include/c++/5/functional:600
#9  std::_Bind_simple<std::_Mem_fn<void (eprosima::fastrtps::rtps::UDPTransportInterface::*)(eprosima::fastrtps::rtps::UDPChannelResource*, eprosima::fastrtps::rtps::Locator_t)> (eprosima::fastrtps::rtps::UDPTransportInterface*, eprosima::fastrtps::rtps::UDPChannelResource*, eprosima::fastrtps::rtps::Locator_t)>::_M_invoke<0ul, 1ul, 2ul>(std::_Index_tuple<0ul, 1ul, 2ul>) (this=<optimized out>) at /usr/include/c++/5/functional:1531
#10 std::_Bind_simple<std::_Mem_fn<void (eprosima::fastrtps::rtps::UDPTransportInterface::*)(eprosima::fastrtps::rtps::UDPChannelResource*, eprosima::fastrtps::rtps::Locator_t)> (eprosima::fastrtps::rtps::UDPTransportInterface*, eprosima::fastrtps::rtps::UDPChannelResource*, eprosima::fastrtps::rtps::Locator_t)>::operator()() (this=<optimized out>)
    at /usr/include/c++/5/functional:1520
#11 std::thread::_Impl<std::_Bind_simple<std::_Mem_fn<void (eprosima::fastrtps::rtps::UDPTransportInterface::*)(eprosima::fastrtps::rtps::UDPChannelResource*, eprosima::fastrtps::rtps::Locator_t)> (eprosima::fastrtps::rtps::UDPTransportInterface*, eprosima::fastrtps::rtps::UDPChannelResource*, eprosima::fastrtps::rtps::Locator_t)> >::_M_run() (
    this=<optimized out>) at /usr/include/c++/5/thread:115
#12 0x00007f0d2aaa0c80 in std::execute_native_thread_routine (__p=<optimized out>) at ../../../../../src/libstdc++-v3/src/c++11/thread.cc:84
#13 0x00007f0d301816ba in start_thread (arg=0x7f0d03fff700) at pthread_create.c:333
#14 0x00007f0d2a50f41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

And here is the thread that is waiting on it and blocking the proper termination of the application.

Thread 1 (Thread 0x7f0d30a9af80 (LWP 5590)):
#0  0x00007f0d3018298d in pthread_join (threadid=139693878408960, thread_return=0x0) at pthread_join.c:90
#1  0x00007f0d2aaa0b97 in __gthread_join (__value_ptr=0x0, __threadid=<optimized out>)
at /build/gcc-5-WLoftf/gcc-5-5.4.0/build/x86_64-linux-gnu/libstdc++-v3/include/x86_64-linux-gnu/bits/gthr-default.h:668
#2  std::thread::join (this=0x21b9090) at ../../../../../src/libstdc++-v3/src/c++11/thread.cc:107
#3  0x00007f0d0f749a7c in eprosima::fastrtps::rtps::ChannelResource::Clear (this=this@entry=0x21b21e0)
at ./src/cpp/transport/ChannelResource.cpp:59
#4  0x00007f0d0f749ab7 in eprosima::fastrtps::rtps::ChannelResource::~ChannelResource (this=0x21b21e0, __in_chrg=<optimized out>)
at ./src/cpp/transport/ChannelResource.cpp:51
#5  0x00007f0d0f749ef8 in eprosima::fastrtps::rtps::UDPChannelResource::~UDPChannelResource (this=0x21b21e0, __in_chrg=<optimized out>)
at ./src/cpp/transport/UDPChannelResource.cpp:46
#6  0x00007f0d0f74a2c9 in eprosima::fastrtps::rtps::UDPChannelResource::~UDPChannelResource (this=0x21b21e0, __in_chrg=<optimized out>)
at ./src/cpp/transport/UDPChannelResource.cpp:49
#7  0x00007f0d0f7715f7 in eprosima::fastrtps::rtps::UDPTransportInterface::CloseInputChannel (this=<optimized out>, locator=...)
at ./src/cpp/transport/UDPTransportInterface.cpp:109
#8  0x00007f0d0f7075c0 in std::function<void ()>::operator()() const (this=0x21b2068) at /usr/include/c++/5/functional:2267
#9  eprosima::fastrtps::rtps::ReceiverResource::~ReceiverResource (this=0x21b2060, __in_chrg=<optimized out>)
at ./src/cpp/rtps/network/ReceiverResource.cpp:105
#10 0x00007f0d0f707629 in eprosima::fastrtps::rtps::ReceiverResource::~ReceiverResource (this=0x21b2060, __in_chrg=<optimized out>) at ./src/cpp/rtps/network/ReceiverResource.cpp:107
#11 0x00007f0d0f7088d8 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x21b9040) at /usr/include/c++/5/bits/shared_ptr_base.h:150
#12 std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=0x21b1c58, __in_chrg=<optimized out>) at /usr/include/c++/5/bits/shared_ptr_base.h:659
#13 std::__shared_ptr<eprosima::fastrtps::rtps::ReceiverResource, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=0x21b1c50, __in_chrg=<optimized out>) at /usr/include/c++/5/bits/shared_ptr_base.h:925
#14 std::shared_ptr<eprosima::fastrtps::rtps::ReceiverResource>::~shared_ptr (this=0x21b1c50, __in_chrg=<optimized out>) at /usr/include/c++/5/bits/shared_ptr.h:93
#15 eprosima::fastrtps::rtps::RTPSParticipantImpl::ReceiverControlBlock::~ReceiverControlBlock (this=0x21b1c50, __in_chrg=<optimized out>) at ./src/cpp/rtps/participant/RTPSParticipantImpl.h:95
#16 std::_List_node<eprosima::fastrtps::rtps::RTPSParticipantImpl::ReceiverControlBlock>::~_List_node (this=0x21b1c40, __in_chrg=<optimized out>) at /usr/include/c++/5/bits/stl_list.h:106
#17 __gnu_cxx::new_allocator<std::_List_node<eprosima::fastrtps::rtps::RTPSParticipantImpl::ReceiverControlBlock> >::destroy<std::_List_node<eprosima::fastrtps::rtps::RTPSParticipantImpl::ReceiverControlBlock> > (this=<optimized out>, __p=0x21b1c40) at /usr/include/c++/5/ext/new_allocator.h:124
#18 std::__cxx11::_List_base<eprosima::fastrtps::rtps::RTPSParticipantImpl::ReceiverControlBlock, std::allocator<eprosima::fastrtps::rtps::RTPSParticipantImpl::ReceiverControlBlock> >::_M_clear (this=0x21b7a50) at /usr/include/c++/5/bits/list.tcc:75
#19 std::__cxx11::list<eprosima::fastrtps::rtps::RTPSParticipantImpl::ReceiverControlBlock, std::allocator<eprosima::fastrtps::rtps::RTPSParticipantImpl::ReceiverControlBlock> >::clear (this=0x21b7a50) at /usr/include/c++/5/bits/stl_list.h:1368
#20 eprosima::fastrtps::rtps::RTPSParticipantImpl::~RTPSParticipantImpl (this=0x21b7810, __in_chrg=<optimized out>) at ./src/cpp/rtps/participant/RTPSParticipantImpl.cpp:275
#21 0x00007f0d0f708e49 in eprosima::fastrtps::rtps::RTPSParticipantImpl::~RTPSParticipantImpl (this=0x21b7810, __in_chrg=<optimized out>) at ./src/cpp/rtps/participant/RTPSParticipantImpl.cpp:283
#22 0x00007f0d0f7128ef in eprosima::fastrtps::rtps::RTPSDomain::removeRTPSParticipant_nts (it=..., it@entry=...) at ./src/cpp/rtps/RTPSDomain.cpp:197
#23 0x00007f0d0f712fb6 in eprosima::fastrtps::rtps::RTPSDomain::removeRTPSParticipant (p=0x21c14d0) at ./src/cpp/rtps/RTPSDomain.cpp:185
#24 0x00007f0d0f727118 in eprosima::fastrtps::ParticipantImpl::~ParticipantImpl (this=0x21b0ec0, __in_chrg=<optimized out>) at ./src/cpp/participant/ParticipantImpl.cpp:79
#25 0x00007f0d0f7273b9 in eprosima::fastrtps::ParticipantImpl::~ParticipantImpl (this=0x21b0ec0, __in_chrg=<optimized out>) at ./src/cpp/participant/ParticipantImpl.cpp:81
#26 0x00007f0d0f716ef2 in eprosima::fastrtps::Domain::removeParticipant (part=part@entry=0x2085c10) at ./src/cpp/Domain.cpp:95
#27 0x00007f0d0fbd94f5 in rmw_fastrtps_shared_cpp::__rmw_destroy_node (identifier=<optimized out>, node=<optimized out>) at ./rmw_fastrtps_shared_cpp/src/rmw_node.cpp:352
#28 0x00007f0d303a8636 in rcl_node_fini (node=0x21b9a80) at ../../ros2-rcl-287eed5b43377179c33aa87cc98ca38e4fc1c18a/rcl/src/rcl/node.c:504
#29 0x00007f0d30662a3a in rclcpp::node_interfaces::NodeBase::NodeBase(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::shared_ptr<rclcpp::Context>, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, bool)::{lambda(rcl_node_t*)#2}::operator()(rcl_node_t*) const [clone .isra.19] () at ./rclcpp/src/rclcpp/node_interfaces/node_base.cpp:184
#30 0x00007f0d306630e5 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x2406bf0) at /usr/include/c++/5/bits/shared_ptr_base.h:150
#31 std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=0x21c3090, __in_chrg=<optimized out>) at /usr/include/c++/5/bits/shared_ptr_base.h:659
#32 std::__shared_ptr<rcl_node_t, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=0x21c3088, __in_chrg=<optimized out>) at /usr/include/c++/5/bits/shared_ptr_base.h:925
#33 std::shared_ptr<rcl_node_t>::~shared_ptr (this=0x21c3088, __in_chrg=<optimized out>) at /usr/include/c++/5/bits/shared_ptr.h:93
#34 rclcpp::node_interfaces::NodeBase::~NodeBase (this=0x21c3070, __in_chrg=<optimized out>) at ./rclcpp/src/rclcpp/node_interfaces/node_base.cpp:209
#35 0x00007f0d306631e9 in rclcpp::node_interfaces::NodeBase::~NodeBase (this=0x21c3070, __in_chrg=<optimized out>) at ./rclcpp/src/rclcpp/node_interfaces/node_base.cpp:221
#36 0x00007f0d3065fa22 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x2409120) at /usr/include/c++/5/bits/shared_ptr_base.h:150
#37 std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=0x2086550, __in_chrg=<optimized out>) at /usr/include/c++/5/bits/shared_ptr_base.h:659
#38 std::__shared_ptr<rclcpp::node_interfaces::NodeBaseInterface, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=0x2086548, __in_chrg=<optimized out>) at /usr/include/c++/5/bits/shared_ptr_base.h:925
#39 std::shared_ptr<rclcpp::node_interfaces::NodeBaseInterface>::~shared_ptr (this=0x2086548, __in_chrg=<optimized out>) at /usr/include/c++/5/bits/shared_ptr.h:93
#40 rclcpp::Node::~Node (this=0x2086530, __in_chrg=<optimized out>) at ./rclcpp/src/rclcpp/node.cpp:101
#41 0x000000000041bb26 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x2086520) at /usr/include/c++/5/bits/shared_ptr_base.h:150
#42 0x0000000000425278 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=0x7ffe9f682648, __in_chrg=<optimized out>) at /usr/include/c++/5/bits/shared_ptr_base.h:659
#43 std::__shared_ptr<rclcpp::Node, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=0x7ffe9f682640, __in_chrg=<optimized out>) at /usr/include/c++/5/bits/shared_ptr_base.h:925
#44 std::shared_ptr<rclcpp::Node>::~shared_ptr (this=0x7ffe9f682640, __in_chrg=<optimized out>) at /usr/include/c++/5/bits/shared_ptr.h:93
#46 0x000000000040f8ae in main (argc=2, argv=0x7ffe9f683858) at ./main.cpp:244
@DensoADAS
Copy link

DensoADAS commented Apr 16, 2019

We encountered the same problem by creating and deleting publishers in ROS2 in high frequency. A workaround which we tested is attached. But is probably not the best solution.
FixHackFast-RTPS.diff.txt

(The patch just demonstrates that it does not block anymore, receiving is not working...)

@guillaumeautran
Copy link
Contributor Author

@DensoADAS Would you be able to push a PR with a cleaned up version of your fix?

@DensoADAS
Copy link

DensoADAS commented Apr 18, 2019

The updated patch is attached.
FixHackFast-RTPS3.diff.txt

guillaumeautran pushed a commit to clearpathrobotics/Fast-RTPS that referenced this issue Apr 22, 2019
Convert the transport UDP receiver into a non-blocking receiver to permit proper thread termination.

issue eProsima#473
@guillaumeautran
Copy link
Contributor Author

#509

@raquelalvarezbanos raquelalvarezbanos changed the title Fast-RTPS not terminating cleanly Fast-RTPS not terminating cleanly [5681] Jun 18, 2019
@MiguelCompany
Copy link
Member

I think this should have been solved by #569, which will be part of release 1.8.1 (#574) scheduled for June 28th.

@guillaumeautran @DensoADAS Could you check on your side?

@guillaumeautran
Copy link
Contributor Author

I am not able to check this issue until later in the Fall. We can either leave this report as is for now (and I'll clean it up later) or close it (and I can always re-open it or create a new one later).

@MiguelCompany
Copy link
Member

Closing this, as it was solved long ago by #569 and I have not been able to reproduce it with the current release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants