Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allocator_common.hpp overallocates 100x memory & invokes UB on dealloc #1254

Open
stevewolter opened this issue Aug 3, 2020 · 8 comments · May be fixed by #1324
Open

allocator_common.hpp overallocates 100x memory & invokes UB on dealloc #1254

stevewolter opened this issue Aug 3, 2020 · 8 comments · May be fixed by #1324
Assignees
Labels
help wanted Extra attention is needed

Comments

@stevewolter
Copy link

rclcpp/include/rclcpp/allocator/allocator_common.hpp has two severe correctness issues in memory allocation:

  1. retyped_allocate takes the "size" parameter (which is size in bytes as defined by rcl_allocator_t) and passes it to it's allocator's allocate function (which takes number of elements as its parameter). So when asked to allocate N bytes, it actually allocates N*sizeof(T) bytes, where T is the type of the allocator. It's called by subscribers and publishers with a typed allocator of type "Message", which is easily ~100 bytes, so it's overallocating memory by a factor of ~100.
  2. retyped_deallocate passes 1 to the allocator's deallocate function, instead of the size of the allocated memory. This happens to work for some allocator implementations that ignore the parameter (libstdc++'s malloc_allocator), but will lead to hard-to-diagnose segfaults on other architectures. I've observed a segfault on Linux + tcmalloc, and I guess this is the reason that allocators needed to be commented out to avoid segfaults on Windows. In any case, it's relying on undefined behaviour and a maintenance hazard.

I realize that both of these issues are not easily fixable. Since allocators are not supported yet anyway (#1061), I'd suggest to disable support for custom allocators entirely and depend on the default RCL allocator. If you concur, I'd be happy to help in implementation.

@clalancette
Copy link
Contributor

Thanks for bringing up this issue. After reading the documentation, I agree with both of your points. I'd be OK with removing this implementation entirely and just using the default rcl allocator for now instead. But pinging @ros2/team for other opinions.

@wjwwood
Copy link
Member

wjwwood commented Aug 14, 2020

Yeah we can just remove it, I think it should be fixable, but I don't think anyone is really using it and we plan to move to the polymorphic allocator (we think) in the future anyways, so we can fix it then.

@clalancette
Copy link
Contributor

Great, thanks for the feedback. @stevewolter if you are interested in working on this, we'll be happy to review. Thanks.

@stevewolter
Copy link
Author

Thanks Chris, I'll give it a shot and send a PR.

@stevewolter
Copy link
Author

Quick update: It will be some time - I'm on vacation for 2 weeks now, but I'll send a PR afterwards.

@clalancette clalancette added the help wanted Extra attention is needed label Sep 3, 2020
@clalancette
Copy link
Contributor

@stevewolter Thanks for the heads up.

stevewolter added a commit to stevewolter/rclcpp that referenced this issue Sep 22, 2020
As explained in ros2#1254, there's conceptually no way to implement RCL
allocators in terms of C++ allocators. Since allocator_common can't
be fixed, delete it and inline calls to rcl_get_default_allocator at
the call sites. Since this would introduce surprising behavior for
custom allocator users (using system malloc when a custom allocator
is requested), add compile-time assertions checking for non-default
allocators. Since there's other blocking bugs for allocator use
(ros2#1061), this shouldn't break existing users.

Technically, we could roll back a lot of templating on allocator and
explicitly use std::allocator<void> in all classes. I have refrained
from doing so because you might have plans with this code, but if you
concur, I'll be more than happy to de-template all these classes to
simplify the code.
stevewolter added a commit to stevewolter/rclcpp that referenced this issue Sep 22, 2020
As explained in ros2#1254, there's conceptually no way to implement RCL
allocators in terms of C++ allocators. Since allocator_common can't
be fixed, delete it and inline calls to rcl_get_default_allocator at
the call sites. Since this would introduce surprising behavior for
custom allocator users (using system malloc when a custom allocator
is requested), add compile-time assertions checking for non-default
allocators. Since there's other blocking bugs for allocator use
(ros2#1061), this shouldn't break existing users.

Technically, we could roll back a lot of templating on allocator and
explicitly use std::allocator<void> in all classes. I have refrained
from doing so because you might have plans with this code, but if you
concur, I'll be more than happy to de-template all these classes to
simplify the code.
stevewolter added a commit to stevewolter/rclcpp that referenced this issue Sep 22, 2020
As explained in ros2#1254, there's conceptually no way to implement RCL
allocators in terms of C++ allocators. Since allocator_common can't
be fixed, delete it and inline calls to rcl_get_default_allocator at
the call sites. Since this would introduce surprising behavior for
custom allocator users (using system malloc when a custom allocator
is requested), add compile-time assertions checking for non-default
allocators. Since there's other blocking bugs for allocator use
(ros2#1061), this shouldn't break existing users.

Technically, we could roll back a lot of templating on allocator and
explicitly use std::allocator<void> in all classes. I have refrained
from doing so because you might have plans with this code, but if you
concur, I'll be more than happy to de-template all these classes to
simplify the code.
stevewolter added a commit to stevewolter/rclcpp that referenced this issue Sep 22, 2020
As explained in ros2#1254, there's conceptually no way to implement RCL
allocators in terms of C++ allocators. Since allocator_common can't
be fixed, delete it and inline calls to rcl_get_default_allocator at
the call sites. Since this would introduce surprising behavior for
custom allocator users (using system malloc when a custom allocator
is requested), add compile-time assertions checking for non-default
allocators. Since there's other blocking bugs for allocator use
(ros2#1061), this shouldn't break existing users.

Technically, we could roll back a lot of templating on allocator and
explicitly use std::allocator<void> in all classes. I have refrained
from doing so because you might have plans with this code, but if you
concur, I'll be more than happy to de-template all these classes to
simplify the code.
stevewolter added a commit to stevewolter/rclcpp that referenced this issue Sep 22, 2020
As explained in ros2#1254, there's conceptually no way to implement RCL
allocators in terms of C++ allocators. Since allocator_common can't
be fixed, delete it and inline calls to rcl_get_default_allocator at
the call sites. Since this would introduce surprising behavior for
custom allocator users (using system malloc when a custom allocator
is requested), add compile-time assertions checking for non-default
allocators. Since there's other blocking bugs for allocator use
(ros2#1061), this shouldn't break existing users.

Technically, we could roll back a lot of templating on allocator and
explicitly use std::allocator<void> in all classes. I have refrained
from doing so because you might have plans with this code, but if you
concur, I'll be more than happy to de-template all these classes to
simplify the code.
stevewolter added a commit to stevewolter/rclcpp that referenced this issue Sep 23, 2020
As explained in ros2#1254, there's conceptually no way to implement RCL
allocators in terms of C++ allocators. Since allocator_common can't
be fixed, delete it and rely on implementers to provide a suitable
specialization. Since there's other blocking bugs for allocator use
(ros2#1061), this shouldn't break existing users.
stevewolter added a commit to stevewolter/rclcpp that referenced this issue Sep 23, 2020
As explained in ros2#1254, there's conceptually no way to implement RCL
allocators in terms of C++ allocators. Since allocator_common can't
be fixed, delete it and rely on implementers to provide a suitable
specialization. Since there's other blocking bugs for allocator use
(ros2#1061), this shouldn't break existing users.
stevewolter added a commit to stevewolter/rclcpp that referenced this issue Sep 23, 2020
As explained in ros2#1254, there's conceptually no way to implement RCL
allocators in terms of C++ allocators. Since allocator_common can't
be fixed, delete it and rely on implementers to provide a suitable
specialization. Since there's other blocking bugs for allocator use
(ros2#1061), this shouldn't break existing users.
stevewolter added a commit to stevewolter/rclcpp that referenced this issue Sep 23, 2020
As explained in ros2#1254, there's conceptually no way to implement RCL
allocators in terms of C++ allocators. Since allocator_common can't
be fixed, delete it and rely on implementers to provide a suitable
specialization. Since there's other blocking bugs for allocator use
(ros2#1061), this shouldn't break existing users.
stevewolter added a commit to stevewolter/realtime_support that referenced this issue Sep 23, 2020
This removes one more user of the broken generic version of
get_rcl_allocator, which is causing
ros2/rclcpp#1254.
stevewolter added a commit to stevewolter/realtime_support that referenced this issue Sep 23, 2020
This removes one more user of the broken generic version of
get_rcl_allocator, which is causing
ros2/rclcpp#1254.
stevewolter added a commit to stevewolter/realtime_support that referenced this issue Sep 23, 2020
This removes one more user of the broken generic version of
get_rcl_allocator, which is causing
ros2/rclcpp#1254.
stevewolter added a commit to stevewolter/realtime_support that referenced this issue Sep 23, 2020
This removes one more user of the broken generic version of
get_rcl_allocator, which is causing
ros2/rclcpp#1254.
stevewolter added a commit to stevewolter/realtime_support that referenced this issue Sep 23, 2020
This removes one more user of the broken generic version of
get_rcl_allocator, which is causing
ros2/rclcpp#1254.
@stevewolter
Copy link
Author

OK, this will take 4 commits to sort out. I'll first make get_rcl_allocator into a family of freestanding functions instead of a function template with explicit instantiation, so that current users of allocators (tlsf_cpp, demo_node) can add overrides for their custom allocators. Then I'll add these overrides, and finally delete the generic allocator implementation that causes the UB.

stevewolter added a commit to stevewolter/rclcpp that referenced this issue Sep 23, 2020
As explained in ros2#1254, there's conceptually no way to implement RCL
allocators in terms of C++ allocators. Since allocator_common can't
be fixed, delete it and rely on implementers to provide a suitable
specialization. Since there's other blocking bugs for allocator use
(ros2#1061), this shouldn't break existing users.
stevewolter added a commit to stevewolter/rclcpp that referenced this issue Sep 23, 2020
As explained in ros2#1254, there's conceptually no way to implement RCL
allocators in terms of C++ allocators. Since allocator_common can't
be fixed, delete it and rely on implementers to provide a suitable
specialization. Since there's other blocking bugs for allocator use
(ros2#1061), this shouldn't break existing users.

Signed-off-by: Steve Wolter <swolter@google.com>
stevewolter added a commit to stevewolter/rclcpp that referenced this issue Sep 23, 2020
As explained in ros2#1254, there's conceptually no way to implement RCL
allocators in terms of C++ allocators. In order to fix this behavior,
we have to delete the generic version of get_rcl_allocator. Since some
ROS code depends on this, we need to allow users to write their own
version of get_rcl_allocator for allocators that support the C-style
interface (most do).

So this CL changes get_rcl_allocator from a template function
into a family of (potentially templated) functions, which allows
users to add their own overloads and rely on the "most specialized"
mechanism for function specialization to select the right one. See
http://www.gotw.ca/publications/mill17.htm for details. This also
allows us to return get_rcl_default_allocator for all specializations
of std::allocator (previously, only for std::allocator), which will
already fix ros2#1254 for pretty much all clients. I'll continue to work
on deleting the generic version, though, to make sure that nobody is
accidentally bitten by it.

I've tried to test this by doing a full ROS compilation following the
Dockerfile of the source Docker image, and all packages compile.

Signed-off-by: Steve Wolter <swolter@google.com>
@stevewolter
Copy link
Author

First PR #1324 is ready for review. Chris, could you PTAL?

stevewolter added a commit to stevewolter/rclcpp that referenced this issue Sep 24, 2020
As explained in ros2#1254, there's conceptually no way to implement RCL
allocators in terms of C++ allocators. In order to fix this behavior,
we have to delete the generic version of get_rcl_allocator. Since some
ROS code depends on this, we need to allow users to write their own
version of get_rcl_allocator for allocators that support the C-style
interface (most do).

So this CL changes get_rcl_allocator from a template function
into a family of (potentially templated) functions, which allows
users to add their own overloads and rely on the "most specialized"
mechanism for function specialization to select the right one. See
http://www.gotw.ca/publications/mill17.htm for details. This also
allows us to return get_rcl_default_allocator for all specializations
of std::allocator (previously, only for std::allocator), which will
already fix ros2#1254 for pretty much all clients. I'll continue to work
on deleting the generic version, though, to make sure that nobody is
accidentally bitten by it.

I've tried to test this by doing a full ROS compilation following the
Dockerfile of the source Docker image, and all packages compile.

Signed-off-by: Steve Wolter <swolter@google.com>
stevewolter added a commit to stevewolter/realtime_support that referenced this issue Sep 29, 2020
This removes one more user of the broken generic version of
get_rcl_allocator, which is causing
ros2/rclcpp#1254.

Signed-off-by: Steve Wolter <swolter@google.com>
stevewolter added a commit to stevewolter/realtime_support that referenced this issue Sep 29, 2020
This removes one more user of the broken generic version of
get_rcl_allocator, which is causing
ros2/rclcpp#1254.

Signed-off-by: Steve Wolter <swolter@google.com>
stevewolter added a commit to stevewolter/realtime_support that referenced this issue Sep 29, 2020
This removes one more user of the broken generic version of
get_rcl_allocator, which is causing
ros2/rclcpp#1254.

Signed-off-by: Steve Wolter <swolter@google.com>
stevewolter added a commit to stevewolter/rclcpp that referenced this issue Oct 1, 2020
As explained in ros2#1254, there's conceptually no way to implement RCL
allocators in terms of C++ allocators. In order to fix this behavior,
we have to delete the generic version of get_rcl_allocator. Since some
ROS code depends on this, we need to allow users to write their own
version of get_rcl_allocator for allocators that support the C-style
interface (most do).

So this CL changes get_rcl_allocator from a template function
into a family of (potentially templated) functions, which allows
users to add their own overloads and rely on the "most specialized"
mechanism for function specialization to select the right one. See
http://www.gotw.ca/publications/mill17.htm for details. This also
allows us to return get_rcl_default_allocator for all specializations
of std::allocator (previously, only for std::allocator), which will
already fix ros2#1254 for pretty much all clients. I'll continue to work
on deleting the generic version, though, to make sure that nobody is
accidentally bitten by it.

I've tried to test this by doing a full ROS compilation following the
Dockerfile of the source Docker image, and all packages compile.

Signed-off-by: Steve Wolter <swolter@google.com>
stevewolter added a commit to stevewolter/rclcpp that referenced this issue Oct 2, 2020
As explained in ros2#1254, there's conceptually no way to implement RCL
allocators in terms of C++ allocators. In order to fix this behavior,
we have to delete the generic version of get_rcl_allocator. Since some
ROS code depends on this, we need to allow users to write their own
version of get_rcl_allocator for allocators that support the C-style
interface (most do).

So this CL changes get_rcl_allocator from a template function
into a family of (potentially templated) functions, which allows
users to add their own overloads and rely on the "most specialized"
mechanism for function specialization to select the right one. See
http://www.gotw.ca/publications/mill17.htm for details. This also
allows us to return get_rcl_default_allocator for all specializations
of std::allocator (previously, only for std::allocator), which will
already fix ros2#1254 for pretty much all clients. I'll continue to work
on deleting the generic version, though, to make sure that nobody is
accidentally bitten by it.

I've tried to test this by doing a full ROS compilation following the
Dockerfile of the source Docker image, and all packages compile.

Signed-off-by: Steve Wolter <swolter@google.com>
stevewolter added a commit to stevewolter/realtime_support that referenced this issue Oct 6, 2020
This removes one more user of the broken generic version of
get_rcl_allocator, which is causing
ros2/rclcpp#1254.

Signed-off-by: Steve Wolter <swolter@google.com>
stevewolter added a commit to stevewolter/realtime_support that referenced this issue Oct 6, 2020
This removes one more user of the broken generic version of
get_rcl_allocator, which is causing
ros2/rclcpp#1254.

Signed-off-by: Steve Wolter <swolter@google.com>
clalancette pushed a commit that referenced this issue Apr 26, 2021
As explained in #1254, there's conceptually no way to implement RCL
allocators in terms of C++ allocators. In order to fix this behavior,
we have to delete the generic version of get_rcl_allocator. Since some
ROS code depends on this, we need to allow users to write their own
version of get_rcl_allocator for allocators that support the C-style
interface (most do).

So this CL changes get_rcl_allocator from a template function
into a family of (potentially templated) functions, which allows
users to add their own overloads and rely on the "most specialized"
mechanism for function specialization to select the right one. See
http://www.gotw.ca/publications/mill17.htm for details. This also
allows us to return get_rcl_default_allocator for all specializations
of std::allocator (previously, only for std::allocator), which will
already fix #1254 for pretty much all clients. I'll continue to work
on deleting the generic version, though, to make sure that nobody is
accidentally bitten by it.

I've tried to test this by doing a full ROS compilation following the
Dockerfile of the source Docker image, and all packages compile.

Signed-off-by: Steve Wolter <swolter@google.com>
tylerjw pushed a commit to tylerjw/rclcpp that referenced this issue Nov 12, 2022
As explained in ros2#1254, there's conceptually no way to implement RCL
allocators in terms of C++ allocators. In order to fix this behavior,
we have to delete the generic version of get_rcl_allocator. Since some
ROS code depends on this, we need to allow users to write their own
version of get_rcl_allocator for allocators that support the C-style
interface (most do).

So this CL changes get_rcl_allocator from a template function
into a family of (potentially templated) functions, which allows
users to add their own overloads and rely on the "most specialized"
mechanism for function specialization to select the right one. See
http://www.gotw.ca/publications/mill17.htm for details. This also
allows us to return get_rcl_default_allocator for all specializations
of std::allocator (previously, only for std::allocator), which will
already fix ros2#1254 for pretty much all clients. I'll continue to work
on deleting the generic version, though, to make sure that nobody is
accidentally bitten by it.

I've tried to test this by doing a full ROS compilation following the
Dockerfile of the source Docker image, and all packages compile.

Signed-off-by: Steve Wolter <swolter@google.com>

Addressed code style test failures.

Signed-off-by: Steve Wolter <swolter@google.com>

Update comments.

Incidentally, this also reruns the flaky test suite, which
seems to be using the real DDS middleware and to be prone
to cross-talk.

Signed-off-by: Steve Wolter <swolter@google.com>

Also update recently added tests.

Signed-off-by: Steve Wolter <swolter@google.com>

Added reference to bug number.

Signed-off-by: Steve Wolter <swolter@google.com>

Extend allocator lifetime in options.

Signed-off-by: Steve Wolter <swolter@google.com>

Drop the generic version of get_rcl_allocator.

This allows us to simplify a bunch of code as well.

Signed-off-by: Steve Wolter <swolter@google.com>

Update rclcpp/include/rclcpp/allocator/allocator_common.hpp

Co-authored-by: William Woodall <william+github@osrfoundation.org>
Signed-off-by: Steve Wolter <swolter@google.com>

Revert accidental deletion of allocator creation.

Signed-off-by: Steve Wolter <swolter@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants