Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adding timeout for action client initialization #3029

Conversation

vinnnyr
Copy link
Contributor

@vinnnyr vinnnyr commented Jun 21, 2022

Signed-off-by: Vinny Ruia vinny.ruia@fireflyautomatix.com


Basic Info

Info Please fill out this column
Ticket(s) this addresses #3027
Primary OS tested on N/A*
Robotic platform tested on N/A*
  • I am currently having trouble compiling from main on my system. I started this PR so that I can see if CI can do it.

Description of contribution in a few bullet points

Adding timeout to the action server initialization

Description of documentation updates required from your changes

N/A


Future work that may be required in bullet points

N/A

For Maintainers:

  • Check that any new parameters added are updated in navigation.ros.org
  • Check that any significant change is added to the migration guide
  • Check that any new features OR changes to existing behaviors are reflected in the tuning guide
  • Check that any new functions have Doxygen added
  • Check that any new features have test coverage
  • Check that any new plugins is added to the plugins page
  • If BT Node, Additionally: add to BT's XML index of nodes for groot, BT package's readme table, and BT library lists

Signed-off-by: Vinny Ruia <vinny.ruia@fireflyautomatix.com>
@mergify
Copy link
Contributor

mergify bot commented Jun 21, 2022

@vinnnyr, please properly fill in PR template in the future. @SteveMacenski, use this instead.

  • Check that any new parameters added are updated in navigation.ros.org
  • Check that any significant change is added to the migration guide
  • Check that any new features OR changes to existing behaviors are reflected in the tuning guide
  • Check that any new functions have Doxygen added
  • Check that any new features have test coverage
  • Check that any new plugins is added to the plugins page
  • If BT Node, Additionally: add to BT's XML index of nodes for groot, BT package's readme table, and BT library lists

@@ -90,7 +90,12 @@ class BtActionNode : public BT::ActionNodeBase

// Make sure the server is actually there before continuing
RCLCPP_DEBUG(node_->get_logger(), "Waiting for \"%s\" action server", action_name.c_str());
action_client_->wait_for_action_server();
if (!action_client_->wait_for_action_server(server_timeout_)) {
Copy link
Contributor Author

@vinnnyr vinnnyr Jun 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I think I have to rethink this a small bit. The default server_timeout is 20ms, which is not a lot of time to wait by default.

When trying to launch the system:

[lifecycle_manager-3] [INFO] [1655777607.945443640] [lifecycle_manager_navigation]: Activating bt_navigator
[bt_navigator-2] [INFO] [1655777607.945732127] [bt_navigator]: Activating
[bt_navigator-2] [ERROR] [1655777607.968145727] [bt_navigatornavigate_through_poses_rclcpp_node]: "compute_path_through_poses" action server not available after waiting for 20 ms
[bt_navigator-2] [ERROR] [1655777607.976364703] []: Caught exception in callback for transition 13
[bt_navigator-2] [ERROR] [1655777607.976390295] []: Original error: Action server not available
[bt_navigator-2] [WARN] [1655777607.976438741] []: Error occurred while doing error handling.
[bt_navigator-2] [FATAL] [1655777607.976449793] [bt_navigator]: Lifecycle node bt_navigator does not have error state implemented
[lifecycle_manager-3] [ERROR] [1655777607.976858429] [lifecycle_manager_navigation]: Failed to change state for node: bt_navigator
[lifecycle_manager-3] [ERROR] [1655777607.976902588] [lifecycle_manager_navigation]: Failed to bring up all requested nodes. Aborting bringup.

Should I declare a new parameter for this timeout?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm hesitant to have a new timeout since that might confuse people, even if well named - but maybe there's a good name that would be clear so its moot? wait_for_action_server_timeout server_initialization_timeout?

Otherwise, if we can't come up with a good & specific name, I think just hardcoding it at some value would be fine. I'd say 1s just to be nice and round. We have a few like that (only a few) in the stack where it didn't make much sense to parameterize it for various reasons and since they were just initialization operations, it wasn't super important for them to be configurable for 99.8% of users.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say 1s just to be nice and round

I think I agree

@SteveMacenski
Copy link
Member

Where is this exception caught? I don't see in the loading process where there's anything catching exceptions so that this wouldn't bring down the process (unless its being caught at the lifecycle node level?).

If it happens in the lifecycle node process for starting up the node, where would it be caught then in steady state? We can specify BTs for use for every navigation request, so we could change trees anything during execution. In that case, there wouldn't be a lifecycle node transition to catch the exception so it wouldn't crash - when a different BT is specified in a navigation request and thus is being constructed

@vinnnyr
Copy link
Contributor Author

vinnnyr commented Jun 21, 2022

Good point.

At least on startup, the exception is caught by the Parent Lifecycle Node

server_name | [action_server_1] [ERROR] [1655837597.993815093] [action_name_rclcpp_node]: "follow_path" action server not available after waiting for 2000 ms
server_name | [action_server_1] [ERROR] [1655837598.076770031] []: Caught exception in callback for transition 13
server_name | [action_server_1] [ERROR] [1655837598.076800292] []: Original error: Action server not available
server_name | [action_server_1] [WARN] [1655837598.076856566] []: Error occurred while doing error handling.
server_name | [ERROR] [launch_ros.actions.lifecycle_node]: Failed to make transition 'TRANSITION_ACTIVATE' for LifecycleNode '/action'

I can look into how it gets caught in steady state too.

@SteveMacenski
Copy link
Member

Prob just need another try/catch somewhere in the chain

Signed-off-by: Vinny Ruia <vinny.ruia@fireflyautomatix.com>
@@ -90,7 +90,14 @@ class BtActionNode : public BT::ActionNodeBase

// Make sure the server is actually there before continuing
RCLCPP_DEBUG(node_->get_logger(), "Waiting for \"%s\" action server", action_name.c_str());
action_client_->wait_for_action_server();
static constexpr std::chrono::milliseconds wait_for_server_timeout =
std::chrono::milliseconds(1000);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hardcoded to a reasonable 1s now

@vinnnyr
Copy link
Contributor Author

vinnnyr commented Jun 21, 2022

@SteveMacenski Take a look at your convenience. I believe I addressed your concerns, including handling the "switch at runtime" case.

@vinnnyr
Copy link
Contributor Author

vinnnyr commented Jun 21, 2022

As an aside, I am unsure which tests are actually failing here.
Atleast on my browser / machine, the CircleCI tab for tests seems to be hanging indefinitely
https://app.circleci.com/pipelines/github/ros-planning/navigation2/7517/workflows/12f61cb3-ffeb-414c-8722-b47d4e422afa/jobs/26329/tests

Copy link
Member

@SteveMacenski SteveMacenski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean std::chrono::milliseconds wait_for_server_timeout = std::chrono::milliseconds(1000); works, but we can just do action_client_->wait_for_action_server(1s) if you just include chrono literals.

@SteveMacenski
Copy link
Member

The tests came through, it did have 1 failure that is unusual so retriggering, but I don't think its related to this PR. I'll let it spin though and check back in the morning

@SteveMacenski SteveMacenski merged commit bccf6d6 into ros-navigation:main Jun 22, 2022
redvinaa pushed a commit to EnjoyRobotics/navigation2 that referenced this pull request Jun 30, 2022
* adding timeout for action client initialization

Signed-off-by: Vinny Ruia <vinny.ruia@fireflyautomatix.com>

* adding constant 1s timeout, catching exception

Signed-off-by: Vinny Ruia <vinny.ruia@fireflyautomatix.com>
SteveMacenski pushed a commit that referenced this pull request Aug 24, 2022
* adding timeout for action client initialization

Signed-off-by: Vinny Ruia <vinny.ruia@fireflyautomatix.com>

* adding constant 1s timeout, catching exception

Signed-off-by: Vinny Ruia <vinny.ruia@fireflyautomatix.com>
SteveMacenski added a commit that referenced this pull request Aug 24, 2022
* Fix size_t format specifier (#3003)

* clear up params file (#3004)

Signed-off-by: zhenpeng ge <zhenpeng.ge@qq.com>

* Bt test fix (#2999)

* fixed tests

* undo

* linting fix (#3007)

Signed-off-by: zhenpeng ge <zhenpeng.ge@qq.com>

* Add nav2_constrained_smoother to metapackage

* adding humble release to table (#3008)

* Fix for costmap nodes lifecycle status (#3009)

Lifecycle status for global and local cost nodes not correct.
ros2 lifecycle/service commands  shows unconfigured for these two.
This is due to directly calling on_configure/on_activate/on_cleanup
calls in parent node.  This PR to replace on_xxxxxx() to
configure()/activate()/cleanup() calls of lifecycle base.

Signed-off-by: Arshad <arshad.mehmood@intel.com>

* Get parameters on configure transition addressing #2985 (#3000)

* Get parameters on configure transition

Signed-off-by: MartiBolet <mboletboixeda@gmail.com>

* Remove past setting of parameters

Signed-off-by: MartiBolet <mboletboixeda@gmail.com>

* Expose transition functions to public for test

Signed-off-by: MartiBolet <mboletboixeda@gmail.com>

* fix: wrong input type in navigate_to_pose_action.hpp and navigate_to_… (#2994)

* fix: wrong input type in navigate_to_pose_action.hpp and navigate_to_pose_action.hpp

* Update navigate_through_poses_action.hpp

Co-authored-by: Steve Macenski <stevenmacenski@gmail.com>

* Nav2 Velocity Smoother (#2964)

* WIP velocity smoother with ruckig

* a few comments

* vel smoother prototype

* updating defaults

* adding defaults to readme

* removing note from readme

* updates to velocity smoother TODO items

* adding unit tests

* finishing system tests

* adding failure to change parameters tests

* fix last bits

* fixing negative sign bug

* lint

* update tests

* setting defaults

* Adding warning

* Update velocity_smoother.cpp

* Fixing rebase issue

* adding plugin type for static in local + removing unused print (#3021)

* removed extra includes (#3026)

* remove extra sub (#3025)

* Fix missing dependency on nav2_velocity_smoother (#3031)

* adding timeout for action client initialization (#3029)

* adding timeout for action client initialization

Signed-off-by: Vinny Ruia <vinny.ruia@fireflyautomatix.com>

* adding constant 1s timeout, catching exception

Signed-off-by: Vinny Ruia <vinny.ruia@fireflyautomatix.com>

* cleanup warnings (#3028)

* cleanup warnings

* removed referenc

* installing plugins folder of nav2_behaviors package (#3051)

Co-authored-by: Srijanee Biswas <srijanee.biswas@toyotatmh.com>

* Completed PR 2929 (#3038)

* accepting empty yaml_filename if no initial map is available

* invalid load_map-request does not invalidate existing map, added Testcase

* style

* finish PR 2929

* finish linting

* removing change

* removing change

* Update test_map_server_node.cpp

* Update test_map_server_node.cpp

Co-authored-by: Nikolas Engelhard <nikolas.engelhard@gmail.com>

* zero z values in rpp transformed plan (#3066)

* removing get node options default for nav2 utils (#3073)

* adding looping timeout for lifecycle service clients + adding string name of action for BT action nodes (#3071)

* adding looping timeout for lifecycle service clients + adding string name of action for BT action nodes

* fix linting

* remove inline comment

* adding goal updated controller node to test

* Smac Planner 2D changes (#3083)

* removing 4-connected option from smac; fixing 2D heuristic and traversal function from other planner's changes

* fix name of variable since no longer a neighborhood

* partial test updates

* ported unit tests fully

* revert to no costmap downsampling

* Collision monitor (#2982)

* Add Collision Monitor node

* Meet review items

* Fix next review items

* Code cleanup

* Support dynamic footprint. More optimizations.

* Switch to multiple footprints. Move variables.
Remove odom subscriber. Add 0-velocity optimization

* Update nav2_collision_monitor/include/nav2_collision_monitor/polygon.hpp

Co-authored-by: Steve Macenski <stevenmacenski@gmail.com>

* Update nav2_collision_monitor/params/collision_monitor_params.yaml

Co-authored-by: Steve Macenski <stevenmacenski@gmail.com>

* Update nav2_collision_monitor/params/collision_monitor_params.yaml

Co-authored-by: Steve Macenski <stevenmacenski@gmail.com>

* Update nav2_collision_monitor/params/collision_monitor_params.yaml

Co-authored-by: Steve Macenski <stevenmacenski@gmail.com>

* Meet smaller review items

* Add fixes found during unit test development

* Fix uncrustify issues

* Add unit tests

* Fix number of polygons points

* Move tests

* Add kinematics unit test

* Minor tests fixes

* Remove commented line

* Add edge case checking testcase and references

* Update comment

* Add README.md

* Fixed table

* Minor changes in README.md

* Fix README.md for documentation pages

* Update nav2_collision_monitor/README.md

Co-authored-by: Steve Macenski <stevenmacenski@gmail.com>

* Update nav2_collision_monitor/README.md

Co-authored-by: Steve Macenski <stevenmacenski@gmail.com>

* Update nav2_collision_monitor/README.md

Co-authored-by: Steve Macenski <stevenmacenski@gmail.com>

* Update nav2_collision_monitor/README.md

Co-authored-by: Steve Macenski <stevenmacenski@gmail.com>

* Update nav2_collision_monitor/README.md

Co-authored-by: Steve Macenski <stevenmacenski@gmail.com>

* Update nav2_collision_monitor/README.md

Co-authored-by: Steve Macenski <stevenmacenski@gmail.com>

* Update nav2_collision_monitor/README.md

Co-authored-by: Steve Macenski <stevenmacenski@gmail.com>

* Update nav2_collision_monitor/README.md

Co-authored-by: Steve Macenski <stevenmacenski@gmail.com>

* Update nav2_collision_monitor/README.md

Co-authored-by: Steve Macenski <stevenmacenski@gmail.com>

* Update nav2_collision_monitor/README.md

Co-authored-by: Steve Macenski <stevenmacenski@gmail.com>

* Meet review items

* Meet review items (part 2)

* Update polygons picture for README

* Change simulation_time_step to 0.1

* Fix bounding boxes to fit the demo from README.md

* Terminology fixes

Co-authored-by: Steve Macenski <stevenmacenski@gmail.com>

* removing the extra argument in class dec (#3084)

* Fix for #3078, fix image path in YAML (#3082)

* Fix for #3078, fix image path in YAML

* Update map_io.cpp

* Update map_io.cpp

* Update map_io.cpp

* Update map_io.cpp

* do not depend on velocity for approach scaling (#3047)

* do not depend on velocity for approach scaling

* add approach_velocity_scaling_dist to README

* do not pass references to shared ptr

* fixup! do not pass references to shared ptr

* fix approach velocity scaling compile issues

* default approach_velocity_scaling_dist based on costmap size

* change default approach_velocity_scaling_dist to 0.6 to match previous behavior

* update tests for approach velocity scaling

* remove use_approach_linear_velocity_scaling from readme

* smooth approach velocity scaling

* Use correct timeout when checking future (#3087)

* Adds missing commas so default plugin names are not stuck together (#3093)

Signed-off-by: Aaron Chong <aaronchongth@gmail.com>

Signed-off-by: Aaron Chong <aaronchongth@gmail.com>

* Fix Costmap Filters system tests (#3120)

* Fix Costmap Filters system tests

* Update map_io.cpp

Co-authored-by: Alexey Merzlyakov <alexey.merzlyakov@samsung.com>

* Finding distance H if obstacle H is <= 0 (#3122)

* adding checks on goal IDs in results for waypoint follower (#3130)

* ComputePathToPose Sets empty path on blackboard if action is aborted or cancelled (#3114)

* set empty path on black on failure

* docs

* switched to correct message type

* set empty path for compute_path_through_poses

* Ignore feedback from old goals in waypoint follower (#3139)

* apply joinchar in pathify (#3141)

Co-authored-by: kevin <kevin@floatic.io>

* Log level (#3110)

* added log level for navigation launch

* localization log level

* slam log level

* revert use_comp

* added log level to components

* linting and uncrusitfy fixes for CI (#3144)

* start is now added to the path (#3140)

* start is now added to the path

* lint fix

* Update README.md (#3147)

Current example doesn't work with the updates.

* fixing linting problem

* Setting map map's yaml path to empty string, since overridden in launch (#3123)

* Update nav2_params.yaml

* Update nav2_params.yaml

* Update nav2_params.yaml

* bumping to 1.1.1 for release

Signed-off-by: zhenpeng ge <zhenpeng.ge@qq.com>
Signed-off-by: Arshad <arshad.mehmood@intel.com>
Signed-off-by: MartiBolet <mboletboixeda@gmail.com>
Signed-off-by: Vinny Ruia <vinny.ruia@fireflyautomatix.com>
Signed-off-by: Aaron Chong <aaronchongth@gmail.com>
Co-authored-by: M. Mostafa Farzan <m2_farzan@yahoo.com>
Co-authored-by: Zhenpeng Ge <zhenpeng.ge@qq.com>
Co-authored-by: Joshua Wallace <47819219+jwallace42@users.noreply.github.com>
Co-authored-by: Arshad Mehmood <arshad.mehmood@intel.com>
Co-authored-by: MartiBolet <43337758+MartiBolet@users.noreply.github.com>
Co-authored-by: shoufei <907575489@qq.com>
Co-authored-by: hodnajit <jitkahodna67@gmail.com>
Co-authored-by: Vinny Ruia <vinny.ruia@fireflyautomatix.com>
Co-authored-by: SrijaneeBiswas <30804865+SrijaneeBiswas@users.noreply.github.com>
Co-authored-by: Srijanee Biswas <srijanee.biswas@toyotatmh.com>
Co-authored-by: Nikolas Engelhard <nikolas.engelhard@gmail.com>
Co-authored-by: Adam Aposhian <aposhian.dev@gmail.com>
Co-authored-by: Alexey Merzlyakov <60094858+AlexeyMerzlyakov@users.noreply.github.com>
Co-authored-by: Pradheep Krishna <padhupradheep@gmail.com>
Co-authored-by: nakai-omer <108797279+nakai-omer@users.noreply.github.com>
Co-authored-by: Samuel Lindgren <samuel@dynorobotics.se>
Co-authored-by: Aaron Chong <aaronchongth@gmail.com>
Co-authored-by: Alexey Merzlyakov <alexey.merzlyakov@samsung.com>
Co-authored-by: Pedro Alejandro González <71234974+pepisg@users.noreply.github.com>
Co-authored-by: 정찬희 <60467877+ladianchad@users.noreply.github.com>
Co-authored-by: kevin <kevin@floatic.io>
Co-authored-by: Austin Greisman <92941098+austin-InDro@users.noreply.github.com>
jwallace42 pushed a commit to jwallace42/navigation2 that referenced this pull request Dec 14, 2022
* adding timeout for action client initialization

Signed-off-by: Vinny Ruia <vinny.ruia@fireflyautomatix.com>

* adding constant 1s timeout, catching exception

Signed-off-by: Vinny Ruia <vinny.ruia@fireflyautomatix.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants