Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace ready_fn with ReadyToTest action #391

Merged

Conversation

pbaughman
Copy link
Contributor

@pbaughman pbaughman commented Oct 21, 2019

The ready_fn will be deprecated in the future in favor of the ReadyToTest() action in launch_testing. See ros2/launch#346 (comment) for background information

Note I had some trouble getting these tests to build and run locally, so I'll pay extra-close attention to the CI jobs on this one

Signed-off-by: Pete Baughman pete.baughman@apex.ai

@sloretz
Copy link
Contributor

sloretz commented Oct 21, 2019

CI (testing just test_cli_remapping test_communication test_rclcpp test_security)

  • Linux Build Status
  • Linux-aarch64 Build Status
  • macOS Build Status
  • Windows Build Status

@pbaughman
Copy link
Contributor Author

I don't understand these failures, but some of them are the same on Linux and macOS. WIll investigate:


53: Test command: /home/jenkins-agent/workspace/ci_linux/venv/bin/python3 "-u" "/home/jenkins-agent/workspace/ci_linux/ws/install/ament_cmake_test/share/ament_cmake_test/cmake/run_test.py" "/home/jenkins-agent/workspace/ci_linux/ws/build/test_communication/test_results/test_communication/test_publisher_subscriber__rclpy__rmw_fastrtps_cpp__rmw_connext_cpp.xunit.xml" "--package-name" "test_communication" "--output-file" "/home/jenkins-agent/workspace/ci_linux/ws/build/test_communication/launch_test/CHANGEME.txt" "--append-env" "LD_LIBRARY_PATH=/home/jenkins-agent/workspace/ci_linux/ws/build/test_communication" "--command" "/home/jenkins-agent/workspace/ci_linux/venv/bin/python3" "-m" "launch_testing.launch_test" "/home/jenkins-agent/workspace/ci_linux/ws/build/test_communication/test_publisher_subscriber__rclpy__rmw_fastrtps_cpp__rmw_connext_cpp_.py" "--junit-xml=/home/jenkins-agent/workspace/ci_linux/ws/build/test_communication/test_results/test_communication/test_publisher_subscriber__rclpy__rmw_fastrtps_cpp__rmw_connext_cpp.xunit.xml" "--package-name=test_communication"
53: Test timeout computed to be: 180
53: -- run_test.py: extra environment variables to append:
53:  - LD_LIBRARY_PATH+=/home/jenkins-agent/workspace/ci_linux/ws/build/test_communication
53: -- run_test.py: invoking following command in '/home/jenkins-agent/workspace/ci_linux/ws/src/ros2/system_tests/test_communication':
53:  - /home/jenkins-agent/workspace/ci_linux/venv/bin/python3 -m launch_testing.launch_test /home/jenkins-agent/workspace/ci_linux/ws/build/test_communication/test_publisher_subscriber__rclpy__rmw_fastrtps_cpp__rmw_connext_cpp_.py --junit-xml=/home/jenkins-agent/workspace/ci_linux/ws/build/test_communication/test_results/test_communication/test_publisher_subscriber__rclpy__rmw_fastrtps_cpp__rmw_connext_cpp.xunit.xml --package-name=test_communication
53: 
53: Starting test run test_communication.test_publisher_subscriber__rclpy__rmw_fastrtps_cpp__rmw_connext_cpp_.launch_tests[Arrays]
53: [INFO] [launch]: All log files can be found below /home/rosbuild/.ros/log/2019-10-21-23-58-18-759304-2536c550ee6c-15566
53: [INFO] [launch]: Default logging verbosity is set to INFO
53: test_subscriber_terminates_in_a_finite_amount_of_time[Arrays] (test_communication.TestPublisherSubscriber)
53: Test that the subscriber terminates after a finite amount of time. ... [INFO] [test_publisher-1]: process started with pid [15570]
53: [INFO] [test_subscriber-2]: process started with pid [15571]
53: [INFO] [test_subscriber-2]: sending signal 'SIGINT' to process[test_subscriber-2]
53: [INFO] [test_publisher-1]: sending signal 'SIGINT' to process[test_publisher-1]
53: [INFO] [test_publisher-1]: process has finished cleanly [pid 15570]
53: [INFO] [test_subscriber-2]: process has finished cleanly [pid 15571]
53: FAIL
53: 
53: ======================================================================
53: FAIL: test_subscriber_terminates_in_a_finite_amount_of_time[Arrays] (test_communication.TestPublisherSubscriber)
53: Test that the subscriber terminates after a finite amount of time.
53: ----------------------------------------------------------------------
53: Traceback (most recent call last):
53:   File "/home/jenkins-agent/workspace/ci_linux/ws/build/test_communication/test_publisher_subscriber__rclpy__rmw_fastrtps_cpp__rmw_connext_cpp_.py", line 66, in test_subscriber_terminates_in_a_finite_amount_of_time
53:     proc_info.assertWaitForShutdown(process=subscriber_process, timeout=10)
53:   File "/home/jenkins-agent/workspace/ci_linux/ws/install/launch_testing/lib/python3.6/site-packages/launch_testing/proc_info_handler.py", line 141, in assertWaitForShutdown
53:     assert success, "Timed out waiting for process '{}' to finish".format(process)
53: AssertionError: Timed out waiting for process '<launch.actions.execute_process.ExecuteProcess object at 0x7fea617c82e8>' to finish
53: 
53: ----------------------------------------------------------------------
53: Ran 1 test in 10.002s
53: 
53: FAILED (failures=1)
53: test_processes_finished_gracefully[Arrays] (test_communication.TestPublisherSubscriberAfterShutdown)
53: Test that both executables finished gracefully. ... 
53: Starting test run test_communication.test_publisher_subscriber__rclpy__rmw_fastrtps_cpp__rmw_connext_cpp_.launch_tests[BasicTypes]
53: [INFO] [launch]: All log files can be found below /home/rosbuild/.ros/log/2019-10-21-23-58-18-759304-2536c550ee6c-15566
53: [INFO] [launch]: Default logging verbosity is set to INFO
53: ok
53: 
53: ----------------------------------------------------------------------
53: Ran 1 test in 0.000s

In this particular test I only swapped out the ReadyToTest action for the ready_fn - maybe there's some unintended difference about the timing that I need to understand better

Signed-off-by: Pete Baughman <pete.baughman@apex.ai>
@pbaughman pbaughman force-pushed the remove_deprecated_launch_testing_features branch from a08c4fe to f2047d2 Compare January 24, 2020 17:14
@pbaughman
Copy link
Contributor Author

@sloretz Sorry this has been sitting here forever - I've just started working on this again and I've got it all rebased. I'm not seeing the same failure locally - do you mind kicking off another CI job? It's possible a change in launch_testing in the last 3 months has fixed this.

@sloretz
Copy link
Contributor

sloretz commented Jan 24, 2020

CI (testing just test_cli_remapping test_communication test_rclcpp test_security)

  • Linux Build Status
  • Linux-aarch64 Build Status
  • macOS Build Status
  • Windows Build Status

@pbaughman
Copy link
Contributor Author

Thank you - the error has changed a little, but the root cause appears to be the same. I will continue to investigate

@pbaughman
Copy link
Contributor Author

pbaughman commented Jan 24, 2020

I see now. . . this is a pretty wild test-within-a-test. The process that's hanging is called test_remote_parameters_cpp is actually a gtest. It gets to here:

[ RUN      ] parameters__rmw_implementation.test_remote_parameters_sync
Setting parameters

And then hangs forever, apparently.

I can get the issue to happen on my machine now - unfortunately it also happens on 'master' for me so I'm not sure how I can proceed with this PR.

@jacobperron it looks like this is hanging here

If while it's hung, I do ros2 service list from another terminal, I see:

root@2d088e2a08c7:/system_tests# ros2 service list
/test_parameters_server_allow_undeclared/describe_parameters
/test_parameters_server_allow_undeclared/get_parameter_types
/test_parameters_server_allow_undeclared/get_parameters
/test_parameters_server_allow_undeclared/list_parameters
/test_parameters_server_allow_undeclared/set_parameters
/test_parameters_server_allow_undeclared/set_parameters_atomically
/test_parameters_server_must_declare/describe_parameters
/test_parameters_server_must_declare/get_parameter_types
/test_parameters_server_must_declare/get_parameters
/test_parameters_server_must_declare/list_parameters
/test_parameters_server_must_declare/set_parameters
/test_parameters_server_must_declare/set_parameters_atomically
/test_remote_parameters_sync/describe_parameters
/test_remote_parameters_sync/get_parameter_types
/test_remote_parameters_sync/get_parameters
/test_remote_parameters_sync/list_parameters
/test_remote_parameters_sync/set_parameters
/test_remote_parameters_sync/set_parameters_atomically

So it looks like the service that it's trying to call is there. Do you have any insight into why this might be hanging? I'm getting the hang by running

root@2d088e2a08c7:/system_tests# ./build/test_rclcpp/test_parameters_server_cpp &
[1] 13115

root@2d088e2a08c7:/system_tests# ./build/test_rclcpp/test_remote_parameters_cpp
Running main() from /opt/ros/foxy/src/gtest_vendor/src/gtest_main.cc
[==========] Running 6 tests from 2 test cases.
[----------] Global test environment set-up.
[----------] 3 tests from parameters__rmw_implementation
[ RUN      ] parameters__rmw_implementation.test_remote_parameters_async
Setting parameters
Got set_parameters result
Listing parameters with recursive depth
Listing parameters with depth 1
Listing parameters with depth 2
Getting parameters
Getting nonexistent parameters
Listing parameters with recursive depth
[       OK ] parameters__rmw_implementation.test_remote_parameters_async (228 ms)
[ RUN      ] parameters__rmw_implementation.test_remote_parameters_sync
Setting parameters  

@jacobperron
Copy link
Member

Sounds like a occurrence of ros2/build_farmer#198.
Looking at the referenced ticket, it isn't clear to me what resolved it, or if it was actually resolved.

I don't have any ideas at the moment.

@pbaughman
Copy link
Contributor Author

@jacobperron Wild - I can get it to happen every time on my machine - let me know if there's anything you think of to try and I'm happy to dig deeper.

@pbaughman
Copy link
Contributor Author

Any idea how I should proceed with this? It seems like I'm blocked on some unrelated bug.

@jacobperron
Copy link
Member

jacobperron commented Feb 4, 2020

Looking at the lastest CI run on this thread, I looks like all of the macOS failures are occurring on our nightly CI, but the test_parameter_server_cpp__rmw_fastrtps_cpp hasn't appeared for the last few nights. I'll retrigger CI to see if the failure persists:

  • Linux Build Status
  • Linux-aarch64 Build Status
  • macOS Build Status (same failures on nightly)
  • Windows Build Status (same failure on nightly)
  • Windows-container Build Status

@pbaughman
Copy link
Contributor Author

At least on my machine, it appears that the test is failing because the parameter server really is hanging. I can improve the output of the test a bit to make it easier to see what's happening, but as far as I can tell it's not a problem with the test

@jacobperron
Copy link
Member

The failures present in the latest CI I've triggered are also occurring on our nightlies, so I'd say that this change is good to merge.

@jacobperron jacobperron merged commit a5e18f3 into ros2:master Feb 4, 2020
@pbaughman pbaughman deleted the remove_deprecated_launch_testing_features branch February 20, 2020 17:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants