Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[error] netlink_connector::receive_cbk received error message: 2 seq 3 #492

Open
dprogm opened this issue Jul 17, 2023 · 12 comments
Open

[error] netlink_connector::receive_cbk received error message: 2 seq 3 #492

dprogm opened this issue Jul 17, 2023 · 12 comments

Comments

@dprogm
Copy link

dprogm commented Jul 17, 2023

Hello everybody 😉

We are getting the following error on an ARM device which is running the service part. The following error is thrown on startup:

[info] Network interface "bridge0" state changed: up
[error] netlink_connector::receive_cbk received error message: 2 seq 3

Normally we are expecting logging output regarding the routing/multicast.

This issue is actually a duplicate of #214, but it is slightly different and there is still no answer available.

We are thinking that this is the reason why the client is not able to connect to the service. Is there any requirement regarding IP addresses (range) or netmask? We are using the 10/8 address space.

Thanks in advance

@hugojacob
Copy link

I'm seeing the same when using kernel 6.x.

Is it a known bug?

@dprogm
Copy link
Author

dprogm commented Aug 27, 2023

What version of boost do you use? Maybe a too up to date version might cause troubles.

I will give an update tomorrow regarding the linux kernel versions used in our setup. But we aren't using 6.x at least.

@hugojacob
Copy link

hugojacob commented Aug 30, 2023

The issue seems to be related to vsomeip >= 3.3.0. Downgrading to vsomeip 3.1.37.1 works fine for me.

Here is a bit more of information about my environment

  • Building on AOSP 14
  • Boost 1.76 (also tested with Boost 1.82)
  • Kernel 6.3.11

vsomeip >= 3.3.0 never joins the broadcast group and thus never gets the SD messages:

[INFO] AbstractTester::init(): initializing vsomeip app.
2023-08-30 17:07:26.463308 [info] Using configuration file: "etc/vsomeip_client.json".
2023-08-30 17:07:26.463309 [info] Parsed vsomeip configuration in 0ms
2023-08-30 17:07:26.463309 [info] Configuration module loaded.
2023-08-30 17:07:26.463309 [info] Initializing vsomeip (3.3.8) application "hello_tester".
2023-08-30 17:07:26.463310 [info] Instantiating routing manager [Host].
2023-08-30 17:07:26.463311 [info] create_routing_root: Routing root @ /tmp/vsomeip-0
2023-08-30 17:07:26.463311 [info] Service Discovery enabled. Trying to load module.
2023-08-30 17:07:26.463315 [info] Service Discovery module loaded.
2023-08-30 17:07:26.463315 [info] Application(hello_tester, 3500) is initialized (11, 100).
2023-08-30 17:07:26.463315 [info] Starting vsomeip application "hello_tester" (3500) using 2 threads I/O nice 255
2023-08-30 17:07:26.463316 [info] Client [3500] routes unicast:192.168.98.1, netmask:255.255.255.0
2023-08-30 17:07:26.463316 [info] main dispatch thread id from application: 3500 (hello_tester) is: 7f72e35116c0 TID: 3360556
2023-08-30 17:07:26.463317 [info] shutdown thread id from application: 3500 (hello_tester) is: 7f72e2d106c0 TID: 3360557
2023-08-30 17:07:26.463318 [info] Watchdog is disabled!
[INFO] AbstractTester::onVsomeipStateChange -> 0
[INFO] AbstractTester::onVsomeipStateChange: registering handlers
[INFO] Requesting service 0x3500.0x1000
2023-08-30 17:07:26.463320 [info] REQUEST(3500): [3500.1000:255.4294967295]
2023-08-30 17:07:26.463320 [info] io thread id from application: 3500 (hello_tester) is: 7f72e3d13740 TID: 3360554
2023-08-30 17:07:26.463320 [info] io thread id from application: 3500 (hello_tester) is: 7f72e1d0e6c0 TID: 3360559
2023-08-30 17:07:26.463321 [info] create_local_server: Listening @ /tmp/vsomeip-3500
[INFO] HelloTester::onVsomeipAvailabilityChange: service availability changed 0x3500.0x1000 -> NOT available
2023-08-30 17:07:26.463327 [info] Network interface "cvd-ebr" state changed: up
2023-08-30 17:07:26.463329 [error] netlink_connector::receive_cbk received error message: 2 seq 3
2023-08-30 17:07:26.464323 [warning] BLOCKING CALL AVAILABILITY(3500): [3500.1000]
2023-08-30 17:07:26.464325 [info] dispatch thread id from application: 3500 (hello_tester) is: 7f72e150d6c0 TID: 3360560
[INFO] Trying again...
2023-08-30 17:07:31.513323 [info] REQUEST(3500): [3500.1000:255.4294967295]

But also note that with vsomeip 3.1.37.1 I also get the error message mentioned in this thread:

2023-07-30 17:56:04.643669 [info] Network interface "cvd-ebr" state changed: up
2023-07-30 17:56:04.643672 [error] netlink_connector::receive_cbk received error message: Unknown error -16 type 26 seq 3
2023-07-30 17:56:04.643673 [info] Retrying netlink route request
2023-07-30 17:56:04.643674 [info] Route "224.0.0.0/4 if: cvd-ebr gw: n/a" state changed: up
2023-07-30 17:56:04.643674 [info] udp_server_endpoint_impl: SO_RCVBUF is: 1703936 (1703936) local port:30491
2023-07-30 17:56:04.643675 [debug] Joining to multicast group 224.0.0.0 from 192.168.98.1

@dprogm
Copy link
Author

dprogm commented Sep 1, 2023

Okay, thanks for the hint. I will check the concrete vsomeip version.

@hugojacob
Copy link

Hey @dprogm,

just a friendly ping to check whether have you had the time to look into this.

@dprogm
Copy link
Author

dprogm commented Sep 25, 2023

Hi @hugojacob Sorry for not getting back to you quickly. I had no access to my workstation the last few weeks. I will check the SomeIP version tomorrow and give an update here. But I am pretty sure that we are using the dev state (HEAD of master) of the SomeIP library in conjunction with CommonAPI (see https://github.com/COVESA/capicxx-someip-tools/wiki/CommonAPI-C---SomeIP-in-10-minutes). I will pin the version and check again.

@dprogm
Copy link
Author

dprogm commented Sep 27, 2023

@hugojacob Small update. I am using Version 3.3.8 and there is no error. I will check the concrete Version from my co-worker too.

@goncaloalmeida
Copy link
Contributor

@dprogm the problem is still occurring? from your last comment, it seems not.

cc @fcmonteiro

@elektronenhirn
Copy link

The problem is still there, I see it, too.

My setup:

  • vsomeip: 3.4.10
  • host os: Ubuntu 22.04
  • host kernel: 6.2.0-39-generic
  • building with Android 13's soong

I digged a bit deeper and found out some interesting things:

  1. as @hugojacob already noticed, the error messages differs between different vsomeip versions. that is because the formatting of the message got changed. but the error is actually the same
  2. we both receive an EBUSY return code from the kernel
  3. there was a patch with a retry mechanism submitted by @phiwer here: a268e2d
  4. but the patch got meanwhile removed here: 826ebb8
  5. I re-applied the patch locally and it fixed my issue: the 2nd re-try usually works for me

Log output with some custom trace messages:

//Netlink get route fails 1. time
2024-01-05 16:02:57.778813 [error] netlink_connector::receive_cbk received error message: Unknown error -16 type 26 seq 3

//1. Netlink retry
2024-01-05 16:02:57.778814 [warning] send_rt_request
2024-01-05 16:02:57.778814 [info] Retrying netlink route request
2024-01-05 16:02:57.778814 [warning] receive_cbk
2024-01-05 16:02:57.778814 [info] offer_event: Event [1236.567a.ea68] uses configured cycle time 0ms
2024-01-05 16:02:57.778814 [warning] receive_cbk
2024-01-05 16:02:57.778814 [info] REGISTER EVENT(1212): [1236.567a.ea68:is_provider=true]
2024-01-05 16:02:57.778814 [warning] receive_cbk
2024-01-05 16:02:57.778814 [warning] receive_cbk

//Netlink get route fails 2. time
2024-01-05 16:02:57.778814 [error] netlink_connector::receive_cbk received error message: Unknown error -16 type 26 seq 3

//2. Netlink retry
2024-01-05 16:02:57.778814 [warning] send_rt_request
2024-01-05 16:02:57.778814 [info] OFFER(1212): [1236.567a:1.0] (true)
2024-01-05 16:02:57.778815 [info] Retrying netlink route request
2024-01-05 16:02:57.778815 [warning] receive_cbk
2024-01-05 16:02:57.778815 [warning] receive_cbk

//now it seems to work :-)
2024-01-05 16:02:57.778817 [info] Route "224.224.224.245/32 if: cvd-ebr gw: n/a" state changed: up
2024-01-05 16:02:57.778818 [info] udp_server_endpoint_impl: SO_RCVBUF is: 212992 (1703936) local port:30490
2024-01-05 16:02:57.778818 [debug] Joining to multicast group 224.224.224.245 from 192.168.98.1
2024-01-05 16:02:57.778818 [info] SOME/IP routing ready.
2024-01-05 16:02:57.778818 [info] udp_server_endpoint_impl<multicast>: SO_RCVBUF is: 212992 (1703936) local port:30490
2024-01-05 16:02:57.778821 [warning] receive_cbk

Is it possible to re-open the issue?

Seems like a retry mechanism would actually make sense. At least on environments dealing with Android Trout / virtualized network environments. However I have no idea how this causes the EBUSY. But it seems like the similarity between @hugojacob ' setup and mine.

@hugojacob
Copy link

It seems to be the same as issue #607. There, the two hosts are a Linux-based machine and not AOSP 14 (as @elektronenhirn 's and my setup).

@hugojacob
Copy link

The issue I linked above (#607) is fixed with PR #591.

I'll give it a try and come back to you.

@duartenfonseca
Copy link
Collaborator

@elektronenhirn did you try with the fix from #591 ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants