Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Will Fast DDS work abnormally if CPU affinity is set in a real-time process? #5272

Open
1 task done
ClarkZaitun opened this issue Sep 27, 2024 · 0 comments
Open
1 task done
Labels
triage Issue pending classification

Comments

@ClarkZaitun
Copy link

Is there an already existing issue for this?

  • I have searched the existing issues

Expected behavior

20240927-111224

Current behavior

The EtherCAT protocol requires very high real-time and deterministic performance. I use Fast DDS for node communication in the EtherCAT Master.
My EtherCAT Master process has the following characteristics:

  1. 1000hz node communication cycle
  2. The process locks the memory at startup to prevent memory swapping to disk; at the same time, the scheduling mode is set to FIFO (Linux RT Preeempt)
  3. Soon after the process starts, the CPU affinity of the thread is set. Although I only set the affinity of 4 threads to CPU 0, from the pictures provided, all threads of the process are running on CPU0.
    After running for a period of time, the Master will stop running completely. The CPU usage of the Master is generally 15%, and it will become 97% when it fails, of which the CPU usage of the dds.asyn.0.0 thread is 92%.

ps -T -p 3211336

PID    SPID TTY          TIME CMD

3211336 3211336 pts/6 00:00:02 ethercat_ma
3211336 3211337 pts/6 00:00:00 Log
3211336 3211338 pts/6 00:00:00 ClkTask
3211336 3211385 pts/6 00:00:01 JobTask
3211336 3211386 pts/6 00:00:00 dds.shm.wdog
3211336 3211387 pts/6 00:00:00 dds.ev.0
3211336 3211388 pts/6 00:00:00 dds.udp.20400
3211336 3211389 pts/6 00:00:00 dds.udp.20410
3211336 3211390 pts/6 00:00:00 dds.shm.20411
3211336 3211391 pts/6 00:00:00 dds.udp.20411
3211336 3211392 pts/6 00:00:00 dds.asyn.0.0
3211336 3211393 pts/6 00:00:00 dds.dsha.2820
3211336 3211394 pts/6 00:00:00 dds.dsha.3332
3211336 3211395 pts/6 00:00:00 dds.dsha.3844
3211336 3211396 pts/6 00:00:00 dds.dsha.4356
3211336 3211397 pts/6 00:00:00 ethercat_ma
3211336 3211398 pts/6 00:00:00 ethercat_ma
3211336 3211399 pts/6 00:00:00 ethercat_ma
3211336 3211400 pts/6 00:00:00 ethercat_ma

Steps to reproduce

  1. Use the ecat_start script to find the configuration of the EtherCAT Master process
  2. The script uses rosa run to run the EtherCAT Master
  3. The EtherCAT Master starts. It communicates with another node for topics and services. Wait for 30min-3h, and the fault will occur.

Fast DDS version/commit

2.14.0

Platform/Architecture

Other. Please specify in Additional context section.

Transport layer

Shared Memory Transport (SHM)

Additional context

Linux motion 5.15.158-rt76 #3 SMP PREEMPT_RT Tue Jun 11 07:18:25 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Ubuntu 22.04

XML configuration file

No response

Relevant log output

No response

Network traffic capture

No response

@ClarkZaitun ClarkZaitun added the triage Issue pending classification label Sep 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triage Issue pending classification
Projects
None yet
Development

No branches or pull requests

1 participant