Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Security Observability with eBPF initial content #1

Merged
merged 1 commit into from
Mar 24, 2022
Merged

Conversation

sharlns
Copy link
Contributor

@sharlns sharlns commented Mar 24, 2022

No description provided.

@sharlns sharlns merged commit 4a959ad into main Mar 24, 2022
@sharlns sharlns deleted the initial_content branch March 24, 2022 22:53
kkourt added a commit that referenced this pull request Jul 27, 2022
We have been hitting an issue, where the test will continuously print
RCU stalls. Set an option to just panic if that happens

[ 1109.837053] rcu: 	1-...!: (20984 ticks this GP) idle=a4a/1/0x4000000000000002 softirq=46388/46388 fqs=1
[ 1109.837053] 	(t=21001 jiffies g=140697 q=9)
[ 1109.837053] rcu: rcu_sched kthread starved for 20995 jiffies! g140697 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
[ 1109.837053] rcu: RCU grace-period kthread stack dump:
[ 1109.837053] rcu_sched       R  running task    14904    11      2 0x90004000
[ 1109.837053] Call Trace:
[ 1109.837053]  __schedule+0x237/0x610
[ 1109.837053]  ? __mod_timer+0x19d/0x3c0
[ 1109.837053]  schedule+0x34/0xa0
[ 1109.837053]  schedule_timeout+0x84/0x150
[ 1109.837053]  ? __next_timer_interrupt+0xc0/0xc0
[ 1109.837053]  rcu_gp_kthread+0x4f4/0xd50
[ 1109.837053]  ? kfree_call_rcu+0x10/0x10
[ 1109.837053]  kthread+0x112/0x130
[ 1109.837053]  ? __kthread_bind_mask+0x60/0x60
[ 1109.837053]  ret_from_fork+0x35/0x40
[ 1109.837053] NMI backtrace for cpu 1
[ 1109.837053] CPU: 1 PID: 533 Comm: pkg.sensors.tra Not tainted 5.4.206 #1
[ 1109.837053] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
[ 1109.837053] Call Trace:
[ 1109.837053]  <IRQ>
[ 1109.837053]  dump_stack+0x50/0x63
[ 1109.837053]  nmi_cpu_backtrace.cold+0x14/0x53
[ 1109.837053]  ? lapic_can_unplug_cpu+0x70/0x70
[ 1109.837053]  nmi_trigger_cpumask_backtrace+0x7c/0x90
[ 1109.837053]  rcu_dump_cpu_stacks+0x7c/0xaa
[ 1109.837053]  rcu_sched_clock_irq.cold+0x1b3/0x39e
[ 1109.837053]  update_process_times+0x56/0x90
[ 1109.837053]  tick_sched_handle+0x2f/0x40
[ 1109.837053]  tick_sched_timer+0x4c/0xb0
[ 1109.837053]  ? can_stop_idle_tick+0x90/0x90
[ 1109.837053]  __hrtimer_run_queues+0x123/0x2a0
[ 1109.837053]  hrtimer_interrupt+0x10b/0x2c0
[ 1109.837053]  smp_apic_timer_interrupt+0x61/0x130
[ 1109.837053]  apic_timer_interrupt+0xf/0x20
[ 1109.837053]  </IRQ>
[ 1109.837053] RIP: 0010:syscall_trace_enter+0x1f1/0x290
[ 1109.837053] Code: 01 00 48 c7 80 88 07 00 00 00 00 00 00 48 8b 10 83 e2 04 74 af f6 80 c9 06 00 00 01 74 a6 48 c7 c0 ff ff ff ff e9 25 ff ff ff <e9> 40 00 00 00 e9 ec fe ff ff 4c 8b 4b 58 48 8b 73 28 49 89 d0 4c
[ 1109.837053] RSP: 0018:ffffbb0f40247ec8 EFLAGS: 00000206 ORIG_RAX: ffffffffffffff13
[ 1109.837053] RAX: 0000000010000000 RBX: ffffbb0f40247f58 RCX: 0000000000000000
[ 1109.837053] RDX: 0000000000000000 RSI: ffffbb0f40247f58 RDI: 00000000000000e4
[ 1109.837053] RBP: 00000000c000003e R08: 0000000000000000 R09: 0000000000000000
[ 1109.837053] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[ 1109.837053] R13: 00000000000000e4 R14: 0000000010000000 R15: 0000000000000000
[ 1109.837053]  ? _copy_to_user+0x28/0x30
[ 1109.837053]  ? put_timespec64+0x35/0x60
[ 1109.837053]  do_syscall_64+0xc8/0x110
[ 1109.837053]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1109.837053] RIP: 0033:0x7ffd2f7bb7ff

Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com>
kkourt added a commit that referenced this pull request Jul 27, 2022
We have been hitting an issue, where the test will continuously print
RCU stalls. Set an option to just panic if that happens

[ 1109.837053] rcu: 	1-...!: (20984 ticks this GP) idle=a4a/1/0x4000000000000002 softirq=46388/46388 fqs=1
[ 1109.837053] 	(t=21001 jiffies g=140697 q=9)
[ 1109.837053] rcu: rcu_sched kthread starved for 20995 jiffies! g140697 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
[ 1109.837053] rcu: RCU grace-period kthread stack dump:
[ 1109.837053] rcu_sched       R  running task    14904    11      2 0x90004000
[ 1109.837053] Call Trace:
[ 1109.837053]  __schedule+0x237/0x610
[ 1109.837053]  ? __mod_timer+0x19d/0x3c0
[ 1109.837053]  schedule+0x34/0xa0
[ 1109.837053]  schedule_timeout+0x84/0x150
[ 1109.837053]  ? __next_timer_interrupt+0xc0/0xc0
[ 1109.837053]  rcu_gp_kthread+0x4f4/0xd50
[ 1109.837053]  ? kfree_call_rcu+0x10/0x10
[ 1109.837053]  kthread+0x112/0x130
[ 1109.837053]  ? __kthread_bind_mask+0x60/0x60
[ 1109.837053]  ret_from_fork+0x35/0x40
[ 1109.837053] NMI backtrace for cpu 1
[ 1109.837053] CPU: 1 PID: 533 Comm: pkg.sensors.tra Not tainted 5.4.206 #1
[ 1109.837053] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
[ 1109.837053] Call Trace:
[ 1109.837053]  <IRQ>
[ 1109.837053]  dump_stack+0x50/0x63
[ 1109.837053]  nmi_cpu_backtrace.cold+0x14/0x53
[ 1109.837053]  ? lapic_can_unplug_cpu+0x70/0x70
[ 1109.837053]  nmi_trigger_cpumask_backtrace+0x7c/0x90
[ 1109.837053]  rcu_dump_cpu_stacks+0x7c/0xaa
[ 1109.837053]  rcu_sched_clock_irq.cold+0x1b3/0x39e
[ 1109.837053]  update_process_times+0x56/0x90
[ 1109.837053]  tick_sched_handle+0x2f/0x40
[ 1109.837053]  tick_sched_timer+0x4c/0xb0
[ 1109.837053]  ? can_stop_idle_tick+0x90/0x90
[ 1109.837053]  __hrtimer_run_queues+0x123/0x2a0
[ 1109.837053]  hrtimer_interrupt+0x10b/0x2c0
[ 1109.837053]  smp_apic_timer_interrupt+0x61/0x130
[ 1109.837053]  apic_timer_interrupt+0xf/0x20
[ 1109.837053]  </IRQ>
[ 1109.837053] RIP: 0010:syscall_trace_enter+0x1f1/0x290
[ 1109.837053] Code: 01 00 48 c7 80 88 07 00 00 00 00 00 00 48 8b 10 83 e2 04 74 af f6 80 c9 06 00 00 01 74 a6 48 c7 c0 ff ff ff ff e9 25 ff ff ff <e9> 40 00 00 00 e9 ec fe ff ff 4c 8b 4b 58 48 8b 73 28 49 89 d0 4c
[ 1109.837053] RSP: 0018:ffffbb0f40247ec8 EFLAGS: 00000206 ORIG_RAX: ffffffffffffff13
[ 1109.837053] RAX: 0000000010000000 RBX: ffffbb0f40247f58 RCX: 0000000000000000
[ 1109.837053] RDX: 0000000000000000 RSI: ffffbb0f40247f58 RDI: 00000000000000e4
[ 1109.837053] RBP: 00000000c000003e R08: 0000000000000000 R09: 0000000000000000
[ 1109.837053] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[ 1109.837053] R13: 00000000000000e4 R14: 0000000010000000 R15: 0000000000000000
[ 1109.837053]  ? _copy_to_user+0x28/0x30
[ 1109.837053]  ? put_timespec64+0x35/0x60
[ 1109.837053]  do_syscall_64+0xc8/0x110
[ 1109.837053]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1109.837053] RIP: 0033:0x7ffd2f7bb7ff

Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com>
kkourt added a commit that referenced this pull request Jul 27, 2022
We have been hitting an issue, where the test will continuously print
RCU stalls. Set an option to just panic if that happens

[ 1109.837053] rcu: 	1-...!: (20984 ticks this GP) idle=a4a/1/0x4000000000000002 softirq=46388/46388 fqs=1
[ 1109.837053] 	(t=21001 jiffies g=140697 q=9)
[ 1109.837053] rcu: rcu_sched kthread starved for 20995 jiffies! g140697 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
[ 1109.837053] rcu: RCU grace-period kthread stack dump:
[ 1109.837053] rcu_sched       R  running task    14904    11      2 0x90004000
[ 1109.837053] Call Trace:
[ 1109.837053]  __schedule+0x237/0x610
[ 1109.837053]  ? __mod_timer+0x19d/0x3c0
[ 1109.837053]  schedule+0x34/0xa0
[ 1109.837053]  schedule_timeout+0x84/0x150
[ 1109.837053]  ? __next_timer_interrupt+0xc0/0xc0
[ 1109.837053]  rcu_gp_kthread+0x4f4/0xd50
[ 1109.837053]  ? kfree_call_rcu+0x10/0x10
[ 1109.837053]  kthread+0x112/0x130
[ 1109.837053]  ? __kthread_bind_mask+0x60/0x60
[ 1109.837053]  ret_from_fork+0x35/0x40
[ 1109.837053] NMI backtrace for cpu 1
[ 1109.837053] CPU: 1 PID: 533 Comm: pkg.sensors.tra Not tainted 5.4.206 #1
[ 1109.837053] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
[ 1109.837053] Call Trace:
[ 1109.837053]  <IRQ>
[ 1109.837053]  dump_stack+0x50/0x63
[ 1109.837053]  nmi_cpu_backtrace.cold+0x14/0x53
[ 1109.837053]  ? lapic_can_unplug_cpu+0x70/0x70
[ 1109.837053]  nmi_trigger_cpumask_backtrace+0x7c/0x90
[ 1109.837053]  rcu_dump_cpu_stacks+0x7c/0xaa
[ 1109.837053]  rcu_sched_clock_irq.cold+0x1b3/0x39e
[ 1109.837053]  update_process_times+0x56/0x90
[ 1109.837053]  tick_sched_handle+0x2f/0x40
[ 1109.837053]  tick_sched_timer+0x4c/0xb0
[ 1109.837053]  ? can_stop_idle_tick+0x90/0x90
[ 1109.837053]  __hrtimer_run_queues+0x123/0x2a0
[ 1109.837053]  hrtimer_interrupt+0x10b/0x2c0
[ 1109.837053]  smp_apic_timer_interrupt+0x61/0x130
[ 1109.837053]  apic_timer_interrupt+0xf/0x20
[ 1109.837053]  </IRQ>
[ 1109.837053] RIP: 0010:syscall_trace_enter+0x1f1/0x290
[ 1109.837053] Code: 01 00 48 c7 80 88 07 00 00 00 00 00 00 48 8b 10 83 e2 04 74 af f6 80 c9 06 00 00 01 74 a6 48 c7 c0 ff ff ff ff e9 25 ff ff ff <e9> 40 00 00 00 e9 ec fe ff ff 4c 8b 4b 58 48 8b 73 28 49 89 d0 4c
[ 1109.837053] RSP: 0018:ffffbb0f40247ec8 EFLAGS: 00000206 ORIG_RAX: ffffffffffffff13
[ 1109.837053] RAX: 0000000010000000 RBX: ffffbb0f40247f58 RCX: 0000000000000000
[ 1109.837053] RDX: 0000000000000000 RSI: ffffbb0f40247f58 RDI: 00000000000000e4
[ 1109.837053] RBP: 00000000c000003e R08: 0000000000000000 R09: 0000000000000000
[ 1109.837053] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[ 1109.837053] R13: 00000000000000e4 R14: 0000000010000000 R15: 0000000000000000
[ 1109.837053]  ? _copy_to_user+0x28/0x30
[ 1109.837053]  ? put_timespec64+0x35/0x60
[ 1109.837053]  do_syscall_64+0xc8/0x110
[ 1109.837053]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1109.837053] RIP: 0033:0x7ffd2f7bb7ff

Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com>
kkourt added a commit that referenced this pull request Jul 27, 2022
We have been hitting an issue, where the test will continuously print
RCU stalls. Set an option to just panic if that happens

[ 1109.837053] rcu: 	1-...!: (20984 ticks this GP) idle=a4a/1/0x4000000000000002 softirq=46388/46388 fqs=1
[ 1109.837053] 	(t=21001 jiffies g=140697 q=9)
[ 1109.837053] rcu: rcu_sched kthread starved for 20995 jiffies! g140697 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
[ 1109.837053] rcu: RCU grace-period kthread stack dump:
[ 1109.837053] rcu_sched       R  running task    14904    11      2 0x90004000
[ 1109.837053] Call Trace:
[ 1109.837053]  __schedule+0x237/0x610
[ 1109.837053]  ? __mod_timer+0x19d/0x3c0
[ 1109.837053]  schedule+0x34/0xa0
[ 1109.837053]  schedule_timeout+0x84/0x150
[ 1109.837053]  ? __next_timer_interrupt+0xc0/0xc0
[ 1109.837053]  rcu_gp_kthread+0x4f4/0xd50
[ 1109.837053]  ? kfree_call_rcu+0x10/0x10
[ 1109.837053]  kthread+0x112/0x130
[ 1109.837053]  ? __kthread_bind_mask+0x60/0x60
[ 1109.837053]  ret_from_fork+0x35/0x40
[ 1109.837053] NMI backtrace for cpu 1
[ 1109.837053] CPU: 1 PID: 533 Comm: pkg.sensors.tra Not tainted 5.4.206 #1
[ 1109.837053] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
[ 1109.837053] Call Trace:
[ 1109.837053]  <IRQ>
[ 1109.837053]  dump_stack+0x50/0x63
[ 1109.837053]  nmi_cpu_backtrace.cold+0x14/0x53
[ 1109.837053]  ? lapic_can_unplug_cpu+0x70/0x70
[ 1109.837053]  nmi_trigger_cpumask_backtrace+0x7c/0x90
[ 1109.837053]  rcu_dump_cpu_stacks+0x7c/0xaa
[ 1109.837053]  rcu_sched_clock_irq.cold+0x1b3/0x39e
[ 1109.837053]  update_process_times+0x56/0x90
[ 1109.837053]  tick_sched_handle+0x2f/0x40
[ 1109.837053]  tick_sched_timer+0x4c/0xb0
[ 1109.837053]  ? can_stop_idle_tick+0x90/0x90
[ 1109.837053]  __hrtimer_run_queues+0x123/0x2a0
[ 1109.837053]  hrtimer_interrupt+0x10b/0x2c0
[ 1109.837053]  smp_apic_timer_interrupt+0x61/0x130
[ 1109.837053]  apic_timer_interrupt+0xf/0x20
[ 1109.837053]  </IRQ>
[ 1109.837053] RIP: 0010:syscall_trace_enter+0x1f1/0x290
[ 1109.837053] Code: 01 00 48 c7 80 88 07 00 00 00 00 00 00 48 8b 10 83 e2 04 74 af f6 80 c9 06 00 00 01 74 a6 48 c7 c0 ff ff ff ff e9 25 ff ff ff <e9> 40 00 00 00 e9 ec fe ff ff 4c 8b 4b 58 48 8b 73 28 49 89 d0 4c
[ 1109.837053] RSP: 0018:ffffbb0f40247ec8 EFLAGS: 00000206 ORIG_RAX: ffffffffffffff13
[ 1109.837053] RAX: 0000000010000000 RBX: ffffbb0f40247f58 RCX: 0000000000000000
[ 1109.837053] RDX: 0000000000000000 RSI: ffffbb0f40247f58 RDI: 00000000000000e4
[ 1109.837053] RBP: 00000000c000003e R08: 0000000000000000 R09: 0000000000000000
[ 1109.837053] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[ 1109.837053] R13: 00000000000000e4 R14: 0000000010000000 R15: 0000000000000000
[ 1109.837053]  ? _copy_to_user+0x28/0x30
[ 1109.837053]  ? put_timespec64+0x35/0x60
[ 1109.837053]  do_syscall_64+0xc8/0x110
[ 1109.837053]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1109.837053] RIP: 0033:0x7ffd2f7bb7ff

Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com>
kkourt added a commit that referenced this pull request Aug 4, 2022
We have been hitting an issue, where the test will continuously print
RCU stalls. Set an option to just panic if that happens

[ 1109.837053] rcu: 	1-...!: (20984 ticks this GP) idle=a4a/1/0x4000000000000002 softirq=46388/46388 fqs=1
[ 1109.837053] 	(t=21001 jiffies g=140697 q=9)
[ 1109.837053] rcu: rcu_sched kthread starved for 20995 jiffies! g140697 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
[ 1109.837053] rcu: RCU grace-period kthread stack dump:
[ 1109.837053] rcu_sched       R  running task    14904    11      2 0x90004000
[ 1109.837053] Call Trace:
[ 1109.837053]  __schedule+0x237/0x610
[ 1109.837053]  ? __mod_timer+0x19d/0x3c0
[ 1109.837053]  schedule+0x34/0xa0
[ 1109.837053]  schedule_timeout+0x84/0x150
[ 1109.837053]  ? __next_timer_interrupt+0xc0/0xc0
[ 1109.837053]  rcu_gp_kthread+0x4f4/0xd50
[ 1109.837053]  ? kfree_call_rcu+0x10/0x10
[ 1109.837053]  kthread+0x112/0x130
[ 1109.837053]  ? __kthread_bind_mask+0x60/0x60
[ 1109.837053]  ret_from_fork+0x35/0x40
[ 1109.837053] NMI backtrace for cpu 1
[ 1109.837053] CPU: 1 PID: 533 Comm: pkg.sensors.tra Not tainted 5.4.206 #1
[ 1109.837053] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
[ 1109.837053] Call Trace:
[ 1109.837053]  <IRQ>
[ 1109.837053]  dump_stack+0x50/0x63
[ 1109.837053]  nmi_cpu_backtrace.cold+0x14/0x53
[ 1109.837053]  ? lapic_can_unplug_cpu+0x70/0x70
[ 1109.837053]  nmi_trigger_cpumask_backtrace+0x7c/0x90
[ 1109.837053]  rcu_dump_cpu_stacks+0x7c/0xaa
[ 1109.837053]  rcu_sched_clock_irq.cold+0x1b3/0x39e
[ 1109.837053]  update_process_times+0x56/0x90
[ 1109.837053]  tick_sched_handle+0x2f/0x40
[ 1109.837053]  tick_sched_timer+0x4c/0xb0
[ 1109.837053]  ? can_stop_idle_tick+0x90/0x90
[ 1109.837053]  __hrtimer_run_queues+0x123/0x2a0
[ 1109.837053]  hrtimer_interrupt+0x10b/0x2c0
[ 1109.837053]  smp_apic_timer_interrupt+0x61/0x130
[ 1109.837053]  apic_timer_interrupt+0xf/0x20
[ 1109.837053]  </IRQ>
[ 1109.837053] RIP: 0010:syscall_trace_enter+0x1f1/0x290
[ 1109.837053] Code: 01 00 48 c7 80 88 07 00 00 00 00 00 00 48 8b 10 83 e2 04 74 af f6 80 c9 06 00 00 01 74 a6 48 c7 c0 ff ff ff ff e9 25 ff ff ff <e9> 40 00 00 00 e9 ec fe ff ff 4c 8b 4b 58 48 8b 73 28 49 89 d0 4c
[ 1109.837053] RSP: 0018:ffffbb0f40247ec8 EFLAGS: 00000206 ORIG_RAX: ffffffffffffff13
[ 1109.837053] RAX: 0000000010000000 RBX: ffffbb0f40247f58 RCX: 0000000000000000
[ 1109.837053] RDX: 0000000000000000 RSI: ffffbb0f40247f58 RDI: 00000000000000e4
[ 1109.837053] RBP: 00000000c000003e R08: 0000000000000000 R09: 0000000000000000
[ 1109.837053] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[ 1109.837053] R13: 00000000000000e4 R14: 0000000010000000 R15: 0000000000000000
[ 1109.837053]  ? _copy_to_user+0x28/0x30
[ 1109.837053]  ? put_timespec64+0x35/0x60
[ 1109.837053]  do_syscall_64+0xc8/0x110
[ 1109.837053]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1109.837053] RIP: 0033:0x7ffd2f7bb7ff

Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com>
kkourt added a commit that referenced this pull request Aug 4, 2022
We have been hitting an issue, where the test will continuously print
RCU stalls. Set an option to just panic if that happens

[ 1109.837053] rcu: 	1-...!: (20984 ticks this GP) idle=a4a/1/0x4000000000000002 softirq=46388/46388 fqs=1
[ 1109.837053] 	(t=21001 jiffies g=140697 q=9)
[ 1109.837053] rcu: rcu_sched kthread starved for 20995 jiffies! g140697 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
[ 1109.837053] rcu: RCU grace-period kthread stack dump:
[ 1109.837053] rcu_sched       R  running task    14904    11      2 0x90004000
[ 1109.837053] Call Trace:
[ 1109.837053]  __schedule+0x237/0x610
[ 1109.837053]  ? __mod_timer+0x19d/0x3c0
[ 1109.837053]  schedule+0x34/0xa0
[ 1109.837053]  schedule_timeout+0x84/0x150
[ 1109.837053]  ? __next_timer_interrupt+0xc0/0xc0
[ 1109.837053]  rcu_gp_kthread+0x4f4/0xd50
[ 1109.837053]  ? kfree_call_rcu+0x10/0x10
[ 1109.837053]  kthread+0x112/0x130
[ 1109.837053]  ? __kthread_bind_mask+0x60/0x60
[ 1109.837053]  ret_from_fork+0x35/0x40
[ 1109.837053] NMI backtrace for cpu 1
[ 1109.837053] CPU: 1 PID: 533 Comm: pkg.sensors.tra Not tainted 5.4.206 #1
[ 1109.837053] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
[ 1109.837053] Call Trace:
[ 1109.837053]  <IRQ>
[ 1109.837053]  dump_stack+0x50/0x63
[ 1109.837053]  nmi_cpu_backtrace.cold+0x14/0x53
[ 1109.837053]  ? lapic_can_unplug_cpu+0x70/0x70
[ 1109.837053]  nmi_trigger_cpumask_backtrace+0x7c/0x90
[ 1109.837053]  rcu_dump_cpu_stacks+0x7c/0xaa
[ 1109.837053]  rcu_sched_clock_irq.cold+0x1b3/0x39e
[ 1109.837053]  update_process_times+0x56/0x90
[ 1109.837053]  tick_sched_handle+0x2f/0x40
[ 1109.837053]  tick_sched_timer+0x4c/0xb0
[ 1109.837053]  ? can_stop_idle_tick+0x90/0x90
[ 1109.837053]  __hrtimer_run_queues+0x123/0x2a0
[ 1109.837053]  hrtimer_interrupt+0x10b/0x2c0
[ 1109.837053]  smp_apic_timer_interrupt+0x61/0x130
[ 1109.837053]  apic_timer_interrupt+0xf/0x20
[ 1109.837053]  </IRQ>
[ 1109.837053] RIP: 0010:syscall_trace_enter+0x1f1/0x290
[ 1109.837053] Code: 01 00 48 c7 80 88 07 00 00 00 00 00 00 48 8b 10 83 e2 04 74 af f6 80 c9 06 00 00 01 74 a6 48 c7 c0 ff ff ff ff e9 25 ff ff ff <e9> 40 00 00 00 e9 ec fe ff ff 4c 8b 4b 58 48 8b 73 28 49 89 d0 4c
[ 1109.837053] RSP: 0018:ffffbb0f40247ec8 EFLAGS: 00000206 ORIG_RAX: ffffffffffffff13
[ 1109.837053] RAX: 0000000010000000 RBX: ffffbb0f40247f58 RCX: 0000000000000000
[ 1109.837053] RDX: 0000000000000000 RSI: ffffbb0f40247f58 RDI: 00000000000000e4
[ 1109.837053] RBP: 00000000c000003e R08: 0000000000000000 R09: 0000000000000000
[ 1109.837053] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[ 1109.837053] R13: 00000000000000e4 R14: 0000000010000000 R15: 0000000000000000
[ 1109.837053]  ? _copy_to_user+0x28/0x30
[ 1109.837053]  ? put_timespec64+0x35/0x60
[ 1109.837053]  do_syscall_64+0xc8/0x110
[ 1109.837053]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1109.837053] RIP: 0033:0x7ffd2f7bb7ff

Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com>
kkourt added a commit that referenced this pull request Aug 4, 2022
We have been hitting an issue, where the test will continuously print
RCU stalls. Set an option to just panic if that happens

[ 1109.837053] rcu: 	1-...!: (20984 ticks this GP) idle=a4a/1/0x4000000000000002 softirq=46388/46388 fqs=1
[ 1109.837053] 	(t=21001 jiffies g=140697 q=9)
[ 1109.837053] rcu: rcu_sched kthread starved for 20995 jiffies! g140697 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
[ 1109.837053] rcu: RCU grace-period kthread stack dump:
[ 1109.837053] rcu_sched       R  running task    14904    11      2 0x90004000
[ 1109.837053] Call Trace:
[ 1109.837053]  __schedule+0x237/0x610
[ 1109.837053]  ? __mod_timer+0x19d/0x3c0
[ 1109.837053]  schedule+0x34/0xa0
[ 1109.837053]  schedule_timeout+0x84/0x150
[ 1109.837053]  ? __next_timer_interrupt+0xc0/0xc0
[ 1109.837053]  rcu_gp_kthread+0x4f4/0xd50
[ 1109.837053]  ? kfree_call_rcu+0x10/0x10
[ 1109.837053]  kthread+0x112/0x130
[ 1109.837053]  ? __kthread_bind_mask+0x60/0x60
[ 1109.837053]  ret_from_fork+0x35/0x40
[ 1109.837053] NMI backtrace for cpu 1
[ 1109.837053] CPU: 1 PID: 533 Comm: pkg.sensors.tra Not tainted 5.4.206 #1
[ 1109.837053] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
[ 1109.837053] Call Trace:
[ 1109.837053]  <IRQ>
[ 1109.837053]  dump_stack+0x50/0x63
[ 1109.837053]  nmi_cpu_backtrace.cold+0x14/0x53
[ 1109.837053]  ? lapic_can_unplug_cpu+0x70/0x70
[ 1109.837053]  nmi_trigger_cpumask_backtrace+0x7c/0x90
[ 1109.837053]  rcu_dump_cpu_stacks+0x7c/0xaa
[ 1109.837053]  rcu_sched_clock_irq.cold+0x1b3/0x39e
[ 1109.837053]  update_process_times+0x56/0x90
[ 1109.837053]  tick_sched_handle+0x2f/0x40
[ 1109.837053]  tick_sched_timer+0x4c/0xb0
[ 1109.837053]  ? can_stop_idle_tick+0x90/0x90
[ 1109.837053]  __hrtimer_run_queues+0x123/0x2a0
[ 1109.837053]  hrtimer_interrupt+0x10b/0x2c0
[ 1109.837053]  smp_apic_timer_interrupt+0x61/0x130
[ 1109.837053]  apic_timer_interrupt+0xf/0x20
[ 1109.837053]  </IRQ>
[ 1109.837053] RIP: 0010:syscall_trace_enter+0x1f1/0x290
[ 1109.837053] Code: 01 00 48 c7 80 88 07 00 00 00 00 00 00 48 8b 10 83 e2 04 74 af f6 80 c9 06 00 00 01 74 a6 48 c7 c0 ff ff ff ff e9 25 ff ff ff <e9> 40 00 00 00 e9 ec fe ff ff 4c 8b 4b 58 48 8b 73 28 49 89 d0 4c
[ 1109.837053] RSP: 0018:ffffbb0f40247ec8 EFLAGS: 00000206 ORIG_RAX: ffffffffffffff13
[ 1109.837053] RAX: 0000000010000000 RBX: ffffbb0f40247f58 RCX: 0000000000000000
[ 1109.837053] RDX: 0000000000000000 RSI: ffffbb0f40247f58 RDI: 00000000000000e4
[ 1109.837053] RBP: 00000000c000003e R08: 0000000000000000 R09: 0000000000000000
[ 1109.837053] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[ 1109.837053] R13: 00000000000000e4 R14: 0000000010000000 R15: 0000000000000000
[ 1109.837053]  ? _copy_to_user+0x28/0x30
[ 1109.837053]  ? put_timespec64+0x35/0x60
[ 1109.837053]  do_syscall_64+0xc8/0x110
[ 1109.837053]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1109.837053] RIP: 0033:0x7ffd2f7bb7ff

Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com>
kkourt added a commit that referenced this pull request Aug 10, 2022
We have been hitting an issue, where the test will continuously print
RCU stalls. Set an option to just panic if that happens

[ 1109.837053] rcu: 	1-...!: (20984 ticks this GP) idle=a4a/1/0x4000000000000002 softirq=46388/46388 fqs=1
[ 1109.837053] 	(t=21001 jiffies g=140697 q=9)
[ 1109.837053] rcu: rcu_sched kthread starved for 20995 jiffies! g140697 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
[ 1109.837053] rcu: RCU grace-period kthread stack dump:
[ 1109.837053] rcu_sched       R  running task    14904    11      2 0x90004000
[ 1109.837053] Call Trace:
[ 1109.837053]  __schedule+0x237/0x610
[ 1109.837053]  ? __mod_timer+0x19d/0x3c0
[ 1109.837053]  schedule+0x34/0xa0
[ 1109.837053]  schedule_timeout+0x84/0x150
[ 1109.837053]  ? __next_timer_interrupt+0xc0/0xc0
[ 1109.837053]  rcu_gp_kthread+0x4f4/0xd50
[ 1109.837053]  ? kfree_call_rcu+0x10/0x10
[ 1109.837053]  kthread+0x112/0x130
[ 1109.837053]  ? __kthread_bind_mask+0x60/0x60
[ 1109.837053]  ret_from_fork+0x35/0x40
[ 1109.837053] NMI backtrace for cpu 1
[ 1109.837053] CPU: 1 PID: 533 Comm: pkg.sensors.tra Not tainted 5.4.206 #1
[ 1109.837053] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
[ 1109.837053] Call Trace:
[ 1109.837053]  <IRQ>
[ 1109.837053]  dump_stack+0x50/0x63
[ 1109.837053]  nmi_cpu_backtrace.cold+0x14/0x53
[ 1109.837053]  ? lapic_can_unplug_cpu+0x70/0x70
[ 1109.837053]  nmi_trigger_cpumask_backtrace+0x7c/0x90
[ 1109.837053]  rcu_dump_cpu_stacks+0x7c/0xaa
[ 1109.837053]  rcu_sched_clock_irq.cold+0x1b3/0x39e
[ 1109.837053]  update_process_times+0x56/0x90
[ 1109.837053]  tick_sched_handle+0x2f/0x40
[ 1109.837053]  tick_sched_timer+0x4c/0xb0
[ 1109.837053]  ? can_stop_idle_tick+0x90/0x90
[ 1109.837053]  __hrtimer_run_queues+0x123/0x2a0
[ 1109.837053]  hrtimer_interrupt+0x10b/0x2c0
[ 1109.837053]  smp_apic_timer_interrupt+0x61/0x130
[ 1109.837053]  apic_timer_interrupt+0xf/0x20
[ 1109.837053]  </IRQ>
[ 1109.837053] RIP: 0010:syscall_trace_enter+0x1f1/0x290
[ 1109.837053] Code: 01 00 48 c7 80 88 07 00 00 00 00 00 00 48 8b 10 83 e2 04 74 af f6 80 c9 06 00 00 01 74 a6 48 c7 c0 ff ff ff ff e9 25 ff ff ff <e9> 40 00 00 00 e9 ec fe ff ff 4c 8b 4b 58 48 8b 73 28 49 89 d0 4c
[ 1109.837053] RSP: 0018:ffffbb0f40247ec8 EFLAGS: 00000206 ORIG_RAX: ffffffffffffff13
[ 1109.837053] RAX: 0000000010000000 RBX: ffffbb0f40247f58 RCX: 0000000000000000
[ 1109.837053] RDX: 0000000000000000 RSI: ffffbb0f40247f58 RDI: 00000000000000e4
[ 1109.837053] RBP: 00000000c000003e R08: 0000000000000000 R09: 0000000000000000
[ 1109.837053] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[ 1109.837053] R13: 00000000000000e4 R14: 0000000010000000 R15: 0000000000000000
[ 1109.837053]  ? _copy_to_user+0x28/0x30
[ 1109.837053]  ? put_timespec64+0x35/0x60
[ 1109.837053]  do_syscall_64+0xc8/0x110
[ 1109.837053]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1109.837053] RIP: 0033:0x7ffd2f7bb7ff

Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com>
kkourt added a commit that referenced this pull request Aug 10, 2022
We have been hitting an issue, where the test will continuously print
RCU stalls. Set an option to just panic if that happens

[ 1109.837053] rcu: 	1-...!: (20984 ticks this GP) idle=a4a/1/0x4000000000000002 softirq=46388/46388 fqs=1
[ 1109.837053] 	(t=21001 jiffies g=140697 q=9)
[ 1109.837053] rcu: rcu_sched kthread starved for 20995 jiffies! g140697 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
[ 1109.837053] rcu: RCU grace-period kthread stack dump:
[ 1109.837053] rcu_sched       R  running task    14904    11      2 0x90004000
[ 1109.837053] Call Trace:
[ 1109.837053]  __schedule+0x237/0x610
[ 1109.837053]  ? __mod_timer+0x19d/0x3c0
[ 1109.837053]  schedule+0x34/0xa0
[ 1109.837053]  schedule_timeout+0x84/0x150
[ 1109.837053]  ? __next_timer_interrupt+0xc0/0xc0
[ 1109.837053]  rcu_gp_kthread+0x4f4/0xd50
[ 1109.837053]  ? kfree_call_rcu+0x10/0x10
[ 1109.837053]  kthread+0x112/0x130
[ 1109.837053]  ? __kthread_bind_mask+0x60/0x60
[ 1109.837053]  ret_from_fork+0x35/0x40
[ 1109.837053] NMI backtrace for cpu 1
[ 1109.837053] CPU: 1 PID: 533 Comm: pkg.sensors.tra Not tainted 5.4.206 #1
[ 1109.837053] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
[ 1109.837053] Call Trace:
[ 1109.837053]  <IRQ>
[ 1109.837053]  dump_stack+0x50/0x63
[ 1109.837053]  nmi_cpu_backtrace.cold+0x14/0x53
[ 1109.837053]  ? lapic_can_unplug_cpu+0x70/0x70
[ 1109.837053]  nmi_trigger_cpumask_backtrace+0x7c/0x90
[ 1109.837053]  rcu_dump_cpu_stacks+0x7c/0xaa
[ 1109.837053]  rcu_sched_clock_irq.cold+0x1b3/0x39e
[ 1109.837053]  update_process_times+0x56/0x90
[ 1109.837053]  tick_sched_handle+0x2f/0x40
[ 1109.837053]  tick_sched_timer+0x4c/0xb0
[ 1109.837053]  ? can_stop_idle_tick+0x90/0x90
[ 1109.837053]  __hrtimer_run_queues+0x123/0x2a0
[ 1109.837053]  hrtimer_interrupt+0x10b/0x2c0
[ 1109.837053]  smp_apic_timer_interrupt+0x61/0x130
[ 1109.837053]  apic_timer_interrupt+0xf/0x20
[ 1109.837053]  </IRQ>
[ 1109.837053] RIP: 0010:syscall_trace_enter+0x1f1/0x290
[ 1109.837053] Code: 01 00 48 c7 80 88 07 00 00 00 00 00 00 48 8b 10 83 e2 04 74 af f6 80 c9 06 00 00 01 74 a6 48 c7 c0 ff ff ff ff e9 25 ff ff ff <e9> 40 00 00 00 e9 ec fe ff ff 4c 8b 4b 58 48 8b 73 28 49 89 d0 4c
[ 1109.837053] RSP: 0018:ffffbb0f40247ec8 EFLAGS: 00000206 ORIG_RAX: ffffffffffffff13
[ 1109.837053] RAX: 0000000010000000 RBX: ffffbb0f40247f58 RCX: 0000000000000000
[ 1109.837053] RDX: 0000000000000000 RSI: ffffbb0f40247f58 RDI: 00000000000000e4
[ 1109.837053] RBP: 00000000c000003e R08: 0000000000000000 R09: 0000000000000000
[ 1109.837053] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[ 1109.837053] R13: 00000000000000e4 R14: 0000000010000000 R15: 0000000000000000
[ 1109.837053]  ? _copy_to_user+0x28/0x30
[ 1109.837053]  ? put_timespec64+0x35/0x60
[ 1109.837053]  do_syscall_64+0xc8/0x110
[ 1109.837053]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1109.837053] RIP: 0033:0x7ffd2f7bb7ff

Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com>
kkourt added a commit that referenced this pull request Aug 10, 2022
We have been hitting an issue, where the test will continuously print
RCU stalls. Set an option to just panic if that happens

[ 1109.837053] rcu: 	1-...!: (20984 ticks this GP) idle=a4a/1/0x4000000000000002 softirq=46388/46388 fqs=1
[ 1109.837053] 	(t=21001 jiffies g=140697 q=9)
[ 1109.837053] rcu: rcu_sched kthread starved for 20995 jiffies! g140697 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
[ 1109.837053] rcu: RCU grace-period kthread stack dump:
[ 1109.837053] rcu_sched       R  running task    14904    11      2 0x90004000
[ 1109.837053] Call Trace:
[ 1109.837053]  __schedule+0x237/0x610
[ 1109.837053]  ? __mod_timer+0x19d/0x3c0
[ 1109.837053]  schedule+0x34/0xa0
[ 1109.837053]  schedule_timeout+0x84/0x150
[ 1109.837053]  ? __next_timer_interrupt+0xc0/0xc0
[ 1109.837053]  rcu_gp_kthread+0x4f4/0xd50
[ 1109.837053]  ? kfree_call_rcu+0x10/0x10
[ 1109.837053]  kthread+0x112/0x130
[ 1109.837053]  ? __kthread_bind_mask+0x60/0x60
[ 1109.837053]  ret_from_fork+0x35/0x40
[ 1109.837053] NMI backtrace for cpu 1
[ 1109.837053] CPU: 1 PID: 533 Comm: pkg.sensors.tra Not tainted 5.4.206 #1
[ 1109.837053] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
[ 1109.837053] Call Trace:
[ 1109.837053]  <IRQ>
[ 1109.837053]  dump_stack+0x50/0x63
[ 1109.837053]  nmi_cpu_backtrace.cold+0x14/0x53
[ 1109.837053]  ? lapic_can_unplug_cpu+0x70/0x70
[ 1109.837053]  nmi_trigger_cpumask_backtrace+0x7c/0x90
[ 1109.837053]  rcu_dump_cpu_stacks+0x7c/0xaa
[ 1109.837053]  rcu_sched_clock_irq.cold+0x1b3/0x39e
[ 1109.837053]  update_process_times+0x56/0x90
[ 1109.837053]  tick_sched_handle+0x2f/0x40
[ 1109.837053]  tick_sched_timer+0x4c/0xb0
[ 1109.837053]  ? can_stop_idle_tick+0x90/0x90
[ 1109.837053]  __hrtimer_run_queues+0x123/0x2a0
[ 1109.837053]  hrtimer_interrupt+0x10b/0x2c0
[ 1109.837053]  smp_apic_timer_interrupt+0x61/0x130
[ 1109.837053]  apic_timer_interrupt+0xf/0x20
[ 1109.837053]  </IRQ>
[ 1109.837053] RIP: 0010:syscall_trace_enter+0x1f1/0x290
[ 1109.837053] Code: 01 00 48 c7 80 88 07 00 00 00 00 00 00 48 8b 10 83 e2 04 74 af f6 80 c9 06 00 00 01 74 a6 48 c7 c0 ff ff ff ff e9 25 ff ff ff <e9> 40 00 00 00 e9 ec fe ff ff 4c 8b 4b 58 48 8b 73 28 49 89 d0 4c
[ 1109.837053] RSP: 0018:ffffbb0f40247ec8 EFLAGS: 00000206 ORIG_RAX: ffffffffffffff13
[ 1109.837053] RAX: 0000000010000000 RBX: ffffbb0f40247f58 RCX: 0000000000000000
[ 1109.837053] RDX: 0000000000000000 RSI: ffffbb0f40247f58 RDI: 00000000000000e4
[ 1109.837053] RBP: 00000000c000003e R08: 0000000000000000 R09: 0000000000000000
[ 1109.837053] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[ 1109.837053] R13: 00000000000000e4 R14: 0000000010000000 R15: 0000000000000000
[ 1109.837053]  ? _copy_to_user+0x28/0x30
[ 1109.837053]  ? put_timespec64+0x35/0x60
[ 1109.837053]  do_syscall_64+0xc8/0x110
[ 1109.837053]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1109.837053] RIP: 0033:0x7ffd2f7bb7ff

Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com>
kkourt added a commit that referenced this pull request Aug 10, 2022
We have been hitting an issue, where the test will continuously print
RCU stalls. Set an option to just panic if that happens

[ 1109.837053] rcu: 	1-...!: (20984 ticks this GP) idle=a4a/1/0x4000000000000002 softirq=46388/46388 fqs=1
[ 1109.837053] 	(t=21001 jiffies g=140697 q=9)
[ 1109.837053] rcu: rcu_sched kthread starved for 20995 jiffies! g140697 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
[ 1109.837053] rcu: RCU grace-period kthread stack dump:
[ 1109.837053] rcu_sched       R  running task    14904    11      2 0x90004000
[ 1109.837053] Call Trace:
[ 1109.837053]  __schedule+0x237/0x610
[ 1109.837053]  ? __mod_timer+0x19d/0x3c0
[ 1109.837053]  schedule+0x34/0xa0
[ 1109.837053]  schedule_timeout+0x84/0x150
[ 1109.837053]  ? __next_timer_interrupt+0xc0/0xc0
[ 1109.837053]  rcu_gp_kthread+0x4f4/0xd50
[ 1109.837053]  ? kfree_call_rcu+0x10/0x10
[ 1109.837053]  kthread+0x112/0x130
[ 1109.837053]  ? __kthread_bind_mask+0x60/0x60
[ 1109.837053]  ret_from_fork+0x35/0x40
[ 1109.837053] NMI backtrace for cpu 1
[ 1109.837053] CPU: 1 PID: 533 Comm: pkg.sensors.tra Not tainted 5.4.206 #1
[ 1109.837053] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
[ 1109.837053] Call Trace:
[ 1109.837053]  <IRQ>
[ 1109.837053]  dump_stack+0x50/0x63
[ 1109.837053]  nmi_cpu_backtrace.cold+0x14/0x53
[ 1109.837053]  ? lapic_can_unplug_cpu+0x70/0x70
[ 1109.837053]  nmi_trigger_cpumask_backtrace+0x7c/0x90
[ 1109.837053]  rcu_dump_cpu_stacks+0x7c/0xaa
[ 1109.837053]  rcu_sched_clock_irq.cold+0x1b3/0x39e
[ 1109.837053]  update_process_times+0x56/0x90
[ 1109.837053]  tick_sched_handle+0x2f/0x40
[ 1109.837053]  tick_sched_timer+0x4c/0xb0
[ 1109.837053]  ? can_stop_idle_tick+0x90/0x90
[ 1109.837053]  __hrtimer_run_queues+0x123/0x2a0
[ 1109.837053]  hrtimer_interrupt+0x10b/0x2c0
[ 1109.837053]  smp_apic_timer_interrupt+0x61/0x130
[ 1109.837053]  apic_timer_interrupt+0xf/0x20
[ 1109.837053]  </IRQ>
[ 1109.837053] RIP: 0010:syscall_trace_enter+0x1f1/0x290
[ 1109.837053] Code: 01 00 48 c7 80 88 07 00 00 00 00 00 00 48 8b 10 83 e2 04 74 af f6 80 c9 06 00 00 01 74 a6 48 c7 c0 ff ff ff ff e9 25 ff ff ff <e9> 40 00 00 00 e9 ec fe ff ff 4c 8b 4b 58 48 8b 73 28 49 89 d0 4c
[ 1109.837053] RSP: 0018:ffffbb0f40247ec8 EFLAGS: 00000206 ORIG_RAX: ffffffffffffff13
[ 1109.837053] RAX: 0000000010000000 RBX: ffffbb0f40247f58 RCX: 0000000000000000
[ 1109.837053] RDX: 0000000000000000 RSI: ffffbb0f40247f58 RDI: 00000000000000e4
[ 1109.837053] RBP: 00000000c000003e R08: 0000000000000000 R09: 0000000000000000
[ 1109.837053] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[ 1109.837053] R13: 00000000000000e4 R14: 0000000010000000 R15: 0000000000000000
[ 1109.837053]  ? _copy_to_user+0x28/0x30
[ 1109.837053]  ? put_timespec64+0x35/0x60
[ 1109.837053]  do_syscall_64+0xc8/0x110
[ 1109.837053]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1109.837053] RIP: 0033:0x7ffd2f7bb7ff

Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com>
kkourt added a commit that referenced this pull request Aug 12, 2022
We 've been seeing RCU stalls such as, when running qemu in GH:

Running test pkg.sensors.test.TestSensorLseekLoad .[  116.892213] rcu: INFO: rcu_sched self-detected stall on CPU
[  116.892213] rcu: 	0-...!: (20987 ticks this GP) idle=d3e/1/0x4000000000000002 softirq=23120/23120 fqs=0
[  116.892213] 	(t=21004 jiffies g=49257 q=8)
[  116.892213] rcu: rcu_sched kthread starved for 21004 jiffies! g49257 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
[  116.892213] rcu: RCU grace-period kthread stack dump:
[  116.892213] rcu_sched       R  running task    14920    11      2 0x90004000
[  116.892213] Call Trace:
[  116.892213]  __schedule+0x288/0x600
[  116.892213]  ? __mod_timer+0x1a6/0x3c0
[  116.892213]  schedule+0x34/0xa0
[  116.892213]  schedule_timeout+0x84/0x140
[  116.892213]  ? __next_timer_interrupt+0xc0/0xc0
[  116.892213]  rcu_gp_kthread+0x4f6/0xd40
[  116.892213]  ? kfree_call_rcu+0x10/0x10
[  116.892213]  kthread+0x107/0x120
[  116.892213]  ? __kthread_bind_mask+0x60/0x60
[  116.892213]  ret_from_fork+0x35/0x40
[  116.892213] NMI backtrace for cpu 0
[  116.892213] CPU: 0 PID: 413 Comm: pkg.sensors.tes Not tainted 5.4.209 #1
[  116.892213] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
[  116.892213] Call Trace:
[  116.892213]  <IRQ>
[  116.892213]  dump_stack+0x50/0x63
[  116.892213]  nmi_cpu_backtrace.cold+0x13/0x50
[  116.892213]  ? lapic_can_unplug_cpu+0x60/0x60
[  116.892213]  nmi_trigger_cpumask_backtrace+0x7c/0x90
[  116.892213]  rcu_dump_cpu_stacks+0x7c/0xaa
[  116.892213]  rcu_sched_clock_irq.cold+0x1b3/0x39e
[  116.892213]  ? can_stop_idle_tick+0x70/0x70
[  116.892213]  update_process_times+0x56/0x90
[  116.892213]  tick_sched_handle+0x2f/0x40
[  116.892213]  tick_sched_timer+0x4b/0xb0
[  116.892213]  __hrtimer_run_queues+0x127/0x2a0
[  116.892213]  hrtimer_interrupt+0xf0/0x280
[  116.892213]  smp_apic_timer_interrupt+0x5d/0x120
[  116.892213]  apic_timer_interrupt+0xf/0x20
[  116.892213]  </IRQ>
... repeted until timout ...

From reading https://www.kernel.org/doc/Documentation/RCU/stallwarn.txt,
one of my theories is that writes to the console get delayed and the
kernel enters some weird livelock state. This patch buffers qemu output
aiming to avoid hitting RCU  stalls such as the one above.

Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com>
kkourt added a commit that referenced this pull request Aug 12, 2022
We 've been seeing RCU stalls such as, when running qemu in GH:

Running test pkg.sensors.test.TestSensorLseekLoad .[  116.892213] rcu: INFO: rcu_sched self-detected stall on CPU
[  116.892213] rcu: 	0-...!: (20987 ticks this GP) idle=d3e/1/0x4000000000000002 softirq=23120/23120 fqs=0
[  116.892213] 	(t=21004 jiffies g=49257 q=8)
[  116.892213] rcu: rcu_sched kthread starved for 21004 jiffies! g49257 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
[  116.892213] rcu: RCU grace-period kthread stack dump:
[  116.892213] rcu_sched       R  running task    14920    11      2 0x90004000
[  116.892213] Call Trace:
[  116.892213]  __schedule+0x288/0x600
[  116.892213]  ? __mod_timer+0x1a6/0x3c0
[  116.892213]  schedule+0x34/0xa0
[  116.892213]  schedule_timeout+0x84/0x140
[  116.892213]  ? __next_timer_interrupt+0xc0/0xc0
[  116.892213]  rcu_gp_kthread+0x4f6/0xd40
[  116.892213]  ? kfree_call_rcu+0x10/0x10
[  116.892213]  kthread+0x107/0x120
[  116.892213]  ? __kthread_bind_mask+0x60/0x60
[  116.892213]  ret_from_fork+0x35/0x40
[  116.892213] NMI backtrace for cpu 0
[  116.892213] CPU: 0 PID: 413 Comm: pkg.sensors.tes Not tainted 5.4.209 #1
[  116.892213] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
[  116.892213] Call Trace:
[  116.892213]  <IRQ>
[  116.892213]  dump_stack+0x50/0x63
[  116.892213]  nmi_cpu_backtrace.cold+0x13/0x50
[  116.892213]  ? lapic_can_unplug_cpu+0x60/0x60
[  116.892213]  nmi_trigger_cpumask_backtrace+0x7c/0x90
[  116.892213]  rcu_dump_cpu_stacks+0x7c/0xaa
[  116.892213]  rcu_sched_clock_irq.cold+0x1b3/0x39e
[  116.892213]  ? can_stop_idle_tick+0x70/0x70
[  116.892213]  update_process_times+0x56/0x90
[  116.892213]  tick_sched_handle+0x2f/0x40
[  116.892213]  tick_sched_timer+0x4b/0xb0
[  116.892213]  __hrtimer_run_queues+0x127/0x2a0
[  116.892213]  hrtimer_interrupt+0xf0/0x280
[  116.892213]  smp_apic_timer_interrupt+0x5d/0x120
[  116.892213]  apic_timer_interrupt+0xf/0x20
[  116.892213]  </IRQ>
... repeted until timeout ...

From reading https://www.kernel.org/doc/Documentation/RCU/stallwarn.txt,
one of my theories is that writes to the console get delayed and the
kernel enters some weird livelock state. This patch buffers qemu output
aiming to avoid hitting RCU  stalls such as the one above.

Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com>
kkourt added a commit that referenced this pull request Aug 12, 2022
We 've been seeing RCU stalls such as, when running qemu in GH:

Running test pkg.sensors.test.TestSensorLseekLoad .[  116.892213] rcu: INFO: rcu_sched self-detected stall on CPU
[  116.892213] rcu: 	0-...!: (20987 ticks this GP) idle=d3e/1/0x4000000000000002 softirq=23120/23120 fqs=0
[  116.892213] 	(t=21004 jiffies g=49257 q=8)
[  116.892213] rcu: rcu_sched kthread starved for 21004 jiffies! g49257 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
[  116.892213] rcu: RCU grace-period kthread stack dump:
[  116.892213] rcu_sched       R  running task    14920    11      2 0x90004000
[  116.892213] Call Trace:
[  116.892213]  __schedule+0x288/0x600
[  116.892213]  ? __mod_timer+0x1a6/0x3c0
[  116.892213]  schedule+0x34/0xa0
[  116.892213]  schedule_timeout+0x84/0x140
[  116.892213]  ? __next_timer_interrupt+0xc0/0xc0
[  116.892213]  rcu_gp_kthread+0x4f6/0xd40
[  116.892213]  ? kfree_call_rcu+0x10/0x10
[  116.892213]  kthread+0x107/0x120
[  116.892213]  ? __kthread_bind_mask+0x60/0x60
[  116.892213]  ret_from_fork+0x35/0x40
[  116.892213] NMI backtrace for cpu 0
[  116.892213] CPU: 0 PID: 413 Comm: pkg.sensors.tes Not tainted 5.4.209 #1
[  116.892213] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
[  116.892213] Call Trace:
[  116.892213]  <IRQ>
[  116.892213]  dump_stack+0x50/0x63
[  116.892213]  nmi_cpu_backtrace.cold+0x13/0x50
[  116.892213]  ? lapic_can_unplug_cpu+0x60/0x60
[  116.892213]  nmi_trigger_cpumask_backtrace+0x7c/0x90
[  116.892213]  rcu_dump_cpu_stacks+0x7c/0xaa
[  116.892213]  rcu_sched_clock_irq.cold+0x1b3/0x39e
[  116.892213]  ? can_stop_idle_tick+0x70/0x70
[  116.892213]  update_process_times+0x56/0x90
[  116.892213]  tick_sched_handle+0x2f/0x40
[  116.892213]  tick_sched_timer+0x4b/0xb0
[  116.892213]  __hrtimer_run_queues+0x127/0x2a0
[  116.892213]  hrtimer_interrupt+0xf0/0x280
[  116.892213]  smp_apic_timer_interrupt+0x5d/0x120
[  116.892213]  apic_timer_interrupt+0xf/0x20
[  116.892213]  </IRQ>
... repeted until timeout ...

From reading https://www.kernel.org/doc/Documentation/RCU/stallwarn.txt,
one of my theories is that writes to the console get delayed and the
kernel enters some weird livelock state. This patch buffers qemu output
aiming to avoid hitting RCU  stalls such as the one above.

Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com>
Trung-DV added a commit to Trung-DV/tetragon that referenced this pull request May 26, 2024
Trung-DV added a commit to Trung-DV/tetragon that referenced this pull request May 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant