implement shadow stacks #455

Freax13 · 2024-09-12T16:10:54Z

This PR implements shadow stacks.

Shadow stacks are enabled/disabled at compile time. AFAIK all processors supporting either AMD SEV-SNP or Intel TDX support shadow stacks, so no runtime checks are implemented. This also avoids the pitfall where the hypervisor lies about shadow stack availability to maliciously lower the security of the SVSM.

KVM intercepts accesses to the MSRs used for shadow stacks, so some host-side modifications are required to make this work.

I implemented a #CP exception handler to display diagnostic information when a shadow stack-related issue occurs. We might want to disable this exception handler for release builds and cause a triple fault instead. #CP exceptions are likely a sign of something fishy going on and we might want to terminate the guest just to be sure.

Currently, shadow stacks are disabled by default for a couple of reasons:

The KVM patches mentioned above are needed to make this work.
We haven't implemented proper user syscalls (using syscall & sysret) yet and I don't want to hinder any efforts in that direction by forcing a proper syscall implementation to support shadow stacks out of the box. We should probably get syscalls without shadow stacks working first and once that's merged, we can add shadow stack support later and eventually enable shadow stacks by default.
I haven't tested shadow-stacks together with enable-gdb yet, but I suspect this probably needs some additional work.

joergroedel

This is impressive, thanks for implementing this feature!

There are some open questions, one around the task-switch handling. I left a comment there. The other question is, whether there is a way to enable/disable this feature at boot-time, instead of compile time.

Finally, I found that it uses the old PageRef interface. Since the new one is merged now, this can be re-based on top of latest HEAD.

joergroedel · 2024-09-16T13:02:56Z

kernel/src/task/schedule.rs

+/// 1. Switch to the shadow stack at the fixed address using `rstorssp`.
+/// 2. Transfer the shadow stack restore token from the shadow stack at the
+///    fixed address to the previous shadow stack by executing `saveprevssp`.
+/// 3. Switch the page tables. This doesn't lead to problems with the shadow
+///    stack because is mapped into both page tables.
+/// 4. Switch to the new shadow stack using `rstorssp`.
+/// 5. Transfer the shadow stack restore token from the new shadow stack back
+///    to the shadow stacks at the fixed address by executing `saveprevssp`.


There is another issue around the task-switch code which needs fixing, and that might help for shadow stacks as well.

The problem currently is that the task-switch assembly is not safe against exceptions (like #HV and #NMI) that can occur at any time. The unsafe part come from the fact that the code can not switch page-tables and stack pointers atomically. The solution for that problem is either using IST stacks for these exceptions, or using a per-cpu task-switch stack.

The preferred solution is a per-cpu task switch stack, which would then also need a shadow stack. I think this would solve this problem as well.

I didn't consider that IRQ may come in during the context switch. I implemented a CPU-local dedicated stack and shadow stack as you suggested.

The initialization and pt_flags are a bit special for shadow stack pages, so this warrants a new `VirtualMapping` implementations. Signed-off-by: Tom Dohrmann <erbse.13@gmx.de>

This shadow stack is used when not using a task's shadow stack. Signed-off-by: Tom Dohrmann <erbse.13@gmx.de>

The interrupt shadow stack table (ISST) is very similar to the interrupt stack table (IST) except that it contains shadow stack addresses instead of normal stack addresses. Signed-off-by: Tom Dohrmann <erbse.13@gmx.de>

Each task needs to a normal shadow stack and shadow stack used for exception handling. Signed-off-by: Tom Dohrmann <erbse.13@gmx.de>

Some exception handlers will need to update the shadow stack, so they need to know the shadow stack pointer at the time of the exception. Signed-off-by: Tom Dohrmann <erbse.13@gmx.de>

Whenever we update the return address on the shadow stack, we'll also need to update the return address on the shadow stack. Signed-off-by: Tom Dohrmann <erbse.13@gmx.de>

We need to guard against IRQs coming in after switching to the new page tables and before switching to the new stack. Signed-off-by: Tom Dohrmann <erbse.13@gmx.de>

Each task has separate shadow stacks, so we need to switch them when switching tasks. Signed-off-by: Tom Dohrmann <erbse.13@gmx.de>

This enables shadow stacks for the BSP. Signed-off-by: Tom Dohrmann <erbse.13@gmx.de>

This enables shadow stacks on the secondary APs. Signed-off-by: Tom Dohrmann <erbse.13@gmx.de>

This exception handler will be executed when the CPU detects a mismatch between the return address on the stack and the return address on the shadow stack. Signed-off-by: Tom Dohrmann <erbse.13@gmx.de>

Trusted CPUID values are hard to come by, so let's just try to enable CET in CR4 and handle failure gracefully. Signed-off-by: Tom Dohrmann <erbse.13@gmx.de>

Freax13 · 2024-09-18T14:17:52Z

There are some open questions, one around the task-switch handling. I left a comment there. The other question is, whether there is a way to enable/disable this feature at boot-time, instead of compile time.

I added a patch that tries to detect CET support at runtime by enabling CET in CR4 and catching any faults that might occur if not supported. This still doesn't technically detect shadow stack support because AFAICT CPUs could theoretically only support CET lBT (indirect branch tracking) and not CET SHSTK (shadow stacks), but in practice no such CPUs exist (and AFAICT TDX only allows turning all of CET on or off, not individual sub-parts).

Finally, I found that it uses the old PageRef interface. Since the new one is merged now, this can be re-based on top of latest HEAD.

Done.

msft-jlange · 2024-09-20T21:56:00Z

It doesn't look like this change is compatible with the existing #HV handling code. For example, there is code in asm_entry_hv which specifically checks to see whether a recursive #HV has been delivered while executing an IRET sequence, and if so, it overwrites the old IRET data with the newly pushed IRET data. The shadow stack equivalent of this code must be written, but it is also inherently unsafe, because it requires popping values from the shadow stack and using the WRSS instruction to write them to a previous location on the shadow stack. Regardless of what approach we ultimately decide to take here, it is critical to support Restricted Injection correctly, and we can't allow shadow stack support to break it.

Freax13 force-pushed the feature/shadow-stack branch from 9ce23d9 to f1d358a Compare September 12, 2024 16:16

joergroedel requested changes Sep 16, 2024

View reviewed changes

Freax13 added 4 commits September 18, 2024 09:34

mm: implement VMKernelShadowStack

a29d1c9

The initialization and pt_flags are a bit special for shadow stack pages, so this warrants a new `VirtualMapping` implementations. Signed-off-by: Tom Dohrmann <erbse.13@gmx.de>

percpu: allocate an initial shadow stack

56b214e

This shadow stack is used when not using a task's shadow stack. Signed-off-by: Tom Dohrmann <erbse.13@gmx.de>

percpu: setup ISST

0fbbd41

The interrupt shadow stack table (ISST) is very similar to the interrupt stack table (IST) except that it contains shadow stack addresses instead of normal stack addresses. Signed-off-by: Tom Dohrmann <erbse.13@gmx.de>

task: allocate shadow stacks for each task

1bc3c0c

Each task needs to a normal shadow stack and shadow stack used for exception handling. Signed-off-by: Tom Dohrmann <erbse.13@gmx.de>

Freax13 force-pushed the feature/shadow-stack branch from f1d358a to 8a88f3b Compare September 18, 2024 10:52

Freax13 added 8 commits September 18, 2024 13:48

idt: add shadow stack pointer to exception context

8d2db3f

Some exception handlers will need to update the shadow stack, so they need to know the shadow stack pointer at the time of the exception. Signed-off-by: Tom Dohrmann <erbse.13@gmx.de>

idt: update return address on shadow stack

5a3d5fe

Whenever we update the return address on the shadow stack, we'll also need to update the return address on the shadow stack. Signed-off-by: Tom Dohrmann <erbse.13@gmx.de>

schedule: switch to special stack during context switches

ab39c92

We need to guard against IRQs coming in after switching to the new page tables and before switching to the new stack. Signed-off-by: Tom Dohrmann <erbse.13@gmx.de>

schedule: switch shadow stacks in context switch

640249f

Each task has separate shadow stacks, so we need to switch them when switching tasks. Signed-off-by: Tom Dohrmann <erbse.13@gmx.de>

svsm: enable shadow stack

d4e4487

This enables shadow stacks for the BSP. Signed-off-by: Tom Dohrmann <erbse.13@gmx.de>

vmsa: enable shadow stacks

f189193

This enables shadow stacks on the secondary APs. Signed-off-by: Tom Dohrmann <erbse.13@gmx.de>

idt: implement #CP handler

207847f

This exception handler will be executed when the CPU detects a mismatch between the return address on the stack and the return address on the shadow stack. Signed-off-by: Tom Dohrmann <erbse.13@gmx.de>

shadow_stack: determine support at runtime

cc5c763

Trusted CPUID values are hard to come by, so let's just try to enable CET in CR4 and handle failure gracefully. Signed-off-by: Tom Dohrmann <erbse.13@gmx.de>

Freax13 force-pushed the feature/shadow-stack branch from 8a88f3b to cc5c763 Compare September 18, 2024 13:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

implement shadow stacks #455

implement shadow stacks #455

Freax13 commented Sep 12, 2024 •

edited

Loading

joergroedel left a comment

joergroedel Sep 16, 2024

Freax13 Sep 18, 2024

Freax13 commented Sep 18, 2024

msft-jlange commented Sep 20, 2024

implement shadow stacks #455

Are you sure you want to change the base?

implement shadow stacks #455

Conversation

Freax13 commented Sep 12, 2024 • edited Loading

joergroedel left a comment

Choose a reason for hiding this comment

joergroedel Sep 16, 2024

Choose a reason for hiding this comment

Freax13 Sep 18, 2024

Choose a reason for hiding this comment

Freax13 commented Sep 18, 2024

msft-jlange commented Sep 20, 2024

Freax13 commented Sep 12, 2024 •

edited

Loading