Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement the attestation report feature #69

Merged
merged 9 commits into from
Nov 20, 2023

Conversation

cclaudio
Copy link
Member

@cclaudio cclaudio commented Aug 2, 2023

This PR implements the attestation report feature discussed in the issue #67

make test will test the main functions used to encrypt and decrypt messages.

The branch attestation-report-test can be used to test the get_report() and get_ext_report() functions. It contains one extra patch that calls them at boot time and print some information in the log.

src/protocols/errors.rs Outdated Show resolved Hide resolved
src/protocols/errors.rs Outdated Show resolved Hide resolved
Makefile Outdated Show resolved Hide resolved
src/mm/address_space.rs Outdated Show resolved Hide resolved
src/psp/guest_request_msg.rs Outdated Show resolved Hide resolved
src/sev/secrets_page.rs Show resolved Hide resolved
src/psp/guest_request_cmd.rs Outdated Show resolved Hide resolved
src/psp/msg_report.rs Outdated Show resolved Hide resolved
src/psp/request.rs Outdated Show resolved Hide resolved
src/psp/guest_request_msg.rs Outdated Show resolved Hide resolved
@cclaudio cclaudio force-pushed the attestation-report branch 2 times, most recently from 026947d to 009c8ab Compare August 6, 2023 23:45
@cclaudio
Copy link
Member Author

cclaudio commented Aug 6, 2023

Hi @stefano-garzarella and @00xc, thank you very much for the feedback. I applied the changes below and marked their threads as resolved.

  • WIP: Call get_report() and get_ext_report() for testing <------- This commit exists only in the test branch
    • Moved "use" statements to inside the testing function
  • psp: Add get_report() and get_ext_report()
  • psp: Add SnpGuestRequestCmd
    • fixups as suggested by "cargo clippy"
    • added structure and file comments pointing out the respective table in the spec as suggested by Stefano
  • psp: Add SnpGuestRequestMsg
    • fixups as suggested by "cargo clippy"
    • added structure and file comments pointing out the respective table in the spec as suggested by Stefano
    • set len in the "set_len" method as suggested by Carlos
  • sev/secrets_page: Export the VMPCK size
    • Adding this change as a separate commit as suggested by Stefano
  • sev/ghcb: Add guest_request() and guest_ext_request()
  • protocols/errors: Implement Display for SvsmReqError
    • Dropped this commit as suggested by Carlos
  • mm/address_space: Increase stack size for attestation report
  • Makefile: Build targets in verbose mode
    • It now supports "make V=1" and "make V=2" as suggested by Stefano

Copy link
Contributor

@stefano-garzarella stefano-garzarella left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still need to study the details, so I may come back with new comments, but for now LGTM, I left just a minor comment.

Makefile Outdated Show resolved Hide resolved
stefano-garzarella added a commit to stefano-garzarella/svsm that referenced this pull request Aug 8, 2023
Some new features may require increasing the stack size:
coconut-svsm#69 (comment)

To avoid problems when the gdb stub is enabled, it is best to define
STACK_PAGES as a constant plus extra pages depending on the features
(e.g. STACK_PAGES_GDB).

This way we have a single point to modify for the base of STACK_PAGES.

No functional changes.

Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
@cclaudio
Copy link
Member Author

attestation-report and attestation-report-test branches updated (version 3):

  • [PATCH] psp: Add SnpGuestRequestMsg
    • Fixed one simple indentation reported by cargo fmt --check
  • Rebased to latest main

@cclaudio
Copy link
Member Author

attestation-report and attestation-report-test branches updated (version 4):

  • Updated the License of the new files to be MIT or Apache-2.0

src/mm/address_space.rs Outdated Show resolved Hide resolved
.cargo/config.toml Outdated Show resolved Hide resolved
src/psp/guest_request_msg.rs Outdated Show resolved Hide resolved
src/psp/guest_request_msg.rs Outdated Show resolved Hide resolved
@cclaudio
Copy link
Member Author

cclaudio commented Sep 3, 2023

Hi everyone,

I updated the branch and it's much better now.

  • Improved stack consumption. Before it was requiring a stack of at least 10 pages to run, now it runs with 5 pages. Basically, now the get_report() provides a buffer that is used for input and output. We read the request and write the response to the buffer without allocating memory in the stack.
  • Added a new patch to fix a VM boot hang that seems to be caused the cargo log version we are using
  • Updated the commit messages.
  • Updated comments/documentation in the source code.
  • I was suggesting to use the sev-host-set-cert-chain tool to load the SEV-SNP certificates, but it is now deprecated. I updated that to suggest the virtee/snphost.
  • I mentioned that I wanted to replace the RustCrypto encrypt/decrypt by the ones that encrypts/decrypts in place, but I decided not to do that because it requires Vec::from_raw_parts , which is highly unsafe at the moment.

@crobinso
Copy link
Contributor

* Added a new patch to fix a VM boot hang that seems to be caused the cargo log version we are using

FYI main branch should have that fixed with 7236a1a

src/greq/services.rs Outdated Show resolved Hide resolved
src/greq/services.rs Outdated Show resolved Hide resolved
src/greq/driver.rs Outdated Show resolved Hide resolved
@joergroedel
Copy link
Member

Except for the missing Drop trait implementations this LGTM.

src/greq/msg.rs Outdated Show resolved Hide resolved
src/greq/driver.rs Outdated Show resolved Hide resolved
src/greq/msg.rs Outdated Show resolved Hide resolved
@cclaudio
Copy link
Member Author

cclaudio commented Oct 12, 2023

Thanks @crobinso @daaltobe @roy-hopkins @deeglaze and @joergroedel for the feedback!

attestation-report and attestation-report-test branches updated (v5):

  • Updated commit messages
  • Perform crypto operations in a protected page (staging message)
  • Implemented drop trait for SnpGuestRequestMsg and SnpGuestRequestExtData
  • Introduced a crypto API for the SVSM kernel
  • Disable the VMPCK0 if the hypervisor returned SNP_GUEST_REQ_INVALID_LEN for a regular SNP_GUEST_REQUEST
  • Documented the greq and crypto crates using rustdoc

Copy link
Member

@joergroedel joergroedel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left two comments.

src/greq/msg.rs Outdated Show resolved Hide resolved
src/greq/msg.rs Outdated Show resolved Hide resolved
src/greq/msg.rs Outdated Show resolved Hide resolved
src/greq/msg.rs Outdated Show resolved Hide resolved
src/crypto/rustcrypto.rs Outdated Show resolved Hide resolved
src/greq/services.rs Outdated Show resolved Hide resolved
$ make V=1
or
$ make V=2

These can be used to easily build targets in verbose mode. That can be
helpful for debugging. Currently we support V=1 or V=2 (the biggest is
the most verbose).

Signed-off-by: Claudio Carvalho <cclaudio@linux.ibm.com>
Both functions are used to send SNP_GUEST_REQUEST messages to the PSP, but the
guest_ext_request() includes an extended request to the hypervisor. More
information can be found in the AMD GHCB specification.

Signed-off-by: Claudio Carvalho <cclaudio@linux.ibm.com>
Export the VMPCK size to be used in other crates.

Signed-off-by: Claudio Carvalho <cclaudio@linux.ibm.com>
Add a generic interface for AES-256 GCM encryption and decryption. They are
both required for requesting an attestation report.

With this interface we should be able to keep the crypto code isolated in
crates and also easily choose which crypto implementation should be
compiled-in.

Signed-off-by: Claudio Carvalho <cclaudio@linux.ibm.com>
@cclaudio
Copy link
Member Author

Thanks @Freax13 and @joergroedel for the feedback. In the last branch update (v6 - from afd7c4b to 30c556e), I addressed all the changes you suggested.

  • Replaced all VirtAddr parameters by a slice in almost all patches
  • Replaced the VirtAddr fields in the driver by Option<&mut ...>. This sounds more rust idiomatic and with that the rust compiler can help us detecting syntax/semantic issues with the fields.
  • Moved driver related code from services.rs and msg.rs to driver.rs, including the static driver instance
  • Refactored the static driver instance to be static GREQ_DRIVER: SpinLock<OnceCell<SnpGuestRequestDriver>> = SpinLock::new(OnceCell::new());
  • Updated the code to only free pages if they are encrypted
  • Updated the documentation

The test branch was also updated accordingly.

src/crypto/rustcrypto.rs Outdated Show resolved Hide resolved
src/greq/driver.rs Outdated Show resolved Hide resolved
// The sequence number is restored only when the guest is rebooted.
let Some(msg_seqno) = self.seqno_last_used().checked_add(1) else {
log::error!("SNP_GUEST_REQUEST: sequence number overflow");
disable_vmpck0();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We call disable_vmpck0 for a bunch of error cases. Have you considered additionally calling disable_vmpck0 in the SVSM's panic handler?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although we call disable_vmpck0 for multiple error cases, they are very specific and they not necessarily cause panic.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'm not saying that they cause a panic. I'm just saying that if some other code in the SVSM panicked (e.g. because an attacker is trying to exploit the SVSM), the SVSM's state might be messed up, so we might also also consider erasing the keys to prevent exploits.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'm not saying that they cause a panic. I'm just saying that if some other code in the SVSM panicked (e.g. because an attacker is trying to exploit the SVSM), the SVSM's state might be messed up, so we might also also consider erasing the keys to prevent exploits.

Oh, got it! I will call disable_vmpck0 from the panic handler too. Thanks

src/greq/driver.rs Outdated Show resolved Hide resolved
src/greq/msg.rs Outdated Show resolved Hide resolved
src/greq/pld_report.rs Outdated Show resolved Hide resolved
src/greq/pld_report.rs Outdated Show resolved Hide resolved
src/greq/pld_report.rs Outdated Show resolved Hide resolved
Add a RustCrypto-based implementation for the SVSM Aes256GCM trait.

Signed-off-by: Claudio Carvalho <cclaudio@linux.ibm.com>
@cclaudio cclaudio force-pushed the attestation-report branch 2 times, most recently from ef7a7ff to 817c317 Compare November 6, 2023 05:16
@cclaudio
Copy link
Member Author

cclaudio commented Nov 6, 2023

Thanks @Freax13 for the feedback. All changes applied.

  • Added packed to the structs in pld_report.rs
  • Refactored the msg.rs and driver.rs to get a slice range using get() and get_mut()
  • Moved functions that handle the C-bit to mm/pagetable, as a separate patch
  • Refactored SnpGuestRequestDriver struct fields to be Box<STRUCT> rather than just &mut STRUCT. To support that, I added the methods .boxed_new(), .set_encrypted() and .set_shared() to the SnpGuestRequestMsg and SnpGuestRequestExtData structs. The boxed_new() allocates the object directly in the heap without going through the stack.

src/greq/msg.rs Outdated Show resolved Hide resolved
@osteffenrh
Copy link
Contributor

Hi @cclaudio,

the current state of this PR seems to end up in a deadlock for me,
trying to acquire the ROOT_MEM lock.
I am using your attestation-report-test branch.
It gets stuck in test_get_regular_report().

GDB session:

[...]
#1  0xffffff80000066ce in svsm::svsm_main () at src/svsm.rs:479
479	    debug_break();   // added by me.
(gdb) s
70	    }
(gdb) s
svsm::svsm_main () at src/svsm.rs:480
480	    test_get_regular_report();
(gdb) s
svsm::greq::services::test_get_regular_report () at src/greq/services.rs:31
31	    let vaddr = allocate_zeroed_page().unwrap();
(gdb) s
svsm::mm::alloc::allocate_zeroed_page () at src/mm/alloc.rs:772
772	    ROOT_MEM.lock().allocate_zeroed_page()
(gdb) s
svsm::locking::spinlock::SpinLock<svsm::mm::alloc::MemoryRegion>::lock<svsm::mm::alloc::MemoryRegion>
    (self=0xffffff800010cd60 <svsm::mm::alloc::ROOT_MEM::h3d8003dc93a1d738>)
    at src/locking/spinlock.rs:58
58	        let ticket = self.current.fetch_add(1, Ordering::Relaxed);
(gdb) s

 < guest and gdb session hang >

@cclaudio
Copy link
Member Author

cclaudio commented Nov 6, 2023

Hi @cclaudio,

the current state of this PR seems to end up in a deadlock for me, trying to acquire the ROOT_MEM lock. I am using your attestation-report-test branch. It gets stuck in test_get_regular_report().

GDB session:

[...]
#1  0xffffff80000066ce in svsm::svsm_main () at src/svsm.rs:479
479	    debug_break();   // added by me.
(gdb) s
70	    }
(gdb) s
svsm::svsm_main () at src/svsm.rs:480
480	    test_get_regular_report();
(gdb) s
svsm::greq::services::test_get_regular_report () at src/greq/services.rs:31
31	    let vaddr = allocate_zeroed_page().unwrap();
(gdb) s
svsm::mm::alloc::allocate_zeroed_page () at src/mm/alloc.rs:772
772	    ROOT_MEM.lock().allocate_zeroed_page()
(gdb) s
svsm::locking::spinlock::SpinLock<svsm::mm::alloc::MemoryRegion>::lock<svsm::mm::alloc::MemoryRegion>
    (self=0xffffff800010cd60 <svsm::mm::alloc::ROOT_MEM::h3d8003dc93a1d738>)
    at src/locking/spinlock.rs:58
58	        let ticket = self.current.fetch_add(1, Ordering::Relaxed);
(gdb) s

 < guest and gdb session hang >

Hi @osteffenrh, thanks for the feedback. The PR seems to be working for me with and without enable-gdb.

  • Does it hang for you on every boot?
  • How are you connecting gdb to the guest? I can't see that [SVSM] * Waiting for connection from GDB * message.
  • Are you running the latest svsm kernel branch in the guest kernel? It includes some patches from Tom Lendacky that could help with that.

@roy-hopkins
Copy link
Collaborator

Hi @osteffenrh, I suspect the hang on the lock is due to the fact that the GDB stub borrows the same functions as the code it is debugging, often resulting in deadlocks if you try to single step over lock code. You can try setting breakpoints instead after the lock has been taken. That normally works.

Having said that, I tried running the latest code from this PR this morning, albeit merged into my own development branch. I found that it was hanging on the call to test_get_regular_report() too. The problem was traced to SnpGuestRequestMsg::set_shared() which clears the C bit from the message buffer but was not informing the hypervisor that the encryption state has changed. I'll add a review comment with a suggested fix.

@Freax13
Copy link
Contributor

Freax13 commented Nov 6, 2023

The problem was traced to SnpGuestRequestMsg::set_shared() which clears the C bit from the message buffer but was not informing the hypervisor that the encryption state has changed. I'll add a review comment with a suggested fix.

Informing the hypervisor about the encryption state change also seems to fix #69 (comment) for me.

src/greq/msg.rs Outdated Show resolved Hide resolved
src/greq/driver.rs Show resolved Hide resolved
// The sequence number is restored only when the guest is rebooted.
let Some(msg_seqno) = self.seqno_last_used().checked_add(1) else {
log::error!("SNP_GUEST_REQUEST: sequence number overflow");
disable_vmpck0();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'm not saying that they cause a panic. I'm just saying that if some other code in the SVSM panicked (e.g. because an attacker is trying to exploit the SVSM), the SVSM's state might be messed up, so we might also also consider erasing the keys to prevent exploits.

src/greq/msg.rs Outdated Show resolved Hide resolved
src/greq/msg.rs Outdated Show resolved Hide resolved
src/greq/msg.rs Outdated Show resolved Hide resolved
These structures are used in the SNP_GUEST_REQUEST communication between the
guest and the PSP; their implementation follow the AMD SEV-SNP specification,
chapter 7.

The SnpGuestRequestMsg is used to carry a SNP_GUEST_REQUEST command or response
in the payload, which is encrypted using AES-256 GCM. This message can't be
tampered with by the hypervisor because only the PSP and the guest have access
to the key to decrypt the payload.

An extended SNP_GUEST_REQUEST command also requests data from the hypervisor;
in this case, the SnpGuestRequestExtData is also provided. The hypervisor will
use it to store the requested data.

Signed-off-by: Claudio Carvalho <cclaudio@linux.ibm.com>
Add a driver to send SNP_GUEST_REQUEST commands to the PSP. The command can be
any of the request or response command types defined in the SEV-SNP spec,
regardless if it's a regular or an extended command.

The send_regular_guest_request() and send_extended_guest_request() functions
can be used to send regular and extended commands, respectively.

guest_request_driver_init() is used to initialize the static driver instance.

Signed-off-by: Claudio Carvalho <cclaudio@linux.ibm.com>
The panic handler is called when the SVSM state is not reliable any more.
Disable the VMPCK0 key to prevent it from being exploited.

Signed-off-by: Claudio Carvalho <cclaudio@linux.ibm.com>
Add get_regular_report() and get_extended_report(). They both call the
SNP_GUEST_REQUEST driver to request a VMPL0 attestation report, the difference
is that get_extended_report() also requests the SEV-SNP certificates needed to
verify the attestation report.

The get_extended_report() function will return an empty buffer if the SEV-SNP
certificates where not imported yet, but they can be imported from the host
using the github virtee/snphost project:

$ snphost import <PEM-files-directory>

For testing purposes, if you import PEM files that contain some random data,
you should be able to see the same random data when you call
get_extended_report() from the SVSM.

Signed-off-by: Claudio Carvalho <cclaudio@linux.ibm.com>
@cclaudio
Copy link
Member Author

cclaudio commented Nov 8, 2023

Thanks @osteffenrh @roy-hopkins @Freax13 @tlendacky for the feedback.

Changes included in this branch update:

  • We were leaking the shared memory in the drop of SnpGuestRequestMsg and SnpGuestRequestExtData. I moved that code to the drop of the Driver so that we can leak the Box rather than the self of the aforementioned structs.
  • Using the Copy trait for copying the SnpGuestRequestMsg and SnpGuestRequestMsgHdr. Removed the copy_from()
  • Informing the hypervisor when we change the encrypted state of pages.
    • Calling page_state_change() directly in the set_encrypted_4k() and set_shared_4k() functions crashes the guest "error: kvm run failed Invalid Argument". It looks like that these functions are called in multiple places and one or more of them are not happy with the page_state_change(). So, for now, I think it is better to call page_state_change() directly in the driver code.
  • Calling alloc_zeroed() to allocate memory for the boxes. The PSP is no longer returning the error 22 now that we are informing the hypervisor when we change the encrypted state of pages
  • Disable the VMPCK0 in the panic handler. New patch.

@joergroedel joergroedel merged commit d1fb87d into coconut-svsm:main Nov 20, 2023
2 checks passed
@joergroedel
Copy link
Member

Merged manually, thanks. Any pending feedback can be addressed on-top.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.