Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix segfault with custom page sizes on aarch64 #8918

Merged
merged 1 commit into from
Jul 8, 2024

Conversation

alexcrichton
Copy link
Member

This commit fixes an issue with static memory initialization and custom page sizes interacting together on aarch64 Linux. (is that specific enough?)

When static memory initialization is enabled chunks of memory to initialize the linear memory are made in host-page-size increments of memory. This is done to enable page-mapping via copy-on-write if customized. With the custom page sizes proposal, however, for the first time it's possible for a linear memory to be smaller than this chunk of memory. This means that a virtual memory allocation of a single host page can be made which is smaller than the initialization chunk.

This currently only happens on aarch64 Linux where we coarsely approximate that the host page size is 64k but many hosts run with 4k pages. This means that a 64k initializer is created but the host only allocates 4k for a linear memory. This means that memory initialization can crash when a 64k initializer is copied into a 4k memory.

This was not caught via fuzzing because fuzzing only runs on x86_64. This was not caught via CI because on CI guard pages are disabled entirely on QEMU and we got lucky in that a number of virtual memory allocations were all placed next to each other meaning that this copy was probably corrupting some other memory. Locally this was found by running tests on main as-is on AArch64 Linux (by bjorn3).

This commit implements a few safeguards and a fix for this issue:

  • On CI with QEMU modestly-size guard pages are now enabled to catch this sooner in development should it happen again in the future.
  • An assert! is added during memory initialization that the memory copy is indeed valid. This causes the tests to fail as-is on main even on x86_64.
  • The issue itself is fixed by bailing out of static memory initialization should the host page size exceed the wasm page size which can now happen on aarch64 Linux with smaller page sizes.

This commit fixes an issue with static memory initialization and custom
page sizes interacting together on aarch64 Linux. (is that specific
enough?)

When static memory initialization is enabled chunks of memory to
initialize the linear memory are made in host-page-size increments of
memory. This is done to enable page-mapping via copy-on-write if
customized. With the custom page sizes proposal, however, for the first
time it's possible for a linear memory to be smaller than this chunk of
memory. This means that a virtual memory allocation of a single host
page can be made which is smaller than the initialization chunk.

This currently only happens on aarch64 Linux where we coarsely
approximate that the host page size is 64k but many hosts run with 4k
pages. This means that a 64k initializer is created but the host only
allocates 4k for a linear memory. This means that memory initialization
can crash when a 64k initializer is copied into a 4k memory.

This was not caught via fuzzing because fuzzing only runs on x86_64.
This was not caught via CI because on CI guard pages are disabled
entirely on QEMU and we got lucky in that a number of virtual memory
allocations were all placed next to each other meaning that this copy
was probably corrupting some other memory. Locally this was found by
running tests on `main` as-is on AArch64 Linux (by bjorn3).

This commit implements a few safeguards and a fix for this issue:

* On CI with QEMU modestly-size guard pages are now enabled to catch
  this sooner in development should it happen again in the future.
* An `assert!` is added during memory initialization that the memory
  copy is indeed valid. This causes the tests to fail as-is on `main`
  even on x86_64.
* The issue itself is fixed by bailing out of static memory
  initialization should the host page size exceed the wasm page size
  which can now happen on aarch64 Linux with smaller page sizes.
@alexcrichton alexcrichton requested a review from a team as a code owner July 8, 2024 20:21
@alexcrichton alexcrichton requested review from fitzgen and removed request for a team July 8, 2024 20:21
Copy link
Member

@fitzgen fitzgen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Thank you!

@fitzgen fitzgen added this pull request to the merge queue Jul 8, 2024
Merged via the queue into bytecodealliance:main with commit c2717d1 Jul 8, 2024
37 checks passed
@alexcrichton alexcrichton deleted the fix-aarch64 branch July 8, 2024 21:34
alexcrichton added a commit to alexcrichton/wasmtime that referenced this pull request Jul 8, 2024
This commit fixes an issue with static memory initialization and custom
page sizes interacting together on aarch64 Linux. (is that specific
enough?)

When static memory initialization is enabled chunks of memory to
initialize the linear memory are made in host-page-size increments of
memory. This is done to enable page-mapping via copy-on-write if
customized. With the custom page sizes proposal, however, for the first
time it's possible for a linear memory to be smaller than this chunk of
memory. This means that a virtual memory allocation of a single host
page can be made which is smaller than the initialization chunk.

This currently only happens on aarch64 Linux where we coarsely
approximate that the host page size is 64k but many hosts run with 4k
pages. This means that a 64k initializer is created but the host only
allocates 4k for a linear memory. This means that memory initialization
can crash when a 64k initializer is copied into a 4k memory.

This was not caught via fuzzing because fuzzing only runs on x86_64.
This was not caught via CI because on CI guard pages are disabled
entirely on QEMU and we got lucky in that a number of virtual memory
allocations were all placed next to each other meaning that this copy
was probably corrupting some other memory. Locally this was found by
running tests on `main` as-is on AArch64 Linux (by bjorn3).

This commit implements a few safeguards and a fix for this issue:

* On CI with QEMU modestly-size guard pages are now enabled to catch
  this sooner in development should it happen again in the future.
* An `assert!` is added during memory initialization that the memory
  copy is indeed valid. This causes the tests to fail as-is on `main`
  even on x86_64.
* The issue itself is fixed by bailing out of static memory
  initialization should the host page size exceed the wasm page size
  which can now happen on aarch64 Linux with smaller page sizes.
fitzgen pushed a commit that referenced this pull request Jul 8, 2024
This commit fixes an issue with static memory initialization and custom
page sizes interacting together on aarch64 Linux. (is that specific
enough?)

When static memory initialization is enabled chunks of memory to
initialize the linear memory are made in host-page-size increments of
memory. This is done to enable page-mapping via copy-on-write if
customized. With the custom page sizes proposal, however, for the first
time it's possible for a linear memory to be smaller than this chunk of
memory. This means that a virtual memory allocation of a single host
page can be made which is smaller than the initialization chunk.

This currently only happens on aarch64 Linux where we coarsely
approximate that the host page size is 64k but many hosts run with 4k
pages. This means that a 64k initializer is created but the host only
allocates 4k for a linear memory. This means that memory initialization
can crash when a 64k initializer is copied into a 4k memory.

This was not caught via fuzzing because fuzzing only runs on x86_64.
This was not caught via CI because on CI guard pages are disabled
entirely on QEMU and we got lucky in that a number of virtual memory
allocations were all placed next to each other meaning that this copy
was probably corrupting some other memory. Locally this was found by
running tests on `main` as-is on AArch64 Linux (by bjorn3).

This commit implements a few safeguards and a fix for this issue:

* On CI with QEMU modestly-size guard pages are now enabled to catch
  this sooner in development should it happen again in the future.
* An `assert!` is added during memory initialization that the memory
  copy is indeed valid. This causes the tests to fail as-is on `main`
  even on x86_64.
* The issue itself is fixed by bailing out of static memory
  initialization should the host page size exceed the wasm page size
  which can now happen on aarch64 Linux with smaller page sizes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants