Skip to content

Commit

Permalink
Make wasmtime::WasmCoreDump serializable (#7078)
Browse files Browse the repository at this point in the history
This commit makes it so that the library type for core dumps is serializable
into the standard binary format for core dumps.

Additionally, this commit makes it so that we use the library type for
generating core dumps in the CLI. We previously were using a one-off
implementation of core dump generation that only had backtrace information and
no instances, modules, globals, or memories included. The library type has all
that information, so the core dumps produced by our CLI will both be more
featureful and be generated by shared code paths going forward.

Along the way, implementing all this required some new helper methods sprinkled
throughout `wasmtime` and `wasmtime-runtime`:

* `wasmtime::Instance::module`: get the module that a `wasmtime::Instance` is an
  instance of. This is public, since it seems generally useful. This involved
  adding a new return value from `ModuleRegistry::register_module` that is an
  identifier that can be used to recover a reference to the registered module.

* `wasmtime::Instance::all_{globals,memories}`: get the full global/memory index
  space. I made these `pub(crate)` out of caution. I don't think we want to commit
  to exposing non-exported things in the public API, even if we internally need
  them for debugging-related features like core dumps. These also needed
  corresponding methods inside `wasmtime-runtime`.

* `wasmtime::{Global,Memory}::hash_key`: this was needed to work around the fact
  that each time you call `{Global,Memory}::from_wasmtime`, it creates a new
  entry in the `StoreData` and so you can get duplicates. But we need to key some
  hash maps on globals and memories when constructing core dumps, so we can't
  treat the underlying `Stored<T>` as a hash key because it isn't stable across
  duplicate `StoreData` entries. So we have these new methods. They are only
  `pub(crate)`, are definitely implementation details, and aren't exposed in the
  public API.

* `wasmtime::FrameInfo::module`: Each frame in a backtrace now keeps a handle to
  its associated module instead of just the name. This is publicly exposed
  because it seems generally useful. This means I also deprecated
  `wasmtime::FrameInfo::module_name` since you can now instead do
  `frame.module().name()` to get that exact same info. I updated callers inside
  the repo.
  • Loading branch information
fitzgen authored Sep 25, 2023
1 parent 6c438d4 commit 6a7ef27
Show file tree
Hide file tree
Showing 16 changed files with 587 additions and 116 deletions.
7 changes: 7 additions & 0 deletions RELEASES.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,15 @@ Unreleased.

### Added

* Added the `wasmtime::FrameInfo::module` method, which returns the
`wasmtime::Module` associated with the stack frame.

### Changed

* The `wasmtime::FrameInfo::module_name` has been removed, however you can now
get identical results by chaining `wasmtime::FrameInfo::module` and
`wasmtime::Module::name`: `my_frame.module().name()`.

--------------------------------------------------------------------------------

## 13.0.0
Expand Down
3 changes: 2 additions & 1 deletion crates/c-api/src/trap.rs
Original file line number Diff line number Diff line change
Expand Up @@ -167,7 +167,8 @@ pub extern "C" fn wasmtime_frame_module_name<'a>(
.module_name
.get_or_init(|| {
frame.trace.frames()[frame.idx]
.module_name()
.module()
.name()
.map(|s| wasm_name_t::from(s.to_string().into_bytes()))
})
.as_ref()
Expand Down
3 changes: 3 additions & 0 deletions crates/cli-flags/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -350,6 +350,9 @@ impl CommonOptions {
if let Some(enable) = self.debug.debug_info {
config.debug_info(enable);
}
if self.debug.coredump.is_some() {
config.coredump_on_trap(true);
}
if let Some(level) = self.opts.opt_level {
config.cranelift_opt_level(level);
}
Expand Down
63 changes: 57 additions & 6 deletions crates/runtime/src/instance.rs
Original file line number Diff line number Diff line change
Expand Up @@ -384,6 +384,28 @@ impl Instance {
}
}

/// Get all globals within this instance.
///
/// Returns both import and defined globals.
///
/// Returns both exported and non-exported globals.
///
/// Gives access to the full globals space.
pub fn all_globals<'a>(
&'a mut self,
) -> impl ExactSizeIterator<Item = (GlobalIndex, ExportGlobal)> + 'a {
let module = self.module().clone();
module.globals.keys().map(move |idx| {
(
idx,
ExportGlobal {
definition: self.defined_or_imported_global_ptr(idx),
global: self.module().globals[idx],
},
)
})
}

/// Get the globals defined in this instance (not imported).
pub fn defined_globals<'a>(
&'a mut self,
Expand Down Expand Up @@ -1350,14 +1372,43 @@ impl InstanceHandle {
self.instance_mut().get_table_with_lazy_init(index, range)
}

/// Return the memories defined in this instance (not imported).
pub fn defined_memories<'a>(&'a mut self) -> impl ExactSizeIterator<Item = ExportMemory> + 'a {
let idxs = (0..self.module().memory_plans.len())
.skip(self.module().num_imported_memories)
/// Get all memories within this instance.
///
/// Returns both import and defined memories.
///
/// Returns both exported and non-exported memories.
///
/// Gives access to the full memories space.
pub fn all_memories<'a>(
&'a mut self,
) -> impl ExactSizeIterator<Item = (MemoryIndex, ExportMemory)> + 'a {
let indices = (0..self.module().memory_plans.len())
.map(|i| MemoryIndex::new(i))
.collect::<Vec<_>>();
idxs.into_iter()
.map(|memidx| self.get_exported_memory(memidx))
indices
.into_iter()
.map(|i| (i, self.get_exported_memory(i)))
}

/// Return the memories defined in this instance (not imported).
pub fn defined_memories<'a>(&'a mut self) -> impl ExactSizeIterator<Item = ExportMemory> + 'a {
let num_imported = self.module().num_imported_memories;
self.all_memories()
.skip(num_imported)
.map(|(_i, memory)| memory)
}

/// Get all globals within this instance.
///
/// Returns both import and defined globals.
///
/// Returns both exported and non-exported globals.
///
/// Gives access to the full globals space.
pub fn all_globals<'a>(
&'a mut self,
) -> impl ExactSizeIterator<Item = (GlobalIndex, ExportGlobal)> + 'a {
self.instance_mut().all_globals()
}

/// Get the globals defined in this instance (not imported).
Expand Down
198 changes: 196 additions & 2 deletions crates/wasmtime/src/coredump.rs
Original file line number Diff line number Diff line change
@@ -1,6 +1,9 @@
use std::fmt;
use std::{collections::HashMap, fmt};

use crate::{store::StoreOpaque, FrameInfo, Global, Instance, Memory, Module, WasmBacktrace};
use crate::{
store::StoreOpaque, AsContextMut, FrameInfo, Global, Instance, Memory, Module, StoreContextMut,
Val, ValType, WasmBacktrace,
};

/// Representation of a core dump of a WebAssembly module
///
Expand Down Expand Up @@ -80,6 +83,197 @@ impl WasmCoreDump {
pub fn memories(&self) -> &[Memory] {
self.memories.as_ref()
}

/// Serialize this core dump into [the standard core dump binary
/// format][spec].
///
/// The `name` parameter may be a file path, URL, or arbitrary name for the
/// "main" Wasm service or executable that was running in this store.
///
/// Once serialized, you can write this core dump to disk, send it over the
/// network, or pass it to other debugging tools that consume Wasm core
/// dumps.
///
/// [spec]: https://github.com/WebAssembly/tool-conventions/blob/main/Coredump.md
pub fn serialize(&self, mut store: impl AsContextMut, name: &str) -> Vec<u8> {
let store = store.as_context_mut();
self._serialize(store, name)
}

fn _serialize<T>(&self, mut store: StoreContextMut<'_, T>, name: &str) -> Vec<u8> {
let mut core_dump = wasm_encoder::Module::new();

core_dump.section(&wasm_encoder::CoreDumpSection::new(name));

// A map from each memory to its index in the core dump's memories
// section.
let mut memory_to_idx = HashMap::new();

let mut data = wasm_encoder::DataSection::new();

{
let mut memories = wasm_encoder::MemorySection::new();
for mem in self.memories() {
let memory_idx = memories.len();
memory_to_idx.insert(mem.hash_key(&store.0), memory_idx);
let ty = mem.ty(&store);
memories.memory(wasm_encoder::MemoryType {
minimum: mem.size(&store),
maximum: ty.maximum(),
memory64: ty.is_64(),
shared: ty.is_shared(),
});

// Attach the memory data, balancing number of data segments and
// binary size. We don't want to attach the whole memory in one
// big segment, since it likely contains a bunch of large runs
// of zeroes. But we can't encode the data without any potential
// runs of zeroes (i.e. including only non-zero data in our
// segments) because we can run up against the implementation
// limits for number of segments in a Wasm module this way. So
// to balance these conflicting desires, we break the memory up
// into reasonably-sized chunks and then trim runs of zeroes
// from the start and end of each chunk.
const CHUNK_SIZE: u32 = 4096;
for (i, chunk) in mem
.data(&store)
.chunks_exact(CHUNK_SIZE as usize)
.enumerate()
{
if let Some(start) = chunk.iter().position(|byte| *byte != 0) {
let end = chunk.iter().rposition(|byte| *byte != 0).unwrap() + 1;
let offset = (i as u32) * CHUNK_SIZE + (start as u32);
let offset = wasm_encoder::ConstExpr::i32_const(offset as i32);
data.active(memory_idx, &offset, chunk[start..end].iter().copied());
}
}
}
core_dump.section(&memories);
}

// A map from each global to its index in the core dump's globals
// section.
let mut global_to_idx = HashMap::new();

{
let mut globals = wasm_encoder::GlobalSection::new();
for g in self.globals() {
global_to_idx.insert(g.hash_key(&store.0), globals.len());
let ty = g.ty(&store);
let mutable = matches!(ty.mutability(), crate::Mutability::Var);
let val_type = match ty.content() {
ValType::I32 => wasm_encoder::ValType::I32,
ValType::I64 => wasm_encoder::ValType::I64,
ValType::F32 => wasm_encoder::ValType::F32,
ValType::F64 => wasm_encoder::ValType::F64,
ValType::V128 => wasm_encoder::ValType::V128,
ValType::FuncRef => wasm_encoder::ValType::FUNCREF,
ValType::ExternRef => wasm_encoder::ValType::EXTERNREF,
};
let init = match g.get(&mut store) {
Val::I32(x) => wasm_encoder::ConstExpr::i32_const(x),
Val::I64(x) => wasm_encoder::ConstExpr::i64_const(x),
Val::F32(x) => {
wasm_encoder::ConstExpr::f32_const(unsafe { std::mem::transmute(x) })
}
Val::F64(x) => {
wasm_encoder::ConstExpr::f64_const(unsafe { std::mem::transmute(x) })
}
Val::V128(x) => wasm_encoder::ConstExpr::v128_const(x.as_u128() as i128),
Val::FuncRef(_) => {
wasm_encoder::ConstExpr::ref_null(wasm_encoder::HeapType::Func)
}
Val::ExternRef(_) => {
wasm_encoder::ConstExpr::ref_null(wasm_encoder::HeapType::Extern)
}
};
globals.global(wasm_encoder::GlobalType { val_type, mutable }, &init);
}
core_dump.section(&globals);
}

core_dump.section(&data);
drop(data);

// A map from module id to its index within the core dump's modules
// section.
let mut module_to_index = HashMap::new();

{
let mut modules = wasm_encoder::CoreDumpModulesSection::new();
for module in self.modules() {
module_to_index.insert(module.id(), modules.len());
match module.name() {
Some(name) => modules.module(name),
None => modules.module(&format!("<anonymous-module-{}>", modules.len())),
};
}
core_dump.section(&modules);
}

// TODO: We can't currently recover instances from stack frames. We can
// recover module via the frame's PC, but if there are multiple
// instances of the same module, we don't know which instance the frame
// is associated with. Therefore, we do a best effort job: remember the
// last instance of each module and always choose that one. We record
// that information here.
let mut module_to_instance = HashMap::new();

{
let mut instances = wasm_encoder::CoreDumpInstancesSection::new();
for instance in self.instances() {
let module = instance.module(&store);
module_to_instance.insert(module.id(), instances.len());

let module_index = module_to_index[&module.id()];

let memories = instance
.all_memories(&mut store.0)
.collect::<Vec<_>>()
.into_iter()
.map(|(_i, memory)| memory_to_idx[&memory.hash_key(&store.0)])
.collect::<Vec<_>>();

let globals = instance
.all_globals(&mut store.0)
.collect::<Vec<_>>()
.into_iter()
.map(|(_i, global)| global_to_idx[&global.hash_key(&store.0)])
.collect::<Vec<_>>();

instances.instance(module_index, memories, globals);
}
core_dump.section(&instances);
}

{
let thread_name = "main";
let mut stack = wasm_encoder::CoreDumpStackSection::new(thread_name);
for frame in self.frames() {
// This isn't necessarily the right instance if there are
// multiple instances of the same module. See comment above
// `module_to_instance` for details.
let instance = module_to_instance[&frame.module().id()];

let func = frame.func_index();

let offset = frame
.func_offset()
.and_then(|o| u32::try_from(o).ok())
.unwrap_or(0);

// We can't currently recover locals and the operand stack. We
// should eventually be able to do that with Winch though.
let locals = [];
let operand_stack = [];

stack.frame(instance, func, offset, locals, operand_stack);
}
core_dump.section(&stack);
}

core_dump.finish()
}
}

impl fmt::Display for WasmCoreDump {
Expand Down
52 changes: 52 additions & 0 deletions crates/wasmtime/src/externals.rs
Original file line number Diff line number Diff line change
Expand Up @@ -349,6 +349,58 @@ impl Global {
from: store[self.0].definition,
}
}

/// Get a stable hash key for this global.
///
/// Even if the same underlying global definition is added to the
/// `StoreData` multiple times and becomes multiple `wasmtime::Global`s,
/// this hash key will be consistent across all of these globals.
pub(crate) fn hash_key(&self, store: &StoreOpaque) -> impl std::hash::Hash + Eq {
store[self.0].definition as usize
}
}

#[cfg(test)]
mod global_tests {
use super::*;
use crate::{Instance, Module, Store};

#[test]
fn hash_key_is_stable_across_duplicate_store_data_entries() -> Result<()> {
let mut store = Store::<()>::default();
let module = Module::new(
store.engine(),
r#"
(module
(global (export "g") (mut i32) (i32.const 0))
)
"#,
)?;
let instance = Instance::new(&mut store, &module, &[])?;

// Each time we `get_global`, we call `Global::from_wasmtime` which adds
// a new entry to `StoreData`, so `g1` and `g2` will have different
// indices into `StoreData`.
let g1 = instance.get_global(&mut store, "g").unwrap();
let g2 = instance.get_global(&mut store, "g").unwrap();

// That said, they really point to the same global.
assert_eq!(g1.get(&mut store).unwrap_i32(), 0);
assert_eq!(g2.get(&mut store).unwrap_i32(), 0);
g1.set(&mut store, Val::I32(42))?;
assert_eq!(g1.get(&mut store).unwrap_i32(), 42);
assert_eq!(g2.get(&mut store).unwrap_i32(), 42);

// And therefore their hash keys are the same.
assert!(g1.hash_key(&store.as_context().0) == g2.hash_key(&store.as_context().0));

// But the hash keys are different from different globals.
let instance2 = Instance::new(&mut store, &module, &[])?;
let g3 = instance2.get_global(&mut store, "g").unwrap();
assert!(g1.hash_key(&store.as_context().0) != g3.hash_key(&store.as_context().0));

Ok(())
}
}

/// A WebAssembly `table`, or an array of values.
Expand Down
Loading

0 comments on commit 6a7ef27

Please sign in to comment.