forked from bytecodealliance/wasmtime
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
wasi-nn: add named models (bytecodealliance#6854)
* wasi-nn: add [named models] This change adds a way to retrieve preloaded ML models (i.e., "graphs" in wasi-nn terms) from a registry. The wasi-nn specification includes a new function, `load_by_name`, that can be used to access these models more efficiently than before; previously, a user's only option was to read/download/etc. all of the bytes of an ML model and pass them to the `load` function. [named models]: WebAssembly/wasi-nn#36 In Wasmtime's implementation of wasi-nn, we call the registry that holds the models a `GraphRegistry`. We include a simplistic `InMemoryRegistry` for use in the Wasmtime CLI (more on this later) but the idea is that production use will involve some more complex caching and thus a new implementation of a registry--a `Box<dyn GraphRegistry>`--passed into the wasi-nn context. Note that, because we now must be able to `clone` a graph out of the registry and into the "used graphs" table, the OpenVINO `BackendGraph` is updated to be easier to copy around. To allow experimentation with this "preload a named model" functionality, this change also adds a new Wasmtime CLI flag: `--graph <encoding>:<host dir>`. Wasmtime CLI users can now preload a model from a directory; the directory `basename` is used as the model name. Loading models from a directory is probably not desired in Wasmtime embeddings so it is cordoned off into a separate `BackendFromDir` extension trait. * wasi-nn: add "named model" test Add a new example crate which loads a model by name and performs image classification. It uses the same MobileNet model as the existing test but a new version of the Rust bindings. The new crate is built and run with the new CLI flag in the `ci/run-wasi-nn-example.sh` script. prtest:full * review: rename `--graph` to `--wasi-nn-graph`
- Loading branch information
1 parent
cd92093
commit d566f9e
Showing
14 changed files
with
388 additions
and
44 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
74 changes: 74 additions & 0 deletions
74
crates/wasi-nn/examples/classification-example-named/Cargo.lock
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
15 changes: 15 additions & 0 deletions
15
crates/wasi-nn/examples/classification-example-named/Cargo.toml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
[package] | ||
name = "wasi-nn-example-named" | ||
version = "0.0.0" | ||
authors = ["The Wasmtime Project Developers"] | ||
readme = "README.md" | ||
edition = "2021" | ||
publish = false | ||
|
||
[dependencies] | ||
wasi-nn = "0.5.0" | ||
|
||
# This crate is built with the wasm32-wasi target, so it's separate | ||
# from the main Wasmtime build, so use this directive to exclude it | ||
# from the parent directory's workspace. | ||
[workspace] |
2 changes: 2 additions & 0 deletions
2
crates/wasi-nn/examples/classification-example-named/README.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
This example project demonstrates using the `wasi-nn` API to perform ML inference. It consists of Rust code that is | ||
built using the `wasm32-wasi` target. See `ci/run-wasi-nn-example.sh` for how this is used. |
53 changes: 53 additions & 0 deletions
53
crates/wasi-nn/examples/classification-example-named/src/main.rs
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,53 @@ | ||
use std::fs; | ||
use wasi_nn::*; | ||
|
||
pub fn main() { | ||
let graph = GraphBuilder::new(GraphEncoding::Openvino, ExecutionTarget::CPU) | ||
.build_from_cache("mobilenet") | ||
.unwrap(); | ||
println!("Loaded a graph: {:?}", graph); | ||
|
||
let mut context = graph.init_execution_context().unwrap(); | ||
println!("Created an execution context: {:?}", context); | ||
|
||
// Load a tensor that precisely matches the graph input tensor (see | ||
// `fixture/frozen_inference_graph.xml`). | ||
let tensor_data = fs::read("fixture/tensor.bgr").unwrap(); | ||
println!("Read input tensor, size in bytes: {}", tensor_data.len()); | ||
context | ||
.set_input(0, TensorType::F32, &[1, 3, 224, 224], &tensor_data) | ||
.unwrap(); | ||
|
||
// Execute the inference. | ||
context.compute().unwrap(); | ||
println!("Executed graph inference"); | ||
|
||
// Retrieve the output. | ||
let mut output_buffer = vec![0f32; 1001]; | ||
context.get_output(0, &mut output_buffer[..]).unwrap(); | ||
|
||
println!( | ||
"Found results, sorted top 5: {:?}", | ||
&sort_results(&output_buffer)[..5] | ||
) | ||
} | ||
|
||
// Sort the buffer of probabilities. The graph places the match probability for each class at the | ||
// index for that class (e.g. the probability of class 42 is placed at buffer[42]). Here we convert | ||
// to a wrapping InferenceResult and sort the results. It is unclear why the MobileNet output | ||
// indices are "off by one" but the `.skip(1)` below seems necessary to get results that make sense | ||
// (e.g. 763 = "revolver" vs 762 = "restaurant") | ||
fn sort_results(buffer: &[f32]) -> Vec<InferenceResult> { | ||
let mut results: Vec<InferenceResult> = buffer | ||
.iter() | ||
.skip(1) | ||
.enumerate() | ||
.map(|(c, p)| InferenceResult(c, *p)) | ||
.collect(); | ||
results.sort_by(|a, b| b.1.partial_cmp(&a.1).unwrap()); | ||
results | ||
} | ||
|
||
// A wrapper for class ID and match probabilities. | ||
#[derive(Debug, PartialEq)] | ||
struct InferenceResult(usize, f32); |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.