Files
wasmtime/crates/wasi-nn/src/openvino.rs
Andrew Brown edfa10d607 wasi-threads: an initial implementation (#5484)
This commit includes a set of changes that add initial support for `wasi-threads` to Wasmtime:

* feat: remove mutability from the WasiCtx Table

This patch adds interior mutability to the WasiCtx Table and the Table elements.

Major pain points:
* `File` only needs `RwLock<cap_std::fs::File>` to implement
  `File::set_fdflags()` on Windows, because of [1]
* Because `File` needs a `RwLock` and `RwLock*Guard` cannot
  be hold across an `.await`, The `async` from
  `async fn num_ready_bytes(&self)` had to be removed
* Because `File` needs a `RwLock` and `RwLock*Guard` cannot
  be dereferenced in `pollable`, the signature of
  `fn pollable(&self) -> Option<rustix::fd::BorrowedFd>`
  changed to `fn pollable(&self) -> Option<Arc<dyn AsFd + '_>>`

[1] da238e324e/src/fs/fd_flags.rs (L210-L217)

* wasi-threads: add an initial implementation

This change is a first step toward implementing `wasi-threads` in
Wasmtime. We may find that it has some missing pieces, but the core
functionality is there: when `wasi::thread_spawn` is called by a running
WebAssembly module, a function named `wasi_thread_start` is found in the
module's exports and called in a new instance. The shared memory of the
original instance is reused in the new instance.

This new WASI proposal is in its early stages and details are still
being hashed out in the [spec] and [wasi-libc] repositories. Due to its
experimental state, the `wasi-threads` functionality is hidden behind
both a compile-time and runtime flag: one must build with `--features
wasi-threads` but also run the Wasmtime CLI with `--wasm-features
threads` and `--wasi-modules experimental-wasi-threads`. One can
experiment with `wasi-threads` by running:

```console
$ cargo run --features wasi-threads -- \
    --wasm-features threads --wasi-modules experimental-wasi-threads \
    <a threads-enabled module>
```

Threads-enabled Wasm modules are not yet easy to build. Hopefully this
is resolved soon, but in the meantime see the use of
`THREAD_MODEL=posix` in the [wasi-libc] repository for some clues on
what is necessary. Wiggle complicates things by requiring the Wasm
memory to be exported with a certain name and `wasi-threads` also
expects that memory to be imported; this build-time obstacle can be
overcome with the `--import-memory --export-memory` flags only available
in the latest Clang tree. Due to all of this, the included tests are
written directly in WAT--run these with:

```console
$ cargo test --features wasi-threads -p wasmtime-cli -- cli_tests
```

[spec]: https://github.com/WebAssembly/wasi-threads
[wasi-libc]: https://github.com/WebAssembly/wasi-libc

This change does not protect the WASI implementations themselves from
concurrent access. This is already complete in previous commits or left
for future commits in certain cases (e.g., wasi-nn).

* wasi-threads: factor out process exit logic

As is being discussed [elsewhere], either calling `proc_exit` or
trapping in any thread should halt execution of all threads. The
Wasmtime CLI already has logic for adapting a WebAssembly error code to
a code expected in each OS. This change factors out this logic to a new
function, `maybe_exit_on_error`, for use within the `wasi-threads`
implementation.

This will work reasonably well for CLI users of Wasmtime +
`wasi-threads`, but embedders will want something better in the future:
when a `wasi-threads` threads fails, they may not want their application
to exit. Handling this is tricky, because it will require cancelling the
threads spawned by the `wasi-threads` implementation, something that is
not trivial to do in Rust. With this change, we defer that work until
later in order to provide a working implementation of `wasi-threads` for
experimentation.

[elsewhere]: https://github.com/WebAssembly/wasi-threads/pull/17

* review: work around `fd_fdstat_set_flags`

In order to make progress with wasi-threads, this change temporarily
works around limitations induced by `wasi-common`'s
`fd_fdstat_set_flags` to allow `&mut self` use in the implementation.
Eventual resolution is tracked in
https://github.com/bytecodealliance/wasmtime/issues/5643. This change
makes several related helper functions (e.g., `set_fdflags`) take `&mut
self` as well.

* test: use `wait`/`notify` to improve `threads.wat` test

Previously, the test simply executed in a loop for some hardcoded number
of iterations. This changes uses `wait` and `notify` and atomic
operations to keep track of when the spawned threads are done and join
on the main thread appropriately.

* various fixes and tweaks due to the PR review

---------

Signed-off-by: Harald Hoyer <harald@profian.com>
Co-authored-by: Harald Hoyer <harald@profian.com>
Co-authored-by: Alex Crichton <alex@alexcrichton.com>
2023-02-07 13:43:02 -08:00

166 lines
6.0 KiB
Rust

//! Implements the wasi-nn API.
use crate::api::{Backend, BackendError, BackendExecutionContext, BackendGraph};
use crate::witx::types::{ExecutionTarget, GraphBuilderArray, Tensor, TensorType};
use openvino::{InferenceError, Layout, Precision, SetupError, TensorDesc};
use std::sync::Arc;
#[derive(Default)]
pub(crate) struct OpenvinoBackend(Option<openvino::Core>);
unsafe impl Send for OpenvinoBackend {}
unsafe impl Sync for OpenvinoBackend {}
impl Backend for OpenvinoBackend {
fn name(&self) -> &str {
"openvino"
}
fn load(
&mut self,
builders: &GraphBuilderArray<'_>,
target: ExecutionTarget,
) -> Result<Box<dyn BackendGraph>, BackendError> {
if builders.len() != 2 {
return Err(BackendError::InvalidNumberOfBuilders(2, builders.len()).into());
}
// Construct the context if none is present; this is done lazily (i.e.
// upon actually loading a model) because it may fail to find and load
// the OpenVINO libraries. The laziness limits the extent of the error
// only to wasi-nn users, not all WASI users.
if self.0.is_none() {
self.0.replace(openvino::Core::new(None)?);
}
// Read the guest array.
let builders = builders.as_ptr();
let xml = builders
.read()?
.as_slice()?
.expect("cannot use with shared memories; see https://github.com/bytecodealliance/wasmtime/issues/5235 (TODO)");
let weights = builders
.add(1)?
.read()?
.as_slice()?
.expect("cannot use with shared memories; see https://github.com/bytecodealliance/wasmtime/issues/5235 (TODO)");
// Construct OpenVINO graph structures: `cnn_network` contains the graph
// structure, `exec_network` can perform inference.
let core = self
.0
.as_mut()
.expect("openvino::Core was previously constructed");
let mut cnn_network = core.read_network_from_buffer(&xml, &weights)?;
// TODO this is a temporary workaround. We need a more eligant way to specify the layout in the long run.
// However, without this newer versions of OpenVINO will fail due to parameter mismatch.
for i in 0..cnn_network.get_inputs_len()? {
let name = cnn_network.get_input_name(i)?;
cnn_network.set_input_layout(&name, Layout::NHWC)?;
}
let exec_network =
core.load_network(&cnn_network, map_execution_target_to_string(target))?;
Ok(Box::new(OpenvinoGraph(Arc::new(cnn_network), exec_network)))
}
}
struct OpenvinoGraph(Arc<openvino::CNNNetwork>, openvino::ExecutableNetwork);
unsafe impl Send for OpenvinoGraph {}
unsafe impl Sync for OpenvinoGraph {}
impl BackendGraph for OpenvinoGraph {
fn init_execution_context(&mut self) -> Result<Box<dyn BackendExecutionContext>, BackendError> {
let infer_request = self.1.create_infer_request()?;
Ok(Box::new(OpenvinoExecutionContext(
self.0.clone(),
infer_request,
)))
}
}
struct OpenvinoExecutionContext(Arc<openvino::CNNNetwork>, openvino::InferRequest);
impl BackendExecutionContext for OpenvinoExecutionContext {
fn set_input(&mut self, index: u32, tensor: &Tensor<'_>) -> Result<(), BackendError> {
let input_name = self.0.get_input_name(index as usize)?;
// Construct the blob structure.
let dimensions = tensor
.dimensions
.as_slice()?
.expect("cannot use with shared memories; see https://github.com/bytecodealliance/wasmtime/issues/5235 (TODO)")
.iter()
.map(|d| *d as usize)
.collect::<Vec<_>>();
let precision = map_tensor_type_to_precision(tensor.type_);
// TODO There must be some good way to discover the layout here; this
// should not have to default to NHWC.
let desc = TensorDesc::new(Layout::NHWC, &dimensions, precision);
let data = tensor
.data
.as_slice()?
.expect("cannot use with shared memories; see https://github.com/bytecodealliance/wasmtime/issues/5235 (TODO)");
let blob = openvino::Blob::new(&desc, &data)?;
// Actually assign the blob to the request.
self.1.set_blob(&input_name, &blob)?;
Ok(())
}
fn compute(&mut self) -> Result<(), BackendError> {
self.1.infer()?;
Ok(())
}
fn get_output(&mut self, index: u32, destination: &mut [u8]) -> Result<u32, BackendError> {
let output_name = self.0.get_output_name(index as usize)?;
let mut blob = self.1.get_blob(&output_name)?;
let blob_size = blob.byte_len()?;
if blob_size > destination.len() {
return Err(BackendError::NotEnoughMemory(blob_size));
}
// Copy the tensor data into the destination buffer.
destination[..blob_size].copy_from_slice(blob.buffer()?);
Ok(blob_size as u32)
}
}
impl From<InferenceError> for BackendError {
fn from(e: InferenceError) -> Self {
BackendError::BackendAccess(anyhow::Error::new(e))
}
}
impl From<SetupError> for BackendError {
fn from(e: SetupError) -> Self {
BackendError::BackendAccess(anyhow::Error::new(e))
}
}
/// Return the execution target string expected by OpenVINO from the
/// `ExecutionTarget` enum provided by wasi-nn.
fn map_execution_target_to_string(target: ExecutionTarget) -> &'static str {
match target {
ExecutionTarget::Cpu => "CPU",
ExecutionTarget::Gpu => "GPU",
ExecutionTarget::Tpu => unimplemented!("OpenVINO does not support TPU execution targets"),
}
}
/// Return OpenVINO's precision type for the `TensorType` enum provided by
/// wasi-nn.
fn map_tensor_type_to_precision(tensor_type: TensorType) -> openvino::Precision {
match tensor_type {
TensorType::F16 => Precision::FP16,
TensorType::F32 => Precision::FP32,
TensorType::U8 => Precision::U8,
TensorType::I32 => Precision::I32,
}
}