Add *_unchecked variants of Func APIs for the C API (#3350)

* Add `*_unchecked` variants of `Func` APIs for the C API

This commit is what is hopefully going to be my last installment within
the saga of optimizing function calls in/out of WebAssembly modules in
the C API. This is yet another alternative approach to #3345 (sorry) but
also contains everything necessary to make the C API fast. As in #3345
the general idea is just moving checks out of the call path in the same
style of `TypedFunc`.

This new strategy takes inspiration from previously learned attempts
effectively "just" exposes how we previously passed `*mut u128` through
trampolines for arguments/results. This storage format is formalized
through a new `ValRaw` union that is exposed from the `wasmtime` crate.
By doing this it made it relatively easy to expose two new APIs:

* `Func::new_unchecked`
* `Func::call_unchecked`

These are the same as their checked equivalents except that they're
`unsafe` and they work with `*mut ValRaw` rather than safe slices of
`Val`. Working with these eschews type checks and such and requires
callers/embedders to do the right thing.

These two new functions are then exposed via the C API with new
functions, enabling C to have a fast-path of calling/defining functions.
This fast path is akin to `Func::wrap` in Rust, although that API can't
be built in C due to C not having generics in the same way that Rust
has.

For some benchmarks, the benchmarks here are:

* `nop` - Call a wasm function from the host that does nothing and
  returns nothing.
* `i64` - Call a wasm function from the host, the wasm function calls a
  host function, and the host function returns an `i64` all the way out to
  the original caller.
* `many` - Call a wasm function from the host, the wasm calls
   host function with 5 `i32` parameters, and then an `i64` result is
   returned back to the original host
* `i64` host - just the overhead of the wasm calling the host, so the
  wasm calls the host function in a loop.
* `many` host - same as `i64` host, but calling the `many` host function.

All numbers in this table are in nanoseconds, and this is just one
measurement as well so there's bound to be some variation in the precise
numbers here.

| Name      | Rust | C (before) | C (after) |
|-----------|------|------------|-----------|
| nop       | 19   | 112        | 25        |
| i64       | 22   | 207        | 32        |
| many      | 27   | 189        | 34        |
| i64 host  | 2    | 38         | 5         |
| many host | 7    | 75         | 8         |

The main conclusion here is that the C API is significantly faster than
before when using the `*_unchecked` variants of APIs. The Rust
implementation is still the ceiling (or floor I guess?) for performance
The main reason that C is slower than Rust is that a little bit more has
to travel through memory where on the Rust side of things we can
monomorphize and inline a bit more to get rid of that. Overall though
the costs are way way down from where they were originally and I don't
plan on doing a whole lot more myself at this time. There's various
things we theoretically could do I've considered but implementation-wise
I think they'll be much more weighty.

* Tweak `wasmtime_externref_t` API comments
This commit is contained in:
Alex Crichton
2021-09-24 14:05:45 -05:00
committed by GitHub
parent 344a219245
commit bfdbd10a13
16 changed files with 659 additions and 217 deletions

View File

@@ -8,7 +8,7 @@ use std::mem::{self, MaybeUninit};
use std::panic::{self, AssertUnwindSafe};
use std::ptr;
use std::str;
use wasmtime::{AsContextMut, Caller, Extern, Func, Trap, Val};
use wasmtime::{AsContextMut, Caller, Extern, Func, Trap, Val, ValRaw};
#[derive(Clone)]
#[repr(transparent)]
@@ -208,6 +208,9 @@ pub type wasmtime_func_callback_t = extern "C" fn(
usize,
) -> Option<Box<wasm_trap_t>>;
pub type wasmtime_func_unchecked_callback_t =
extern "C" fn(*mut c_void, *mut wasmtime_caller_t, *mut ValRaw) -> Option<Box<wasm_trap_t>>;
#[no_mangle]
pub unsafe extern "C" fn wasmtime_func_new(
store: CStoreContextMut<'_>,
@@ -271,6 +274,35 @@ pub(crate) unsafe fn c_callback_to_rust_fn(
}
}
#[no_mangle]
pub unsafe extern "C" fn wasmtime_func_new_unchecked(
store: CStoreContextMut<'_>,
ty: &wasm_functype_t,
callback: wasmtime_func_unchecked_callback_t,
data: *mut c_void,
finalizer: Option<extern "C" fn(*mut std::ffi::c_void)>,
func: &mut Func,
) {
let ty = ty.ty().ty.clone();
let cb = c_unchecked_callback_to_rust_fn(callback, data, finalizer);
*func = Func::new_unchecked(store, ty, cb);
}
pub(crate) unsafe fn c_unchecked_callback_to_rust_fn(
callback: wasmtime_func_unchecked_callback_t,
data: *mut c_void,
finalizer: Option<extern "C" fn(*mut std::ffi::c_void)>,
) -> impl Fn(Caller<'_, crate::StoreData>, *mut ValRaw) -> Result<(), Trap> {
let foreign = crate::ForeignData { data, finalizer };
move |caller, values| {
let mut caller = wasmtime_caller_t { caller };
match callback(foreign.data, &mut caller, values) {
None => Ok(()),
Some(trap) => Err(trap.trap),
}
}
}
#[no_mangle]
pub unsafe extern "C" fn wasmtime_func_call(
mut store: CStoreContextMut<'_>,
@@ -329,6 +361,18 @@ pub unsafe extern "C" fn wasmtime_func_call(
}
}
#[no_mangle]
pub unsafe extern "C" fn wasmtime_func_call_unchecked(
store: CStoreContextMut<'_>,
func: &Func,
args_and_results: *mut ValRaw,
) -> *mut wasm_trap_t {
match func.call_unchecked(store, args_and_results) {
Ok(()) => ptr::null_mut(),
Err(trap) => Box::into_raw(Box::new(wasm_trap_t::new(trap))),
}
}
#[no_mangle]
pub extern "C" fn wasmtime_func_type(
store: CStoreContext<'_>,
@@ -362,3 +406,17 @@ pub unsafe extern "C" fn wasmtime_caller_export_get(
crate::initialize(item, which.into());
true
}
#[no_mangle]
pub unsafe extern "C" fn wasmtime_func_from_raw(
store: CStoreContextMut<'_>,
raw: usize,
func: &mut Func,
) {
*func = Func::from_raw(store, raw).unwrap();
}
#[no_mangle]
pub unsafe extern "C" fn wasmtime_func_to_raw(store: CStoreContextMut<'_>, func: &Func) -> usize {
func.to_raw(store)
}