We used to allow at most one instantiation per compilation, but there is no
fundamental reason why that should be the case. Allowing multiple instantiations
per compilation allows us to, for example, benchmark repeated instantiation
within Wasmtime's pooling allocator.
This additionally switches to using host functions for WASI and for
`bench_{start,end}` rather than defining them on the linker, this way we can use
a new store for every instantiation and don't need to keep other instances alive
when instantiating new instances.
Finally, we switch all timing to be done through callback functions, rather than
having the bench API caller implicitly start/end timers around bench API
calls. This allows us to more precisely measure phases and exclude things like
file I/O performed when creating a WASI context.