Add support for generating perf maps for simple perf profiling (#6030)

* Add support for generating perf maps for simple perf profiling

* add missing enum entry in C code

* bugfix: use hexa when printing the code region's length too (thanks bjorn3!)

* sanitize file name + use bufwriter

* introduce --profile CLI flag for wasmtime

* Update doc and doc comments for new --profile option

* remove redundant FromStr import

* Apply review feedback: make_line receives a Write impl, report errors

* fix tests?

* better docs
This commit is contained in:
Benjamin Bouvier
2023-03-20 17:17:36 +01:00
committed by GitHub
parent b5a2d536ac
commit 6f4f30c840
14 changed files with 224 additions and 38 deletions

View File

@@ -6,6 +6,59 @@ an extremely powerful profiler with lots of documentation on the web, but for
the rest of this section we'll assume you're running on Linux and already have
`perf` installed.
There are two profiling agents for `perf`:
- a very simple one that will map code regions to symbol names: `perfmap`.
- a more detailed one that can provide additional information and mappings between the source
language statements and generated JIT code: `jitdump`.
## Profiling with `perfmap`
Simple profiling support with `perf` generates a "perf map" file that the `perf` CLI will
automatically look for, when running into unresolved symbols. This requires runtime support from
Wasmtime itself, so you will need to manually change a few things to enable profiling support in
your application. Enabling runtime support depends on how you're using Wasmtime:
* **Rust API** - you'll want to call the [`Config::profiler`] method with
`ProfilingStrategy::PerfMap` to enable profiling of your wasm modules.
* **C API** - you'll want to call the `wasmtime_config_profiler_set` API with a
`WASMTIME_PROFILING_STRATEGY_PERFMAP` value.
* **Command Line** - you'll want to pass the `--profile=perfmap` flag on the command
line.
Once perfmap support is enabled, you'll use `perf record` like usual to record
your application's performance.
For example if you're using the CLI, you'll execute:
```sh
$ perf record -k mono wasmtime --profile=perfmap foo.wasm
```
This will create a `perf.data` file as per usual, but it will *also* create a
`/tmp/perf-XXXX.map` file. This extra `.map` file is the perf map file which is
specified by `perf` and Wasmtime generates at runtime.
After that you can explore the `perf.data` profile as you usually would, for example with:
```sh
$ perf report --input perf.data
```
You should be able to see time spent in wasm functions, generate flamegraphs based on that, etc..
You should also see entries for wasm functions show up as one function and the name of each
function matches the debug name section in the wasm file.
Note that support for perfmap is still relatively new in Wasmtime, so if you
have any problems, please don't hesitate to [file an issue]!
[file an issue]: https://github.com/bytecodealliance/wasmtime/issues/new
## Profiling with `jitdump`
Profiling support with `perf` uses the "jitdump" support in the `perf` CLI. This
requires runtime support from Wasmtime itself, so you will need to manually
change a few things to enable profiling support in your application. First
@@ -19,7 +72,7 @@ depends on how you're using Wasmtime:
* **C API** - you'll want to call the `wasmtime_config_profiler_set` API with a
`WASMTIME_PROFILING_STRATEGY_JITDUMP` value.
* **Command Line** - you'll want to pass the `--jitdump` flag on the command
* **Command Line** - you'll want to pass the `--profile=jitdump` flag on the command
line.
Once jitdump support is enabled, you'll use `perf record` like usual to record
@@ -29,7 +82,7 @@ your application's performance. You'll need to also be sure to pass the
For example if you're using the CLI, you'll execute:
```sh
$ perf record -k mono wasmtime --jitdump foo.wasm
$ perf record -k mono wasmtime --profile=jitdump foo.wasm
```
This will create a `perf.data` file as per usual, but it will *also* create a
@@ -110,7 +163,7 @@ To collect perf information for this wasm module we'll execute:
```sh
$ rustc --target wasm32-wasi fib.rs -O
$ perf record -k mono wasmtime --jitdump fib.wasm
$ perf record -k mono wasmtime --profile=jitdump fib.wasm
fib(42) = 267914296
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.147 MB perf.data (3435 samples) ]

View File

@@ -39,7 +39,7 @@ runtime--enable runtime support based on how you use Wasmtime:
* **C API** - call the `wasmtime_config_profiler_set` API with a
`WASMTIME_PROFILING_STRATEGY_VTUNE` value.
* **Command Line** - pass the `--vtune` flag on the command line.
* **Command Line** - pass the `--profile=vtune` flag on the command line.
### Profiling Wasmtime itself
@@ -58,11 +58,11 @@ With VTune [properly installed][download], if you are using the CLI execute:
```sh
$ cargo build
$ vtune -run-pass-thru=--no-altstack -collect hotspots target/debug/wasmtime --vtune foo.wasm
$ vtune -run-pass-thru=--no-altstack -collect hotspots target/debug/wasmtime --profile=vtune foo.wasm
```
This command tells the VTune collector (`vtune`) to collect hot spot
profiling data as Wasmtime is executing `foo.wasm`. The `--vtune` flag enables
profiling data as Wasmtime is executing `foo.wasm`. The `--profile=vtune` flag enables
VTune support in Wasmtime so that the collector is also alerted to JIT events
that take place during runtime. The first time this is run, the result of the
command is a results diretory `r000hs/` which contains profiling data for
@@ -96,13 +96,13 @@ $ rustc --target wasm32-wasi fib.rs -C opt-level=z -C lto=yes
```
Then we execute the Wasmtime runtime (built with the `vtune` feature and
executed with the `--vtune` flag to enable reporting) inside the VTune CLI
executed with the `--profile=vtune` flag to enable reporting) inside the VTune CLI
application, `vtune`, which must already be installed and available on the
path. To collect hot spot profiling information, we execute:
```sh
$ rustc --target wasm32-wasi fib.rs -C opt-level=z -C lto=yes
$ vtune -run-pass-thru=--no-altstack -v -collect hotspots target/debug/wasmtime --vtune fib.wasm
$ vtune -run-pass-thru=--no-altstack -v -collect hotspots target/debug/wasmtime --profile=vtune fib.wasm
fib(45) = 1134903170
amplxe: Collection stopped.
amplxe: Using result path /home/jlb6740/wasmtime/r000hs
@@ -141,7 +141,7 @@ like:
- Open VTune Profiler
- "Configure Analysis" with
- "Application" set to `/path/to/wasmtime` (e.g., `target/debug/wasmtime`)
- "Application parameters" set to `--vtune /path/to/module.wasm`
- "Application parameters" set to `--profile=vtune /path/to/module.wasm`
- "Working directory" set as appropriate
- Enable "Hardware Event-Based Sampling," which may require some system
configuration, e.g. `sysctl -w kernel.perf_event_paranoid=0`