Reimplement the pooling instance allocation strategy (#5661)

* Reimplement the pooling instance allocation strategy This commit is a reimplementation of the strategy by which the pooling instance allocator selects a slot for a module. Previously there was a choice amongst three different algorithms: "reuse affinity", "next available", and "random". The default was "reuse affinity" but some new data has come to light which shows that this may not always be a good default. Notably the pooling allocator will retain some memory per-slot in the pooling instance allocator, for example instance data or memory data if-so-configured. This means that a currently unused, but previously used, slot can contribute to the RSS usage of a program using Wasmtime. Consequently the RSS impact here is O(max slots) which can be counter-intuitive for embedders. This particularly affects "reuse affinity" because the algorithm for picking a slot when there are no affine slots is "pick a random slot", which means eventually all slots will get used. In discussions about possible ways to tackle this, an alternative to "pick a strategy" arose and is now implemented in this commit. Concretely the new allocation algorithm for a slot is now: * First pick the most recently used affine slot, if one exists. * Otherwise if the number of affine slots to other modules is above some threshold N then pick the least-recently used affine slot. * Otherwise pick a slot that's affine to nothing. The "N" in this algorithm is configurable and setting it to 0 is the same as the old "next available" strategy while setting it to infinity is the same as the "reuse affinity" algorithm. Setting it to something in the middle provides a knob to allow a modest "cache" of affine slots while not allowing the total set of slots used to grow too much beyond the maximal concurrent set of modules. The "random" strategy is now no longer possible and was removed to help simplify the allocator. * Resolve rustdoc warnings in `wasmtime-runtime` crate * Remove `max_cold` as it duplicates the `slot_state.len()` * More descriptive names * Add a comment and debug assertion * Add some list assertions
2023-02-01 11:43:51 -06:00
parent cb3b6c621f
commit 8ffbb9cfd7
7 changed files with 444 additions and 440 deletions
--- a/crates/wasmtime/src/config.rs
+++ b/crates/wasmtime/src/config.rs
@@ -1712,17 +1712,61 @@ pub struct PoolingAllocationConfig {
    config: wasmtime_runtime::PoolingInstanceAllocatorConfig,
 }

-#[cfg(feature = "pooling-allocator")]
-pub use wasmtime_runtime::PoolingAllocationStrategy;
-
 #[cfg(feature = "pooling-allocator")]
 impl PoolingAllocationConfig {
-    /// Configures the method by which slots in the pooling allocator are
-    /// allocated to instances
+    /// Configures the maximum number of "unused warm slots" to retain in the
+    /// pooling allocator.
    ///
-    /// This defaults to [`PoolingAllocationStrategy::ReuseAffinity`] .
-    pub fn strategy(&mut self, strategy: PoolingAllocationStrategy) -> &mut Self {
-        self.config.strategy = strategy;
+    /// The pooling allocator operates over slots to allocate from, and each
+    /// slot is considered "cold" if it's never been used before or "warm" if
+    /// it's been used by some module in the past. Slots in the pooling
+    /// allocator additionally track an "affinity" flag to a particular core
+    /// wasm module. When a module is instantiated into a slot then the slot is
+    /// considered affine to that module, even after the instance has been
+    /// dealloocated.
+    ///
+    /// When a new instance is created then a slot must be chosen, and the
+    /// current algorithm for selecting a slot is:
+    ///
+    /// * If there are slots that are affine to the module being instantiated,
+    ///   then the most recently used slot is selected to be allocated from.
+    ///   This is done to improve reuse of resources such as memory mappings and
+    ///   additionally try to benefit from temporal locality for things like
+    ///   caches.
+    ///
+    /// * Otherwise if there are more than N affine slots to other modules, then
+    ///   one of those affine slots is chosen to be allocated. The slot chosen
+    ///   is picked on a least-recently-used basis.
+    ///
+    /// * Finally, if there are less than N affine slots to other modules, then
+    ///   the non-affine slots are allocated from.
+    ///
+    /// This setting, `max_unused_warm_slots`, is the value for N in the above
+    /// algorithm. The purpose of this setting is to have a knob over the RSS
+    /// impact of "unused slots" for a long-running wasm server.
+    ///
+    /// If this setting is set to 0, for example, then affine slots are
+    /// aggressively resused on a least-recently-used basis. A "cold" slot is
+    /// only used if there are no affine slots available to allocate from. This
+    /// means that the set of slots used over the lifetime of a program is the
+    /// same as the maximum concurrent number of wasm instances.
+    ///
+    /// If this setting is set to infinity, however, then cold slots are
+    /// prioritized to be allocated from. This means that the set of slots used
+    /// over the lifetime of a program will approach
+    /// [`PoolingAllocationConfig::instance_count`], or the maximum number of
+    /// slots in the pooling allocator.
+    ///
+    /// Wasmtime does not aggressively decommit all resources associated with a
+    /// slot when the slot is not in use. For example the
+    /// [`PoolingAllocationConfig::linear_memory_keep_resident`] option can be
+    /// used to keep memory associated with a slot, even when it's not in use.
+    /// This means that the total set of used slots in the pooling instance
+    /// allocator can impact the overall RSS usage of a program.
+    ///
+    /// The default value for this option is 100.
+    pub fn max_unused_warm_slots(&mut self, max: u32) -> &mut Self {
+        self.config.max_unused_warm_slots = max;
        self
    }