This latest refactor adds "extractor macros" in place of the
very-confusing-even-to-the-DSL-author reverse-rules-as-extractors
concept. It was beautifully symmetric but also just too mind-bending to
be practical.
It also adds argument polarity to external extractors. This is inspired
by Prolog's similar notion (see e.g. the "+x" vs. "-x" argument notation
in library documentation) where the unification-based semantics allow
for bidirectional flow through arguments. We don't want polymorphism
or dynamism w.r.t. directions/polarities here; the polarities are
static; but it is useful to be able to feed values *into* an extractor
(aside from the one value being extracted). Semantically this still
correlates to a term-rewriting/value-equivalence world since we can
still translate all of this to a list of equality constraints.
To make that work, this change also adds expressions into patterns,
specifically only for extractor "input" args. This required quite a bit
of internal refactoring but is only a small addition to the language
semantics.
I plan to build out the little instruction-selector sketch further but
the one that is here (in `test3.isle`) is starting to get interesting
already with the current DSL semantics.
See long block comment in codegen.rs. In brief, I think we actually want
to compile to a trie with priority-intervals, a sort of hybrid of a
priority tree and a trie representing decisions keyed on match-ops
(PatternInsts).
The reasons are:
1. The lexicographic ordering that is fundamental to the FSM-building in
the Peepmatic view of the problem is sort of fundamentally limited
w.r.t. our notion of rule priorities. See the example in the block
comment.
2. While the FSM is nice for interpreter-based execution, when compiling
to a language with structured control flow, what we really want is a
tree; otherwise, if we want to form DAGs to share substructure, we
need something like a "diamond-recovery" algorithm that finds common
suffixes of *input match-op sequences*, and then we need to
incorporate something like phi-nodes in order to allow captures from
either side of the diamond to be used.
3. One of the main advantages of the automaton/transducer approach,
namely sharing suffixes of the *output* sequence (emitting partial
output at each state transition), is unfortunately not applicable if
we allow the overall function to be partial. Otherwise, there is
always the possibility that we fail at the last match op, so we
cannot allow any external constructors to be called until we reach
the final state anyway.
4. Pragmatically, I found I was having to significantly edit the
peepmatic_automata implementation to adapt to this use-case
(compilation to Rust), and it seemed more practical to design the
data structure we want than to try to shoehorn the existing thing
into the new problem.
WIP, hopefully working soon.
* Add some debug logging for timing in module compiles
This is sometimes helpful when debugging slow compiles from fuzz bugs or
similar.
* Fix total duration calculation to not double-count