Commit Graph

16 Commits

Author SHA1 Message Date
Chris Fallin
7865191093 Update long block comment describing priority trie in codegen.rs. 2021-11-11 15:56:55 -08:00
Chris Fallin
b8e916a0ab Another example, testing rule priorities a bit 2021-11-11 15:56:55 -08:00
Chris Fallin
4a2cd78827 Working example and README 2021-11-11 15:56:55 -08:00
Chris Fallin
d7efd9f219 Working extractor and constructor generation from rules! 2021-11-11 15:56:55 -08:00
Chris Fallin
be1140e80a WIP. 2021-11-11 15:56:55 -08:00
Chris Fallin
cd55dc9568 WIP. 2021-11-11 15:56:55 -08:00
Chris Fallin
e5d76db97a WIP. 2021-11-11 15:56:55 -08:00
Chris Fallin
8c727b175a more codegen WIP: start to generate functions 2021-11-11 15:56:55 -08:00
Chris Fallin
638c9edd01 Support for file input and output, including multiple input files with proper position tracking. 2021-11-11 15:56:55 -08:00
Chris Fallin
e9a57d854d Generate internal enum types. 2021-11-11 15:56:54 -08:00
Chris Fallin
5aa72bc060 skeleton for codegen 2021-11-11 15:56:54 -08:00
Chris Fallin
02ec77a45b trie insertion 2021-11-11 15:56:54 -08:00
Chris Fallin
77ed861857 Start of significant rework: compile to a trie, not an FSM, and handle rule priorities appropriately.
See long block comment in codegen.rs. In brief, I think we actually want
to compile to a trie with priority-intervals, a sort of hybrid of a
priority tree and a trie representing decisions keyed on match-ops
(PatternInsts).

The reasons are:

1. The lexicographic ordering that is fundamental to the FSM-building in
   the Peepmatic view of the problem is sort of fundamentally limited
   w.r.t. our notion of rule priorities. See the example in the block
   comment.

2. While the FSM is nice for interpreter-based execution, when compiling
   to a language with structured control flow, what we really want is a
   tree; otherwise, if we want to form DAGs to share substructure, we
   need something like a "diamond-recovery" algorithm that finds common
   suffixes of *input match-op sequences*, and then we need to
   incorporate something like phi-nodes in order to allow captures from
   either side of the diamond to be used.

3. One of the main advantages of the automaton/transducer approach,
   namely sharing suffixes of the *output* sequence (emitting partial
   output at each state transition), is unfortunately not applicable if
   we allow the overall function to be partial. Otherwise, there is
   always the possibility that we fail at the last match op, so we
   cannot allow any external constructors to be called until we reach
   the final state anyway.

4. Pragmatically, I found I was having to significantly edit the
   peepmatic_automata implementation to adapt to this use-case
   (compilation to Rust), and it seemed more practical to design the
   data structure we want than to try to shoehorn the existing thing
   into the new problem.

WIP, hopefully working soon.
2021-11-11 15:56:54 -08:00
Chris Fallin
f2399c5384 WIP -- more thinking about how to work priorities into FSM 2021-11-11 15:56:54 -08:00
Chris Fallin
6a567924cd WIP 2021-11-11 15:56:54 -08:00
Chris Fallin
e08160845e WIP: rip out a bunch of stuff and rework 2021-11-11 15:56:54 -08:00