diff --git a/cranelift/docs/compare-llvm.rst b/cranelift/docs/compare-llvm.rst index 6ca562bcd3..e23f9f7890 100644 --- a/cranelift/docs/compare-llvm.rst +++ b/cranelift/docs/compare-llvm.rst @@ -21,18 +21,7 @@ highlighting some of the differences and similarities. Both projects: - Can target multiple ISAs. - Can cross-compile by default without rebuilding the code generator. -Cranelift's scope is much smaller than that of LLVM. The classical three main -parts of a compiler are: - -1. The language-dependent front end parses and type-checks the input program. -2. Common optimizations that are independent of both the input language and the - target ISA. -3. The code generator which depends strongly on the target ISA. - -LLVM provides both common optimizations *and* a code generator. Cranelift only -provides the last part, the code generator. LLVM additionally provides -infrastructure for building assemblers and disassemblers. Cranelift does not -handle assembly at all---it only generates binary machine code. +However, there are also some major differences, described in the following sections. Intermediate representations ============================ @@ -103,7 +92,24 @@ smaller scope. is annotated with an assigned ISA register or stack slot. The Cranelift intermediate representation is similar to LLVM IR, but at a slightly -lower level of abstraction. +lower level of abstraction, to allow it to be used all the way through the +codegen process. + +This design tradeoff does mean that Cranelift IR is less friendly for mid-level +optimizations. Cranelift doesn't currently perform mid-level optimizations, +however if it should grow to where this becomes important, the vision is that +Cranelift would add a separate IR layer, or possibly an separate IR, to support +this. Instead of frontends producing optimizer IR which is then translated to +codegen IR, Cranelift would have frontends producing codegen IR, which can be +translated to optimizer IR and back. + +This biases the overall system towards fast compilation when mid-level +optimization is not needed, such as when emitting unoptimized code for or when +low-level optimizations are sufficient. + +And, it removes some constraints in the mid-level optimize IR design space, +making it more feasible to consider ideas such as using a +[VSDG-based IR](https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-705.pdf). Program structure ----------------- @@ -112,10 +118,15 @@ In LLVM IR, the largest representable unit is the *module* which corresponds more or less to a C translation unit. It is a collection of functions and global variables that may contain references to external symbols too. -In Cranelift IR, the largest representable unit is the *function*. This is so -that functions can easily be compiled in parallel without worrying about -references to shared data structures. Cranelift does not have any -inter-procedural optimizations like inlining. +In `Cranelift's IR` `_, +used by the `cranelift-codegen `_ crate, +functions are self-contained, allowing them to be compiled independently. At +this level, there is no explicit module that contains the functions. + +Module functionality in Cranelift is provided as an optional library layer, in +the `cranelift-module `_ crate. It provides +facilities for working with modules, which can contain multiple functions as +well as data objects, and it links them together. An LLVM IR function is a graph of *basic blocks*. A Cranelift IR function is a graph of *extended basic blocks* that may contain internal branch instructions.