Settings can be defined globally or per-ISA. They are available to code through a generated Settings struct with accessor methods per setting.
389 lines
14 KiB
ReStructuredText
389 lines
14 KiB
ReStructuredText
********************************
|
|
Cretonne Meta Language Reference
|
|
********************************
|
|
|
|
.. default-domain:: py
|
|
.. highlight:: python
|
|
.. module:: cretonne
|
|
|
|
The Cretonne meta language is used to define instructions for Cretonne. It is a
|
|
domain specific language embedded in Python. This document describes the Python
|
|
modules that form the embedded DSL.
|
|
|
|
The meta language descriptions are Python modules under the :file:`meta`
|
|
top-level directory. The descriptions are processed in two steps:
|
|
|
|
1. The Python modules are imported. This has the effect of building static data
|
|
structures in global variables in the modules. These static data structures
|
|
use the classes in the :mod:`cretonne` module to describe instruction sets
|
|
and other properties.
|
|
|
|
2. The static data structures are processed to produce Rust source code and
|
|
constant tables.
|
|
|
|
The main driver for this source code generation process is the
|
|
:file:`meta/build.py` script which is invoked as part of the build process if
|
|
anything in the :file:`meta` directory has changed since the last build.
|
|
|
|
|
|
Settings
|
|
========
|
|
|
|
Settings are used by the environment embedding Cretonne to control the details
|
|
of code generation. Each setting is defined in the meta language so a compact
|
|
and consistent Rust representation can be generated. Shared settings are defined
|
|
in the :mod:`cretonne.settings` module. Some settings are specific to a target
|
|
ISA, and defined in a `settings` module under the appropriate :file:`meta/isa/*`
|
|
directory.
|
|
|
|
Settings can take boolean on/off values, small numbers, or explicitly enumerated
|
|
symbolic values. Each type is represented by a sub-class of :class:`Setting`:
|
|
|
|
.. inheritance-diagram:: Setting BoolSetting NumSetting EnumSetting
|
|
:parts: 1
|
|
|
|
.. autoclass:: Setting
|
|
.. autoclass:: BoolSetting
|
|
.. autoclass:: NumSetting
|
|
.. autoclass:: EnumSetting
|
|
|
|
All settings must belong to a *group*, represented by a :class:`SettingGroup`
|
|
object.
|
|
|
|
.. autoclass:: SettingGroup
|
|
|
|
Normally, a setting group corresponds to all settings defined in a module. Such
|
|
a module looks like this::
|
|
|
|
group = SettingGroup('example')
|
|
|
|
foo = BoolSetting('use the foo')
|
|
bar = BoolSetting('enable bars', True)
|
|
opt = EnumSetting('optimization level', 'Debug', 'Release')
|
|
|
|
group.close(globals())
|
|
|
|
|
|
Instruction descriptions
|
|
========================
|
|
|
|
New instructions are defined as instances of the :class:`Instruction`
|
|
class. As instruction instances are created, they are added to the currently
|
|
open :class:`InstructionGroup`.
|
|
|
|
.. autoclass:: InstructionGroup
|
|
:members:
|
|
|
|
The basic Cretonne instruction set described in :doc:`langref` is defined by the
|
|
Python module :mod:`cretonne.base`. This module has a global variable
|
|
:data:`cretonne.base.instructions` which is an :class:`InstructionGroup`
|
|
instance containing all the base instructions.
|
|
|
|
.. autoclass:: Instruction
|
|
|
|
An instruction is defined with a set of distinct input and output operands which
|
|
must be instances of the :class:`Operand` class.
|
|
|
|
.. autoclass:: Operand
|
|
|
|
Cretonne uses two separate type systems for operand kinds and SSA values.
|
|
|
|
Type variables
|
|
--------------
|
|
|
|
Instruction descriptions can be made polymorphic by using :class:`Operand`
|
|
instances that refer to a *type variable* instead of a concrete value type.
|
|
Polymorphism only works for SSA value operands. Other operands have a fixed
|
|
operand kind.
|
|
|
|
.. autoclass:: TypeVar
|
|
:members:
|
|
|
|
If multiple operands refer to the same type variable they will be required to
|
|
have the same concrete type. For example, this defines an integer addition
|
|
instruction::
|
|
|
|
Int = TypeVar('Int', 'A scalar or vector integer type', ints=True, simd=True)
|
|
a = Operand('a', Int)
|
|
x = Operand('x', Int)
|
|
y = Operand('y', Int)
|
|
|
|
iadd = Instruction('iadd', 'Integer addition', ins=(x, y), outs=a)
|
|
|
|
The type variable `Int` is allowed to vary over all scalar and vector integer
|
|
value types, but in a given instance of the `iadd` instruction, the two
|
|
operands must have the same type, and the result will be the same type as the
|
|
inputs.
|
|
|
|
There are some practical restrictions on the use of type variables, see
|
|
:ref:`restricted-polymorphism`.
|
|
|
|
Immediate operands
|
|
------------------
|
|
|
|
Immediate instruction operands don't correspond to SSA values, but have values
|
|
that are encoded directly in the instruction. Immediate operands don't
|
|
have types from the :class:`cretonne.ValueType` type system; they often have
|
|
enumerated values of a specific type. The type of an immediate operand is
|
|
indicated with an instance of :class:`ImmediateKind`.
|
|
|
|
.. autoclass:: ImmediateKind
|
|
|
|
.. automodule:: cretonne.immediates
|
|
:members:
|
|
|
|
.. currentmodule:: cretonne
|
|
|
|
Entity references
|
|
-----------------
|
|
|
|
Instruction operands can also refer to other entties in the same function. This
|
|
can be extended basic blocks, or entities declared in the function preamble.
|
|
|
|
.. autoclass:: EntityRefKind
|
|
|
|
.. automodule:: cretonne.entities
|
|
:members:
|
|
|
|
.. currentmodule:: cretonne
|
|
|
|
Value types
|
|
-----------
|
|
|
|
Concrete value types are represented as instances of :class:`cretonne.ValueType`. There are
|
|
subclasses to represent scalar and vector types.
|
|
|
|
.. autoclass:: ValueType
|
|
.. inheritance-diagram:: ValueType ScalarType VectorType IntType FloatType BoolType
|
|
:parts: 1
|
|
.. autoclass:: ScalarType
|
|
:members:
|
|
.. autoclass:: VectorType
|
|
:members:
|
|
.. autoclass:: IntType
|
|
:members:
|
|
.. autoclass:: FloatType
|
|
:members:
|
|
.. autoclass:: BoolType
|
|
:members:
|
|
|
|
.. automodule:: cretonne.types
|
|
:members:
|
|
|
|
.. currentmodule:: cretonne
|
|
|
|
There are no predefined vector types, but they can be created as needed with
|
|
the :func:`ScalarType.by` function.
|
|
|
|
|
|
Instruction representation
|
|
==========================
|
|
|
|
The Rust in-memory representation of instructions is derived from the
|
|
instruction descriptions. Part of the representation is generated, and part is
|
|
written as Rust code in the `cretonne.instructions` module. The instruction
|
|
representation depends on the input operand kinds and whether the instruction
|
|
can produce multiple results.
|
|
|
|
.. autoclass:: OperandKind
|
|
.. inheritance-diagram:: OperandKind ImmediateKind EntityRefKind
|
|
|
|
Since all SSA value operands are represented as a `Value` in Rust code, value
|
|
types don't affect the representation. Two special operand kinds are used to
|
|
represent SSA values:
|
|
|
|
.. autodata:: value
|
|
.. autodata:: variable_args
|
|
|
|
When an instruction description is created, it is automatically assigned a
|
|
predefined instruction format which is an instance of
|
|
:class:`InstructionFormat`:
|
|
|
|
.. autoclass:: InstructionFormat
|
|
|
|
|
|
.. _restricted-polymorphism:
|
|
|
|
Restricted polymorphism
|
|
-----------------------
|
|
|
|
The instruction format strictly controls the kinds of operands on an
|
|
instruction, but it does not constrain value types at all. A given instruction
|
|
description typically does constrain the allowed value types for its value
|
|
operands. The type variables give a lot of freedom in describing the value type
|
|
constraints, in practice more freedom than what is needed for normal instruction
|
|
set architectures. In order to simplify the Rust representation of value type
|
|
constraints, some restrictions are imposed on the use of type variables.
|
|
|
|
A polymorphic instruction has a single *controlling type variable*. For a given
|
|
opcode, this type variable must be the type of the first result or the type of
|
|
the input value operand designated by the `typevar_operand` argument to the
|
|
:py:class:`InstructionFormat` constructor. By default, this is the first value
|
|
operand, which works most of the time.
|
|
|
|
The value types of instruction results must be one of the following:
|
|
|
|
1. A concrete value type.
|
|
2. The controlling type variable.
|
|
3. A type variable derived from the controlling type variable.
|
|
|
|
This means that all result types can be computed from the controlling type
|
|
variable.
|
|
|
|
Input values to the instruction are allowed a bit more freedom. Input value
|
|
types must be one of:
|
|
|
|
1. A concrete value type.
|
|
2. The controlling type variable.
|
|
3. A type variable derived from the controlling type variable.
|
|
4. A free type variable that is not used by any other operands.
|
|
|
|
This means that the type of an input operand can either be computed from the
|
|
controlling type variable, or it can vary independently of the other operands.
|
|
|
|
|
|
Encodings
|
|
=========
|
|
|
|
Encodings describe how Cretonne instructions are mapped to binary machine code
|
|
for the target architecture. After the lealization pass, all remaining
|
|
instructions are expected to map 1-1 to native instruction encodings. Cretonne
|
|
instructions that can't be encoded for the current architecture are called
|
|
:term:`illegal instruction`\s.
|
|
|
|
Some instruction set architectures have different :term:`CPU mode`\s with
|
|
incompatible encodings. For example, a modern ARMv8 CPU might support three
|
|
different CPU modes: *A64* where instructions are encoded in 32 bits, *A32*
|
|
where all instuctions are 32 bits, and *T32* which has a mix of 16-bit and
|
|
32-bit instruction encodings. These are incompatible encoding spaces, and while
|
|
an :cton:inst:`iadd` instruction can be encoded in 32 bits in each of them, it's
|
|
not the same 32 bits. It's a judgement call if CPU modes should be modelled as
|
|
separate targets, or as sub-modes of the same target. In the ARMv8 case, the
|
|
different register banks means that it makes sense to model A64 as a separate
|
|
target architecture, while A32 and T32 are CPU modes of the 32-bit ARM target.
|
|
|
|
In a given CPU mode, there may be multiple valid encodings of the same
|
|
instruction. Both RISC-V and ARMv8's T32 mode have 32-bit encodings of all
|
|
instructions with 16-bit encodings available for some opcodes if certain
|
|
constraints are satisfied.
|
|
|
|
.. autoclass:: CPUMode
|
|
|
|
Encodings are guarded by :term:`sub-target predicate`\s. For example, the RISC-V
|
|
"C" extension which specifies the compressed encodings may not be supported, and
|
|
a predicate would be used to disable all of the 16-bit encodings in that case.
|
|
This can also affect whether an instruction is legal. For example, x86 has a
|
|
predicate that controls the SSE 4.1 instruction encodings. When that predicate
|
|
is false, the SSE 4.1 instructions are not available.
|
|
|
|
Encodings also have a :term:`instruction predicate` which depends on the
|
|
specific values of the instruction's immediate fields. This is used to ensure
|
|
that immediate address offsets are within range, for example. The instructions
|
|
in the base Cretonne instruction set can often represent a wider range of
|
|
immediates than any specific encoding. The fixed-size RISC-style encodings tend
|
|
to have more range limitations than CISC-style variable length encodings like
|
|
x86.
|
|
|
|
The diagram below shows the relationship between the classes involved in
|
|
specifying instruction encodings:
|
|
|
|
.. digraph:: encoding
|
|
|
|
node [shape=record]
|
|
EncRecipe -> SubtargetPred
|
|
EncRecipe -> InstrFormat
|
|
EncRecipe -> InstrPred
|
|
Encoding [label="{Encoding|Opcode+TypeVars}"]
|
|
Encoding -> EncRecipe [label="+EncBits"]
|
|
Encoding -> CPUMode
|
|
Encoding -> SubtargetPred
|
|
Encoding -> InstrPred
|
|
Encoding -> Opcode
|
|
Opcode -> InstrFormat
|
|
CPUMode -> Target
|
|
|
|
An :py:class:`Encoding` instance specifies the encoding of a concrete
|
|
instruction. The following properties are used to select instructions to be
|
|
encoded:
|
|
|
|
- An opcode, i.e. :cton:inst:`iadd_imm`, that must match the instruction's
|
|
opcode.
|
|
- Values for any type variables if the opcode represents a polymorphic
|
|
instruction.
|
|
- An :term:`instruction predicate` that must be satisfied by the instruction's
|
|
immediate operands.
|
|
- The CPU mode that must be active.
|
|
- A :term:`sub-target predicate` that must be satisfied by the currently active
|
|
sub-target.
|
|
- :term:`Register constraint`\s that must be satisfied by the instruction's value
|
|
operands and results.
|
|
|
|
An encoding specifies an *encoding recipe* along with some *encoding bits* that
|
|
the recipe can use for native opcode fields etc. The encoding recipe has
|
|
additional constraints that must be satisfied:
|
|
|
|
- An :py:class:`InstructionFormat` that must match the format required by the
|
|
opcodes of any encodings that use this recipe.
|
|
- An additional :term:`instruction predicate`.
|
|
- An additional :term:`sub-target predicate`.
|
|
|
|
The additional predicates in the :py:class:`EncRecipe` are merged with the
|
|
per-encoding predicates when generating the encoding matcher code. Often
|
|
encodings only need the recipe predicates.
|
|
|
|
.. autoclass:: EncRecipe
|
|
|
|
|
|
Targets
|
|
=======
|
|
|
|
Cretonne can be compiled with support for multiple target instruction set
|
|
architectures. Each ISA is represented by a :py:class:`cretonne.TargetISA` instance.
|
|
|
|
.. autoclass:: TargetISA
|
|
|
|
The definitions for each supported target live in a package under
|
|
:file:`meta/isa`.
|
|
|
|
.. automodule:: isa
|
|
:members:
|
|
|
|
.. automodule:: isa.riscv
|
|
|
|
|
|
Glossary
|
|
========
|
|
|
|
.. glossary::
|
|
|
|
Illegal instruction
|
|
An instruction is considered illegal if there is no encoding available
|
|
for the current CPU mode. The legality of an instruction depends on the
|
|
value of :term:`sub-target predicate`\s, so it can't always be
|
|
determined ahead of time.
|
|
|
|
CPU mode
|
|
Every target defines one or more CPU modes that determine how the CPU
|
|
decodes binary instructions. Some CPUs can switch modes dynamically with
|
|
a branch instruction (like ARM/Thumb), while other modes are
|
|
process-wide (like x86 32/64-bit).
|
|
|
|
Sub-target predicate
|
|
A predicate that depends on the current sub-target configuration.
|
|
Examples are "Use SSE 4.1 instructions", "Use RISC-V compressed
|
|
encodings". Sub-target predicates can depend on both detected CPU
|
|
features and configuration settings.
|
|
|
|
Instruction predicate
|
|
A predicate that depends on the immediate fields of an instruction. An
|
|
example is "the load address offset must be a 10-bit signed integer".
|
|
Instruction predicates do not depend on the registers selected for value
|
|
operands.
|
|
|
|
Register constraint
|
|
Value operands and results correspond to machine registers. Encodings may
|
|
constrain operands to either a fixed register or a register class. There
|
|
may also be register constraints between operands, for example some
|
|
encodings require that the result register is one of the input
|
|
registers.
|