Skip to content

Latest commit

 

History

History
251 lines (201 loc) · 10.4 KB

function_abi.md

File metadata and controls

251 lines (201 loc) · 10.4 KB

Function signatures

A key job of the IREE compiler and runtime is capturing function call semantics from the originating system and providing mechanisms so that invocations can be performed in as similar way as possible in various target languages. In general, this requires additional metadata on top of the raw characteristics of a function. Where possible, this is done by attaching attributes to a function.

  • abi : string indicating the abi/calling convention in use
  • abiv : numeric version of the abi

Each abi can require additional attributes as needed.

Generic Signature Mangling

Where possible, ABI metadata is encoded into a plain-text signature in a way that is easily transported across component boundaries and can be efficiently implemented without additional dependencies (i.e. just string manipulation).

The suggested format is manipulated via the C++ reference implementations SignatureBuilder and SignatureParser classes (see iree/base/signature_mangle.h). See documentation and code for those classes for more details.

ABIs

Raw Function ABI

All exported functions implement the raw function ABI, which defines the metadata and calling convention for marshalling inputs and results to their underlying implementations.

Attributes:

  • fv = 1 (current version of the raw function ABI)
  • f = encoded raw function signature (see below)
  • fbr = result buffer allocation function name (optional)

The reflection metadata documented here augments the underlying type system such that host language bindings can interop as needed. This additional metadata is needed in most dynamic cases because the compiled assets operate on fundamental types with most characteristics type erased away (think: void* level things vs high-level ShapedBuffer level things).

Grammar

The signature is implemented in terms of the SignatureBuilder, using tagged Integer and Spans.

signature ::= 'I' length-prefixed(type-sequence)
              'R' length-prefixed(type-sequence)

type-sequence ::= (arg-result-type)*
arg-result-type ::= buffer-type | ref-object-type | unrecognized-type
buffer-type ::= 'B' length-prefixed(scalar-type? dim*)
scalar-type ::= 't' (
                    '0'  # IEEE float32 (default if not specified)
                  | '1'  # IEEE float16
                  | '2'  # IEEE float64
                  | '3'  # Google bfloat16
                  | '4'  # Signed int8
                  | '5'  # Signed int16
                  | '6'  # Signed int32
                  | '7'  # Signed int64
                  | '8'  # Unsigned int8
                  | '9'  # Unsigned int16
                  | '10' # Unsigned int32
                  | '11' # Unsigned int64
                  )
dim :: = 'd' integer  # -1 indicates a dynamic dim
ref-object-type ::= 'O' length-prefixed()  # Details TBD
unrecognized-type ::= 'U' length-prefixed()

# Lexical primitives
integer ::= -?[0-9]+
length ::= [0-9]+
# The `length` encodes the length in bytes of `production`, plus 1 for the '!'.
length-prefixed(production) ::= length '!' production
any-byte-sequence ::= <any byte sequence>

Interpretation and Rationale

Memory layout

The astute reader will note that the above metadata is insufficient to determine the memory layout of a buffer. The reason is that any more specific details than this (contiguity, strides, alignment, etc) can actually only be known once the actual compute devices have been enumerated and the resulting matrix of conversions is more dynamic than can be expressed in something as static as a function signature. The above formulation is an input to an additional runtime oracle which produces appropriate full buffer descriptions.

While the exact implementation is host-language specific, consider the following more detailed set of declarations that may exist in such a binding layer:

// Inspired heavily by the Py_buffer type.
// See: https://docs.python.org/3/c-api/buffer.html
struct BufferDescription {
  ScalarType element_type;
  // For contiguous arrays, this is is the length of the underlying memory.
  // For non-contiguous, this is the size of the buffer if it were copied
  // to a contiguous representation.
  size_t len;
  // Number of dims and strides.
  size_t ndim;
  int* shape;
  int* strides;
};

// Mirrors the 'buffer-type' production in the above grammar.
struct SignatureBufferType;

// Oracle which combines signature metadata with a user-provided, materialized
// BufferDescription to derive a BufferDescription that is compatible for
// invocation. Returns an updated buffer description if the original is
// not compatible or fully specified.
// This can be used in a couple of ways:
//   a) On function invocation to determine whether a provided buffer can be
//      used as-is or needs to be converted (copied).
//   b) To provide a factory function to the host language to create a
//      compatible buffer.
optional<BufferDescription> BufferDescriptionOracle(
    DeviceContext*, SignatureBufferType, BufferDescription)
  throws UnsupportedBufferException;

The above scheme should allow host-language and device coordination with respect to buffer layout. For the moment, the responsibility to convert the buffer to a compatible memory layout is on the host-language binding. However, often it is the most efficient to schedule this for execution on a device. In the future, it is anticipated that there will be a built-in pathway for scheduling such a conversion (which would allow pipelining and offload of buffer conversions).

Deferred result allocation

In general, exported functions accept pre-allocated results that should be mutated. For the simplest cases, such results can be null and retrieved upon completion of the function. This, however, puts severe limitations on the ability to pipeline. For fully specified signatures (no dynamic shapes), the BufferDescriptionOracle and the signature is sufficient to pre-allocate appropriate results, which allows chains of result-producing invocations to be pipelined.

If, however, a buffer-type is not fully specified, the compiler may emit a special result allocator function, which will be referenced in the fbr attribute. Such a function would have a signature like this:

tuple<buffer> __allocate_results(tuple<int> dynamic_dims);

Such a function takes a tuple of all dynamic buffer dims in the function input signature and returns a tuple of allocated buffers for each dynamic result. Note that it may not be possible to fully allocate results in this fashion (i.e. if the result layout is data dependent), in which case a null buffer is returned for that slot (and the host library would need to await on the invocation to get the fully populated result).

A similar mechanism will need to be created at some future point for under-specified results of other (non-buffer) types.

Contiguity hinting

Commonly in some kinds of dataflows, the compiler needs to be free to internally toggle buffer continuity (i.e. C/row-major, Fortran/col-major, etc). In many cases, such toggling does not naturally escape through the exported function boundaries, in which case, there is no ABI impact. However, it is anticipated that there is benefit to letting the toggle propagate through the exported ABI boundary, in which case, the buffer-type will likely be extended with a contiguity hint indicating the preference. When combined with the buffer description oracle and in-pipeline conversion features described above, this could yield a powerful mechanism for dynamically and efficiently managing such transitions.

Such an enhancement would almost certainly necessitate a major version bump in the ABI and would be logical to implement once the advanced features above are functional.

Structured Index Path ABI

Functions may support the SIP ABI if their input and result tuples logically map onto "structures" (nested sequence/dicts).

Attributes:

  • sipv = 1 (current SIP ABI version)
  • sip = encoded SIP signature (see below)

This ABI maps a raw, linear sequence of inputs and results onto an input and result "structure" -- which in this context refers to a nested assembly of sequences (with integer keys) and dictionaries (with string keys). Such a facility is useful for encoding input/result mappings in a way that is common in dynamic languages (such as Python).

In practice, this ABI supports the calling convention for TensorFlow, which allows functions that accept and produce nestings via the tf.nest facility. In implementing it, however, care has been taken to allow the calling convention to generalize to other similar cases.

Grammar

The signature is implemented in terms of the SignatureBuilder, using tagged Integer and Spans.

# Defines the structured value for the inputs ('I') and results ('R')
# of the function.
signature ::= 'I' length-prefixed(structured-value)
              'R' length-prefixed(structured-value)

structured-value ::= raw-fn-index | sequence | dict
raw-fn-index ::= '_' integer
sequence ::= 'S' length-prefixed( (integer-key structured-value)* )
integer-key ::= 'k' integer
dict ::= 'D' length-prefixed( (string-key structured-value)* )
string-key ::= 'K' length-prefixed( any-byte-sequence )

# Low-level lexical primitives:
integer ::= -?[0-9]+
length ::= [0-9]+
# The `length` encodes the length in bytes of `production`, plus 1 for the '!'.
length-prefixed(production) ::= length '!' production
any-byte-sequence ::= <any byte sequence>

Structured values define a tree of recursive dicts/lists, with raw-fn-index at the leaves. The interpretation is that a raw-fn-index that has been reached by traversing N expansions of the structured-value production is assigned an "index path" which is a list of the N keys that were traversed to reach it. For example, for N=0, the index path is empty. For N=1, and if an integer-key with numerical value 0 was traversed to reach the raw-fn-index, then the index path is [0].

.... give a few examples more, writing out various nested dicts/lists in Python-esque notation to clarify this concept ....

See the SipSignatureParser::ToStringVisitor for a canonical example of how to interpret the signature.

Implementations

  • C++
    • SipSignatureMangler: Produces a function signature given individual input and result assignment of physical indices to nested index paths in the structure tree.
    • SipSignatureParser: Parses signatures and dispatches calls to a visitor.