Skip to content

Commit

Permalink
Bug 1630936: Write documentation for Baldrdash in Spidermonkey; r=rhu…
Browse files Browse the repository at this point in the history
…nt,lth

Differential Revision: https://phabricator.services.mozilla.com/D71313
  • Loading branch information
bnjbvr committed Apr 21, 2020
1 parent 9e2e299 commit 3666942
Show file tree
Hide file tree
Showing 2 changed files with 133 additions and 9 deletions.
137 changes: 131 additions & 6 deletions js/src/wasm/cranelift/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -13,18 +13,137 @@
* limitations under the License.
*/

mod bindings; // High-level bindings for C++ data structures.
mod compile; // Cranelift function compiler.
mod isa; // `TargetISA` configuration.
mod utils; // Helpers for other source files.
mod wasm2clif; // WebAssembly to Cranelift translation callbacks.
//! This code bridges Spidermonkey to Cranelift.
//!
//! This documentation explains the role of each high-level function, each notable submodule, and
//! the Spidermonkey idiosyncrasies that are visible here and leak into Cranelift. This is not a
//! technical presentation of how Cranelift works or what it intends to achieve, a task much more
//! suited to the Wasmtime documentation itself:
//!
//! https://github.com/bytecodealliance/wasmtime/blob/master/cranelift/docs/index.md
//!
//! At the time of writing (April 14th, 2020), this code is only used for WebAssembly (wasm)
//! compilation, so this documentation focuses on the wasm integration. As a matter of fact, this
//! glue crate between Baldrmonkey and Cranelift is called Baldrdash, thanks to the usual punsters.
//!
//! ## Relationships to other files
//!
//! * WasmCraneliftCompile.cpp contains all the C++ code that calls into this crate.
//! * clifapi.h describes the C-style bindings to this crate's public functions, used by the C++
//! code to call into Rust. They're maintained by hand, and thus manual review must ensure the
//! signatures match those of the functions exposed in this lib.rs file.
//! * baldrapi.h describes the C-style functions exposed through `bindgen` so they can be called
//! from Rust. Bindings are automatically generated, such that they're safe to use in general.
//! WasmConstants.h is also exposed in through this file, which makes sharing some code easier.
//!
//! ## High-level functions
//!
//! * `cranelift_initialize` performs per-process initialization.
//! * `cranelift_compiler_create` will return a `BatchCompiler`, the high-level data structure
//! controlling the compilation of a group (batch) of wasm functions. The created compiler should
//! later be deallocated with `cranelift_compiler_destroy`, once it's not needed anymore.
//! * `cranelift_compile_function` takes care of translating a single wasm function into Cranelift
//! IR, and compiles it down to machine code. Input data is passed through a const pointer to a
//! `FuncCompilerInput` data structure (defined in bindings), and the return values are stored in
//! an in-out parameter named `CompiledFunc` (also defined in bindings).
//!
//! ## Submodules
//!
//! The list of submodules here is voluntarily put in a specific order, so as to make it easier to
//! discover and read.
//!
//! * The `isa` module configures Cranelift, applying some target-independent settings, as well as
//! target-specific settings. These settings are used both during translation of wasm to Cranelift
//! IR and compilation to machine code.
//! * The `wasm2clif` module contains the code doing the translation of the wasm code section to
//! Cranelift IR, implementing all the Spidermonkey specific behaviors.
//! * The `compile` module takes care of optimizing the Cranelift IR and compiles it down to
//! machine code, noting down relocations in the process.
//!
//! A few other helper modules are also defined:
//!
//! * The `bindings` module contains C++ bindings automatically generated by `bindgen` in the Cargo
//! build script (`build.rs`), as well as thin wrappers over these data structures to make these
//! more ergonomic to use in Rust.
//! * No code base would be feature complete without a bunch of random helpers and functions that
//! don't really belong anywhere else: the `utils` module contains error handling helpers, to unify
//! all the Cranelift Error types into one that can be passed around in Baldrdash.
//!
//! ## Spidermonkey idiosyncrasies
//!
//! Most of the Spidermonkey-specific behavior is reflected during conversion of the wasm code to
//! Cranelift IR (in the `wasm2clif` module), but there are some other aspects worth mentioning
//! here.
//!
//! ### Code generation, prologues/epilogues, ABI
//!
//! Cranelift may call into and be called from other functions using the Spidermonkey wasm ABI:
//! that is, code generated by the wasm baseline compiler during tiering, any other wasm stub, even
//! Ion (through the JIT entries and exits).
//!
//! As a matter of fact, it must push the same C++ `wasm::Frame` on the stack before a call, and
//! unwind it properly on exit. To keep this detail orthogonal to Cranelift, the function's
//! prologue and epilogue are **not** generated by Cranelift itself; the C++ code generates them
//! for us. Here, Cranelift only generates the code section and appropriate relocations.
//! The C++ code writes the prologue, copies the machine code section, writes the epilogue, and
//! translates the Cranelift relocations into Spidermonkey relocations.
//!
//! * To not generate the prologue and epilogue, Cranelift uses a special calling convention called
//! Baldrdash in its code. This is set upon creation of the `TargetISA`.
//! * Cranelift must know the offset to the stack argument's base, that is, the size of the
//! wasm::Frame. The `baldrdash_prologue_words` setting is used to propagate this information to
//! Cranelift.
//! * Since Cranelift generated functions interact with Ion-ABI functions (Ionmonkey, other wasm
//! functions), and native (host) functions, it has to respect both calling conventions. Especially
//! when it comes to function calls it must preserve callee-saved and caller-saved registers in a
//! way compatible with both ABIs. In practice, it means Cranelift must consider Ion's callee-saved
//! as its callee-saved, and native's caller-saved as its caller-saved (since it deals with both
//! ABIs, it has to union the sets).
//!
//! ### Maintaining HeapReg
//!
//! On some targets, Spidermonkey pins one register to keep the heap-base accessible at all-times,
//! making memory accesses cheaper. This register is excluded from Ion's register allocation, and
//! is manually maintained by Spidermonkey before and after calls.
//!
//! Cranelift has two settings to mimic the same behavior:
//! - `enable_pinned_reg` makes it possible to pin a register and gives access to two Cranelift
//! instructions for reading it and writing to it.
//! - `use_pinned_reg_as_heap_base` makes the code generator use the pinned register as the heap
//! base for all Cranelift IR memory accesses.
//!
//! Using both settings allows to reproduce Spidermonkey's behavior. One caveat is that the pinned
//! register used in Cranelift must match the HeapReg register in Spidermonkey, for this to work
//! properly.
//!
//! Not using the pinned register as the heap base, when there's a heap register on the platform,
//! means that we have to explicitly maintain it in the prologue and epilogue (because of tiering),
//! which would be another source of slowness.
//!
//! ### Non-streaming validation
//!
//! Ionmonkey is able to iterate over the wasm code section's body, validating and emitting the
//! internal Ionmonkey's IR at the same time.
//!
//! Cranelift uses `wasmparser` to parse the wasm binary section, which isn't able to add
//! per-opcode hooks. Instead, Cranelift validates (off the main thread) the function's body before
//! compiling it, function per function.
mod bindings;
mod compile;
mod isa;
mod utils;
mod wasm2clif;

use log::{self, error, info};
use std::ptr;

use crate::bindings::{CompiledFunc, FuncCompileInput, ModuleEnvironment, StaticEnvironment};
use crate::compile::BatchCompiler;

/// Initializes all the process-wide Cranelift state. It must be called at least once, before any
/// other use of this crate. It is not an issue if it is called more than once; subsequent calls
/// are useless though.
#[no_mangle]
pub extern "C" fn cranelift_initialize() {
// Gecko might set a logger before we do, which is all fine; try to initialize ours, and reset
Expand All @@ -41,6 +160,9 @@ pub extern "C" fn cranelift_initialize() {

/// Allocate a compiler for a module environment and return an opaque handle.
///
/// It is the caller's responsability to deallocate the returned BatchCompiler later, passing back
/// the opaque handle to a call to `cranelift_compiler_destroy`.
///
/// This is declared in `clifapi.h`.
#[no_mangle]
pub unsafe extern "C" fn cranelift_compiler_create<'a, 'b>(
Expand All @@ -58,7 +180,10 @@ pub unsafe extern "C" fn cranelift_compiler_create<'a, 'b>(
}
}

/// Deallocate compiler.
/// Deallocate a BatchCompiler created by `cranelift_compiler_create`.
///
/// Passing any other kind of pointer to this function is technically undefined behavior, thus
/// making the function unsafe to use.
///
/// This is declared in `clifapi.h`.
#[no_mangle]
Expand Down
5 changes: 2 additions & 3 deletions js/src/wasm/cranelift/src/wasm2clif.rs
Original file line number Diff line number Diff line change
Expand Up @@ -18,14 +18,13 @@
//! The code here deals with adapting the `cranelift_wasm` module to the specifics of BaldrMonkey's
//! internal data structures.
use crate::bindings::GlobalDesc;
use cranelift_codegen::ir::immediates::Offset32;
use std::collections::HashMap;

use cranelift_codegen::cursor::{Cursor, FuncCursor};
use cranelift_codegen::entity::{EntityRef, PrimaryMap, SecondaryMap};
use cranelift_codegen::ir;
use cranelift_codegen::ir::condcodes::IntCC;
use cranelift_codegen::ir::immediates::Offset32;
use cranelift_codegen::ir::InstBuilder;
use cranelift_codegen::isa::{CallConv, TargetFrontendConfig, TargetIsa};
use cranelift_codegen::packed_option::PackedOption;
Expand All @@ -34,7 +33,7 @@ use cranelift_wasm::{
SignatureIndex, TableIndex, TargetEnvironment, WasmError, WasmResult,
};

use crate::bindings::{self, SymbolicAddress};
use crate::bindings::{self, SymbolicAddress, GlobalDesc};
use crate::compile::{symbolic_function_name, wasm_function_name};
use crate::isa::POINTER_SIZE;

Expand Down

0 comments on commit 3666942

Please sign in to comment.