How to target `Wizard` #54

ejrgilbert · 2024-06-12T18:14:10Z

Progress Tracker

Overview

The emitter that targets Wizard for instrumentation could emit a Wasm module that specifies where a probe should be attached to a program by placing the pattern matching rule in the name of a function export.
All exported functions are these pattern matching rules, e.g.:

( export "match_rule / pred_func(pred_args) / (probe_args))" (func 0))

Wizard will read this in and run the pattern matching rules to find all places in the program to call func 0. Note that in whamm a predicate can include static AND dynamic data. This means that the static portion of the predicate will reside in the match rule's predicate logic and the dynamic portion will be "pushed down" into the probe's function (in this case func 0).
This means that the predicate needs to be split out to have static predication in the export name and dynamic predication in the func 0 instructions!

Breaking Down the Export Name

match_rule / pred_func(pred_args) / (probe_args))

`match_rule`

Purpose: This is used to specify where to attach the probe in the application module.

This will basically just be the normal probe specification, but without the 'mode' portion, e.g.:
"wasm:opcode:call"

`pred_func`

Purpose: This function is used to further predicate on what constitutes a match site at match time.

This will be a pointer to the function in mon.wasm that contains the static predication logic. It could either be a function name OR a function ID preceded by $, e.g.:
"$call_predicate", or "$1"

`pred_args`

**Purpose: ** This is used to list the arguments that the engine will need to pass when invoking the predicate function.

The ordering of this is important! The predicate function expects the same ordering as requested in the match rule. An example:
"fid, pc"

`probe_args`

**Purpose: ** This is used to list the arguments that the engine will need to pass when invoking the probe function.

The ordering of this is important! The predicate function expects the same ordering as requested in the match rule. An example:
"fid, pc, arg0"

The provided globals that will need to be supported by wizard

To see this, use the info utility of whamm:

cargo run -- info -fg --spec "wasm:opcode:*"

Note: There is still some work to be done, these are not exhaustive.

An Example

The whamm script:

i32 count;

wasm:opcode:call:before / (fid == 3 && pc != 2) / {
   if (arg0 == 1) {
       count++;
   }
}

The resulting mon.wasm:

(module
  (type (;0;) (func (result i32)))
  (type (;1;) (func (param i32 i32) (result i32)))
  (type (;2;) (func (param i32)))
  (func $get_count (;0;) (type 0) (result i32)
    global.get 0
  )
  (func (;1;) (type 1) (param i32 i32) (result i32)
    local.get 0
    i32.const 3
    i32.eq
    local.get 1
    i32.const 2
    i32.ne
    i32.and
  )
  (func (;2;) (type 2) (param i32)
    local.get 0
    i32.const 1
    i32.eq
    if ;; label = @1
      global.get 0
      i32.const 1
      i32.add
      global.set 0
    end
  )
  (global (;0;) (mut i32) i32.const 0)
  (export "get_count" (func $get_count))
  (export "wasm:opcode:call / $1(fid, pc) / (arg0)" (func 2))
)

The text was updated successfully, but these errors were encountered:

ahuoguo · 2024-07-02T00:24:56Z

A more detailed writeup of what we are doing here

Instead of emitting virgil source code and loading to src/monitors, we are going to emit wasm code (whamm.wasm) and run it by wizeng --monitors=whamm.wasm test/monitors/app.wasm.

When the option provided to --monitors= ends with .wasm, it automatically loads a WhammMonitor, which defines how we want to interface with the wizard engine.

For now, we can create new whamm.wasm files by running v3c-dev -target=wasm test/whamm/PcTracer.v3. The probe action corresponds totrace_pc and the export defines where we fire the probe (it only supports wasm:bytecode:<pattern>:before for now (note that the mode is before is implicit)

Discussion of dynamic info

For now, we only support probes that are determined statically. To support probe that fires dynamically, eg

i32 i;
wasm:bytecode:call:before /
    arg0 == 1
/ {
    i++;
}

we want to define a placeholder externref to access the FrameAccessor. The following is an example PcTracer.wat that tries to get the real pc. We currently need to extend WhammMonitor.v3 to support this.

(module
 (type (;0;) (func (param i32 i32)))
 (type (;1;) (func (param i32)))
 (type (;2;) (func))
 (type (;3;) (func (param i32 i32 i32)))
 (import "wizeng" "puts" (func $wizeng.puts (type 0)))
 (import "wizeng" "puti" (func $wizeng.puti (type 1)))
 (import "wizeng:frame_accessor" "get_pc" (func $get_pc (param externref) (result i32)))
 (func $PcTracer.main (type 2)
  return)
 (func $PcTracer.trace_pc (param externref)
  i32.const 65536
  call $PcTracer.puts
  (call $get_pc (local.get 0))
  call $wizeng.puti
  i32.const 65552
  call $PcTracer.puts
  return)
 (func $PcTracer.puts (type 1) (param i32) ... )
 (func (;5;) (type 0) (param i32 i32)
  local.get 1
  call $PcTracer.puts)
 (func (;6;) (type 0) (param i32 i32)
  local.get 1
  call $wizeng.puti)
 (func (;7;) (type 3) (param i32 i32 i32)
  local.get 1
  local.get 2
  call $wizeng.puts)
 (table (;0;) 4 4 funcref)
 (memory (;0;) 3 3)
 (export "main" (func $PcTracer.main))
 (export "memory" (memory 0))
 (export "wasm:bytecode:loop" (func $PcTracer.trace_pc))
 (elem (;0;) (i32.const 1) func 5 6 7)
 (data (;0;) (i32.const 65536) "\05\00\00\00\08\00\00\00hello @ \05\00\00\00\01\00\00\00\0a\00\00\00"))

ejrgilbert · 2024-07-30T18:16:35Z

ahuoguo · 2024-07-31T15:26:47Z

ejrgilbert · 2024-11-21T20:49:42Z

Report Variables on Wizard

Report variables will be flushed at the end of program execution, on program exit (observable by the engine itself). On program exit, functions will be called that flush the data.

The functions

There will be one function per report variable type used in the script. So, if the script only uses i32 report variables, there will be one function called to flush the final state.

These functions will contain the logic to handle its respective datatype. For i32, for map, etc.

There will be 2 global variables used per data type as well:

one global that points to the memory location of the first allocated report variable of this type
one global that points to the most-recently allocated report variable of this type

The global variables are necessary since they will be used both during variable allocation and during variable flush.

Memory layout

We will have a linked list in memory of the report variables. One linked list per type.

next	value
mem_offset: i32	var_data: datatype_len

Each variable location will have two pieces of information:

mem_offset: i32: The memory offset to the next report variable of this type, it is the offset from the current memory location.
var_data: datatype_len: The actual value of the variable, the type/len is dependent on the type of the variable. Since a function is called that iterates over each datatype's linked list of report variables, it will know how to parse the variable contents!

Variable allocation

On the first allocation for a datatype, the global that points to the first memory location is updated to point to the memory address.

When a new variable is allocated in memory, the global containing the memory address of the most-recently allocated variable of that type is used to update the next pointer to the difference between the previous and the current memory address (to find the offset). Then, that global is updated to the current memory address.

The value of the variable is then placed in the var_data slot.

Flush function logic

For each datatype, there will be a function that does something like the following pseudocode:

func report_DT() {
    // Set the first variable to look at the pointer to the first variable for some datatype
    let curr_var = mem.get(FIRST_VAR);

    do {
        flush_DT(curr_var.val);
    } while curr_var.next != NULL;
}

func flush_DT(val: DT) {
    // logic to flush this value
}

Note, this logic can't be factored out into a commonly-used function due to the strict typed-ness of function calls. Consider the following WAT fleshing some of this out:

(module
    ;; ... $alloc function that allocates memory for i32s

    (global $i32_start i32 (i32.const 0))
    (func $iter_i32s ()
        (local $next i32)
        (local $curr i32)
        (local $offset i32)
        (local.set $offset (i32.const 4)) ;; offset to skip over $next

        (block $finished
            ;; Load the next addr
            (local.set $next (i32.load (global.get $i32_start)))
            ;; Load the first value (specific to the i32 DT)
            (local.set $curr (i32.load (i32.add (local.get $offset) (global.get $i32_start))))
            (loop $continue
                ;; Flush the $curr value
                (call $flush_i32 (local.get $curr))

                ;; If $next is null, we're finished!
                (br_if $finished (i32.eq (i32.const NULL) (local.get $next)))

                ;; Otherwise, prepare the next value
                (local.set $curr (i32.load (i32.add (local.get $offset) (local.get $next)))) ;; use $next to pull value
                (local.set $next (i32.load (local.get $next))) ;; use $next to update to the new $next
            )
        )
    )
    (func $flush_i32 (param $val i32)
        ;; logic that flushes an i32 report variable
    )

    (func $on_end ()
        (call $iter_i32s (global.get $i32_start)
    )
)

ejrgilbert added the enhancement New feature or request label Jun 12, 2024

ejrgilbert self-assigned this Oct 8, 2024

Repository owner deleted a comment from ahuoguo Oct 8, 2024

ejrgilbert mentioned this issue Oct 8, 2024

Get targeting wizard to work (basic functionality) #157

Merged

3 tasks

ejrgilbert mentioned this issue Nov 21, 2024

Support report vars on Wizard #176

Draft

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to target `Wizard` #54

How to target `Wizard` #54

ejrgilbert commented Jun 12, 2024 •

edited

Loading

ahuoguo commented Jul 2, 2024

ejrgilbert commented Jul 30, 2024

ahuoguo commented Jul 31, 2024

ejrgilbert commented Nov 21, 2024 •

edited

Loading

How to target Wizard #54

How to target Wizard #54

Comments

ejrgilbert commented Jun 12, 2024 • edited Loading

Progress Tracker

Overview

Breaking Down the Export Name

match_rule

pred_func

pred_args

probe_args

The provided globals that will need to be supported by wizard

An Example

ahuoguo commented Jul 2, 2024

A more detailed writeup of what we are doing here

Discussion of dynamic info

ejrgilbert commented Jul 30, 2024

ahuoguo commented Jul 31, 2024

ejrgilbert commented Nov 21, 2024 • edited Loading

Report Variables on Wizard

The functions

Memory layout

Variable allocation

Flush function logic

How to target `Wizard` #54

How to target `Wizard` #54

ejrgilbert commented Jun 12, 2024 •

edited

Loading

`match_rule`

`pred_func`

`pred_args`

`probe_args`

ejrgilbert commented Nov 21, 2024 •

edited

Loading