Skip to content

Commit

Permalink
update docs to angr-simuvex merge
Browse files Browse the repository at this point in the history
  • Loading branch information
zardus committed Jun 13, 2017
1 parent f633b6d commit e7ed02f
Show file tree
Hide file tree
Showing 12 changed files with 91 additions and 91 deletions.
2 changes: 1 addition & 1 deletion docs/claripy.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ angr's solver engine is called Claripy. Claripy exposes the following:

Internally, Claripy seamlessly mediates the co-operation of multiple disparate backends -- concrete bitvectors, VSA constructs, and SAT solvers. It is pretty badass.

Most users of angr will not need to interact directly with Claripy (except for, maybe, claripy AST objects, which represent symbolic expressions) -- SimuVEX handles most interactions with Claripy internally.
Most users of angr will not need to interact directly with Claripy (except for, maybe, claripy AST objects, which represent symbolic expressions) -- angr handles most interactions with Claripy internally.
However, for dealing with expressions, an understanding of Claripy might be useful.

## Claripy ASTs
Expand Down
2 changes: 1 addition & 1 deletion docs/faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ Searching around the internet, the major choices were:
- BAP was another possibility. When we started work on angr, BAP only supported lifting x86 code, and up-do-date versions of BAP were only available to academic collaborators of the BAP authors. These were two deal-breakers. BAP has since become open, but it still only supports x86_64, x86, and ARM.
- VEX was the only choice that offered an open library and support for many architectures. As a bonus, it is very well documented and designed specifically for program analysis, making it very easy to use in angr.

While angr uses VEX now, there's no fundamental reason that multiple IRs cannot be used. There are two parts of angr, outside of the `simuvex.vex` package, that are VEX-specific:
While angr uses VEX now, there's no fundamental reason that multiple IRs cannot be used. There are two parts of angr, outside of the `angr.engines.vex` package, that are VEX-specific:

- the jump lables (i.e., the `Ijk_Ret` for returns, `Ijk_Call` for calls, and so forth) are VEX enums.
- VEX treats registers as a memory space, and so does angr. While we provide accesses to `state.regs.rax` and friends, on the backend, this does `state.registers.load(8, 8)`, where the first `8` is a VEX-defined offset for `rax` to the register file.
Expand Down
2 changes: 1 addition & 1 deletion docs/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ angr exposes information about what the paths execute and *do*.

A powerful feature of angr is the ability to represent basic blocks in terms of their effects on a program state.
In other words, angr can reason about what basic blocks *do*, not just what they *are*.
This is accomplished by a module named SimuVEX, further described [here](./simuvex.md).
This is accomplished by a code simulation engine, further described [here](./simulation.md).

## Symbolic Execution

Expand Down
7 changes: 4 additions & 3 deletions docs/paths.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,10 @@
Program Paths - Controlling Execution
=====================================

SimuVEX provides an incredibly awkward interface for performing symbolic execution. Paths are angr's primary interface to provide an abstraction to control execution, and are used in most interactions with angr and its analyses.
Dealing with SimStates and SimEngines directly provides an incredibly awkward interface for performing symbolic execution.
Paths are angr's primary interface to provide an abstraction to control execution, and are used in most interactions with angr and its analyses.

A path through a program is, at its core, a sequence of basic blocks (actually, individual executions of a `simuvex.SimEngine`) representing what was executed since the program started.
A path through a program is, at its core, a sequence of basic blocks (actually, individual executions of a `angr.SimEngine`) representing what was executed since the program started.
These blocks in the paths can repeat (in the case of loops) and a program can have a near-infinite amount of paths (for example, a program with a single branch will have two paths, a program with two branches nested within each other will have 4, and so on).

To create an empty path at the program's entry point, do:
Expand Down Expand Up @@ -172,7 +173,7 @@ At this point, all memory, registers, and so forth of the path are blank. In a n

## SimActions Redux

The SimActions from deep within simuvex are exported for much easier access through the Path. Actions are part of the path's history (Path.actions), so the same rules as the other history items about iterating over them still apply.
The SimActions from deep within the simulation engine are exported for much easier access through the Path. Actions are part of the path's history (Path.actions), so the same rules as the other history items about iterating over them still apply.

When paths grow long, stored SimActions can be a serious source of memory consumption. Because of this, by default all but the most recent SimActions are discarded. To disable this behavior, enable the `TRACK_ACTION_HISTORY` state option.

Expand Down
38 changes: 19 additions & 19 deletions docs/pipeline.md
Original file line number Diff line number Diff line change
Expand Up @@ -121,7 +121,7 @@ It works by making a call into `SimOS` to retrieve the SimProcedure that should
`SimEngineHook` provides the hooking functionality in angr.
It is used when a state is at an address that is hooked, and the previous jumpkind is *not* `Ijk_NoHook`.
It simply looks up the given hook, calls `hook.instantiate()` on it in order to retrieve a `SimProcedure` instance, and then runs that procedure.
This class is a thin subclass of the `SimEngineProcedure` class present in SimuVEX, for obvious reasons.
This class is a thin subclass of the `SimEngineProcedure` class, specialized for hooking.
It takes the parameter `procedure`, which will cause `check` to always succeed, and this procedure will be used instead of the SimProcedure that would be obtained from a hook.

`SimEngineUnicorn` performs concrete execution with the Unicorn Engine.
Expand All @@ -130,7 +130,7 @@ It is used when the state option `o.UNICORN` is enabled, and a myriad of other c
`SimEngineVEX` is the big fellow.
It is used whenever any of the previous can't be used.
It attempts to lift bytes from the current address into an IRSB, and then executes that IRSB symbolically.
There are a huge number of parameters that can control this process, so I will merely link to the [API reference](http://angr.io/api-doc/simuvex.html#simuvex.engines.vex.engine.SimEngineVEX.process) describing them.
There are a huge number of parameters that can control this process, so I will merely link to the [API reference](http://angr.io/api-doc/angr.html#angr.engines.vex.engine.SimEngineVEX.process) describing them.

The exact process by which SimEngineVEX digs into an IRSB and executes it deserves some documentation as well.
At time of writing I'm not sure if this exists anywhere but it really should.
Expand All @@ -154,10 +154,10 @@ unicorn = { UNICORN, UNICORN_SYM_REGS_SUPPORT, INITIALIZE_ZERO_REGISTERS, UNICOR
These will enable some additional functionalities and defaults which will greatly enhance your experience.
Additionally, there are a lot of options you can tune on the `state.unicorn` plugin.

A good way to understand how unicorn works is by examining the logging output (`logging.getLogger('simuvex.engines.unicorn_engine').setLevel('DEBUG'); logging.getLogger('simuvex.plugins.unicorn_engine').setLevel('DEBUG')` from a sample run of unicorn.
A good way to understand how unicorn works is by examining the logging output (`logging.getLogger('angr.engines.unicorn_engine').setLevel('DEBUG'); logging.getLogger('angr.state_plugins.unicorn_engine').setLevel('DEBUG')` from a sample run of unicorn.

```
INFO | 2017-02-25 08:19:48,012 | simuvex.plugins.unicorn | started emulation at 0x4012f9 (1000000 steps)
INFO | 2017-02-25 08:19:48,012 | angr.state_plugins.unicorn | started emulation at 0x4012f9 (1000000 steps)
```

Here, angr diverts to unicorn engine, beginning with the basic block at 0x4012f9.
Expand All @@ -166,41 +166,41 @@ This is to avoid hanging in an infinite loop.
The block count is configurable via the `state.unicorn.max_steps` variable.

```
INFO | 2017-02-25 08:19:48,014 | simuvex.plugins.unicorn | mmap [0x401000, 0x401fff], 5 (symbolic)
INFO | 2017-02-25 08:19:48,016 | simuvex.plugins.unicorn | mmap [0x7fffffffffe0000, 0x7fffffffffeffff], 3 (symbolic)
INFO | 2017-02-25 08:19:48,019 | simuvex.plugins.unicorn | mmap [0x6010000, 0x601ffff], 3
INFO | 2017-02-25 08:19:48,022 | simuvex.plugins.unicorn | mmap [0x602000, 0x602fff], 3 (symbolic)
INFO | 2017-02-25 08:19:48,023 | simuvex.plugins.unicorn | mmap [0x400000, 0x400fff], 5
INFO | 2017-02-25 08:19:48,025 | simuvex.plugins.unicorn | mmap [0x7000000, 0x7000fff], 5
INFO | 2017-02-25 08:19:48,014 | angr.state_plugins.unicorn | mmap [0x401000, 0x401fff], 5 (symbolic)
INFO | 2017-02-25 08:19:48,016 | angr.state_plugins.unicorn | mmap [0x7fffffffffe0000, 0x7fffffffffeffff], 3 (symbolic)
INFO | 2017-02-25 08:19:48,019 | angr.state_plugins.unicorn | mmap [0x6010000, 0x601ffff], 3
INFO | 2017-02-25 08:19:48,022 | angr.state_plugins.unicorn | mmap [0x602000, 0x602fff], 3 (symbolic)
INFO | 2017-02-25 08:19:48,023 | angr.state_plugins.unicorn | mmap [0x400000, 0x400fff], 5
INFO | 2017-02-25 08:19:48,025 | angr.state_plugins.unicorn | mmap [0x7000000, 0x7000fff], 5
```

angr performs lazy mapping of data that is accessed by unicorn engine, as it is accessed. 0x401000 is the page of instructions that it is executing, 0x7fffffffffe0000 is the stack, and so on. Some of these pages are symbolic, meaning that they contain at least some data that, when accessed, will cause execution to abort out of Unicorn.

```
INFO | 2017-02-25 08:19:48,037 | simuvex.plugins.unicorn | finished emulation at 0x7000080 after 3 steps: STOP_STOPPOINT
INFO | 2017-02-25 08:19:48,037 | angr.state_plugins.unicorn | finished emulation at 0x7000080 after 3 steps: STOP_STOPPOINT
```

Execution stays in Unicorn for 3 basic blocks (a computational waste, considering the required setup), after which it reaches a simprocedure location and jumps out to execute the simproc in angr.

```
INFO | 2017-02-25 08:19:48,076 | simuvex.plugins.unicorn | started emulation at 0x40175d (1000000 steps)
INFO | 2017-02-25 08:19:48,077 | simuvex.plugins.unicorn | mmap [0x401000, 0x401fff], 5 (symbolic)
INFO | 2017-02-25 08:19:48,079 | simuvex.plugins.unicorn | mmap [0x7fffffffffe0000, 0x7fffffffffeffff], 3 (symbolic)
INFO | 2017-02-25 08:19:48,081 | simuvex.plugins.unicorn | mmap [0x6010000, 0x601ffff], 3
INFO | 2017-02-25 08:19:48,076 | angr.state_plugins.unicorn | started emulation at 0x40175d (1000000 steps)
INFO | 2017-02-25 08:19:48,077 | angr.state_plugins.unicorn | mmap [0x401000, 0x401fff], 5 (symbolic)
INFO | 2017-02-25 08:19:48,079 | angr.state_plugins.unicorn | mmap [0x7fffffffffe0000, 0x7fffffffffeffff], 3 (symbolic)
INFO | 2017-02-25 08:19:48,081 | angr.state_plugins.unicorn | mmap [0x6010000, 0x601ffff], 3
```

After the simprocedure, execution jumps back into Unicorn.

```
WARNING | 2017-02-25 08:19:48,082 | simuvex.plugins.unicorn | fetching empty page [0x0, 0xfff]
INFO | 2017-02-25 08:19:48,103 | simuvex.plugins.unicorn | finished emulation at 0x401777 after 1 steps: STOP_EXECNONE
WARNING | 2017-02-25 08:19:48,082 | angr.state_plugins.unicorn | fetching empty page [0x0, 0xfff]
INFO | 2017-02-25 08:19:48,103 | angr.state_plugins.unicorn | finished emulation at 0x401777 after 1 steps: STOP_EXECNONE
```

Execution bounces out of Unicorn almost right away because the binary accessed the zero-page.

```
INFO | 2017-02-25 08:19:48,120 | simuvex.engines.unicorn_engine | not enough runs since last unicorn (100)
INFO | 2017-02-25 08:19:48,125 | simuvex.engines.unicorn_engine | not enough runs since last unicorn (99)
INFO | 2017-02-25 08:19:48,120 | angr.engines.unicorn_engine | not enough runs since last unicorn (100)
INFO | 2017-02-25 08:19:48,125 | angr.engines.unicorn_engine | not enough runs since last unicorn (99)
```

To avoid thrashing in and out of Unicorn (which is expensive), we have cooldowns (attributes of the `state.unicorn` plugin) that wait for certain conditions to hold (i.e., no symbolic memory accesses for X blocks) before jumping back into unicorn when a unicorn run is aborted due to anything but a simprocedure or syscall.
Expand Down
7 changes: 3 additions & 4 deletions docs/simprocedures.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,7 @@ This chapter should serve as a guide when programming SimProcedures.
Here's an example that will remove all bugs from any program:

```python
>>> from simuvex import SimProcedure
>>> from angr import Hook, Project
>>> from angr import Hook, Project, SimProcedure
>>> project = Project('examples/fauxware/fauxware')

>>> class BugFree(SimProcedure):
Expand Down Expand Up @@ -45,7 +44,7 @@ More on that later.

We've been using the words Hook and SimProcedure sort of interchangeably. Let's fix that.

- `SimProcedure` is a simuvex class that describes a set of actions to take on a state.
- `SimProcedure` is a class that describes a set of actions to take on a state.
Its crux is the `run()` method.
- `Hook` is an angr class that holds a SimProcedure along with information about how to instantiate it.

Expand Down Expand Up @@ -102,7 +101,7 @@ We'll get there after a quick detour...
What if we want to add a conditional branch out of a SimProcedure?
In order to do that, you'll need to work directly with the SimSuccessors object for the current execution step.

The interface for this is [`self.successors.add_successor(state, addr, guard, jumpkind)`](http://angr.io/api-doc/simuvex.html#simuvex.engines.successors.SimSuccessors.add_successor).
The interface for this is [`self.successors.add_successor(state, addr, guard, jumpkind)`](http://angr.io/api-doc/angr.html#angr.engines.successors.SimSuccessors.add_successor).
All of these parameters should have an obvious meaning if you've followed along so far.
Keep in mind that the state you pass in will NOT be copied, so be sure to make a copy if you want to use it again!

Expand Down
Loading

0 comments on commit e7ed02f

Please sign in to comment.