The most important control interface in angr is the SimulationManager, which allows you to control symbolic execution over groups of states simultaneously, applying search strategies to explore a program's state space. Here, you'll learn how to use it.
Simulation managers let you wrangle multiple states in a slick way.
States are organized into “stashes”, which you can step forward, filter, merge, and move around as you wish.
This allows you to, for example, step two different stashes of states at different rates, then merge them together.
The default stash for most operations is the active
stash, which is where your states get put when you initialize a new simulation manager.
The most basic capability of a simulation manager is to step forward all states in a given stash by one basic block.
You do this with .step()
.
>>> import angr
>>> proj = angr.Project('examples/fauxware/fauxware', auto_load_libs=False)
>>> state = proj.factory.entry_state()
>>> simgr = proj.factory.simgr(state)
>>> simgr.active
[<SimState @ 0x400580>]
>>> simgr.step()
>>> simgr.active
[<SimState @ 0x400540>]
Of course, the real power of the stash model is that when a state encounters a symbolic branch condition, both of the successor states appear in the stash, and you can step both of them in sync.
When you don't really care about controlling analysis very carefully and you just want to step until there's nothing left to step, you can just use the .run()
method.
# Step until the first symbolic branch
>>> while len(simgr.active) == 1:
... simgr.step()
>>> simgr
<SimulationManager with 2 active>
>>> simgr.active
[<SimState @ 0x400692>, <SimState @ 0x400699>]
# Step until everything terminates
>>> simgr.run()
>>> simgr
<SimulationManager with 3 deadended>
We now have 3 deadended states!
When a state fails to produce any successors during execution, for example, because it reached an exit
syscall, it is removed from the active stash and placed in the deadended
stash.
Let's see how to work with other stashes.
To move states between stashes, use .move()
, which takes from_stash
(optional, default "active"), to_stash
, and filter_func
(optional, default is to move everything).
For example, let's move everything that has a certain string in its output:
>>> simgr.move(from_stash='deadended', to_stash='authenticated', filter_func=lambda s: 'Welcome' in s.posix.dumps(1))
>>> simgr
<SimulationManager with 2 authenticated, 1 deadended>
We were able to just create a new stash named "authenticated" just by asking for states to be moved to it. All the states in this stash have "Welcome" in their stdout, which is a fine metric for now.
Each stash is just a list, and you can index into or iterate over the list to access each of the individual states, but there are some alternate methods to access the states too.
If you prepend the name of a stash with one_
, you will be given the first state in the stash.
If you prepend the name of a stash with mp_
, you will be given a mulpyplexed version of the stash.
>>> for s in simgr.deadended + simgr.authenticated:
... print hex(s.addr)
0x1000030
0x1000078
0x1000078
>>> simgr.one_deadended
<SimState @ 0x1000030>
>>> simgr.mp_authenticated
MP([<SimState @ 0x1000078>, <SimState @ 0x1000078>])
>>> simgr.mp_authenticated.posix.dumps(0)
MP(['\x00\x00\x00\x00\x00\x00\x00\x00\x00SOSNEAKY\x00',
'\x00\x00\x00\x00\x00\x00\x00\x00\x00S\x80\x80\x80\x80@\x80@\x00'])
Of course, step
, run
, and any other method that operates on a single stash of paths can take a stash
argument, specifying which stash to operate on.
There are lots of fun tools that the simulation manager provides you for managing your stashes. We won't go into the rest of them for now, but you should check out the API documentation. TODO: link
You can use stashes for whatever you like, but there are a few stashes that will be used to categorize some special kinds of states. These are:
Stash | Description |
---|---|
active | This stash contains the states that will be stepped by default, unless an alternate stash is specified. |
deadended | A state goes to the deadended stash when it cannot continue the execution for some reason, including no more valid instructions, unsat state of all of its successors, or an invalid instruction pointer. |
pruned | When using LAZY_SOLVES , states are not checked for satisfiability unless absolutely necessary. When a state is found to be unsat in the presence of LAZY_SOLVES , the state hierarchy is traversed to identify when, in its history, it initially became unsat. All states that are descendants of that point (which will also be unsat, since a state cannot become un-unsat) are pruned and put in this stash. |
unconstrained | If the save_unconstrained option is provided to the SimulationManager constructor, states that are determined to be unconstrained (i.e., with the instruction pointer controlled by user data or some other source of symbolic data) are placed here. |
unsat | If the save_unsat option is provided to the SimulationManager constructor, states that are determined to be unsatisfiable (i.e., they have constraints that are contradictory, like the input having to be both "AAAA" and "BBBB" at the same time) are placed here. |
There is another list of states that is not a stash: errored
.
If, during execution, an error is raised, then the state will be wrapped in an ErrorRecord
object, which contains the state and the error it raised, and then the record will be inserted into errored
.
You can get at the state as it was at the beginning of the execution tick that caused the error with record.state
, you can see the error that was raised with record.error
, and you can launch a debug shell at the site of the error with record.debug()
.
This is an invaluable debugging tool!
An extremely common operation in symbolic execution is to find a state that reaches a certain address, while discarding all states that go through another address.
Simulation manager has a shortcut for this pattern, the .explore()
method.
When launching .explore()
with a find
argument, execution will run until a state is found that matches the find condition, which can be the address of an instruction to stop at, a list of addresses to stop at, or a function which takes a state and returns whether it meets some criteria.
When any of the states in the active stash match the find
condition, they are placed in the found
stash, and execution terminates.
You can then explore the found state, or decide to discard it and continue with the other ones.
You can also specify an avoid
condition in the same format as find
.
When a state matches the avoid condition, it is put in the avoided
stash, and execution continues.
Finally, the num_find
argument controls the number of states that should be found before returning, with a default of 1.
Of course, if you run out of states in the active stash before finding this many solutions, execution will stop anyway.
Let's look at a simple crackme example:
First, we load the binary.
>>> proj = angr.Project('examples/CSCI-4968-MBE/challenges/crackme0x00a/crackme0x00a')
Next, we create a SimulationManager.
>>> simgr = proj.factory.simgr()
Now, we symbolically execute until we find a state that matches our condition (i.e., the "win" condition).
>>> simgr.explore(find=lambda s: "Congrats" in s.posix.dumps(1))
<SimulationManager with 1 active, 1 found>
Now, we can get the flag out of that state!
>>> s = simgr.found[0]
>>> print s.posix.dumps(1)
Enter password: Congrats!
>>> flag = s.posix.dumps(0)
>>> print(flag)
g00dJ0B!
Pretty simple, isn't it?
Other examples can be found by browsing the examples.
angr ships with several pieces of canned functionality that let you customize the behavior of a simulation manager, called exploration techniques.
The archetypical exploration technique is depth-first search, which puts all active paths except one in a stash called deferred
, and whenever active
goes empty, pops a state out of deferred
and continues.
To use an exploration technique, call simgr.use_technique(tech)
, where tech is an instance of an ExplorationTechnique subclass.
angr's built-in exploration techniques can be found under angr.exploration_techniques
.
Here's a quick overview of some of the built-in ones:
- TODO
You can also write your own exploration techniques! This will be covered in a later chapter.