SiliFuzz is a system that finds CPU defects by fuzzing software proxies, like CPU simulators or disassemblers, and then executing the accumulated test inputs (known as the corpus) on actual CPUs on a large scale. SiliFuzz is a work in progress, please refer to the paper for details.
Fuzzing is a technique of testing a target (an application, or an API) with a large number of test inputs generated on the fly. The goal is to make these inputs as interesting and as diverse as possible in order to trigger corner cases. In other words, fuzzing aims to maximize the combined code coverage. Code coverage may have different meanings, e.g. which basic blocks are executed or which paths are taken in the program.
For the purposes of SiliFuzz, a proxy is any software or hardware system that behaves similar to some aspects of the target CPU. For example, a CPU emulator or a disassembler. A proxy is needed when we cannot directly collect coverage information from the target.
By applying fuzzing techniques to a proxy, we can generate a series of test inputs (corpus) that produce interesting behavior in the proxy. Our underlying assumption is that this translates into similarly interesting behavior in the target.
A collection of inputs used for testing the target is known as corpus.
A reasonably large corpus contains millions of inputs and is usually split into multiple non-overlapping chunks called shards.
A SiliFuzz snapshot describes a short sequence of CPU instructions plus an
initial state of CPU registers and memory to deterministically execute that
sequence. A typical snapshot contains less than 100 bytes of code and runs in
microseconds, but it can be arbitrarily large. Snapshots are stored as
silifuzz.proto.Snapshot
protocol buffers.
Snapshots are typically created from the inputs generated by a fuzzing engine. For CPU testing purposes these inputs are filtered to eliminate all non-deterministic snapshots.
An end state describes the contents of registers and memory expected to exist at the end of a Snapshot execution. If the snapshot executes differently on different CPU microarchitectures it will have multiple expected end states.
A Snap is an in-memory representation of a Snapshot that can be easily loaded and executed by the Runner. Snaps can be compiled in an executable program (also called baked-in runner) or loaded from disk by a reading runner. In the latter case the Snap on-disk format is essentially the same as the in-memory one except that native pointers are replaced with offsets. See this header for details. This format is often referred to as relocatable. Each Snap contains exactly one expected end state i.e. Snaps are microarchitecture-specific.
Runner is a binary for testing a single CPU core. A runner consumes a corpus shard, repeatedly executes random Snaps in it and checks that the expected end state is reached. Runner is a single-threaded process.
The orchestrator is a process that drives multiple runners. In a typical setting the orchestrator will continuously execute one runner per logical CPU core, and accumulate and report any failures produced by the individual runner processes.
We have extensively tested SiliFuzz on the following x86_64
microarchitectures
running recent Linux kernels:
intel-skylake / intel-cascadelake
intel-haswell / intel-broadwell
intel-ivybridge
amd-rome
amd-milan
Other x86_64
microarchitectures should work, too, as long as there is the
compiler and OS support for them.
We are actively working on AArch64
support.
A non-exhaustive list of bugs and defects SiliFuzz has found.
A logic bug is an invalid CPU behavior inherent to a particular CPU microarchitecture or stepping. SiliFuzz has identified the following bugs:
An (electrical) defect is an invalid CPU behavior that happens only on one or several chips. SiliFuzz has found the following defects we described in the paper
- F2XM1 Defect. Paper / Appendix A
- Overshoot of an Illegal Instruction. Paper / Appendix B
- FCOS Miscomputes. Paper / Appendix C
- Missing x87 Data Pointer Update. Paper / Appendix D
- Centipede is a fuzzing engine developed at Google for fuzzing large and slow targets like CPU emulators.
git clone https://github.com/google/silifuzz.git && cd silifuzz
SILIFUZZ_SRC_DIR=`pwd`
./install_build_dependencies.sh # Currently, works for the latest Debian and Ubuntu only
bazel build -c opt @silifuzz//tools:{snap_corpus_tool,fuzz_filter_tool,snap_tool,silifuzz_platform_id,simple_fix_tool_main} \
@silifuzz//runner:reading_runner_main_nolibc \
@silifuzz//orchestrator:silifuzz_orchestrator_main
SILIFUZZ_BIN_DIR=`pwd`/bazel-bin/
cd "${SILIFUZZ_BIN_DIR}"
NOTE: You can use a Docker container to avoid polluting the host system: docker run -it --tty --security-opt seccomp=unconfined --mount type=bind,source=${SILIFUZZ_SRC_DIR},target=/app debian /bin/bash -c "cd /app && ./install_build_dependencies.sh && bazel build ... && bazel test ..."
cd "${SILIFUZZ_SRC_DIR}"
COV_FLAGS_FILE="$(bazel info output_base)/external/centipede/clang-flags.txt"
bazel build -c opt --copt=-UNDEBUG --dynamic_mode=off \
--per_file_copt=unicorn/.*@$(xargs < "${COV_FLAGS_FILE}" |sed -e 's/,/\\,/g' -e 's/ /,/g') @//proxies:unicorn_x86_64
bazel build -c opt @centipede//:centipede
mkdir -p /tmp/wd
# Fuzz the Unicorn proxy under Centipede with parallelism of 30.
"${SILIFUZZ_BIN_DIR}/external/centipede/centipede" \
--binary="${SILIFUZZ_BIN_DIR}/proxies/unicorn_x86_64" \
--workdir=/tmp/wd \
-j=30
NOTE: Please refer to Centipede documentation on how to efficiently run the fuzzing engine.
You can download Centipede corpus files that have been generated using the instructions
above here.
The data can be processed by simple_fix_tool
or used to seed future fuzzing runs.
This helper tool is to check that the machine you're running on is supported.
$ ${SILIFUZZ_BIN_DIR}/tools/silifuzz_platform_id
intel-skylake
NOTE: SiliFuzz CPU detection logic does not account for certain desktop variants of otherwise supported CPUs. The tool will report "Unsupported platform" in such cases.
The fuzz_filter_tool converts raw instructions into Snap-compatible Snapshots. It returns 0 when the conversion is possible and 1 otherwise. This interface is compatible with Centipede input_filter
fuzz_filter_tool raw_input_sequence [optional output proto]
The raw_input_sequence
file contains x86_64
instructions which will be
converted into the Snapshot format using
InstructionsToSnapshot_X86_64
When the second argument is provided, the tool will also output the resulting Snapshot proto.
NOTE: the result proto is best-effort i.e. for Snapshots that are not Snap-compatible the output will contain some intermediate Snapshot value.
Sample usage:
# NOP
echo -en '\x90' | ./tools/fuzz_filter_tool /dev/stdin /tmp/nop.pb
# INC EAX
echo -en '\xFF\xC0' | ./tools/fuzz_filter_tool /dev/stdin /tmp/inc_eax.pb
snap_tool
examines and manipulates binary Snapshot protos (e.g. ones produced
by fuzz_filter_tool
at the previous step).
./tools/snap_tool print /tmp/inc_eax.pb
Metadata:
Id: inc_eax
Architecture: x86_64 Linux
Compleness: complete
Registers:
gregs (non-0 only)
rax = 0x20000000
....
The simple fix tool takes fuzzing results from Centipede, converts raw instructions into a snapshots with no end states, adds end states to snapshots and finally packages snapshots into a sharded relocatable snap corpus.
Currently this runs as a non-restartable process on a single host and everything is put into memory so the size of corpus that can be handled is limited by available memory of the host. Since end states are generated on the host, the resulting corpus is single architecture.
The rest of the document is organized in a How-to manner with each question describing a typical use case. The progression of the questions represents the increasing complexity of the task one is trying to accomplish. Each step typically requires either understanding or the artifacts (sometimes both) obtained at the previous step.
NOTE: The document assumes an x86_64
host/target CPU. The exact output may
vary depending on the CPU vendor/stepping/etc and the environment (e.g.
Docker/KVM).
WARNING: many of the instructions below execute arbitrary binary code with the privileges of the running user. The tool does its best to sandbox the code with seccomp(2). Use at your own risk.
# INC EAX
$ echo -en '\xFF\xC0' > /tmp/inc_eax
$ ./tools/fuzz_filter_tool /tmp/inc_eax /tmp/inc_eax.pb
# CPUID
$ echo -en '\x0F\xA2' | ./tools/fuzz_filter_tool /dev/stdin /tmp/cpuid.pb
<error log omitted>
failed: Non-deterministic insn CPUID
NOTE: To avoid non-deterministic results, various parts of SiliFuzz exclude certain classes of instructions, e.g. CPUID above.
$ ./tools/snap_tool print /tmp/inc_eax.pb
Metadata:
Id: inc_eax
Architecture: x86_64 Linux
Compleness: complete
Registers:
gregs (non-0 only):
rax = 0x20000000
rip = 0xeb85c12b000
<omitted>
End states (1):
Endpoint:
Instruction address: 0xeb85c12b002
Platforms:
intel-skylake
Registers (diff vs snapshot's initial values):
gregs (modified only):
rax = 0x20000001
rip = 0xeb85c12b002
<omitted>
Notice how the end state value of the RAX
register is 0x20000001
(0x20000000+1
which is exactly what INC EAX
does). Also notice that the
RIP
value is the original +2
which is the size of the INC EAX
instruction.
$ ./tools/snap_tool play /tmp/inc_eax.pb
Snapshot played successfully.
Note: You will need to specify a target platform in order to generate a corpus. In this example, the resulting corpus targets the platform it was generated on.
$ cd "${SILIFUZZ_BIN_DIR}"
$ PLATFORM_ID=$(./tools/silifuzz_platform_id)
$ ./tools/snap_tool generate_corpus /tmp/inc_eax.pb \
--target_platform="${PLATFORM_ID}" > /tmp/inc_eax.corpus
# Will play the same "INC EAX" snapshot 1M times
$ ./runner/reading_runner_main_nolibc /tmp/inc_eax.corpus
You can inspect the process with gdb:
$ gdb ./runner/reading_runner_main_nolibc
(gdb) b RestoreUContextNoSyscalls
(gdb) run /tmp/inc_eax.corpus
Starting program: .../reading_runner_main_nolibc /tmp/inc_eax.corpus
Breakpoint 1, 0x0000456700010598 in RestoreUContextNoSyscalls ()
(gdb) x/i 0xeb85c12b000 # same as the rip value produced by snap_tool print above
0xeb85c12b000: inc %eax
NOTE: This step relies on the "fuzzing Unicorn target" step described earlier.
Convert fuzzing result corpus.* into a 10-shard runnable corpus for the current architecture.
cd "${SILIFUZZ_BIN_DIR}"
"${SILIFUZZ_BIN_DIR}/tools/simple_fix_tool_main" \
--num_output_shards=10 \
--output_path_prefix=/tmp/wd/runnable-corpus \
/tmp/wd/corpus.*
The corpus shards will be at /tmp/wd/runnable-corpus.*
$ ./tools/snap_corpus_tool list_snaps /tmp/inc_eax.corpus
...
I0000 00:00:1661887744.019079 4074672 snap_corpus_tool.cc:155] inc_eax
NOTE: This is a very basic tool at the moment offering just a few commands.
# Will play the same "INC EAX" snapshot on CPU#1 10k times.
$ ./runner/reading_runner_main_nolibc \
--cpu=1 --num_iterations=10000 /tmp/inc_eax.corpus
# Will repeatedly run the corpus on all available CPU cores for 30s.
$ ./orchestrator/silifuzz_orchestrator_main --duration=30s \
--runner=./runner/reading_runner_main_nolibc \
/tmp/inc_eax.corpus
You can pass multiple corpus shards to the orchestrator. It'll cycle through them in random order:
# Will repeatedly run the corpus on all available CPU cores for 30s using
# /tmp/wd/runnable-corpus.* selected randomly.
$ ./orchestrator/silifuzz_orchestrator_main --duration=30s \
--runner=./runner/reading_runner_main_nolibc \
/tmp/wd/runnable-corpus.*
NOTE: The orchestrator can also use XZ-compressed corpus shards.