From 8c6b3fc92765716d47c89a0610b05927e0538a2e Mon Sep 17 00:00:00 2001 From: Ivan Gotovchits Date: Tue, 21 Apr 2015 12:46:27 -0400 Subject: [PATCH] Plugins can now be loaded with findlib. Also, fixed project application, added `CHANGES.md`, updated `README.md` and bumped version in `_oasis`. --- CHANGES.md | 180 +++++++++++++++++++++++++++++++++++++++++ README.md | 158 +++++++++++++++++++++--------------- _oasis | 2 +- lib/bap/bap_plugins.ml | 6 +- opam | 2 +- src/readbin/readbin.ml | 74 +++++++++-------- 6 files changed, 318 insertions(+), 104 deletions(-) create mode 100644 CHANGES.md diff --git a/CHANGES.md b/CHANGES.md new file mode 100644 index 000000000..be839458a --- /dev/null +++ b/CHANGES.md @@ -0,0 +1,180 @@ +0.9.6 +===== + +1. New loader backed with LLVM + BAP now have another loader (image reader), that + supports MACH-O, ELF, COFF, PE. This loader is + backed with LLVM library. + +2. Online plugin system + + New extension point is added - "bap.project". Plugins marked with + this plugin system will not be loaded automatically when + `Plugins.load` is called, instead, they can be loaded dynamically + (or online, hence the title), by using `-l` option to the `bap` + utility. After being loaded the plugin is applied to a `project` + data structure that contains all information about disassembled + binary. Plugin can functionally update this data structure, to + push information to other plugins or back to the `bap` utility. + + In addition to a common way of creating plugins with `oasis`, we + extended `bapbuild` utility with a new rule the will product a + `plugin` file. This is just a shared library underneath the hood, + and you can load a plugin, created with this method directly, + without installing it anywhere. `bap` utility will try to find the + plugin, specified with `-l` option in a current folder, then in all + folders specified in `BAP_PLUGIN_PATH` environment variable, and, + finally in the system, using `ocamlfind`. + + In order to provide a typesafe way of interacting between plugins, + we added extensible variants to BAP. But instead of using one from + the 4.02, we're using universal types, based on that one, that Core + library provides. First of all this is more portable, second it is + more explicit and a little bit more safe. + +3. New ABI and CPU interfaces + + Modules that implements `CPU` interface are used to describe + particular CPU in BIL terminology, e.g., it tells which variable + corresponds to which register, flag, etc. To obtain such module, + one should use `target_of_cpu` function. + + ABI is used to capture the procedure abstraction, starting from + calling convetions and stack frame structure and ending with special + function handling and support for different data-types. + + See d5cab1a5e122719b4a3b1ece2b1bc44f3f93095a for more information + and examples. + +4. Bap-objdump renamed to bap + + bap-objdump has outgrown its name. Actually it was never really a + bap-objdump at all. From now, it is just an entry point to the `bap` as + platform. We will later unite `bap` with other utilities, to make them + subcommands, e.g. `bap byteweight`. + +5. Cleanup of BIL modules + + Now there is a separation between BIL fur uns, and BIL fur + OCaml. For writing BIL programs (as EDSL in OCaml) one should use + `Bil` module, e.g. `Bil.(x = y)` will evaluate to a BIL + expression. For using BIL entities as OCaml values, one should use + corresponding module, e.g. `Exp.(x = y)` will compare to expressions + and evaluate to a value of type `bool`. + +6. Enhanced IDA integration + + IDA intergation is now more robust. We switched to `IDA-32` by default, + since 64-bit version doesn't support decompiler. Also `bap` utility + can now output IDA python scripts. And `bap` plugins can annotate project + with `python` commands, that later will be dumped into the script. + +7. In ARM switched to ARMv7 by default +8. Introduce LNF algorithm and Sema library + + A new layer of BAP is started in this release. This would be a third pass + of decompilation, where the semantic model of program will be built. Currently, + there is nothing really interesting here, e.g., an implementation of the + Loop nesting forest, that is not very usable right now. But the next release, + will be dedicated to this layer. So, stay tuned. + +9. Add support for OCamlGraph + + Now we provide a helper utilities for those who would like to use + ocamlgraph library for analysis. + +10. Extended bap-mc utility + + `bap-mc` utility now prints results in plethora of formats, + including protocol buffers, from the piqi library, that was revived + by Kenneth Miller. + +11. Interval trees, aka memory maps + + For working with arbitrary overlapping memory regions we now have a + memory map data structure, aka interval trees, segment trees, etc. It + is based on AVL trees, and performs logarithmic searches. + +12. Simplified CI + + We put Travis on a diet. Now only 4 machines with 20 ETA for all test + suites to pass. (Instead of 8 * 40). + + +0.9.5 +===== + +1. removed tag warnings from the ocamlbuild +2. fixed #114 +3. moved Bap_plugins out of Bap library +4. plugin library can now load arbitrary files +5. bap-objdump is now pluggable +6. added new extension point in the plugin system +7. updated BAP LICENSE, baptop is now QPLed +8. IDA can now work in a headless mode +9. enhanced symbol resolution algorithm +10. cleaned up image backend interface +11. constraint OPAM file + + +0.9.4 +===== + +1. x86 and x86_64 lifter #106 +2. New byteweight implementation #99 +3. Intra-procedure CFG reconstruction #102 +4. IDA integration #103 +5. Binary release #108 +6. Man pages and documentation #107 +7. Unconstraint opam file and extended it with system dependents #109 + +0.9.3 +===== + +1. Bitvector (aka Word, aka Addr) now provides all Integer +interface without any monads right at the toplevel of the module. +In other words, now you can write: Word.(x + y). + +2. Bitvector.Int is renamed to Bitvector.Int_exn so that it don't +clobber the real Int module + +3. All BIL is now consolidated in one module named Bil. This module +contains everything, including constructors for statements, expressions +casts, binary and unary operations. It also includes functional +constructors, that are now written by hand and, thus, don't suffer from +syntactic clashes with keywords. There're also a plenty of other +functions and new operators, available from the new Bap_helpers +module, see later. Old modules, like Expr, Stmt, etc are still +available, they implement Regular interface for corresponding types. + +4. New feature: visitor classes to traverse and transform the AST. +Writing a pattern matching code every time you need to traverse or map +the BIL AST is error prone and time-consuming. This visitors, do all the +traversing for you, allowing you to override default behavior. Some +handy algorithms, that use visitors are provided in an internal +Bap_helpers module, that is included into resulting Bil +module. Several optimizations were added to bap-objdump utility, like +constant propogation, inlining, pruning unused variables and resolving +addresses to symbols. + +5. Insn interface now provides predicates to query insn classes, this +predicates use BIL if available. + +6. Disam interface now provides linear_sweep function. + + +0.9.2 +===== + +1. Recursive descent disassembler +2. High-level simple to use interface to BAP +3. New utility `bap-objdump` +4. Enhanced pretty-printing +5. Lots of small fixes and new handy functions +6. Automatically generated documentation. + + +0.9.1 +===== + +First release of a new BAP. diff --git a/README.md b/README.md index 47b4a3e4e..c7e62be67 100644 --- a/README.md +++ b/README.md @@ -4,8 +4,8 @@ [![Build Status](https://travis-ci.org/BinaryAnalysisPlatform/bap.svg?branch=master)](https://travis-ci.org/BinaryAnalysisPlatform/bap) -`Bap` library provides basic facilities for performing binary analysis -in OCaml and other languages. +BAP is a platform for binary analysis. It is written in OCaml, but can +be used from other languages, for example, from Python. # Installation @@ -27,23 +27,86 @@ $ pip install git+git://github.com/BinaryAnalysisPlatform/bap.git ### Installing system dependencies -There are few system libraries that bap depends on, namely `llvm-3.4` and `clang` compiler. -We provide a file `apt.deps` that contains package names as they are in Ubuntu -Trusty. Depending on your OS and distribution, you may need to adjust -this names. But, on most Debian-based Linux distribution, this should work: +BAP uses clang compiler and llvm library as major backend. We also +need curl, zip and gmp packages for our tools. A complete up-to-date +list of packages can be found in `opam` file, under `depexts` section. + +You can query for the external (system) dependecies with ```bash -$ sudo apt-get install $(cat apt.deps) +$ opam install bap -e ubuntu ``` +To install dependencies, using this method, try the following: + +```bash +$ sudo apt-get install $(opam install bap -e ubuntu) +``` + +If you're not using Ubuntu, then you need to adapt package names in +according to your system package manager preferences. + # Usage +## Using from OCaml + +There're two ways to use BAP. Compile your own application, and use +BAP library, or write a plugin, that can still use the library, but +will also get an access to decompiled binary. For the latter, write +your plugin in OCaml using your +[favorite text editor](https://github.com/BinaryAnalysisPlatform/bap/wiki/Emacs) +: +```sh +$ cat mycode.ml +open Bap.Std +let main project = print_endline "Hello, World" +let () = Project.register_plugin' main +``` + +Next, build it with our `bapbuild` tool: + +```sh +$ bapbuild mycode.plugin +``` + +After this you can load your plugin with `-l` command line option, and +get an immediate access to the decompiled binary: + +```sh +$ bap /bin/ls -lmycode +``` + +`bapbuild` can compile a standalone applications, not only plugins. In +fact, `bapbuild` underneath the hood is an `ocamlbuild` utility extended +with our rules an flags. To compile a standalone binary, + +```bash +$ bapbuild mycoolprog.native +``` + +If `bapbuild` complains that something is missing, make sure that you +didn't skip the [Installation](#Installation) phase. You can add your +own dependencies with a `-pkg` or `-pkgs` command line options: + +```bash +$ bapbuild -pkg lwt mycoolprog.native +``` + +If you use your own build environment, please make sure that you have +added `bap` as a dependency. We install our libraries using +`ocamlfind` and you just need to add `bap` to your project. For +example, if you use `oasis`, then you should add `bap` to the +`BuildDepends` field. If you are using `ocamlbuild` with the +`ocamlfind` plugin, then you should add `package(bap)` or `pkg_bap` to +your `_tags` file. + + ## Using from top-level -It is a good idea to learn how to use our library by playing in an OCaml -top-level. If you have installed `utop`, then you can just use our `baptop` -script to run `utop` with `bap` extensions: +It maybe a good idea to learn how to use our library by playing in an +OCaml top-level. If you have installed `utop`, then you can just use +our `baptop` script to run `utop` with `bap` extensions: ```bash $ baptop @@ -79,22 +142,8 @@ dependencies, install top-level printers, etc. ## Using from Python -You can install `bap` python bindings with `pip`. - -```bash -$ pip install git+git://github.com/BinaryAnalysisPlatform/bap.git -``` - - -Instead of git path you can also use a local one. Adjust it according -to your setup. Also, you may need to use `sudo` or to activate your -`virtualenv` if you're using one. - -If you don't like `pip`, then you can just go to `bap/python` folder -and copy-paste the contents to whatever place you like, and use it as -desired. - -After bindings are properly installed, you can start to use it: +After BAP and python bindings are properly installed, you can start to +use it: ```python >>> import bap @@ -124,9 +173,9 @@ For more information, read builtin documentation, for example with ## Using from shell -Bap is shipped with `bap-objdump` utility that can disassemble files, -and printout dumps in different formats, including plain text, json, -dot, html. The example of `bap-objdump` output is: +Bap is shipped with `bap` utility that can disassemble files, and +printout dumps in different formats, including plain text, json, dot, +html. The example of `bap` output is: ```asm begin(to_uchar) @@ -162,8 +211,14 @@ dot, html. The example of `bap-objdump` output is: } ``` -Also we're shipping a `bap-mc` executable that can disassemble arbitrary -strings. Read `bap-mc --help` for more information. +Also we're shipping a `bap-mc` executable that can disassemble +arbitrary strings and output them in a plethora of formats. Read +`bap-mc --help` for more information. `bap-byteweight` utility can be +used to evaluate our `byteweight` algorithm for finding symbols inside +the binary. It is also a supporting toolkit for byteweight +infrastructure, it can download, create and install binary signatures, +used for identification. + ## Using from other languages @@ -175,49 +230,26 @@ shipped with bap by default. You can talk with server using `HTTP` protocol, or extend it with any other transporting protocol you would like. - -## Compiling your program with `bap` - -Similar to the top-level, you can use our `bapbuild` script to compile a program -that uses `bap` without tackling with the build system. For example, if your -program is `mycoolprog.ml`, then you can execute: - -```bash -$ bapbuild mycoolprog.native -``` - -and you will obtain `mycoolprog.native`. If `bapbuild` complains that something -is missing, make sure that you didn't skip the [Installation](#Installation) -phase. You can add your own dependencies with a `-package` command line option. - -If you have other dependencies, you can compile it using `pkg` flag, like this - -```bash -$ bapbuild -pkg lwt mycoolprog.native -``` - -If you use your own build environment, please make sure that you have added -`bap` as a dependency. We install our libraries using `ocamlfind` and you just -need to add `bap` to your project. For example, if you use `oasis`, then you -should add `bap` to the `BuildDepends` field. If you are using `ocamlbuild` with -the `ocamlfind` plugin, then you should add `package(bap)` or `pkg_bap` to your -`_tags` file. - ## Extending BAP -BAP can be extended using plugin system. That means, that you can use -`bap` library, to extend the `bap` library! See our +We're always welcome for any contributions. If you want to add new +code, or fix a bug, feel free to clone us, and create a pull request. + +But BAP can also be extended in a non invasive way, using plugin +system. That means, that you can use `bap` library, to extend the +`bap` library! See our [blog](http://binaryanalysisplatform.github.io/bap_plugins/) for more information. - ## Learning BAP The best source of information about BAP is it's source code, that is well-documented. There are also [blog](http://binaryanalysisplatform.github.io/bap_plugins/) and [wiki](https://github.com/BinaryAnalysisPlatform/bap/wiki/), where you -can find some useful information. +can find some useful information. Also, we have a permanently manned +chat in case of emergency. Look at the badge on top of the README file, +and feel free to join. # License diff --git a/_oasis b/_oasis index 3181d7d40..cea3ed520 100644 --- a/_oasis +++ b/_oasis @@ -1,6 +1,6 @@ OASISFormat: 0.4 Name: bap -Version: 0.9.5 +Version: 0.9.6 Synopsis: BAP Core Library Authors: BAP Team Maintainers: Ivan Gotovchits diff --git a/lib/bap/bap_plugins.ml b/lib/bap/bap_plugins.ml index 5c6746d63..560b27d59 100644 --- a/lib/bap/bap_plugins.ml +++ b/lib/bap/bap_plugins.ml @@ -5,8 +5,8 @@ module Std = struct type plugin = Plugin.t let systems = [ - "bap.image"; - "bap.disasm" + "bap.loader"; + "bap.disasm"; ] let internal = [ @@ -32,8 +32,6 @@ module Std = struct (Bap_plugin.name pkg) system (Error.to_string_hum err))) - - let all () = List.map systems ~f:(fun s -> Bap_plugin.find_all ~system:s) |> List.concat diff --git a/opam b/opam index 8668e655e..f12a999b7 100644 --- a/opam +++ b/opam @@ -11,7 +11,6 @@ build: [ ["./configure" "--prefix=%{prefix}%" "--with-cxx=`which clang++`" - "--enable-docs" "--docdir=%{doc}%/bap" "--mandir=%{man}%"] [make] @@ -19,6 +18,7 @@ build: [ ] install: [ [make "install"] + ["bap-byteweight" "update"] ] remove: [ diff --git a/src/readbin/readbin.ml b/src/readbin/readbin.ml index a48e17ecf..542b87d05 100644 --- a/src/readbin/readbin.ml +++ b/src/readbin/readbin.ml @@ -13,7 +13,35 @@ module Program(Conf : Options.Provider) = struct try Sys.getenv "BAP_PLUGIN_PATH" |> String.split ~on:':' with Not_found -> [] + (** [create_plugin system name file] if file is not [None] + then create a plugin targeting this file, otherwise + search for the plugin with a given [system] and [name] + if a system. Return an error, if nothing found. *) + let create_plugin system name = function + | Some name -> Ok (Plugin.create ~system name) + | None -> + Plugin.find_all ~system |> + List.filter ~f:(fun p -> Plugin.name p = name) |> function + | [] -> + errorf "Failed to find plugin in path or in system, \ + try to use -L option or set \ + BAP_PLUGIN_PATH environment variable" + | _ :: _ :: _ -> + errorf "The plugin name is ambigious, as I found more \ + than one plugin named '%s' in your system." name + | [p] -> Ok p + + (** [load_plugin name] if [name] or [name.plugin] points to a + file then load it, otherwise search for the plugin of a + system "bap.project" with the given [name] using findlib. + Bail-out with error if nothing found. + Once plugin is loaded it is checked that it registered itself + under the system. *) let load_plugin name = + let system = "bap.project" in + let name = + if Filename.check_suffix name ".plugin" + then name else name ^ ".plugin" in let before = Project.plugins () |> List.length in let paths = [ [FileUtil.pwd ()]; paths_of_env (); options.load_path @@ -21,11 +49,7 @@ module Program(Conf : Options.Provider) = struct List.find_map paths ~f:(fun dir -> let path = Filename.concat dir name in Option.some_if (Sys.file_exists path) path) |> - Result.of_option - ~error:(Error.of_string "Failed to find plugin in path, \ - try to use -L option or set \ - BAP_PLUGIN_PATH environment variable") - >>| Plugin.create ~system:"program" >>= Plugin.load >>= fun () -> + create_plugin system name >>= Plugin.load >>= fun () -> if List.length (Project.plugins ()) = before then errorf "Plugin %s didn't register itself" name else return () @@ -38,8 +62,6 @@ module Program(Conf : Options.Provider) = struct | None -> None | Some arg -> Some ("--" ^ arg)) - - type bound = [`min | `max] with sexp type spec = [`name | bound] with sexp @@ -52,7 +74,6 @@ module Program(Conf : Options.Provider) = struct | `bil ] with sexp - let subst_of_string = function | "region" | "region_name" -> Some (`region `name) | "region_addr" | "region_min_addr" -> Some (`region `min) @@ -228,49 +249,32 @@ module Program(Conf : Options.Provider) = struct merge_syms img_syms |> rename_symbols usr_syms |> merge_syms usr_syms in - let annots = - Option.value_map img ~default:Memmap.empty ~f: Image.memory in + let memory = + Option.value_map img ~default:Memmap.empty ~f:Image.memory in + let memory = + Table.foldi syms ~init:memory ~f:(fun mem sym map -> + Memmap.add map mem (Tag.create Image.symbol sym)) in List.iter options.plugins ~f:(fun name -> - let name = if Filename.check_suffix name ".plugin" then - name else (name ^ ".plugin") in match load_plugin name with | Ok () -> () | Error err -> let msg = asprintf "Failed to load plugin %s" (Filename.basename name) in Error.raise (Error.tag err msg)); - let module Target = (val target_of_arch arch) in - - let make_project memory symbols = - let module H = Helpers.Make(struct - let options = options - let cfg = Disasm.blocks disasm - let base = mem - let syms = syms - let arch = arch - module Target = Target - end) in Project.({ - memory; - storage = String.Map.empty; - symbols; - arch; base = mem; - disasm; - }) in let project = - List.fold2_exn ~init:(make_project annots syms) - options.plugins - (Project.plugins ()) - ~f:(fun p name visit -> - let argv = prepare_args Sys.argv name in - visit argv (make_project p.memory p.symbols)) |> + List.fold2_exn options.plugins (Project.plugins ()) ~init:{ + arch; disasm; memory; storage = String.Map.empty; + symbols = syms; base = mem + } ~f:(fun p name f -> f (prepare_args Sys.argv name) p) |> substitute in Option.iter options.emit_ida_script (fun dst -> Out_channel.write_all dst ~data:(Idapy.extract_script project.memory)); + let module Target = (val target_of_arch arch) in let module Env = struct let options = options let cfg = Disasm.blocks project.disasm