On-boarding

This page provides learning resources for new GREASE developers.

Binary analysis

General background

Try it out!

  • Use bintools
    • Use llvm-dwarfdump
  • Download Ghidra, look at an ELF program

LLVM

Try it out!

  • Compile a program to .ll with clang -fno-discard-value-names -emit-llvm -grecord-gcc-switches -S -O1 test.c
  • Do the same thing using compiler explorer
  • Assemble it to .bc with llvm-as test.bc, disassemble it again with llvm-dis

Under-constrained symbolic execution

Dependently-typed Haskell

GREASE and its underlying libraries are programmed using depedently-typed Haskell.

Singletons

This is the name of the approach of combining GADTs and type families to achieve depedently-typed programming. The main paper is Dependently typed programming with singletons.

The singletons library uses Template Haskell where parameterized-utils encourages a more explicit style.

parameterized-utils

This is our core dependently-typed programming library.

Idioms

See Fancy Haskell idioms for some examples of idioms that arise when doing "fancy" Haskell programming.

Try it out!

Exercises

Libraries

GREASE makes use of several libraries for binary analysis and symbolic execution.

Macaw

Macaw performs code discovery: given some set of entrypoints (generally addresses of functions, generally from the symbol table), Macaw recovers the CFGs (bounds of and edges between basic blocks) and translates them into its own IR. Macaw CFGs can then be turned into Crucible CFGs for symbolic execution.

The closest thing out there to the Macaw/Crucible combo is angr.

Try it out!

Prepare to build Macaw (see the latest build instructions in the README):

git clone https://github.com/GaloisInc/macaw
cd macaw
git submodule update --init
ln -s cabal.project.dist cabal.project

Then print the Macaw CFGs in a test ELF file:

cabal run exe:macaw-x86-dump -- discover x86_symbolic/tests/pass/identity.opt.exe

Pass --crucible to see the Crucible CFGs too.

Crucible

Crucible is an imperative IR. It has an SSA format with basic blocks. It uses dependently-typed programming to enforce well-typedness and the SSA invariant. It comes with a symbolic execution engine. There are many frontends that translate other languages into the Crucible IR, e.g., LLVM (e.g., from C and C++ via Clang), JVM (e.g., from Java), MIR (e.g., from Rust), WebAssembly, and crucially for us, machine code (via Macaw).

Try it out!

First, install Yices and Z3. Then, Prepare to build Crucible (see the latest build instructions in the README):

git clone https://github.com/GaloisInc/crucible
cd crucible
git submodule update --init

The run Crucible on some example IR files:

cabal run exe:crucible -- simulate crucible-cli/test-data/simulate/assert.cbl

What4

What4 is our internal representation of symbolic terms. What4 values are manipulated by Crucible programs. What4 wraps a variety of SMT solvers to make queries about terms.