MATE Python Notebooks¶
MATE has a Python API for querying the CPG and exposes browser-based, interactive Jupyter notebooks with this query interface pre-loaded. These notebooks can be used to write complex, recursive, whole-program queries that answer detailed questions like “What sequences of function calls can lead from point A to point B in this program?” or “Can user input flow into a memory location with a specific struct type, and from there to some particular function without passing through one of these three sanitization routines?” These notebooks can be used for one-off explorations, or as a platform for users to build reusable apps on the MATE platform (such as UsageFinder).
The MATE notebook server is exposed via web interface at http://localhost:8889/.
See Notebook Tutorial for a hands-on guide to finding a bug with MATE notebooks.
Create a notebook¶
Navigate to the MATE notebook server at http://localhost:8889/ and use the “New” dropdown to create a new Python3 notebook.
Optional: click the notebook name (initially “Untitled”) to give it a more descriptive name.
Load the desired code property graph¶
Within a Python notebook, you need to identify the Code Property Graph you wish to query. You’ll need the Build Id for the target you’re interested in. You can copy it from the MATE dashboard: http://localhost:8050/
Now, in your MATE notebook enter the following into the first cell, replacing the placeholder Build Id with the one copied from the MATE dashboard:
session = db.new_session()
## TODO: replace the build ID in the next line with the ID from the dashboard
b = session.query(db.Build).get("fd60a24c857647a4b6707fea56a69db8")
g = session.graph_from_build(b)
print(f"Graph loaded with {session.query(g.Node).count()} Nodes and {session.query(g.Edge).count()} Edges")
You’ll know it’s working if you get nonzero number of nodes and edges as output.
Query the code property graph¶
The MATE notebook uses
SQLAlchemy to expose the CPG as Python objects.
See CPG Query API for more information and API documentation for a complete reference. The reference documentation is
also available inside Python via the help
function.
Below are some examples queries.
Each assume session
, b
, and g
have been initialized as described above.
Print every external (e.g. located in libc
) function, each followed by a list of all the application functions that invoke it:
for f in session.query(g.Function).filter_by(is_declaration=True).all():
print("### '", f.name, "' is invoked by:")
for c in f.callers:
print("*", c.name)
CPGs are made of nodes and edges. Some useful utility functions:
# print node IDs <-> llvm for a Function object
def show_llvm(f):
for b in f.blocks:
print(f"### {b} ###")
for i in b.instructions:
print(f"{i} {i.attributes['pretty_string']}")
# print node IDs <-> llvm for a Function given a function name
def show_llvm_fname(fname):
show_llvm(session.query(g.Function).filter_by(demangled_name=fname).one())
# helper: turn a node UUID into the corresponding Node object
def nid(uuid):
return session.query(g.Node).filter_by(uuid=str(uuid)).one()