Transforming Term Values

The three primary functions of SAW are extracting models (Term values) from programs, transforming those models, and proving properties about models using external provers. We’ve seen how to construct Term values from a range of sources. Now we show how to use the various term transformation features available in SAW.

Rewriting

Rewriting a Term consists of applying one or more rewrite rules to it, resulting in a new Term. A rewrite rule in SAW can be specified in multiple ways:

as the definition of a function that can be unfolded,
as a term of Boolean type (or a function returning a Boolean) that is an equality statement, and
as a term of equality type with a body that encodes a proof that the equality in the type is valid.

In each case the term logically consists of two sides and describes a way to transform the left side into the right side. Each side may contain variables (bound by enclosing lambda expressions) and is therefore a pattern which can match any term in which each variable represents an arbitrary sub-term. The left-hand pattern describes a term to match (which may be a sub-term of the full term being rewritten), and the right-hand pattern describes a term to replace it with. Any variable in the right-hand pattern must also appear in the left-hand pattern and will be instantiated with whatever sub-term matched that variable in the original term.

For example, say we have the following Cryptol function:

 \(x:[8]) -> (x * 2) + 1

We might for some reason want to replace multiplication by a power of two with a shift. We can describe this replacement using an equality statement in Cryptol (a rule of form 2 above):

\(y:[8]) -> (y * 2) == (y << 1)

Interpreting this as a rewrite rule, it says that for any 8-bit vector (call it y for now), we can replace y * 2 with y << 1. Using this rule to rewrite the earlier expression would then yield:

 \(x:[8]) -> (x << 1) + 1

The general philosophy of rewriting is that the left and right patterns, while syntactically different, should be semantically equivalent. Therefore, applying a set of rewrite rules should not change the fundamental meaning of the term being rewritten. SAW is particularly focused on the task of proving that some logical statement expressed as a Term is always true. If that is in fact the case, then the entire term can be replaced by the term True without changing its meaning. The rewriting process can in some cases, by repeatedly applying rules that themselves are known to be valid, reduce a complex term entirely to True, which constitutes a proof of the original statement. In other cases, rewriting can simplify terms before sending them to external automated provers that can then finish the job. Sometimes this simplification can help the automated provers run more quickly, and sometimes it can help them prove things they would otherwise be unable to prove by applying reasoning steps (rewrite rules) that are not available to the automated provers.

In practical use, rewrite rules can be aggregated into Simpset values in SAWScript. A few pre-defined Simpset values exist:

empty_ss : Simpset is the empty set of rules. Rewriting with it should have no effect, but it is useful as an argument to some of the functions that construct larger Simpset values.
basic_ss : Simpset is a collection of rules that are useful in most proof scripts.
cryptol_ss : () -> Simpset includes a collection of Cryptol-specific rules. Some of these simplify away the abstractions introduced in the translation from Cryptol to SAWCore, which can be useful when proving equivalence between Cryptol and non-Cryptol code. Leaving these abstractions in place is appropriate when comparing only Cryptol code, however, so cryptol_ss is not included in basic_ss.

The following function can apply a Simpset:

rewrite : Simpset -> Term -> Term applies a Simpset to an existing Term to produce a new Term.

To make this more concrete, we examine how the rewriting example sketched above, to convert multiplication into shift, can work in practice. We simplify everything with cryptol_ss as we go along so that the Terms don’t get too cluttered. First, we declare the term to be transformed:

sawscript> let term = rewrite (cryptol_ss ()) {{ \(x:[8]) -> (x * 2) + 1 }}
sawscript> print_term term
\(x : Prelude.Vec 8 Prelude.Bool) ->
  Prelude.bvAdd 8 (Prelude.bvMul 8 x (Prelude.bvNat 8 2))
    (Prelude.bvNat 8 1)

Next, we declare the rewrite rule:

sawscript> let rule = rewrite (cryptol_ss ()) {{ \(y:[8]) -> (y * 2) == (y << 1) }}
sawscript> print_term rule
let { x@1 = Prelude.Vec 8 Prelude.Bool
    }
 in \(y : x@1) ->
      Cryptol.ecEq x@1 (Cryptol.PCmpWord 8)
        (Prelude.bvMul 8 y (Prelude.bvNat 8 2))
        (Prelude.bvShiftL 8 Prelude.Bool 1 Prelude.False y
           (Prelude.bvNat 1 1))

The primary interface to rewriting uses the Theorem type instead of the Term type, as shown in the signatures for addsimp and addsimps.

addsimp : Theorem -> Simpset -> Simpset adds a single Theorem to a Simpset.
addsimps : [Theorem] -> Simpset -> Simpset adds several Theorem values to a Simpset.

A Theorem is essentially a Term that is proven correct in some way. In general, a Theorem can be any statement, and may not be useful as a rewrite rule. However, if it has an appropriate shape it can be used for rewriting. In the “Proofs about Terms” section, we’ll describe how to construct Theorem values from Term values.

For the time being, we’ll assume we’ve proved our rule term correct in some way, and have a Theorem named rule_thm.

Finally, we apply the rule to the target term:

sawscript> let result = rewrite (addsimp rule_thm empty_ss) term
sawscript> print_term result
\(x : Prelude.Vec 8 Prelude.Bool) ->
  Prelude.bvAdd 8
    (Prelude.bvShiftL 8 Prelude.Bool 1 Prelude.False x
       (Prelude.bvNat 1 1))
    (Prelude.bvNat 8 1)

In the absence of user-constructed Theorem values, there are some additional built-in rules that are not included in either basic_ss and cryptol_ss because they are not always beneficial, but that can sometimes be helpful or essential. The cryptol_ss simpset includes rewrite rules to unfold all definitions in the Cryptol SAWCore module, but does not include any of the terms of equality type.

add_cryptol_defs : [String] -> Simpset -> Simpset adds unfolding rules for functions with the given names from the SAWCore Cryptol module to the given Simpset.
add_cryptol_eqs : [String] -> Simpset -> Simpset adds the terms of equality type with the given names from the SAWCore Cryptol module to the given Simpset.
add_prelude_defs : [String] -> Simpset -> Simpset adds unfolding rules from the SAWCore Prelude module to a Simpset.
add_prelude_eqs : [String] -> Simpset -> Simpset adds equality-typed terms from the SAWCore Prelude module to a Simpset.

Finally, it’s possible to construct a theorem from an arbitrary SAWCore expression (rather than a Cryptol expression), using the core_axiom function.

core_axiom : String -> Theorem creates a Theorem from a String in SAWCore syntax. Any Theorem introduced by this function is assumed to be correct, so use it with caution.

Folding and Unfolding

A SAWCore term can be given a name using the define function, and is then by default printed as that name alone. A named subterm can be “unfolded” so that the original definition appears again.

define : String -> Term -> TopLevel Term
unfold_term : [String] -> Term -> Term

For example:

sawscript> let t = {{ 0x22 }}
sawscript> print_term t
Cryptol.ecNumber (Cryptol.TCNum 34) (Prelude.Vec 8 Prelude.Bool)
  (Cryptol.PLiteralSeqBool (Cryptol.TCNum 8))
sawscript> t' <- define "t" t
sawscript> print_term t'
t
sawscript> let t'' = unfold_term ["t"] t'
sawscript> print_term t''
Cryptol.ecNumber (Cryptol.TCNum 34) (Prelude.Vec 8 Prelude.Bool)
  (Cryptol.PLiteralSeqBool (Cryptol.TCNum 8))

This process of folding and unfolding is useful both to make large terms easier for humans to work with and to make automated proofs more tractable. We’ll describe the latter in more detail when we discuss interacting with external provers.

In some cases, folding happens automatically when constructing Cryptol expressions. Consider the following example:

sawscript> let t = {{ 0x22 }}
sawscript> print_term t
Cryptol.ecNumber (Cryptol.TCNum 34) (Prelude.Vec 8 Prelude.Bool)
  (Cryptol.PLiteralSeqBool (Cryptol.TCNum 8))
sawscript> let {{ t' = 0x22 }}
sawscript> print_term {{ t' }}
t'

This illustrates that a bare expression in Cryptol braces gets translated directly to a SAWCore term. However, a Cryptol definition gets translated into a folded SAWCore term. In addition, because the second definition of t occurs at the Cryptol level, rather than the SAWScript level, it is visible only inside Cryptol braces. Definitions imported from Cryptol source files are also initially folded and can be unfolded as needed.

Other Built-in Transformation and Inspection Functions

In addition to the Term transformation functions described so far, a variety of others also exist.

beta_reduce_term : Term -> Term takes any sub-expression of the form (\x -> t) v in the given Term and replaces it with a transformed version of t in which all instances of x are replaced by v.
replace : Term -> Term -> Term -> TopLevel Term replaces arbitrary subterms. A call to replace x y t replaces any instance of x inside t with y.

Assessing the size of a term can be particularly useful during benchmarking. SAWScript provides two mechanisms for this.

term_size : Term -> Int calculates the number of nodes in the Directed Acyclic Graph (DAG) representation of a Term used internally by SAW. This is the most appropriate way of determining the resource use of a particular term.
term_tree_size : Term -> Int calculates how large a Term would be if it were represented by a tree instead of a DAG. This can, in general, be much, much larger than the number returned by term_size, and serves primarily as a way of assessing, for a specific term, how much benefit there is to the term sharing used by the DAG representation.

Finally, there are a few commands related to the internal SAWCore type of a Term.

check_term : Term -> TopLevel () checks that the internal structure of a Term is well-formed and that it passes all of the rules of the SAWCore type checker.
type : Term -> Type returns the type of a particular Term, which can then be used to, for example, construct a new fresh variable with fresh_symbolic.

Loading and Storing Terms

Most frequently, Term values in SAWScript come from Cryptol, JVM, or LLVM programs, or some transformation thereof. However, it is also possible to obtain them from various other sources.

parse_core : String -> Term parses a String containing a term in SAWCore syntax, returning a Term.
read_core : String -> TopLevel Term is like parse_core, but obtains the text from the given file and expects it to be in the simpler SAWCore external representation format, rather than the human-readable syntax shown so far.
read_aig : String -> TopLevel Term returns a Term representation of an And-Inverter-Graph (AIG) file in AIGER format.
read_bytes : String -> TopLevel Term reads a constant sequence of bytes from a file and represents it as a Term. Its result will always have Cryptol type [n][8] for some n.

It is also possible to write Term values into files in various formats, including: AIGER (write_aig), CNF (write_cnf), SAWCore external representation (write_core), and SMT-Lib version 2 (write_smtlib2).

write_aig : String -> Term -> TopLevel ()
write_cnf : String -> Term -> TopLevel ()
write_core : String -> Term -> TopLevel ()
write_smtlib2 : String -> Term -> TopLevel ()