Transforming Term Values
The three primary functions of SAW are extracting models (Term
values) from programs, transforming those models, and proving
properties about models using external provers. We’ve seen how to construct
Term
values from a range of sources. Now we show how
to use the various term transformation features available in SAW.
Rewriting
Rewriting a Term
consists of applying one or more rewrite rules to
it, resulting in a new Term
. A rewrite rule in SAW can be specified in
multiple ways:
as the definition of a function that can be unfolded,
as a term of Boolean type (or a function returning a Boolean) that is an equality statement, and
as a term of equality type with a body that encodes a proof that the equality in the type is valid.
In each case the term logically consists of two sides and describes a way to transform the left side into the right side. Each side may contain variables (bound by enclosing lambda expressions) and is therefore a pattern which can match any term in which each variable represents an arbitrary sub-term. The left-hand pattern describes a term to match (which may be a sub-term of the full term being rewritten), and the right-hand pattern describes a term to replace it with. Any variable in the right-hand pattern must also appear in the left-hand pattern and will be instantiated with whatever sub-term matched that variable in the original term.
For example, say we have the following Cryptol function:
\(x:[8]) -> (x * 2) + 1
We might for some reason want to replace multiplication by a power of two with a shift. We can describe this replacement using an equality statement in Cryptol (a rule of form 2 above):
\(y:[8]) -> (y * 2) == (y << 1)
Interpreting this as a rewrite rule, it says that for any 8-bit vector
(call it y
for now), we can replace y * 2
with y << 1
. Using this
rule to rewrite the earlier expression would then yield:
\(x:[8]) -> (x << 1) + 1
The general philosophy of rewriting is that the left and right patterns,
while syntactically different, should be semantically equivalent.
Therefore, applying a set of rewrite rules should not change the
fundamental meaning of the term being rewritten. SAW is particularly
focused on the task of proving that some logical statement expressed as
a Term
is always true. If that is in fact the case, then the entire
term can be replaced by the term True
without changing its meaning. The
rewriting process can in some cases, by repeatedly applying rules that
themselves are known to be valid, reduce a complex term entirely to
True
, which constitutes a proof of the original statement. In other
cases, rewriting can simplify terms before sending them to external
automated provers that can then finish the job. Sometimes this
simplification can help the automated provers run more quickly, and
sometimes it can help them prove things they would otherwise be unable
to prove by applying reasoning steps (rewrite rules) that are not
available to the automated provers.
In practical use, rewrite rules can be aggregated into Simpset
values in SAWScript. A few pre-defined Simpset
values exist:
empty_ss : Simpset
is the empty set of rules. Rewriting with it should have no effect, but it is useful as an argument to some of the functions that construct largerSimpset
values.basic_ss : Simpset
is a collection of rules that are useful in most proof scripts.cryptol_ss : () -> Simpset
includes a collection of Cryptol-specific rules. Some of these simplify away the abstractions introduced in the translation from Cryptol to SAWCore, which can be useful when proving equivalence between Cryptol and non-Cryptol code. Leaving these abstractions in place is appropriate when comparing only Cryptol code, however, socryptol_ss
is not included inbasic_ss
.
The following function can apply a Simpset
:
rewrite : Simpset -> Term -> Term
applies aSimpset
to an existingTerm
to produce a newTerm
.
To make this more concrete, we examine how the rewriting example
sketched above, to convert multiplication into shift, can work in
practice. We simplify everything with cryptol_ss
as we go along so
that the Term
s don’t get too cluttered. First, we declare the term
to be transformed:
sawscript> let term = rewrite (cryptol_ss ()) {{ \(x:[8]) -> (x * 2) + 1 }}
sawscript> print_term term
\(x : Prelude.Vec 8 Prelude.Bool) ->
Prelude.bvAdd 8 (Prelude.bvMul 8 x (Prelude.bvNat 8 2))
(Prelude.bvNat 8 1)
Next, we declare the rewrite rule:
sawscript> let rule = rewrite (cryptol_ss ()) {{ \(y:[8]) -> (y * 2) == (y << 1) }}
sawscript> print_term rule
let { x@1 = Prelude.Vec 8 Prelude.Bool
}
in \(y : x@1) ->
Cryptol.ecEq x@1 (Cryptol.PCmpWord 8)
(Prelude.bvMul 8 y (Prelude.bvNat 8 2))
(Prelude.bvShiftL 8 Prelude.Bool 1 Prelude.False y
(Prelude.bvNat 1 1))
The primary interface to rewriting uses the Theorem
type instead of
the Term
type, as shown in the signatures for addsimp
and
addsimps
.
addsimp : Theorem -> Simpset -> Simpset
adds a singleTheorem
to aSimpset
.addsimps : [Theorem] -> Simpset -> Simpset
adds severalTheorem
values to aSimpset
.
A Theorem
is essentially a Term
that is proven correct in some way.
In general, a Theorem
can be any statement, and may not be useful as a
rewrite rule. However, if it has an appropriate shape it can be used for
rewriting. In the “Proofs about Terms” section,
we’ll describe how to construct Theorem
values from Term
values.
For the time being, we’ll assume we’ve proved our rule
term correct in
some way, and have a Theorem
named rule_thm
.
Finally, we apply the rule to the target term:
sawscript> let result = rewrite (addsimp rule_thm empty_ss) term
sawscript> print_term result
\(x : Prelude.Vec 8 Prelude.Bool) ->
Prelude.bvAdd 8
(Prelude.bvShiftL 8 Prelude.Bool 1 Prelude.False x
(Prelude.bvNat 1 1))
(Prelude.bvNat 8 1)
In the absence of user-constructed Theorem
values, there are some
additional built-in rules that are not included in either basic_ss
and
cryptol_ss
because they are not always beneficial, but that can
sometimes be helpful or essential. The cryptol_ss
simpset includes
rewrite rules to unfold all definitions in the Cryptol
SAWCore module,
but does not include any of the terms of equality type.
add_cryptol_defs : [String] -> Simpset -> Simpset
adds unfolding rules for functions with the given names from the SAWCoreCryptol
module to the givenSimpset
.add_cryptol_eqs : [String] -> Simpset -> Simpset
adds the terms of equality type with the given names from the SAWCoreCryptol
module to the givenSimpset
.add_prelude_defs : [String] -> Simpset -> Simpset
adds unfolding rules from the SAWCorePrelude
module to aSimpset
.add_prelude_eqs : [String] -> Simpset -> Simpset
adds equality-typed terms from the SAWCorePrelude
module to aSimpset
.
Finally, it’s possible to construct a theorem from an arbitrary SAWCore
expression (rather than a Cryptol expression), using the core_axiom
function.
core_axiom : String -> Theorem
creates aTheorem
from aString
in SAWCore syntax. AnyTheorem
introduced by this function is assumed to be correct, so use it with caution.
Folding and Unfolding
A SAWCore term can be given a name using the define
function, and is
then by default printed as that name alone. A named subterm can be
“unfolded” so that the original definition appears again.
define : String -> Term -> TopLevel Term
unfold_term : [String] -> Term -> Term
For example:
sawscript> let t = {{ 0x22 }}
sawscript> print_term t
Cryptol.ecNumber (Cryptol.TCNum 34) (Prelude.Vec 8 Prelude.Bool)
(Cryptol.PLiteralSeqBool (Cryptol.TCNum 8))
sawscript> t' <- define "t" t
sawscript> print_term t'
t
sawscript> let t'' = unfold_term ["t"] t'
sawscript> print_term t''
Cryptol.ecNumber (Cryptol.TCNum 34) (Prelude.Vec 8 Prelude.Bool)
(Cryptol.PLiteralSeqBool (Cryptol.TCNum 8))
This process of folding and unfolding is useful both to make large terms easier for humans to work with and to make automated proofs more tractable. We’ll describe the latter in more detail when we discuss interacting with external provers.
In some cases, folding happens automatically when constructing Cryptol expressions. Consider the following example:
sawscript> let t = {{ 0x22 }}
sawscript> print_term t
Cryptol.ecNumber (Cryptol.TCNum 34) (Prelude.Vec 8 Prelude.Bool)
(Cryptol.PLiteralSeqBool (Cryptol.TCNum 8))
sawscript> let {{ t' = 0x22 }}
sawscript> print_term {{ t' }}
t'
This illustrates that a bare expression in Cryptol braces gets
translated directly to a SAWCore term. However, a Cryptol definition
gets translated into a folded SAWCore term. In addition, because the
second definition of t
occurs at the Cryptol level, rather than the
SAWScript level, it is visible only inside Cryptol braces. Definitions
imported from Cryptol source files are also initially folded and can be
unfolded as needed.
Other Built-in Transformation and Inspection Functions
In addition to the Term
transformation functions described so far, a
variety of others also exist.
beta_reduce_term : Term -> Term
takes any sub-expression of the form(\x -> t) v
in the givenTerm
and replaces it with a transformed version oft
in which all instances ofx
are replaced byv
.replace : Term -> Term -> Term -> TopLevel Term
replaces arbitrary subterms. A call toreplace x y t
replaces any instance ofx
insidet
withy
.
Assessing the size of a term can be particularly useful during benchmarking. SAWScript provides two mechanisms for this.
term_size : Term -> Int
calculates the number of nodes in the Directed Acyclic Graph (DAG) representation of aTerm
used internally by SAW. This is the most appropriate way of determining the resource use of a particular term.term_tree_size : Term -> Int
calculates how large aTerm
would be if it were represented by a tree instead of a DAG. This can, in general, be much, much larger than the number returned byterm_size
, and serves primarily as a way of assessing, for a specific term, how much benefit there is to the term sharing used by the DAG representation.
Finally, there are a few commands related to the internal SAWCore type of a
Term
.
check_term : Term -> TopLevel ()
checks that the internal structure of aTerm
is well-formed and that it passes all of the rules of the SAWCore type checker.type : Term -> Type
returns the type of a particularTerm
, which can then be used to, for example, construct a new fresh variable withfresh_symbolic
.
Loading and Storing Terms
Most frequently, Term
values in SAWScript come from Cryptol, JVM, or
LLVM programs, or some transformation thereof. However, it is also
possible to obtain them from various other sources.
parse_core : String -> Term
parses aString
containing a term in SAWCore syntax, returning aTerm
.read_core : String -> TopLevel Term
is likeparse_core
, but obtains the text from the given file and expects it to be in the simpler SAWCore external representation format, rather than the human-readable syntax shown so far.read_aig : String -> TopLevel Term
returns aTerm
representation of an And-Inverter-Graph (AIG) file in AIGER format.read_bytes : String -> TopLevel Term
reads a constant sequence of bytes from a file and represents it as aTerm
. Its result will always have Cryptol type[n][8]
for somen
.
It is also possible to write Term
values into files in various
formats, including: AIGER (write_aig
), CNF (write_cnf
), SAWCore
external representation (write_core
), and SMT-Lib version 2
(write_smtlib2
).
write_aig : String -> Term -> TopLevel ()
write_cnf : String -> Term -> TopLevel ()
write_core : String -> Term -> TopLevel ()
write_smtlib2 : String -> Term -> TopLevel ()