Structure of SAWScript
A SAWScript program consists, at the top level, of a sequence of
commands to be executed in order. Each command is terminated with a
semicolon. For example, the print
command displays a textual
representation of its argument. Suppose the following text is stored in
the file print.saw
:
print 3;
The command saw print.saw
will then yield output similar to the
following:
Loading module Cryptol
Loading file "print.saw"
3
The same code can be run from the interactive REPL:
sawscript> print 3;
3
At the REPL, terminating semicolons can be omitted:
sawscript> print 3
3
To make common use cases simpler, bare values at the REPL are treated as
if they were arguments to print
:
sawscript> 3
3
One SAWScript file can be included in another using the include
command, which takes the name of the file to be included as an argument.
For example:
sawscript> include "print.saw"
Loading file "print.saw"
3
Typically, included files are used to import definitions, not perform side effects like printing. However, as you can see, if any commands with side effects occur at the top level of the imported file, those side effects will occur during import.
Parts of a SAW Script
Warning
This section is under construction!
A First Simple Example
Warning
This section is under construction!
To get started with SAW, let’s see what it takes to verify simple programs that do not use pointers (or that use them only internally). Consider, for instance the C program that adds its two arguments together:
#include <stdint.h>
uint32_t add(uint32_t x, uint32_t y) {
return x + y;
}
We can specify this function’s expected behavior as follows:
let add_setup = do {
x <- llvm_fresh_var "x" (llvm_int 32);
y <- llvm_fresh_var "y" (llvm_int 32);
llvm_execute_func [llvm_term x, llvm_term y];
llvm_return (llvm_term {{ x + y : [32] }});
};
We can then compile the C file add.c
into the bitcode file add.bc
and verify it with ABC:
m <- llvm_load_module "add.bc";
add_ms <- llvm_verify m "add" [] false add_setup abc;
Syntax
The syntax of SAWScript is reminiscent of functional languages such as
Cryptol, Haskell and ML. In particular, functions are applied by
writing them next to their arguments rather than by using parentheses
and commas. Rather than writing f(x, y)
, write f x y
.
Comments are written as in C, Java, and Rust (among many other languages). All
text from //
until the end of a line is ignored. Additionally, all
text between /*
and */
is ignored, regardless of whether the line
ends.
Basic Types and Values
All values in SAWScript have types, and these types are determined and checked before a program runs (that is, SAWScript is statically typed). The basic types available are similar to those in many other languages.
The
Int
type represents unbounded mathematical integers. Integer constants can be written in decimal notation (e.g.,42
), hexadecimal notation (0x2a
), and binary (0b00101010
). However, unlike many languages, integers in SAWScript are used primarily as constants. Arithmetic is usually encoded in Cryptol, as discussed in the next section.The Boolean type,
Bool
, contains the valuestrue
andfalse
, like in many other languages. As with integers, computations on Boolean values usually occur in Cryptol.Values of any type can be aggregated into tuples. For example, the value
(true, 10)
has the type(Bool, Int)
.Values of any type can also be aggregated into records, which are exactly like tuples except that their components have names. For example, the value
{ b = true, n = 10 }
has the type{ b : Bool, n : Int }
.A sequence of values of the same type can be stored in a list. For example, the value
[true, false, true]
has the type[Bool]
.Strings of textual characters can be represented in the
String
type. For example, the value"example"
has typeString
.The “unit” type, written
()
, is essentially a placeholder, similar tovoid
in languages like C and Java. It has only one value, also written()
. Values of type()
convey no information. We will show in later sections several cases where this is useful.Functions are given types that indicate what type they consume and what type they produce. For example, the type
Int -> Bool
indicates a function that takes anInt
as input and produces aBool
as output. Functions with multiple arguments use multiple arrows. For example, the typeInt -> String -> Bool
indicates a function in which the first argument is anInt
, the second is aString
, and the result is aBool
. It is possible, but not necessary, to group arguments in tuples, as well, so the type(Int, String) -> Bool
describes a function that takes one argument, a pair of anInt
and aString
, and returns aBool
.
SAWScript also includes some more specialized types that do not have straightforward counterparts in most other languages. These will appear in later sections.
Basic Expression Forms
One of the key forms of top-level command in SAWScript is a binding,
introduced with the let
keyword, which gives a name to a value. For
example:
sawscript> let x = 5
sawscript> x
5
Bindings can have parameters, in which case they define functions. For instance, the following function takes one parameter and constructs a list containing that parameter as its single element.
sawscript> let f x = [x]
sawscript> f "text"
["text"]
Functions themselves are values and have types. The type of a function
that takes an argument of type a
and returns a result of type b
is
a -> b
.
Function types are typically inferred, as in the example f
above. In this case, because f
only creates a list with the given
argument, and because it is possible to create a list of any element
type, f
can be applied to an argument of any type. We say, therefore,
that f
is polymorphic. Concretely, we write the type of f
as {a} a -> [a]
, meaning it takes a value of any type (denoted a
) and
returns a list containing elements of that same type. This means we can
also apply f
to 10
:
sawscript> f 10
[10]
However, we may want to specify that a function has a more
specific type. In this case, we could restrict f
to operate only on
Int
parameters.
sawscript> let f (x : Int) = [x]
This will work identically to the original f
on an Int
parameter:
sawscript> f 10
[10]
However, it will fail for a String
parameter:
sawscript> f "text"
type mismatch: String -> t.0 and Int -> [Int]
at "_" (REPL)
mismatched type constructors: String and Int
Type annotations can be applied to any expression. The notation (e : t)
indicates that expression e
is expected to have type t
and
that it is an error for e
to have a different type. Most types in
SAWScript are inferred automatically, but specifying them explicitly can
sometimes enhance readability.
Because functions are values, functions can return other functions. We
make use of this feature when writing functions of multiple arguments.
Consider the function g
, similar to f
but with two arguments:
sawscript> let g x y = [x, y]
Like f
, g
is polymorphic. Its type is {a} a -> a -> [a]
. This
means it takes an argument of type a
and returns a function that
takes an argument of the same type a
and returns a list of a
values.
We can therefore apply g
to any two arguments of the same type:
sawscript> g 2 3
[2,3]
sawscript> g true false
[true,false]
But type checking will fail if we apply it to two values of different types:
sawscript> g 2 false
type mismatch: Bool -> t.0 and Int -> [Int]
at "_" (REPL)
mismatched type constructors: Bool and Int
So far we have used two related terms, function and command, and we
take these to mean slightly different things. A function is any value
with a function type (e.g., Int -> [Int]
). A command is any value with
a special command type (e.g. TopLevel ()
, as shown below). These
special types allow us to restrict command usage to specific contexts,
and are also parameterized (like the list type). Most but not all
commands are also functions.
The most important command type is the TopLevel
type, indicating a
command that can run at the top level (directly at the REPL, or as one
of the top level commands in a script file). The print
command
has the type {a} a -> TopLevel ()
, where TopLevel ()
means that it
is a command that runs in the TopLevel
context and returns a value of
type ()
(that is, no useful information). In other words, it has a
side effect (printing some text to the screen) but doesn’t produce any
information to use in the rest of the SAWScript program. This is the
primary usage of the ()
type.
It can sometimes be useful to bind a sequence of commands together. This
can be accomplished with the do { ... }
construct. For example:
sawscript> let print_two = do { print "first"; print "second"; }
sawscript> print_two
first
second
The bound value, print_two
, has type TopLevel ()
, since that is the
type of its last command.
Note that in the previous example the printing doesn’t occur until
print_two
directly appears at the REPL. The let
expression does not
cause those commands to run. The construct that runs a command is
written using the <-
operator. This operator works like let
except
that it says to run the command listed on the right hand side and bind
the result, rather than binding the variable to the command itself.
Using <-
instead of let
in the previous example yields:
sawscript> print_two <- do { print "first"; print "second"; }
first
second
sawscript> print print_two
()
Here, the print
commands run first, and then print_two
gets the
value returned by the second print
command, namely ()
. Any command
run without using <-
at either the top level of a script or within a
do
block discards its result. However, the REPL prints the result of
any command run without using the <-
operator.
In some cases it can be useful to have more control over the value
returned by a do
block. The return
command allows us to do this. For
example, say we wanted to write a function that would print a message
before and after running some arbitrary command and then return the
result of that command. We could write:
let run_with_message msg c =
do {
print "Starting.";
print msg;
res <- c;
print "Done.";
return res;
};
x <- run_with_message "Hello" (return 3);
print x;
If we put this script in run.saw
and run it with saw
, we get
something like:
Loading module Cryptol
Loading file "run.saw"
Starting.
Hello
Done.
3
Note that it ran the first print
command, then the caller-specified
command, then the second print
command. The result stored in x
at
the end is the result of the return
command passed in as an argument.
Other Basic Functions
Aside from the functions we have listed so far, there are a number of other operations for working with basic data structures and interacting with the operating system.
The following functions work on lists:
concat : {a} [a] -> [a] -> [a]
takes two lists and returns the concatenation of the two.head : {a} [a] -> a
returns the first element of a list.tail : {a} [a] -> [a]
returns everything except the first element.length : {a} [a] -> Int
counts the number of elements in a list.null : {a} [a] -> Bool
indicates whether a list is empty (has zero elements).nth : {a} [a] -> Int -> a
returns the element at the given position, withnth l 0
being equivalent tohead l
.for : {m, a, b} [a] -> (a -> m b) -> m [b]
takes a list and a function that runs in some command context. The passed command will be called once for every element of the list, in order. Returns a list of all of the results produced by the command.
For interacting with the operating system, we have:
get_opt : Int -> String
returns the command-line argument tosaw
at the given index. Argument 0 is always the name of thesaw
executable itself, and higher indices represent later arguments.exec : String -> [String] -> String -> TopLevel String
runs an external program given, respectively, an executable name, a list of arguments, and a string to send to the standard input of the program. Theexec
command returns the standard output from the program it executes and prints standard error to the screen.exit : Int -> TopLevel ()
stops execution of the current script and returns the given exit code to the operating system.
Finally, there are a few miscellaneous functions and commands:
show : {a} a -> String
computes the textual representation of its argument in the same way asprint
, but instead of displaying the value it returns it as aString
value for later use in the program. This can be useful for constructing more detailed messages later.str_concat : String -> String -> String
concatenates twoString
values, and can also be useful withshow
.time : {a} TopLevel a -> TopLevel a
runs any otherTopLevel
command and prints out the time it took to execute.with_time : {a} TopLevel a -> TopLevel (Int, a)
returns both the original result of the timed command and the time taken to execute it (in milliseconds), without printing anything in the process.
REPL Actions
There is an additional class of things that one may type at the REPL for interactive use:
:cd
changes the REPL’s current directory.:pwd
prints the REPL’s current directory.:env
displays the values and types of all currently bound variables, including built-in functions and commands.:search
with one or more types (complex types go in parentheses) searches the currently bound variables, including built-in functions and commands, and prints those that mention all the types cited. You can use_
as a wildcard. Free type variables are treated as pattern constraints; use forall-bound type variables using the{a}
syntax to search specifically for forall-bound types.:tenv
displays the expansions of all currently defined type aliases, including those that are built in.:type
or:t
checks and prints the type of an arbitrary SAWScript expression:
sawscript> :t show
{a.0} a.0 -> String
:help
or:h
prints the help text for a built-in function or command:
sawscript> :h show
Description
-----------
show : {a} a -> String
Convert the value of the given expression to a string.
:quit
or:q
exits the program.
Further built-in functions and commands
SAW contains many built-in operations, referred to as “primitives.” These appear in SAWScript as built-in functions and commands. The following sections of the manual will introduce many of these.
Experimental and deprecated functions and commands
Some of the primitives available in SAW at any given time are
experimental.
These may be incomplete, unfinished, use at your own risk, etc.
The functions and commands associated with these are unavailable
by default; they can be made visible with the enable_experimental
command.
Other primitives are considered deprecated. Some of these, as the deprecation process proceeds, are unavailable by default.
They can be made visible with the enable_deprecated
command.