(sawscript-intro)=
# Quick Introduction to the SAWScript Language

SAWScript is the metalanguage for the SAW system: it is the framework
used to load specifications, state theorems, set up proofs, and solve
proof goals.
SAW's functionality is, in general, accessed by executing SAWScript
builtins from SAWScript scripts.

One can program in SAWScript; however, ordinary usage does not really
require it; it's generally sufficient to write sequences of builtin
invocations according to the manual and sometimes group them into
blocks.
However, there's enough language involved that we should stop briefly
for a quick introduction and tour.

As discussed in the previous chapter, the most common way to run SAW
(when using SAWScript at all) is on a file of SAWScript code,
although there's also a REPL.
In this chapter we'll outline what goes in those files and just enough
of the syntax and semantics to get started.
A fuller treatment can be found in [a later chapter](#sawscript).

A SAWScript file is a sequence of SAWScript statements executed in
order, top to bottom, each terminated with a semicolon.
Most of the statements you will use are calls to SAWScript builtin
functions, so the first topic is how to call a function.

This script:

:::{code-block} sawscript
print 3;
:::

calls the `print` builtin with the argument `3`, and will print
the following when run:

:::{code-block} console
Loading file "print.saw"
3
:::

Note that there are no parentheses around argument lists like there
are in C and Rust.
The syntax for function calls is like ML and Haskell: you just put
the argument next to the function name.

## Types

The basic types in SAWScript are:

| Name                          | Description    | Example values  |
| ----------------------------- | -------------- | --------------- |
| `Bool`                        | truth values   | `true`, `false` |
| `Int`                         | integers       | `3`             |
| `String`                      | strings        | `"abc"`         |
| `[`_`a`_`]`                   | lists of _`a`_ | `[1, 2, 3]`     |
| `()`                          | unit           | `()`            |
| `(`_`a`_`,` _`b`_`, ...)`     | tuples         | `(1, "abc")`    |
| _`a`_ `->` _`b`_              | functions      | `print`         |

There are many other types, most of them abstract builtin types for
working with different kinds of verification tasks.
These will be introduced in later chapters.

Functions are curried; that means a function of two arguments
is the same type as a function of one argument that returns a
function taking the other argument.

## Looking up Builtins

To look up a builtin, start the REPL and use the `:h` (help) command
with the builtin name:

:::{code-block} console
sawscript> :h print
Description
-----------

    print : {a} a -> TopLevel ()

Print the value of the given expression.
sawscript> 
:::

This gives both the type and the builtin's documentation.
You can use `:t` instead to get just the type.

Here the type is polymorphic: print will take a value of any type
(the syntax `{a}` is a forall-binding, like in Cryptol).

The `TopLevel` annotation in the return type is a monad.

(intro-monads)=
## Monads

SAWScript is a monadic language.
If you have seen programming-language monads (from Haskell or
elsewhere) before, this will be familiar.
If not, don't worry: there's no need to understand how it works under
the covers or at any sort of deep level, and no experience with Haskell
(or for that matter category theory) is needed.

You can think of SAWScript's monads as contexts for code to run in.
There are five of them, all hard-wired, and no facilities for creating
more or doing anything fancy with them.

There are, accordingly, two kinds of builtin.

Builtins that belong to a monad have a return type of the form _`M`_
_`ty`_, where _`M`_ is one of the monads, and the type _`ty`_ is the
function's actual return type.
Thus the `print` builtin shown above executes in the `TopLevel` monad
and returns the unit value `()`.

Builtins that have an ordinary return type are not in any monad
("pure") and can be used anywhere, but are invoked in a slightly
different way, as we'll see in a moment.

For example, the `length` builtin that returns the length of a list:

:::{code-block} console
sawscript> :h length
Description
-----------

    length : {a} [a] -> Int

Compute the length of a list.
sawscript> 
:::

The five SAWScript monads are:

- `TopLevel` is the main monad.
  The syntactic top level, that is, the statements in SAWScript files
  that aren't nested in something else, as well as statements typed
  into the REPL, run in `TopLevel`.

- `ProofScript` is a context for proof tactics.
  Writing your own proof scripts is an advanced topic; see the
  [Interactive Proofs](#interactive-proofs) chapter.

- `LLVMSetup` is a context for assembling specifications for
  LLVM code.

- `JVMSetup` is a context for assembling specifications for
  JVM code.

- `MIRSetup` is a context for assembling specifications for
  MIR (Rust) code.

The tools available in the latter three monads are discussed in the
[Verifying Code](#verifying-code) chapter.
For now, we'll concentrate on `TopLevel`.
The other monads use the same syntax and work in the same way.

Because of the monads there are two ways to call a function.

A pure function like `length` that isn't in a monad is called by
let-binding a variable to hold its return value, then passing it the
right number of arguments:

:::{code-block} sawscript
let x = length [2, 4, 6];
:::

This will bind a new variable `x` and store 3 in it.

A pure "function" that doesn't take any arguments is just a value.
The boolean constants `true` and `false` are examples.

Meanwhile, a function in a monad is called by using `<-` instead of
`let`:

:::{code-block} sawscript
thm <- prove_print z3 {{ 1 < 2 }};
:::

This calls the solver z3 to prove that 1 is less than 2 (more on this
soon) and returns a theorem value, which we save in a new variable
`thm`.

If you don't want the return value, because you don't need it or
because it's `()`, you can throw it away by binding it to `_`.
You can also leave off the `<-` entirely and just call the function:

:::{code-block} sawscript
_ <- print 3;
print 3;
:::

You can think of `<-` as executing the monad function on its right
hand side.
(What it actually does is give it the current context to run in.
Thus, you can't use `<-` with functions in other monads, only the
one you're currently in.
So you can only use `TopLevel` functions at the top level, and you
can't use `TopLevel` functions when assembling an LLVM specification
in `LLVMSetup`.)

A monad function that takes no arguments does nothing until it's
executed with `<-`.

## `return`

The `return` builtin is a monad function that takes an argument and
hands it back when executed.
Thus these statements are equivalent:

:::{code-block} sawscript
let x = 3;
x <- return 3;
:::

Note though that, confusingly, `return` is _not_ a control-flow
operator; it just creates a monad function out of an ordinary
value.
Blame Haskell (where it originated) for the confusion.

## `do` Blocks

One can group a series of statements into a nested block.
The syntax for this is `do {` _`statement...`_ `}`; hence these
are called `do` blocks.
A do-block is a monad function that takes no arguments.
The last entry in a `do` block must be an expression without a
variable binding; it produces a result value for the block.
It will often be the `return` builtin; this is how the name
originated.

The statements in a `do` block must all be in the same monad, but
it can be any monad.
To write an LLVM specification using the `LLVMSetup` monad, write a
`do` block using the `LLVMSetup` builtins (as noted before, these are
introduced in the
[Verifying Code](#verifying-code) chapter)
and, to avoid trying to execute it in `TopLevel`, let-bind it
rather than using `<-`.

Thus for example:

:::{code-block} sawscript
let spec = do {
   llvm_execute_func [];
   return ();
};
:::

The last statement produces `()` as the result value for the block, which
is what's expected for these specifications.

Here `spec` gets bound to a no-arguments monad function of type
`LLVMSetup ()`, which can then be passed in for verification.
The verification backend creates and applies the `LLVMSetup` context
needed to execute it.

This works the same way for the other setup monads.

## Expressions

The arguments to functions are expressions, not statements.
The right-hand side of a `let` statement is also an expression.
Thus calls to pure functions can appear as arguments to functions
(you'll need parentheses to group them).
Calls to monadic functions cannot.
You must execute the monadic function with `<-` and bind the result to
a new variable, then pass the variable as the argument expression.

There is one exception to this: when the argument is itself supposed to
be a monadic function.
In this case you should pass the monadic function itself without executing
it with `<-`.
This comes up with the `ProofScript` arguments to verification functions
as well as the `LLVMSetup`/`JVMSetup`/`MIRSetup` specification blocks.
Builtins like these that take monadic functions directly execute them
themselves at the proper point.

SAWScript also has no infix operators; expressions are basically either
literals of various kinds or function calls.
Computations are done in Cryptol.

## Common Pitfalls

Because functions are curried, missing arguments are not an error; instead
you get a function value back.
If you didn't intend this it will cause potentially confusing type errors
downstream.

Another common problem is to accidentally leave off the semicolon at
the end of a statement, especially one that's a function call.
If the next line is also a function call, it and its arguments will be
interpreted as excess arguments to the first function, also producing
potentially confusing type errors.

## The REPL

The REPL accepts statements in `TopLevel`.
(There is also a `ProofScript` REPL, but that's an advanced topic.)
You may leave off the semicolons at the ends of statements.

Currently it does not accept pure expressions.
This is a shortcoming related to ongoing cleanup of the internals and
will eventually be fixed.
As a workaround, pure expressions can be evaluated at the REPL by
passing them to `return` or `print`:

:::{code-block} sawscript
sawscript> return (length [2, 4, 6])
3
sawscript> 
:::

## Experimental and Deprecated Builtins

Some of the primitives available in SAW at any given time are
experimental.
These may be incomplete, unfinished, use at your own risk, etc.
They may also be unstable and change type or behavior in future
releases.
The functions and commands associated with these are unavailable
by default; they can be made visible with the `enable_experimental`
command.

Other primitives are considered deprecated.
Some of these, as the
[deprecation process](appendices/formal-deprecation-process.md) proceeds, are
unavailable by default.

They can be made visible with the `enable_deprecated` command.


## Haskell-Relative Notes

Haskell veterans will want to know that SAWScript is eagerly evaluated,
not lazily, even though it's monadic.
For non-Haskell folks, this basically means that execution works the
way you expect and things never happen out of order.