Extraction to the Coq theorem prover
In addition to the (semi-)automatic and compositional proof modes already discussed above, SAW has experimental support for exporting Cryptol and SAWCore values as terms to the Coq proof assistant[5]. This is intended to support more manual proof efforts for properties that go beyond what SAW can support (for example, proofs requiring induction) or for connecting to preexisting formalizations in Coq of useful algorithms (e.g. the fiat crypto library[6]).
This support consists of two related pieces. The first piece is a library of formalizations of the primitives underlying Cryptol and SAWCore and various supporting concepts that help bridge the conceptual gap between SAW and Coq. The second piece is a term translation that maps the syntactic forms of SAWCore onto corresponding concepts in Coq syntax, designed to dovetail with the concepts defined in the support library. SAWCore is a quite similar language to the core calculus underlying Coq, so much of this translation is quite straightforward; however, the languages are not exactly equivalent, and there are some tricky cases that mostly arise from Cryptol code that can only be partially supported. We will note these restrictions later in the manual.
We expect this extraction process to work with a fairly wide range of Coq versions, as we are not using bleeding-edge Coq features. It has been most fully tested with Coq version 8.13.2.
Support Library
In order to make use of SAW’s extraction capabilities, one must first
compile the support library using Coq so that the included definitions
and theorems can be referenced by the extracted code. From the top
directory of the SAW source tree, the source code for this support
library can be found in the saw-core-coq/coq
subdirectory.
In this subdirectory you will find a _CoqProject
and a Makefile
.
A simple make
invocation should be enough to compile
all the necessary files, assuming Coq is installed and coqc
is
available in the user’s PATH
. HTML documentation for the support
library can also be generated by make html
from the same directory.
Once the library is compiled, the recommended way to import
it into your subsequent development is by adding the following
lines to your _CoqProject
file:
-Q <SAWDIR>/saw-core-coq/coq/generated/CryptolToCoq CryptolToCoq
-Q <SAWDIR>/saw-core-coq/coq/handwritten/CryptolToCoq CryptolToCoq
Here <SAWDIR>
refers to the location on your system where the
SAWScript source tree is checked out. This will add the relevant
library files to the CryptolToCoq
namespace, where the extraction
process will expect to find them.
The support library for extraction is broken into two parts: those
files which are handwritten, versus those that are automatically
generated. The handwritten files are generally fairly readable and
are reasonable for human inspection; they define most of the
interesting pipe-fitting that allows Cryptol and SAWCore definitions
to connect to corresponding Coq concepts. In particular the
file SAWCoreScaffolding.v
file defines most of the bindings of base
types to Coq types, and the SAWCoreVectorsAsCoqVectors.v
defines the
core bitvector operations. The automatically generated files are direct
translations of the SAWCore source files
(saw-core/prelude/Prelude.sawcore
and
cryptol-saw-core/saw/Cryptol.sawcore
) that correspond to the
standard libraries for SAWCore and Cryptol, respectively.
The autogenerated files are intended to be kept up-to-date with
changes in the corresponding sawcore
files, and end users should
not need to generate them. Nonetheless, if they are out of sync for some
reason, these files may be regenerated using the saw
executable
by running (cd saw-core-coq; saw saw/generate_scaffolding.saw)
from the top-level of the SAW source directory before compiling them
with Coq as described above.
You may also note some additional files and concepts in the standard
library, such as CompM.v
, and a variety of lemmas and definitions
related to it. These definitions are related to the “heapster” system,
which form a separate use-case for the SAWCore to Coq translation.
These definitions will not be used for code extracted from Cryptol.
Cryptol module extraction
There are several modes of use for the SAW to Coq term extraction facility, but the easiest to use is whole Cryptol module extraction. This will extract all the definitions in the given Cryptol module, together with it’s transitive dependencies, into a single Coq module which can then be compiled and pulled into subsequence developments.
Suppose we have a Cryptol source file named source.cry
and we want
to generate a Coq file named output.v
. We can accomplish this by
running the following commands in saw (either directly from the saw
command prompt, or via a script file)
enable_experimental;
write_coq_cryptol_module "source.cry" "output.v" [] [];
In this default mode, identifiers in the Cryptol source will be
directly translated into identifiers in Coq. This may occasionally
cause problems if source identifiers clash with Coq keywords or
preexisting definitions. The third argument to
write_coq_cryptol_module
can be used to remap such names if
necessary by giving a list of (in,out)
pairs of names. The fourth
argument is a list of source identifiers to skip translating, if
desired. Authoritative online documentation for this command can be
obtained directly from the saw
executable via :help write_coq_cryptol_module
after enable_experimental
.
The resulting “output.v” file will have some of the usual hallmarks of computer-generated code; it will have poor formatting and, explicit parenthesis and fully-qualified names. Thankfully, once loaded into Coq, the Coq pretty-printer will do a much better job of rendering these terms in a somewhat human-readable way.
Proofs involving uninterpreted functions
It is possible to write a Cryptol module that references uninterpreted
functions by using the primitive
keyword to declare them in your
Cryptol source. Primitive Cryptol declarations will be translated into
Coq section variables; as usual in Coq, uses of these section
variables will be translated into additional parameters to the
definitions from outside the section. In this way, consumers of
the translated module can instantiate the declared Cryptol functions
with corresponding terms in subsequent Coq developments.
Although the Cryptol interpreter itself will not be able to compute
with declared but undefined functions of this sort, they can be used
both to provide specifications for functions to be verified with
llvm_verify
or jvm_verify
and also for Coq extraction.
For example, if I write the following Cryptol source file:
primitive f : Integer -> Integer
g : Integer -> Bool
g x = f (f x) > 0
After extraction, the generated term g
will have Coq type:
(Integer -> Integer) -> Integer -> Bool
Translation limitations and caveats
Translation from Cryptol to Coq has a number of fundamental
limitations that must be addressed. The most severe of these is that
Cryptol is a fully general-recursive language, and may exhibit runtime
errors directly via calls to the error
primitive, or via partial
operations (such as indexing a sequence out-of-bounds). The internal
language of Coq, by contrast, is a strongly-normalizing language of
total functions. As such, our translation is unable to extract all
Cryptol programs.
Recursive programs
The most severe of the currently limitations for our system
is that the translation is unable to translate any recursive Cryptol
program. Doing this would require us to attempt to find some
termination argument for the recursive function sufficient to satisfy
Coq; for now, no attempt is made to do so. if you attempt to
extract a recursive function, SAW will produce an error
about a “malformed term” with Prelude.fix
as the head symbol.
Certain limited kinds of recursion are available via the
foldl
Cryptol primitive operation, which is translated directly
into a fold operation in Coq. This is sufficient for many basic
iterative algorithms.
Type coercions
Another limitation of the translation system is that Cryptol uses SMT
solvers during its typechecking process and uses the results of solver
proofs to justify some of its typing judgments. When extracting these
terms to Coq, type equality coercions must be generated. Currently, we
do not have a good way to transport the reasoning done inside
Cryptol’s typechecker into Coq, so we just supply a very simple Ltac
tactic to discharge these coercions (see solveUnsafeAssert
in
CryptolPrimitivesForSAWCoreExtra.v
). This tactic is able to
discover simple coercions, but for anything nontrivial it may
fail. The result will be a typechecking failure when compiling the
generated code in Coq when the tactic fails. If you encounter this
problem, it may be possible to enhance the solveUnsafeAssert
tactic
to cover your use case.
Error terms
A final caveat that is worth mentioning is that Cryptol can sometimes
produce runtime errors. These can arise from explicit calls to the
error
primitive, or from partially defined operations (e.g.,
division by zero or sequence access out of bounds). Such instances
are translated to occurrences of an unrealized Coq axiom named error
.
In order to avoid introducing an inconsistent environment, the error
axiom is restricted to apply only to inhabited types. All the types
arising from Cryptol programs are inhabited, so this is no problem
in principle. However, collecting and passing this information around
on the Coq side is a little tricky.
The main technical device we use here is the Inhabited
type class;
it simply asserts that a type has some distinguished inhabitant. We
provide instances for the base types and type constructors arising
from Cryptol, so the necessary instances ought to be automatically
constructed when needed. However, polymorphic Cryptol functions add
some complications, as type arguments must also come together with
evidence that they are inhabited. The translation process takes care
to add the necessary Inhabited
arguments, so everything ought to
work out. However, if Coq typechecking of generated code fails with
errors about Inhabited
class instances, it likely represents some
problem with this aspect of the translation.