Getting Started
DaeDaLus is a domain-specific programming language for specifying parsers. It supports data-dependent parsing, meaning that the behavior of a parser can be affected by the semantic values parsed from other parts of the input. This makes DaeDaLus extremely well-suited for the concise and precise specification of both text and binary formats.
DaeDaLus parser specifications may be directly applied to inputs via the
daedalus
interpreter to produce serialized representations of the resulting
semantic values, or the specifications may be compiled to either Haskell or
C++ sources for parsers to be used in larger software ecosystems.
Note
This tutorial assumes that you have a working knowledge of common programming concepts and are comfortable working at the command line. Familiarity with functional programming will be particularly helpful but is not a strict requirement; any concepts needed will be covered in the relevant sections.
Installation
DaeDaLus currently has no binary releases available for installation, but it can be easily built and installed from source.
Since DaeDaLus is implemented in Haskell, you will need a suitable Haskell environment; the easiest way to set this up is with ghcup. Simply follow the instructions on that page, and you will have the ability to install the necessary components to build DaeDaLus.
At the time of this writing, we recommend that you use ghcup
to install
GHC 8.10.7 and Cabal 3.6.2.0:
> ghcup install ghc 8.10.7 && ghcup set ghc 8.10.7
> ghcup install cabal 3.6.2.0 && ghcup set cabal 3.6.2.0
If you prefer, you can run ghcup tui
to interactively install and set
default GHC and Cabal versions.
With Haskell installed, clone
the DaeDaLus repository. Once
cloned, cd
into the repository root and run:
> cabal install exe:daedalus --installdir=DIR --overwrite-policy=always
This will build and install the executable daedalus
, placing links
to the executable in the directory DIR
. We recommend setting this
installation directory to something on your PATH
, so that the
interpreter can be invoked simply as daedalus
at the command line
– the remainder of this tutorial assumes this setup for brevity. The
--overwrite-policy=always
flag will make sure that, if you clone a
new version of daedalus
, this installation step will overwrite the
existing versions of the executable installed on the system.
The DaeDaLus Command-Line Interface
Once DaeDaLus has been compiled and installed, we recommend reading
The Command-Line Tools section to become familiar with how to run
the DaeDaLus tool. For the purposes of this tutorial, it will be helpful
to be familiar with the show-types
DaeDaLus command. This command
is useful to get DaeDaLus to show you the types of the elements of your
specification, both to check that your specification is valid and also
to serve as a learning aid while going through the tutorial.
DaeDaLus Syntax Highlighting / Editing Modes
The DaeDaLus repository ships with support for editing DaeDaLus
specifications in a few popular editors. The files can be found in the
syntax-highlight
subdirectory of the DaeDaLus repository. Please see
your editor’s documentation for details on how to install the files for
your editor of choice.
Downloading The Sample Specifications
For convenience, the two full specifications we look at in this tutorial, for
the PPM and PNG image formats, are provided as
a compressed TAR
. You should
follow along with the plain-ppm.ddl
specification for the first part, and
fill out the png-template.ddl
as you complete the exercises. png.ddl
is the full solution to all the PNG exercises.
Your First DaeDaLus Specification
In order to give a feel for what DaeDaLus specifications look like, we now present a well-known image format, PPM, and a DaeDaLus parser for it. This example will be broken down in detail in the following sections of the tutorial as a means of exploring the available language features.
The Portable PixMap Format
PPM is a simple image format designed to make exchange between different platforms easy. For the purposes of this introduction, we’ll be looking specifically at the ASCII PPM format, which describes color RGB images in a human-readable format. Informally, this format consists of:
A magic number identifying the file type (for ASCII PPM, this is
P3
)The dimensions of the image (width then height)
The maximum color value
A ‘matrix’ of RGB triples for each pixel defined in row-major order
The format also allows for single-line comments, but we will ignore these for now.
An Example PPM Image
Here is a small example of a PPM image:
P3
4 4
15
0 0 0 0 0 0 0 0 0 15 0 15
0 0 0 0 15 7 0 0 0 0 0 0
0 0 0 0 0 0 0 15 7 0 0 0
15 0 15 0 0 0 0 0 0 0 0 0
We can match this up with the format description given above:
The magic number is
P3
, indicating an ASCII RGB imageThe width and height are both
4
The maximum color value is
15
There is a four-by-four grid of triples, one triple per pixel
A DaeDaLus PPM Specification
Our goal now is to provide a DaeDaLus specification for this format, so that we may parse well-formed PPM values into semantic values for further processing in Haskell or C++ (you might imagine we are writing a program to transform images represented in this PPM format). Here it is:
1def Main =
2 block
3 $$ = PPM
4
5def Token P =
6 block
7 $$ = P
8 Many (1..) WS
9
10def PPM =
11 block
12 Match "P"
13 let version = Token Natural
14 version == 3 is true
15 width = Token Natural
16 height = Token Natural
17 maxVal = Token Natural
18 data = Many height (Many width RGB)
19
20def RGB =
21 block
22 red = Token Natural
23 green = Token Natural
24 blue = Token Natural
25
26def WS = $[0 | 9 | 12 | 32 | '\n' | '\r']
27
28def Natural =
29 block
30 let ds = Many (1..) Digit
31 ^ for (val = 0; d in ds) (addDigit val d)
32
33def Digit =
34 block
35 let d = $['0' .. '9']
36 ^ d - '0'
37
38def addDigit val d =
39 10 * val + (d as uint 64)
Note that this specification only specifies the format’s data layout and does not perform any validation of the image data (such as checking whether the color values don’t exceed the declared maximum value). Later, we’ll discuss the pros and cons of including validation in parsers and some strategies for deciding whether or not that is best left to other parts of the application consuming the formatted data. For now, let’s break down this example to understand the building blocks of parser specifications.