Control Structures

If-then-else

Booleans may be used to choose between one of two parsers:

block
  let i = $[ '0'..'9' ]
  if (i - '0') > 5
    then Match 'X'
    else ^ 7

The parser above parses a decimal digit and if it is larger than 5 it will try to match 'X' from the input, otherwise it will succeed with semantic value 7.

Guards

Guards provide one way to examine a semantic value, and their general form is:

expression is shape

A guard is a parser that will succeed if the expression has the required shape.

Boolean Guards

Perhaps the most common guard is on boolean semantic values, which may be used to control whether parsing should continue. For example, the following parser uses the guard (i - '0') > 5 is true to continue parsing (on the given alternative) only for digits larger than 5.

block
  let i = $[ '0'..'9' ]
  First
    block
      (i - '0') > 5 is true
      ^ "input gt 5"
    ^ "input leq 5"

So, if p is a boolean value, then p is true is a parser that succeeds without consuming input if p holds, and fails otherwise. Similarly, p is false is a parser that would succeed only if p is false.

Guards on maybe

The type maybe also supports guards, with two shapes: just and nothing. For example e is just is a parser that will succeed only if e is of the shape just x form some x. In that case the result of the parser would be the value x. Guards that have no interesting result (e.g., e is true) simply return the trivial value {}.

Guards on Tagged Unions

The same notation may be used to examine values of user-defined union types (see Tagged Unions)

block
  let res = Choose
              good = $[ 'G' ]
              bad  = $[ 'B' ]

  Choose

    block
      res is good
      ^ "Success!"

    block
      res is bad
      ^ "Failure!"

Case

The case construct provides an alternative method for examining semantic values. The body of a case expression consists of a list of matches with the syntax pattern -> result. For example, the following expression has the same functionality as the previous example, but avoids the need for backtracking.

block
  let res = Choose
              good = $[ 'G' ]
              bad  = $[ 'B' ]
  case res of
    good -> ^ "Success!"
    bad  -> ^ "Failure!"

A case expression can extract the value from a tagged union. In this case, the match should have the form pattern var -> result.

block
  let res = Choose
              number = $[ '0'..'9' ]
              letter = $[ 'a'..'z' ]
              other  = UInt8
  case res of
    number n -> ^ (n - '0')
    letter l -> ^ (l - 'a')
    _        -> Fail "Something went wrong"

Here the special pattern _ -> result serves as a default, which matches against any value. Similarly, a pattern of the form pattern _ -> result indicates that the value will not be used in the result.

In a parser expression, case need not be total (i.e. cover all possible patterns) as any omitted matches will implicitly result in failure and backtracking. In non-parser contexts, all case expressions are required to be total.

Todo

It should be true that guards are just syntactic sugar for case

for loops

The for construct can be used to iterate over collections (arrays and dictionaries). A for-loop declares a local variable representing the accumulated result of the computation, and a variable that is bound to the elements of the collection. The body may be a parser, or a semantic value. For example, the following expression sums the values in an array of integers:

for (val = 0 : int; v in [1,2,3])
  val + v

Here, val is initially bound to 0. Each iteration of the loop binds v to the current element of the sequence, then computes the value of the body, val + v. This returned value is the updated value of val.

Another way to understand how this works is to see the following expression, which is the result of one step of evaluation:

for (val = 1; v in [2, 3])
  val + v

for supports an alternative form which binds both the index and value of a collection. For example, the following loop multiplies each element in the sequence by its index:

for (val = 0; i,v in [1,2,3])
  val + (i * v)

This construct is also useful when iterating over the contents of dictionaries, where the index is bound to the key. The following loop is a parser which fails when the value is less than the key:

for (val = 0; k,v in d)
  k <= v is true

Traversing with map

DaeDaLus supports another iteration construct, map. This performs an operation on each element of a sequence, resulting in a sequence of results. For example, the following code doubles each element in an array:

map (x in [1:int, 2, 3])
  2 * x

The map construct can be used to parse a sequence of blocks, based on a sequence of values. For example the following code parses blocks of the form 0AAA..., with the number of 'A' characters dictated by the input sequence.

map (x in [1, 2, 3])
  block
    $[ '0' ]
    Many x $[ 'A' ]

Just as with for, the map construct has an alternative form that includes both sequence indexes and values:

map (i,x in [5, 2, 1])
  block
    $[ '0' ]
    len       = ^ { index = i, elem = x }
    something = Many x $['A']

many loops

many loops are similar to for loops and the Many combinator.

  • The loop stops when the parser in the loop’s body fails, which is just like the Many combinator.

  • The body is parameterized by a piece of state, which is initialized at the start of the loop, and is updated by the result of the body on each iteration.

For example, this is how one might parse a base 10 number:

many (s = 0) (10 * s + Digit)

In the example above, the state variable is s and is initialized to 0. The loop body is evaluated, and if it succeeds, its value replaces s, which in the example above means that the loop body’s result is assigned to s, i.e., s = 10 * s + Digit. The result of the many is the result of the body of its last successful loop iteration.

There is also a variant called many? which is similar to Many? in that the loop repeats just enough times to make the parser succeed.

Warning

It is quite easy to write inefficient or ambiguous parser when using many? so it should be avoided, if possible.

Commit

Warning

commit is an unstable experimental feature and its behavior may change or it may be removed entirely.

Normally, at the point a parser fails, DaeDaLus will backtrack to a choice point and try an alternative parser. The commit guard acts as a cut-point and prevents backtracking. For example, the following code cannot parse the string "AC" because parsing 'A' and the subsequent commit will prevent backtracking reaching the alternative branch.

First
  { $['A']; commit; $[ 'B' ] }
  { $['A'];         $[ 'C' ] }  -- Can't happen

The try construct converts commit failure into parser failure. A commit failure will propagate until it hits an enclosing try construct, or until it escapes the top-level definition.