Extended Exercise: Defining Helpful Utilities

In this section we’ll get started writing a DaeDaLus specification for the PNG format. While you shouldn’t need to read much of the PNG layout specification for this tutorial, we provide it as a reference. In this tutorial, we’ll present any relevant information within the exercises themselves. If you want to write a full PNG implementation at some point, though, the full PNG specification has everything you’ll need.

To begin, open your favorite text editor and create a new file named png.ddl. This is where you’ll write the format specification. The first line of your file, after any comments you wish to add to introduce the module, should be:

import Daedalus

This will load all of the standard library features covered in the previous section, which you’ll start using right away.

Domain-Specific Parser Names

For total clarity in code, it is often useful to give new names to things that are otherwise quite generic (e.g. to represent lengths, we may be using floating-point numbers, but the type name length is more informative than float, so we might want to alias these names). In PNG, there are a number of examples of this being useful since unsigned 32-bit words are used for a number of very different components.

Exercise: Define two parsers, Length and Crc, that each parse a big-endian, 32-bit, unsigned integer.

Hint

You may want to review the section aside: the daedalus standard library if you’re not sure how to deal with the endianness.

Solution
def Length = BEUInt32
def Crc = BEUInt32

Exercise: There are a couple of places in the PNG specification where we have 1-bit flags, indicating whether some option is enabled or not. Write a parser FLAG that matches a byte that is 0 or 1.

Solution

We use a character class to write this very succinctly:

def FLAG = $[0 .. 1]

For brevity, we’ll prefer this syntax in the other provided solutions.

Null-terminated Strings

A null-terminated string is an array of characters terminated with the null character (ASCII value 0). These are also known as C strings since this is the representation used in C’s standard library string manipulation functions.

PNG makes use of null-terminated strings in a few places, so we need to be able to parse them. In particular, we need to be able to parse strings within a specific range of sizes and strings of unbounded length.

Exercise: Define a parser, NullChar, that parses the ASCII null byte.

Solution
def NullChar    = $[0]

Exercise: Define a parser, NonNullChar, that parses one non-null ASCII byte.

Solution
def NonNullChar = $[1 .. 255]

Exercise (Challenging): Define a parser, OMany, that behaves like Many, but takes the integer arguments as maybe values. Your parser should satisfy the following laws:

  • OMany nothing nothing       P = Many P

  • OMany nothing (just max)    P = Many (..max) P

  • OMany (just min) nothing    P = Many (min..) P

  • OMany (just min) (just max) P = Many (min..max) P

Hint

The above equations are the hard part! Your job is to write the ‘glue’ to bring it all together.

Solution
def OMany (omin:maybe (uint 64)) (omax:maybe (uint 64)) P =
  case omin of
    nothing  -> case omax of
                  nothing  -> Many P
                  just max -> Many (..max) P
    just min -> case omax of
                  nothing  -> Many (min..) P
                  just max -> Many (min..max) P

Note that the right-hand sides of each case arm is the right-hand side of one of the laws; this form of algebraic specification is very useful when writing functional code, as all that is left once we have established the laws is writing the appropriate pattern-matching code to cover each one.

Exercise (Challenging): Define a parser, NTString, that parses a null-terminated string between min and max characters in length, if bounds are provided (i.e. the bounds should be maybe values). The null character should not be included in the character count.

Hint

The OMany parser you wrote in the previous exercise should be extremely helpful.

Solution
def NTString (omin:maybe (uint 64)) (omax:maybe (uint 64)) =
  block
    $$ = OMany omin omax NonNullChar
    NullChar

Common Structures

Time and RGB color values are both critical parts of the PNG specification. The latter shouldn’t be surprising given that this is an image format, but the former is perhaps a little surprising: by the format’s definition, PNG chunks can carry last-modified time data.

Exercise: Define a parser RGB that consumes three bytes and returns them in a structure with fields red, green, and blue. The parsed bytes should be assigned to those fields in that order (i.e. red is the first byte, green the second, and blue the third).

Solution
def RGB =
  block
    red = UInt8
    green = UInt8
    blue = UInt8

While the order of fields in a structure doesn’t matter in general, when writing DaeDaLus specifications, we must be sure to write the fields in the order in which things will be parsed. If there is a clearer order to write the structure itself in, you can feel free to use the de-sugared form of sequence parsing discussed in an earlier section.

Exercise: The time format used by PNG is given by this table:

PNG Time Stamps

Year

Month

Day

Hour

Minute

Second

2 bytes

1 byte (1 - 12)

1 byte (1 - 31)

1 byte (0 - 23)

1 byte (0 - 59)

1 byte (0 - 60)

Write a parser UTCTime that produces a structure with these fields with the given value constraints. The order of columns in the table is the order the fields should be parsed and the year should be parsed in big-endian order.

Hint

Remember that you can use the syntax $[n .. m] to parse one byte with a value between n and m, inclusive.

Solution
def UTCTime =
  block
    year = BEUInt16
    month = $[1 .. 12]
    day = $[1 .. 31]
    hour = $[0 .. 23]
    minute = $[0 .. 59]
    second = $[0 .. 60]

In case you’re wondering why the second field allows for a value of 60: It’s to allow for leap-seconds, according to the PNG specification.