Extended Exercise: PNG Chunks

Now, we’ll use our utilities and more of the knowledge we’ve developed to build up parsers for PNG chunks themselves, culminating in a top-level parser that can consume full PNG images.

The central consideration in this part of the tutorial will be the PNG chunk type. In the PNG format, chunk types are given by special 4-byte sequences identifying the kind of data that is carried by the chunk. The possible chunk types are given explicitly in the PNG specification. PNG chunk types will require us to look at specific bits in the 4-byte sequence. Although tagged sums are close to sufficient for this purpose, they are not sufficient on their own. To handle chunk types properly, we’ll need to be able to handle specific bit arrangements in the PNG data. The next section introduces bitdata, a DaeDaLus feature that we’ll use for this purpose.

Introducing bitdata

In general, a bitdata specifies how bytes should be broken into groups of bits, which are then combined into a tagged sum.

Here is a small example of using bitdata to inspect the first nybble of a byte:

bitdata ChooseOption where
  opt1 = 0x0 : uint 4
  opt2 = 0x1

bitdata OptionData where
  OptionData = { opt : ChooseOption, val : uint 4 }

We use the syntax bitdata ... where ... to declare a new bitdata type. In this example, we’ve declared two bitdata types: ChooseOption and OptionData. Notice that we’ve added a type annotation of uint 4 to the opt1 alternative for ChooseOption; since all alternatives in the bitdata must have the same time, this type annotation effectively assigns a type to opt2 as well as to opt1.

Despite having capitalized names, ChooseOption and OptionData are not parsers. A bitdata declaration introduces a type that can only be used in conversions; it doesn’t do anything else at all until we use something like as?. When using a bitdata to coerce a value, it’s important to know that a bitdata has a fixed bit endianness meaning that fields defined further to the left correspond to more significant (leftmost) bits. Endianness applies only to parsing; once a value has been parsed, we can use a bitdata to interpret the parsed value’s bits. For example:

block
  let odata = UInt8 as? OptionData
  case odata of
    OptionData x ->
      case x.opt of
        opt1 -> ^ x.val
        _    -> Fail "Wrong option!"

We use the . accessor notation to access the fields of the OptionData structure. In the bitdata OptionData, we have one variant with two fields, opt and val, each 4 bits in length. Those fields can be accessed with x.opt and x.val in this example.

A bitdata declaration can also specify specific numeric values. This allows us to match bitdata alternatives on a value only when the value’s bits match that of the expected numeric value. For example:

bitdata SpecificNums where
  opt1 = { 55 : uint 8; 4: uint 8 }
  opt2 = { 65 : uint 8; 9: uint 8 }

PNG Chunk Types

To get started using bitdata, you’ll first declare one that specifies the different type tags that a PNG chunk can carry.

Every type tag is a 32-bit unsigned integer consisting of four bytes. The following table gives the four bytes for each type. Don’t worry about what the tag names mean; if you’re curious, you can check the PNG specification for details, but we won’t need that information for parsing.

PNG Chunk Types

plte

80 76 84 69

idat

73 68 65 84

trns

116 82 78 83

chrm

99 72 82 77

gama

103 65 77 65

iccp

105 67 67 80

sbit

115 66 73 84

srgb

115 82 71 66

text

116 69 88 116

itxt

105 84 88 116

ztxt

122 84 88 116

bkgd

98 75 71 68

hist

104 73 83 84

phys

112 72 89 115

splt

115 80 76 84

time

116 73 77 69

Exercise: Write a bitdata called ChunkType with 16 variants (one for each name in the above table). Name the variants using the names in the table, and be sure to specify the type of each byte (use uint 8).

Solution
bitdata ChunkType where
  plte = {  80 : uint 8 ;  76 : uint 8 ;  84 : uint 8 ;  69 : uint 8 }
  idat = {  73 : uint 8 ;  68 : uint 8 ;  65 : uint 8 ;  84 : uint 8 }
  trns = { 116 : uint 8 ;  82 : uint 8 ;  78 : uint 8 ;  83 : uint 8 }
  chrm = {  99 : uint 8 ;  72 : uint 8 ;  82 : uint 8 ;  77 : uint 8 }
  gama = { 103 : uint 8 ;  65 : uint 8 ;  77 : uint 8 ;  65 : uint 8 }
  iccp = { 105 : uint 8 ;  67 : uint 8 ;  67 : uint 8 ;  80 : uint 8 }
  sbit = { 115 : uint 8 ;  66 : uint 8 ;  73 : uint 8 ;  84 : uint 8 }
  srgb = { 115 : uint 8 ;  82 : uint 8 ;  71 : uint 8 ;  66 : uint 8 }
  text = { 116 : uint 8 ;  69 : uint 8 ;  88 : uint 8 ; 116 : uint 8 }
  itxt = { 105 : uint 8 ;  84 : uint 8 ;  88 : uint 8 ; 116 : uint 8 }
  ztxt = { 122 : uint 8 ;  84 : uint 8 ;  88 : uint 8 ; 116 : uint 8 }
  bkgd = {  98 : uint 8 ;  75 : uint 8 ;  71 : uint 8 ;  68 : uint 8 }
  hist = { 104 : uint 8 ;  73 : uint 8 ;  83 : uint 8 ;  84 : uint 8 }
  phys = { 112 : uint 8 ;  72 : uint 8 ;  89 : uint 8 ; 115 : uint 8 }
  splt = { 115 : uint 8 ;  80 : uint 8 ;  76 : uint 8 ;  84 : uint 8 }
  time = { 116 : uint 8 ;  73 : uint 8 ;  77 : uint 8 ;  69 : uint 8 }

PNG Chunk Data

Now, we’ll build parsers for each type of chunk data. We’ll then combine those into a single ChunkData parser, which will allow us to parse any type of chunk with a single parser and, more importantly, parse many chunks in sequence.

PLTE

The PLTE chunk contains between 1 and 256 palette entries, which are simply RGB structures as we defined in the section Extended Exercise: Defining Helpful Utilities.

Exercise: Write a parser PLTEChunkData that parses between 1 and 256 RGB structures.

Solution
def PLTEChunkData = Many (1..256) RGB

IDAT

The IDAT chunk contains image data as output by the PNG compression algorithm; it consists merely of a bunch of bytes.

Exercise: Write a parser IDATChunkData that parses any number of bytes.

Solution
def IDATChunkData = Many UInt8

tRNS

Transparency chunks are significantly more complex than the others we’ve looked at so far in that they have a different data shape depending on data provided in the PNG signature that we have not yet defined. We’ll delay full discussion of that until later; for now, all that matters is that the signature is a structure with a field named colour_type that we can pattern-match on in order to guide value construction for different PNG modes.

Transparency data is valid for colour_type values 0, 2, and 3; any other value should cause transparency data parsing to fail.

In the case of mode 0, the transparency data is a single big-endian 2-byte value which we will call grey_sample_value.

In mode 2, the transparency data is a sequence of three big-endian 2-byte values which we call red_sample_value, blue_sample_value, and green_sample_value, respectively. This is the order in which the values should be parsed.

Finally, in mode 3, there is a sequence of bytes, one for each entry in the PLTE chunk (so, between 1 and 256). You are not required to check that there are an appropriate number of bytes; in fact, you should not put bounds on how many are parsed. We call these bytes alpha_for_palette.

Exercise: Define three parsers, TRNSData0, TRNSData2, and TRNSData3, that carry the data as described above. Use the names provided as field names (so, make sure these parsers all produce structures).

Solution
def TRNSData0 =
  block
    grey_sample_value = BEUInt16

def TRNSData2 =
  block
    red_sample_value   = BEUInt16
    blue_sample_value  = BEUInt16
    green_sample_value = BEUInt16

def TRNSData3 =
  block
    alpha_for_palette = Many UInt8

Exercise: Now, define a parser TRNSChunkData that takes a single argument, sig, and uses the colour_type field of that argument to produce an appropriate value; you can use pattern-matching and integer literals for this. In all other cases, the parser should fail with a message indicating that the transparency chunk cannot appear for any other color mode.

Hint

Remember that you can return values of a tagged sum using the special ‘barbed wire’ brackets, {| tag = ... |}.

Solution
def TRNSChunkData sig =
  case sig.colour_type of
    0 -> {| trns_colour_type_0 = TRNSData0 |}
    2 -> {| trns_colour_type_2 = TRNSData2 |}
    3 -> {| trns_colour_type_3 = TRNSData3 |}
    _ -> Fail "tRNS chunk shall not appear for other colour types"

cHRM

The chromaticities and white point chunk is made of eight big-endian 4-byte values: x and y respectively for each of the white point, red, green, and blue. You should use the names white_point_x, white_point_y, red_x, red_y, green_x, green_y, blue_x, and blue_y for these fields in the next exercise.

Exercise: Define a parser CHRMChunkData that parses the fields described above.

Solution
def CHRMChunkData =
  block
    white_point_x = BEUInt32
    white_point_y = BEUInt32
    red_x         = BEUInt32
    red_y         = BEUInt32
    green_x       = BEUInt32
    green_y       = BEUInt32
    blue_x        = BEUInt32
    blue_y        = BEUInt32

gAMA

The gamma chunk relates image samples to desired output intensity. It is simply a big-endian 4-byte integer.

Exercise: Define a parser GAMAChunkData that parses a single field, image_gamma, as specified in the above description.

Solution
def GAMAChunkData =
  block
    image_gamma = BEUInt32

iCCP

The embedded ICC profile chunk contains information related to the International Color Consortium. It consists of three fields:

  1. A null-terminated string between 1 and 79 characters in length, which we call profile_name

  2. A byte defining the compression_method

  3. Many bytes giving the compressed_profile

Exercise: Define a parser ICCPChunkData that parses a structure with the three fields described above.

Hint

Remember the null-terminated string parser we built previously!

Solution
def ICCPChunkData =
  block
    profile_name       = NTString (just 1) (just 79)
    compression_method = UInt8
    compressed_profile = Many UInt8

sBIT

The significant bits chunk allows lossless data recovery even if the sample depth isn’t supported by PNG. The supported color modes each have a number of significant bits to store that allows this to be done.

Like the transparency chunk described above, the significant bits chunk behaves differently depending on the PNG color mode being used.

In mode 0, the significant bits data is a single byte which we call significant_greyscale_bits.

In modes 2 and 3, the data is stored in three bytes, called significant_red_bits, significant_green_bits, and significant_blue_bits, respectively.

In mode 4, there are two bytes: significant_greyscale_bits and significant_alpha_bits.

Finally, in mode 6, there are four bytes: significant_red_bits, significant_green_bits, significant_blue_bits, and significant_alpha_bits.

Exercise: Define four parsers, SBITData0, SBITData2or3, SBITData4, and SBITData6, that carry the data as described above. Name the fields using the names we provided (so make sure all parsers return structures).

Solution
def SBITData0 =
  block
    significant_greyscale_bits = UInt8

def SBITData2or3 =
  block
    significant_red_bits   = UInt8
    significant_green_bits = UInt8
    significant_blue_bits  = UInt8

def SBITData4 =
  block
    significant_greyscale_bits = UInt8
    significant_alpha_bits     = UInt8

def SBITData6 =
  block
    significant_red_bits   = UInt8
    significant_green_bits = UInt8
    significant_blue_bits  = UInt8
    significant_alpha_bits = UInt8

Exercise: Now, define a parser SBITChunkData that takes a single argument, sig, and uses the colour_type field of that argument to produce an appropriate value; you can use pattern-matching and integer literals for this. You can ignore all other color modes.

Solution
def SBITChunkData sig =
    case sig.colour_type of
      0 -> {| sbit_colour_type_0 = SBITData0 |}
      2 -> {| sbit_colour_type_2 = SBITData2or3 |}
      3 -> {| sbit_colour_type_3 = SBITData2or3 |}
      4 -> {| sbit_colour_type_4 = SBITData4 |}
      6 -> {| sbit_colour_type_6 = SBITData6 |}

sRGB

The standard RGB color space chunk defines a rendering intent, defined by the ICC. It contains one byte, which is a value between 0 and 3 (inclusive) giving the intent.

Exercise: Define a parser, SRGBChunkData, that parses a structure with a single field, rendering_intent, from exactly one byte with a value between 0 and 3 (inclusive).

Solution
def SRGBChunkData =
  block
    rendering_intent = $[0 .. 3]

tEXt

PNG provides textual data blocks to store information associated with images such as descriptions and copyright notices. There are three types of textual chunk, the most basic of which is the tEXt chunk we’ll define first.

The standard tEXt chunk is comprised of two pieces of data: a null-terminated string giving a keyword (some of which are recognized by the PNG specification) and zero or more bytes giving the text associated with the keyword. Note that this character string (the data following the keyword) is not null-terminated.

Exercise: Define a parser TEXTChunkData that returns a structure with two fields, keyword and text_string. The keyword field should be a null-terminated string between 1 and 79 characters in length and text_string field should be an arbitrarily large number of bytes.

Solution
def TEXTChunkData =
  block
    keyword = NTString (just 1) (just 79)
    text_string = Many UInt8

zTXt

The zTXt chunk is semantically the same as tEXt, but with compressed data; it is recommended for use with large blocks of text.

Exercise: Define a parser ZTXTChunkData that returns a structure with three fields: keyword, compression_method, and compressed_text_datastream. The first and last of these are exactly as they were for TEXTChunkData, and compression_method is a single byte.

Solution
def ZTXTChunkData =
  block
    keyword = NTString (just 1) (just 79)
    compression_method = UInt8
    compressed_text_datastream = Many UInt8

iTXt

The final type of textual data chunk is iTXt. This chunk type is used for international text. It consists of the following fields:

  • keyword: Exactly the same as the previous two chunk types

  • compression_flag: A FLAG indicating whether the text data is compressed

  • compression_method: The same as in zTXt

  • language_tag: An arbitrarily long null-terminated string indicating the language used for the text

  • translated_keyword: An arbitrarily long null-terminated string giving the translation of the keyword into the language of the text

  • text: The text data, given in the same way as both tEXt and zTXt

Exercise: Define a parser ITXTChunkData that returns a structure with the six fields as described above.

Solution
def ITXTChunkData =
  block
    keyword = NTString (just 1) (just 79)
    compression_flag = FLAG
    compression_method = UInt8
    language_tag = NTString nothing nothing
    translated_keyword = NTString nothing nothing
    text = Many UInt8

bKGD

The background color chunk specifies the default background color against which to present the image. Different color modes use different data in this chunk to achieve the specified effects.

For color modes 0 and 4, a single two-byte value greyscale controls the background color.

For modes 2 and 6, three two-byte values, red, green, and blue are used for the background color.

Finally, for color mode 3, a single byte, palette_index, is used. You do not need to check that this index is within range for the palette provided.

Exercise: Define three parsers, BKGDData0or4, BKGDData2or6, and BKGDData3 that return structures with the fields described above.

Solution
def BKGDData0or4 =
  block
    greyscale = BEUInt16

def BKGDData2or6 =
  block
    red   = BEUInt16
    green = BEUInt16
    blue  = BEUInt16

def BKGDData3 =
  block
    palette_index = UInt8

Exercise: Now, define a parser BKGDChunkData that takes a single argument, sig, and uses the colour_type field of that argument to prouce an appropriate value; you can use pattern-matching and integer literals for this. You can ignore all other color modes.

Solution
def BKGDChunkData sig =
  case sig.colour_type of
    0 -> {| bkgd_colour_type_0 = BKGDData0or4 |}
    4 -> {| bkgd_colour_type_4 = BKGDData0or4 |}
    2 -> {| bkgd_colour_type_2 = BKGDData2or6 |}
    6 -> {| bkgd_colour_type_6 = BKGDData2or6 |}
    3 -> {| bkgd_colour_type_3 = BKGDData3 |}

hIST

The image histogram chunk gives approximate usage frequencies for all colors in the palette. Each frequency is a big-endian two-byte integer.

Exercise: Define a parser HISTChunkData that returns a structure with a single field, frequencies, which is an array of frequencies as described above. Don’t worry about checking that the correct number of frequencies are provided.

Solution
def HISTChunkData =
  block
    frequencies = Many BEUInt16

pHYs

The physical pixel dimensions chunk describes the intended aspect ratio for image display. It’s comprised of:

  • pixels_per_unit_x_axis, a 4-byte unsigned integer

  • pixels_per_unit_y_axis, another 4-byte unsigned integer

  • unit_specifier, a FLAG indicating whether to use unitless aspect ratio or specify actual size

Exercise: Define a parser PHYSChunkData that returns a structure with the fields as described above.

Solution
def PHYSChunkData =
  block
    pixels_per_unit_x_axis = BEUInt32
    pixels_per_unit_y_axis = BEUInt32
    unit_specifier         = FLAG

sPLT

The suggested palette chunk is semantically fairly complex; if you want to know the details, we once again refer you to the full PNG specification linked earlier in this section. As it turns out, it is also likely the most challenging exercise on this page!

The chunk consists of the following data:

  • palette_name, a null-terminated string between 1 and 79 characters in length

  • sample_depth, a single byte that must be either 8 or 16

  • palette, a series of 6 or 10 byte structures depending on the sample depth. In both cases, the data are:

    • red

    • green

    • blue

    • alpha

    • frequency

    In all cases, the frequency is given by a 2-byte unsigned integer. If the sample depth is 8, the other four samples are one byte; if it is 16, they are two bytes (this is where we get a total of 6 or 10, since the frequency is always given in two bytes).

To put this in higher-level terms, we now need to define a data-dependent parser since the sample depth will control how we parse the remaining bytes in the input.

Let’s build this up one small step at a time.

Exercise: Define two parsers, SPLTSample8 and SPLTSample16, that parse the five sample values as described above.

Solution
def SPLTSample8 =
  block
    red       = UInt8
    green     = UInt8
    blue      = UInt8
    alpha     = UInt8
    frequency = BEUInt16

def SPLTSample16 =
  block
    red       = BEUInt16
    green     = BEUInt16
    blue      = BEUInt16
    alpha     = BEUInt16
    frequency = BEUInt16

Exercise: Using the two parsers you just defined, define a new parser SPLTSample that takes a single argument, depth : uint 8, and decides which to use. The result should be a tagged sum with two variants named splt_sample_depth_8 and splt_sample_depth_16. You do not need to write anything to handle sample depths other than 8 and 16.

Solution
def SPLTSample (depth : uint 8) =
  case depth of
    8  -> {| splt_sample_depth_8  = SPLTSample8 |}
    16 -> {| splt_sample_depth_16 = SPLTSample16 |}

Exercise (Challenging): Finally, define a parser SPLTChunkData that returns a structure containing fields palette_name, sample_depth, and palette as described earlier.

Hint

Remember that you can parse alternatives using the operators <| and | - the former is for biased choice, the latter for unbiased.

Also remember that DaeDaLus allows for data-dependent parsing: If you store the result of running a parser in a local variable or structure field, you may use that value later in the sequence.

Solution
def SPLTChunkData =
  block
    palette_name = NTString (just 1) (just 79)
    sample_depth = $[ 8 ] <| $[ 16 ]
    Many (SPLTSample sample_depth)

tIME

With that tricky one out of the way, there’s only one chunk type left, and it’s very straightforward: the last one left is the tIME chunk that carries along the last-modified time for the image.

Exercise: For consistency with our other chunk parser naming, define a ‘new’ parser TIMEChunkData that returns a UTCTime.

Solution
def TIMEChunkData = UTCTime

Generic Chunk Data Parsing

Now that we know how to parse each type of chunk, we’ll combine all of those possibilities into a single tagged sum type so that we will be able to parse many chunks in sequence. To do this, we’ll use pattern-matching on the bitdata we defined earlier and all of the parsers defined so far in this section.

Exercise: Write a parser ChunkData that takes two arguments, sig and type : ChunkType, and produces a tagged sum covering all possible chunk types defined by the type parameter. The tags should be named, for example, plte_data for the plte case.

Solution
def ChunkData sig (type : ChunkType) =
  case type of
    plte -> {| plte_data = PLTEChunkData     |}
    idat -> {| idat_data = IDATChunkData     |}
    trns -> {| trns_data = TRNSChunkData sig |}
    chrm -> {| chrm_data = CHRMChunkData     |}
    gama -> {| gama_data = GAMAChunkData     |}
    iccp -> {| iccp_data = ICCPChunkData     |}
    sbit -> {| sbit_data = SBITChunkData sig |}
    srgb -> {| srgb_data = SRGBChunkData     |}
    text -> {| text_data = TEXTChunkData     |}
    itxt -> {| itxt_data = ITXTChunkData     |}
    ztxt -> {| ztxt_data = ZTXTChunkData     |}
    bkgd -> {| bkgd_data = BKGDChunkData sig |}
    hist -> {| hist_data = HISTChunkData     |}
    phys -> {| phys_data = PHYSChunkData     |}
    splt -> {| splt_data = SPLTChunkData     |}
    time -> {| time_data = TIMEChunkData     |}

PNG Chunks

Now that we can successfully parse any type of chunk data, we need only add in the metadata necessary for a complete chunk. A complete PNG chunk consists of the chunk’s length (as a 4-byte integer), the type of the chunk (which we need to interpret as a ChunkType), the chunk data, and a CRC value.

Exercise: Define a parser PNGChunk that takes a single argument, sig, and returns a structure with the following fields:

  • type, an instance of bitdata ChunkType

  • data, a ChunkData that results from parsing exactly as many bytes as specified by the length (which must be parsed before type but does not need to be stored in the resulting structure)

  • crc, a Crc

Hint
  1. Remember that you use a bitdata by coercing a parsed value; take care to use the appropriate coercion method!

  2. To make sure a particular number of bytes are consumed, recall the Chunk parser in the standard library. Note that this expects a uint 64 argument for the number of bytes to parse.

Solution
def PNGChunk sig =
  block
    let len = Length as uint 64
    type = BEUInt32 as? ChunkType
    data = Chunk len (ChunkData sig type)
    crc = Crc

Header / Trailer

Two special chunks we haven’t discussed are the header and trailer for PNG. These are the first and last chunks in a PNG datastream. The sig parameter we’ve left undiscussed is the data in the header.

Like the PNGChunk variants, these will both need length, type, and CRC fields. Since they always have the same length (13 and 0 bytes respectively), we can hardcode these fields in our parsers. Note that the length, type, and CRC fields do not count towards the length of the chunk.

The header chunk additionally contains the following fields:

  • width, a 4-byte unsigned integer giving the width of the image

  • height, a 4-byte unsigned integer giving the height of the image

  • bit_depth, a single byte giving the number of bits per sample

  • colour_type, a single byte that defines the image type. This must be one of 0, 2, 3, 4, or 6

  • compression_method, a single byte giving the compression method

  • filter_method, a single byte indicating the preprocessing method applied before compression

  • interlace_method, a single byte indicating transmission order of image data

Exercise: Define a parser IHDRChunk that returns a structure with the above-described fields. The type identifier for the header chunk is the four bytes 73, 72, 68, and 82.

Solution
def IHDRChunk =
  block
    Match [ 0; 0; 0; 13]
    Match [ 73; 72; 68; 82]
    width              = BEUInt32
    height             = BEUInt32
    bit_depth          = UInt8
    colour_type        = UInt8
    compression_method = UInt8
    filter_method      = UInt8
    interlace_method   = UInt8
    crc                = Crc

Exercise: Define a parser IENDChunk that returns a structure matching the description above. The type identifier for the trailer chunk is the four bytes 73, 69, 78, and 68.

Solution
def IENDChunk =
  block
    Match [0; 0; 0; 0]
    Match [73; 69; 78; 68]
    crc = Crc

Full PNG

We have almost all of the components needed to parse full PNG images now. The only thing missing is the PNG header, a byte sequence that starts every PNG image ever encoded:

def PNGHeader =
  Match [ 137; 80; 78; 71; 13; 10; 26; 10]

With this here is your final exercise:

Exercise: Write the Main parser to parse a full PNG image. That is: The PNG header, the signature/header chunk, the data chunks, and finally the trailer chunk. Make sure Main is defined so that it only succeeds if it consumes the entire input.

Solution
def Main =
  block
    PNGHeader
    signature = IHDRChunk
    chunks = Many (PNGChunk signature)
    IENDChunk
    END

Conclusion

Congratulations on making it through! The full PNG specification can be found on the following page; notice how DaeDaLus has made it possible to write such a short specification compared to the PNG specification upon which it is based.

Note that in a few places (as the exercises note), we fail to do some of the validation included in the specification; in fact, there are a number of places where we deliberately left out the restrictions for simplicity (e.g. the bit_depth field only has a few allowed values, and these values are also constrained by the colour_type). As discussed, it is often better to leave such validations for post-parsing stages of your applications, such as type-checking and other static analysis. Where it was natural to do so, we built the specification’s restrictions into our parser to catch problems early.

We recommend reading over the rest of the DaeDaLus user guide, which has some extra detail on concepts we covered (and some more advanced topics we skipped).