Breaking down PPM: Declarations
Starting from the very beginning of the PPM specification, we have:
def Main =
block
$$ = PPM
This is a declaration for the name Main
. Declarations in general are
indicated by the keyword def
, followed by the name being declared,
any formal parameters, an equals sign, and finally the definition itself.
As in most programming langauges, declarations allow us to name certain
entities that we may wish to refer to later, often with the intent of reducing
duplication and helping readers understand the ideas we’re trying to express
(in the immortal words of Guido Van Rossum, creator of Python: Code is read
more often than it is written.) In this particular case, the name Main
indicates to the DaeDaLus interpreter and backend that this is the entry point
of the parser being defined. All layout specifications must declare a Main
symbol.
Note
In DaeDaLus, declarations behave differently from what you might be used to if you are unfamiliar with pure functional programming. In such languages, variables are immutable: Once defined, they cannot be re-defined. While at first this may seem limiting, it makes it far easier to reason about the behavior of code since each name always refers to the same thing - anywhere we see the name used, we can substitute its one precise definition without changing the meaning of the program or having to be concerned about changes to any global state.
Deducing Types From Names
Though it’s probably not obvious, there’s actually another piece of information
we can deduce from this declaration: What type of entity Main
refers to.
In DaeDaLus, there are three sorts of entities that may be declared: parsers, semantic values, and character classes.
Parsers always have a name starting with an uppercase letter; so, in the
example declaration we’re currently looking at, Main
is a parser.
In contrast, semantic values always have a name starting with a lowercase
letter. Looking ahead in the PPM example, by this convention, the declaration
of addDigit
specifies a semantic value; in particular, a function from two
semantic values to another semantic value.
Finally, character classes always have a name starting with $
. The PPM
example does not specify any character classes, but we’ll have more to say
about them later.
Note
In summary:
Parser names begin with uppercase letters
Semantic value names begin with lowercase letters
Character class names begin with
$
Keeping these rules in mind will save you a lot of trouble debugging in the future!
Parameterized Declarations
The next declaration in the PPM specfication shows that a declaration may be parameterized:
def Token P =
block
$$ = P
Many (1..) WS
Parameter names follow the same rules outlined above: Uppercase names indicate
parser parameters, lowercase names indicate semantic value parameters, and
names starting with $
indicate character class parameters. In this example,
since P
is capitalized, we know it is a parser parameter.
The Token
parser also demonstrates similarities between DaeDaLus and
parser combinator libraries such as parsec
for the Haskell programming
language. Rather than having to write complex parsing algorithms from scratch,
a library of primitive parsers and higher-order combining operations are
provided as building blocks. If you’re already familiar with these sorts of
parsing libraries, you’re well on your way to being a productive DaeDaLus user!