G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20)....

63
G52CMP: Lecture 6 Defining Programming Languages II Henrik Nilsson University of Nottingham, UK G52CMP: Lecture 6 – p.1/30

Transcript of G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20)....

Page 1: G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20). • Our version has evolved and is now quite different in some respects. • We

G52CMP: Lecture 6Defining Programming Languages II

Henrik Nilsson

University of Nottingham, UK

G52CMP: Lecture 6 – p.1/30

Page 2: G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20). • Our version has evolved and is now quite different in some respects. • We

MiniTriangle

In part II of the coursework, we are going to usea language called MiniTriangle :

• Originates from Watt & Brown (defined at pp.6–20).

• Our version has evolved and is now quitedifferent in some respects.

• We use MiniTriangle in this lecture to:- Illustrate the ideas of concrete and abstract

syntax- Introduce you to the language

G52CMP: Lecture 6 – p.2/30

Page 3: G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20). • Our version has evolved and is now quite different in some respects. • We

This Lecture

• Concrete Syntax- Lexical syntax for MiniTriangle- Context-free syntax for MiniTriangle

• Abstract Syntax- Abstract syntax for MiniTriangle

• Representing Abstract Syntax Trees (ASTs)

G52CMP: Lecture 6 – p.3/30

Page 4: G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20). • Our version has evolved and is now quite different in some respects. • We

A MiniTriangle Program

This is an example of a valid MiniTriangleprogram:

letvar y: Integer := 0

inbegin

y := y + 1 ;putint(y)

end

G52CMP: Lecture 6 – p.4/30

Page 5: G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20). • Our version has evolved and is now quite different in some respects. • We

Concrete Syntax

The Concrete Syntax , or surface syntax, of alanguage is usually defined at two levels:

G52CMP: Lecture 6 – p.5/30

Page 6: G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20). • Our version has evolved and is now quite different in some respects. • We

Concrete Syntax

The Concrete Syntax , or surface syntax, of alanguage is usually defined at two levels:

• The Lexical syntax : the syntax of- language symbols or tokens- white space- comments

G52CMP: Lecture 6 – p.5/30

Page 7: G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20). • Our version has evolved and is now quite different in some respects. • We

Concrete Syntax

The Concrete Syntax , or surface syntax, of alanguage is usually defined at two levels:

• The Lexical syntax : the syntax of- language symbols or tokens- white space- comments

• The Context-Free syntax .

G52CMP: Lecture 6 – p.5/30

Page 8: G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20). • Our version has evolved and is now quite different in some respects. • We

Regular Grammars

• Lexical syntax is usually defined as aRegular Language (RL).

G52CMP: Lecture 6 – p.6/30

Page 9: G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20). • Our version has evolved and is now quite different in some respects. • We

Regular Grammars

• Lexical syntax is usually defined as aRegular Language (RL).

• A regular language can be described by

G52CMP: Lecture 6 – p.6/30

Page 10: G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20). • Our version has evolved and is now quite different in some respects. • We

Regular Grammars

• Lexical syntax is usually defined as aRegular Language (RL).

• A regular language can be described by- a Regular Expression

G52CMP: Lecture 6 – p.6/30

Page 11: G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20). • Our version has evolved and is now quite different in some respects. • We

Regular Grammars

• Lexical syntax is usually defined as aRegular Language (RL).

• A regular language can be described by- a Regular Expression- a Context-Free Grammar (as the RLs are a

proper subset of the CFLs)

G52CMP: Lecture 6 – p.6/30

Page 12: G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20). • Our version has evolved and is now quite different in some respects. • We

Regular Grammars

• Lexical syntax is usually defined as aRegular Language (RL).

• A regular language can be described by- a Regular Expression- a Context-Free Grammar (as the RLs are a

proper subset of the CFLs)• If a grammar G is left-linear or right-linear ,

then G is a regular grammar and L(G) is aregular language.

G52CMP: Lecture 6 – p.6/30

Page 13: G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20). • Our version has evolved and is now quite different in some respects. • We

Regular Grammars

• Lexical syntax is usually defined as aRegular Language (RL).

• A regular language can be described by- a Regular Expression- a Context-Free Grammar (as the RLs are a

proper subset of the CFLs)• If a grammar G is left-linear or right-linear ,

then G is a regular grammar and L(G) is aregular language.

• Regular languages are easy to recognise (DFA).G52CMP: Lecture 6 – p.6/30

Page 14: G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20). • Our version has evolved and is now quite different in some respects. • We

Right-linear Grammar

A CFG G = (N,T, P, S) is right-linear if all itsproductions are of the forms

A → wB

A → w

where A,B ∈ N and w ∈ T ∗.

Example: The regular language 0(10)∗ isgenerated by the right-linear grammar

S → 0A

A → 10A | ǫ

G52CMP: Lecture 6 – p.7/30

Page 15: G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20). • Our version has evolved and is now quite different in some respects. • We

Left-linear Grammar

A CFG G = (N,T, P, S) is left-linear if all itsproductions are of the forms

A → Bw

A → w

where A,B ∈ N and w ∈ T ∗.

Example: The regular language 0(10)∗ isgenerated by the left-linear grammar

S → S10 | 0

G52CMP: Lecture 6 – p.8/30

Page 16: G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20). • Our version has evolved and is now quite different in some respects. • We

MiniTriangle Lexical Syntax (1)Program → (Token | Separator )∗

Token → Keyword | Identifier | IntegerLiteral | Operator

| , | ; | : | := | = | ( | ) | eot

Keyword → begin | const | do | else | end | if | in

| let | then | var | while

Identifier → Letter | Identifier Letter | Identifier Digit

except Keyword

IntegerLiteral → Digit | IntegerLiteral Digit

Operator → + | - | * | / | < | <= | == | != | >= | > | && | || | !

Separator → Comment | space | eol

Comment → // (any character except eol )∗ eolG52CMP: Lecture 6 – p.9/30

Page 17: G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20). • Our version has evolved and is now quite different in some respects. • We

MiniTriangle Lexical Syntax (2)

Notes:

G52CMP: Lecture 6 – p.10/30

Page 18: G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20). • Our version has evolved and is now quite different in some respects. • We

MiniTriangle Lexical Syntax (2)

Notes:• Essentially a left-linear grammar.

G52CMP: Lecture 6 – p.10/30

Page 19: G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20). • Our version has evolved and is now quite different in some respects. • We

MiniTriangle Lexical Syntax (2)

Notes:• Essentially a left-linear grammar.• Not completely formal (e.g. the use of

“except” for excluding keywords fromidentifiers).

G52CMP: Lecture 6 – p.10/30

Page 20: G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20). • Our version has evolved and is now quite different in some respects. • We

MiniTriangle Lexical Syntax (2)

Notes:• Essentially a left-linear grammar.• Not completely formal (e.g. the use of

“except” for excluding keywords fromidentifiers).

• Note! Each individual character of a terminalis actually a terminal symbol! I.e., really:Keyword → b e g i n | c o n s t | . . .

G52CMP: Lecture 6 – p.10/30

Page 21: G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20). • Our version has evolved and is now quite different in some respects. • We

MiniTriangle Lexical Syntax (2)

Notes:• Essentially a left-linear grammar.• Not completely formal (e.g. the use of

“except” for excluding keywords fromidentifiers).

• Note! Each individual character of a terminalis actually a terminal symbol! I.e., really:Keyword → b e g i n | c o n s t | . . .

• Special characters are written like this.Note! They are single terminal symbols!

G52CMP: Lecture 6 – p.10/30

Page 22: G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20). • Our version has evolved and is now quite different in some respects. • We

MiniTriangle: Tokens

Some valid MiniTriangle tokens:• const3 (Identifier)• const (Keyword)• 42 (Integer-Literal)• + (Operator)

G52CMP: Lecture 6 – p.11/30

Page 23: G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20). • Our version has evolved and is now quite different in some respects. • We

MiniTriangle: Tokens

Some valid MiniTriangle tokens:• const3 (Identifier)• const (Keyword)• 42 (Integer-Literal)• + (Operator)

Q: Is const3 really a single token?

G52CMP: Lecture 6 – p.11/30

Page 24: G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20). • Our version has evolved and is now quite different in some respects. • We

MiniTriangle: Tokens

Some valid MiniTriangle tokens:• const3 (Identifier)• const (Keyword)• 42 (Integer-Literal)• + (Operator)

Q: Is const3 really a single token?The grammar is ambiguous !

G52CMP: Lecture 6 – p.11/30

Page 25: G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20). • Our version has evolved and is now quite different in some respects. • We

MiniTriangle: Tokens

Some valid MiniTriangle tokens:• const3 (Identifier)• const (Keyword)• 42 (Integer-Literal)• + (Operator)

Q: Is const3 really a single token?The grammar is ambiguous !

A: An implicit “maximal munch rule ” used todisambiguate!

G52CMP: Lecture 6 – p.11/30

Page 26: G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20). • Our version has evolved and is now quite different in some respects. • We

MiniTriangle: Non Tokens

Some non tokens:

G52CMP: Lecture 6 – p.12/30

Page 27: G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20). • Our version has evolved and is now quite different in some respects. • We

MiniTriangle: Non Tokens

Some non tokens:• 123abc

G52CMP: Lecture 6 – p.12/30

Page 28: G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20). • Our version has evolved and is now quite different in some respects. • We

MiniTriangle: Non Tokens

Some non tokens:• 123abc (two tokens: Integer-Literal 123 and

Identifier abc)

G52CMP: Lecture 6 – p.12/30

Page 29: G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20). • Our version has evolved and is now quite different in some respects. • We

MiniTriangle: Non Tokens

Some non tokens:• 123abc (two tokens: Integer-Literal 123 and

Identifier abc)• put_x

G52CMP: Lecture 6 – p.12/30

Page 30: G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20). • Our version has evolved and is now quite different in some respects. • We

MiniTriangle: Non Tokens

Some non tokens:• 123abc (two tokens: Integer-Literal 123 and

Identifier abc)• put_x (Identifier put, illegal character “_”,

Identifier x)

G52CMP: Lecture 6 – p.12/30

Page 31: G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20). • Our version has evolved and is now quite different in some respects. • We

MiniTriangle: Non Tokens

Some non tokens:• 123abc (two tokens: Integer-Literal 123 and

Identifier abc)• put_x (Identifier put, illegal character “_”,

Identifier x)• 3.14

G52CMP: Lecture 6 – p.12/30

Page 32: G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20). • Our version has evolved and is now quite different in some respects. • We

MiniTriangle: Non Tokens

Some non tokens:• 123abc (two tokens: Integer-Literal 123 and

Identifier abc)• put_x (Identifier put, illegal character “_”,

Identifier x)• 3.14 (Integer-Literal 3, illegal character “.”,

Integer-Literal 14)

G52CMP: Lecture 6 – p.12/30

Page 33: G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20). • Our version has evolved and is now quite different in some respects. • We

MiniTriangle: Non Tokens

Some non tokens:• 123abc (two tokens: Integer-Literal 123 and

Identifier abc)• put_x (Identifier put, illegal character “_”,

Identifier x)• 3.14 (Integer-Literal 3, illegal character “.”,

Integer-Literal 14)• 3e8

G52CMP: Lecture 6 – p.12/30

Page 34: G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20). • Our version has evolved and is now quite different in some respects. • We

MiniTriangle: Non Tokens

Some non tokens:• 123abc (two tokens: Integer-Literal 123 and

Identifier abc)• put_x (Identifier put, illegal character “_”,

Identifier x)• 3.14 (Integer-Literal 3, illegal character “.”,

Integer-Literal 14)• 3e8 (two tokens: Integer-Literal 3 and

Identifier e8)

G52CMP: Lecture 6 – p.12/30

Page 35: G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20). • Our version has evolved and is now quite different in some respects. • We

MiniTriangle Context-Free Syntax (1)

Program → Command

Commands → Command

| Command ; Commands

Command → VarExpression := Expression

| VarExpression ( Expressions )

| if Expression then Command else Command

| whileExpression do Command

| let Declarations in Command

| beginCommands end

G52CMP: Lecture 6 – p.13/30

Page 36: G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20). • Our version has evolved and is now quite different in some respects. • We

MiniTriangle Context-Free Syntax (2)

Expressions → Expression

| Expression , Expressions

Expression → PrimaryExpression

| Expression Operator PrimaryExpression

PrimaryExpression → IntegerLiteral

| VarExpression

| Operator PrimaryExpression

| ( Expression )

VarExpression → Identifier

G52CMP: Lecture 6 – p.14/30

Page 37: G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20). • Our version has evolved and is now quite different in some respects. • We

MiniTriangle Context-Free Syntax (3)

Declarations → Declaration

| Declaration ; Declarations

Declaration → const Identifier : TypeDenoter = Expression

| var Identifier : TypeDenoter

| var Identifier : TypeDenoter := Expression

TypeDenoter → Identifier

G52CMP: Lecture 6 – p.15/30

Page 38: G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20). • Our version has evolved and is now quite different in some respects. • We

Another MiniTriangle Program

The following is a syntactically validMiniTriangle program (slightly changed fromearlier to save some space):

letvar y: Integer

inbegin

y := y + 1 ;putint(y)

end

G52CMP: Lecture 6 – p.16/30

Page 39: G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20). • Our version has evolved and is now quite different in some respects. • We

Parse Tree for the ProgramProgram

Command

let Declarations in

CommandsDeclaration

var Identifier : TypeDenoter

Integer

y Identifier

Identifier

y

:= Expression

Expression Operator PrimaryExpression

+PrimaryExpression

Identifier

y

IntegerLiteral

1

Command

begin end

Command

VarExpression

Commands

Command

VarExpression ( )Expressions

ExpressionIdentifier

putint

VarExpression

Identifier

y

;

VarExpression PrimaryExpression

G52CMP: Lecture 6 – p.17/30

Page 40: G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20). • Our version has evolved and is now quite different in some respects. • We

Exercise 1

Draw the parse tree for the following MiniTriangleprogram:

while b don := 0

G52CMP: Lecture 6 – p.18/30

Page 41: G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20). • Our version has evolved and is now quite different in some respects. • We

Why a Lexical Grammar? (1)

Together, the lexical grammar and thecontext-free grammar specify the concretesyntax .

G52CMP: Lecture 6 – p.19/30

Page 42: G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20). • Our version has evolved and is now quite different in some respects. • We

Why a Lexical Grammar? (1)

Together, the lexical grammar and thecontext-free grammar specify the concretesyntax .

In our case, both grammars are expressed in(E)BNF and looks similar.

So . . .

G52CMP: Lecture 6 – p.19/30

Page 43: G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20). • Our version has evolved and is now quite different in some respects. • We

Why a Lexical Grammar? (1)

Together, the lexical grammar and thecontext-free grammar specify the concretesyntax .

In our case, both grammars are expressed in(E)BNF and looks similar.

So . . .• Why not join them?

G52CMP: Lecture 6 – p.19/30

Page 44: G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20). • Our version has evolved and is now quite different in some respects. • We

Why a Lexical Grammar? (1)

Together, the lexical grammar and thecontext-free grammar specify the concretesyntax .

In our case, both grammars are expressed in(E)BNF and looks similar.

So . . .• Why not join them?• Why not do away with scanning, and just do

parsing?

G52CMP: Lecture 6 – p.19/30

Page 45: G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20). • Our version has evolved and is now quite different in some respects. • We

Why a Lexical Grammar? (2)

Answer:• Simplicity : dealing with white space and

comments in the context free grammarbecomes extremely complicated. (Try it!)

G52CMP: Lecture 6 – p.20/30

Page 46: G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20). • Our version has evolved and is now quite different in some respects. • We

Why a Lexical Grammar? (2)

Answer:• Simplicity : dealing with white space and

comments in the context free grammarbecomes extremely complicated. (Try it!)

• Efficiency :- Working on classified groups of characters

(tokens) facilitates parsing: may bepossible to use a simpler parsing algorithm.

G52CMP: Lecture 6 – p.20/30

Page 47: G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20). • Our version has evolved and is now quite different in some respects. • We

Why a Lexical Grammar? (2)

Answer:• Simplicity : dealing with white space and

comments in the context free grammarbecomes extremely complicated. (Try it!)

• Efficiency :- Working on classified groups of characters

(tokens) facilitates parsing: may bepossible to use a simpler parsing algorithm.

- Grouping and classifying characters by assimple means as possible increasesefficiency.

G52CMP: Lecture 6 – p.20/30

Page 48: G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20). • Our version has evolved and is now quite different in some respects. • We

MiniTriangle Abstract Syntax (1)This grammar specifies the phrase structure ofMiniTriangle. In addition, it gives node labels tobe used when drawing Abstract Syntax Trees.

Program → Command Program

Command → Expression := Expression CmdAssign

| Expression ( Expression∗ ) CmdCall

| Command∗ CmdSeq

| if Expression then Command CmdIf

else Command

| whileExpression do Command CmdWhile

| let Declaration∗ in Command CmdLetG52CMP: Lecture 6 – p.21/30

Page 49: G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20). • Our version has evolved and is now quite different in some respects. • We

MiniTriangle Abstract Syntax (2)Expression → IntegerLiteral ExpLitInt

| Name ExpVar

| Expression ( Expression∗ ) ExpApp

Declaration → constName : TypeDenoter DeclConst

= Expression

| var Name : TypeDenoter DeclVar

(:= Expression | ǫ)

TypeDenoter → Name TDBaseType

Note: Keywords and other fixed-spelling terminals serveonly to make the connection with the concrete syntax clear.Identifier ⊆ Name, Operator ⊆ Name

G52CMP: Lecture 6 – p.22/30

Page 50: G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20). • Our version has evolved and is now quite different in some respects. • We

Abstract Syntax Tree for the ProgramProgram

CmdLet

DeclVar

CmdAssignName TDBaseType

Integer

y Name

Name

y

ExpApp

ExpVar

Name

ExpLitInt

+

Name

y

IntegerLiteral

1

CmdSeq

ExpVar

ExpVar

CmdCall

ExpVar

Name

putint

Name

y

ExpVar

Note: fixed-spelling terminals are omittedbecause they are implied by the node labels.

G52CMP: Lecture 6 – p.23/30

Page 51: G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20). • Our version has evolved and is now quite different in some respects. • We

Exercise 2

Draw the Abstract Syntax Tree for the followingMiniTriangle program:

while b don := 0

G52CMP: Lecture 6 – p.24/30

Page 52: G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20). • Our version has evolved and is now quite different in some respects. • We

Concrete AST RepresentationMapping of abstract syntax to algebraic datatypes:

G52CMP: Lecture 6 – p.25/30

Page 53: G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20). • Our version has evolved and is now quite different in some respects. • We

Concrete AST RepresentationMapping of abstract syntax to algebraic datatypes:

• Each non-terminal is mapped to a type .

G52CMP: Lecture 6 – p.25/30

Page 54: G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20). • Our version has evolved and is now quite different in some respects. • We

Concrete AST RepresentationMapping of abstract syntax to algebraic datatypes:

• Each non-terminal is mapped to a type .• Each label is mapped to a constructor for

the corresponding type.

G52CMP: Lecture 6 – p.25/30

Page 55: G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20). • Our version has evolved and is now quite different in some respects. • We

Concrete AST RepresentationMapping of abstract syntax to algebraic datatypes:

• Each non-terminal is mapped to a type .• Each label is mapped to a constructor for

the corresponding type.• The constructors get one argument for each

non-terminal and “variable” terminal in theRHS of the production.

G52CMP: Lecture 6 – p.25/30

Page 56: G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20). • Our version has evolved and is now quite different in some respects. • We

Concrete AST RepresentationMapping of abstract syntax to algebraic datatypes:

• Each non-terminal is mapped to a type .• Each label is mapped to a constructor for

the corresponding type.• The constructors get one argument for each

non-terminal and “variable” terminal in theRHS of the production.

• Sequences are represented by lists.

G52CMP: Lecture 6 – p.25/30

Page 57: G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20). • Our version has evolved and is now quite different in some respects. • We

Concrete AST RepresentationMapping of abstract syntax to algebraic datatypes:

• Each non-terminal is mapped to a type .• Each label is mapped to a constructor for

the corresponding type.• The constructors get one argument for each

non-terminal and “variable” terminal in theRHS of the production.

• Sequences are represented by lists.• Options are represented by values of type Maybe.

G52CMP: Lecture 6 – p.25/30

Page 58: G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20). • Our version has evolved and is now quite different in some respects. • We

Concrete AST RepresentationMapping of abstract syntax to algebraic datatypes:

• Each non-terminal is mapped to a type .• Each label is mapped to a constructor for

the corresponding type.• The constructors get one argument for each

non-terminal and “variable” terminal in theRHS of the production.

• Sequences are represented by lists.• Options are represented by values of type Maybe.• “Literal” terminals are ignored.

G52CMP: Lecture 6 – p.25/30

Page 59: G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20). • Our version has evolved and is now quite different in some respects. • We

Concrete AST Representation (2)

data Command

= CmdAssign Expression Expression

| CmdCall Expression [Expression]

| CmdSeq [Command]

| CmdIf Expression Command Command

| CmdWhile Expression Command

| CmdLet [Declaration] Command

G52CMP: Lecture 6 – p.26/30

Page 60: G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20). • Our version has evolved and is now quite different in some respects. • We

Concrete AST Representation (3)

data Expression

= ExpLitInt Integer

| ExpVar Name

| ExpApp Expression [Expression]

data Declaration

= DeclConst Name TypeDenoter Expression

| DeclVar Name TypeDenoter (Maybe Expression)

G52CMP: Lecture 6 – p.27/30

Page 61: G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20). • Our version has evolved and is now quite different in some respects. • We

Concrete AST Representation (4)In fact, the lab code uses labelled fields:data Command

= CmdAssign {

caVar :: Expression,

caVal :: Expression,

cmdSrcPos :: SrcPos

}

| CmdCall {

ccProc :: Expression,

ccArgs :: [Expression],

cmdSrcPos :: SrcPos

}

... G52CMP: Lecture 6 – p.28/30

Page 62: G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20). • Our version has evolved and is now quite different in some respects. • We

Haskell Representation of the Program

CmdLet

(DeclVar "y" (TDBaseName "Integer") Nothing)

(CmdSeq [CmdAssign (ExpVar "y")

(ExpApp (ExpVar "+")

[ExpVar "y",

ExpLitInt 1]),

CmdCall (ExpVar "putint")

[ExpVar "y"]])

Assumption:type Name = String

G52CMP: Lecture 6 – p.29/30

Page 63: G52CMP: Lecture 6 - cs.nott.ac.ukpsznhn/G52CMP-Obsolete/LectureNotes-2010/lecture06.pdf · 6–20). • Our version has evolved and is now quite different in some respects. • We

Exercise 3

Provide the Haskell representation of thefollowing MiniTriangle fragment:

while b don := 0

G52CMP: Lecture 6 – p.30/30