Grammar - courses.cs.washington.edu · Grammar for D, a little language program::= function-def |...
Transcript of Grammar - courses.cs.washington.edu · Grammar for D, a little language program::= function-def |...
![Page 1: Grammar - courses.cs.washington.edu · Grammar for D, a little language program::= function-def | program function-def function-def::= int id ( ) {statements} | int id ( parameters)](https://reader034.fdocuments.in/reader034/viewer/2022042806/5f75ddc7b80c1037cb3021dd/html5/thumbnails/1.jpg)
22-November-2002 cse413-19-Grammar © 2002 University of Washington 1
Grammar
CSE 413, Autumn 2002Programming Languages
http://www.cs.washington.edu/education/courses/413/02au/
![Page 2: Grammar - courses.cs.washington.edu · Grammar for D, a little language program::= function-def | program function-def function-def::= int id ( ) {statements} | int id ( parameters)](https://reader034.fdocuments.in/reader034/viewer/2022042806/5f75ddc7b80c1037cb3021dd/html5/thumbnails/2.jpg)
22-November-2002 cse413-19-Grammar © 2002 University of Washington 2
Recall: Programming Language Specs
• Syntax of every significant programminglanguage is specified by a formal grammar» BNF or some variation there on
• As language engineering has developed,formal methods have improved for defininguseful grammars and tools for processing them
![Page 3: Grammar - courses.cs.washington.edu · Grammar for D, a little language program::= function-def | program function-def function-def::= int id ( ) {statements} | int id ( parameters)](https://reader034.fdocuments.in/reader034/viewer/2022042806/5f75ddc7b80c1037cb3021dd/html5/thumbnails/3.jpg)
22-November-2002 cse413-19-Grammar © 2002 University of Washington 3
Productions• The rules of a grammar are called productions• Rules contain
» Nonterminal symbols: grammar variables (program,statement, id, etc.)
» Terminal symbols: concrete syntax that appears inprograms: a, b, c, 0, 1, if, (, …
• Meaning of nonterminal ::= <sequence of terminals and nonterminals>
In a derivation, an instance of nonterminal can be replacedby the sequence of terminals and nonterminals on the rightof the production
• Often, there are two or more productions for a singlenonterminal – can use either at different times
![Page 4: Grammar - courses.cs.washington.edu · Grammar for D, a little language program::= function-def | program function-def function-def::= int id ( ) {statements} | int id ( parameters)](https://reader034.fdocuments.in/reader034/viewer/2022042806/5f75ddc7b80c1037cb3021dd/html5/thumbnails/4.jpg)
Grammar for D, a little languageprogram ::= function-def | program function-deffunction-def ::= int id ( ) { statements } | int id ( parameters ) { statements } | int id ( ) { declarations statements } | int id ( parameters ) { declarations statements }parameters ::= parameter | parameters , parameterparameter ::= int iddeclarations ::= declaration | declarations declarationdeclaration ::= int id ;statements ::= statement | statements statementstatement ::= id = exp ; | return exp ; | { statements } | if ( bool-exp ) statement | if ( bool-exp ) statement else statement | while ( bool-exp ) statementbool-exp ::= rel-exp | ! ( rel-exp )rel-exp ::= exp == exp | exp > expexp ::= term | exp + term | exp - termterm ::= factor | term * factorfactor ::= id | int | ( exp ) | id ( ) | id ( exps )exps ::= exp | exps , exp
![Page 5: Grammar - courses.cs.washington.edu · Grammar for D, a little language program::= function-def | program function-def function-def::= int id ( ) {statements} | int id ( parameters)](https://reader034.fdocuments.in/reader034/viewer/2022042806/5f75ddc7b80c1037cb3021dd/html5/thumbnails/5.jpg)
22-November-2002 cse413-19-Grammar © 2002 University of Washington 5
Grammar for Java, a big language
• The Java™ Language Specification, SecondEdition» Entire document
500+ pagesGrammar productions with explanatory text
» Chapter 18, Syntax8 pages of grammar productions, presented in "BNF-style"
![Page 6: Grammar - courses.cs.washington.edu · Grammar for D, a little language program::= function-def | program function-def function-def::= int id ( ) {statements} | int id ( parameters)](https://reader034.fdocuments.in/reader034/viewer/2022042806/5f75ddc7b80c1037cb3021dd/html5/thumbnails/6.jpg)
22-November-2002 cse413-19-Grammar © 2002 University of Washington 6
Parsing
• Parsing: reconstruct the derivation (syntacticstructure) of a program
• In principle, a single recognizer could workdirectly from the concrete, character-by-character grammar» In practice this is never done
![Page 7: Grammar - courses.cs.washington.edu · Grammar for D, a little language program::= function-def | program function-def function-def::= int id ( ) {statements} | int id ( parameters)](https://reader034.fdocuments.in/reader034/viewer/2022042806/5f75ddc7b80c1037cb3021dd/html5/thumbnails/7.jpg)
22-November-2002 cse413-19-Grammar © 2002 University of Washington 7
Parsing & Scanning
• In real compilers the recognizer is split into twophases» Scanner: translate input characters to tokens
Also, report lexical errors like illegal characters and illegal symbols
» Parser: read token stream and reconstruct the derivation
Scanner Parsersource tokens
![Page 8: Grammar - courses.cs.washington.edu · Grammar for D, a little language program::= function-def | program function-def function-def::= int id ( ) {statements} | int id ( parameters)](https://reader034.fdocuments.in/reader034/viewer/2022042806/5f75ddc7b80c1037cb3021dd/html5/thumbnails/8.jpg)
22-November-2002 cse413-19-Grammar © 2002 University of Washington 8
Parsing• The syntax of most programming languages
can be specified by a context-free grammar(CFG)
• Parsing» Given a grammar G and a sentence w in L(G ),
traverse the derivation (parse tree) for w in somestandard order and do something useful at eachnode
» The tree might not be produced explicitly, but thecontrol flow of a parser corresponds to a traversal
![Page 9: Grammar - courses.cs.washington.edu · Grammar for D, a little language program::= function-def | program function-def function-def::= int id ( ) {statements} | int id ( parameters)](https://reader034.fdocuments.in/reader034/viewer/2022042806/5f75ddc7b80c1037cb3021dd/html5/thumbnails/9.jpg)
Parse TreeExample
a = 1 ; if ( a + 1 ) b = 2 ;
program ::= statement | program statementstatement ::= assignStmt | ifStmtassignStmt ::= id = expr ;ifStmt ::= if ( expr ) stmtexpr ::= id | int | expr + exprId ::= a | b | c | i | j | k | n | x | y | zint ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9program
program
statement
statement
ifStmt
assignStmtstatement
expr assignStmtexpr expr
intid
id expr
int
id expr
int
G
w
![Page 10: Grammar - courses.cs.washington.edu · Grammar for D, a little language program::= function-def | program function-def function-def::= int id ( ) {statements} | int id ( parameters)](https://reader034.fdocuments.in/reader034/viewer/2022042806/5f75ddc7b80c1037cb3021dd/html5/thumbnails/10.jpg)
22-November-2002 cse413-19-Grammar © 2002 University of Washington 10
“Standard Order”
• For practical reasons we want the parser to bedeterministic (no backtracking), and we wantto examine the source program from left toright.» parse the program in linear time in the order it
appears in the source file
![Page 11: Grammar - courses.cs.washington.edu · Grammar for D, a little language program::= function-def | program function-def function-def::= int id ( ) {statements} | int id ( parameters)](https://reader034.fdocuments.in/reader034/viewer/2022042806/5f75ddc7b80c1037cb3021dd/html5/thumbnails/11.jpg)
22-November-2002 cse413-19-Grammar © 2002 University of Washington 11
Common Orderings
• Top-down» Start with the root» Traverse the parse tree depth-first, left-to-right (leftmost
derivation)» LL(k)
• Bottom-up» Start at leaves and build up to the root
Effectively a rightmost derivation in reverse
» LR(k) and subsets (LALR(k), SLR(k), etc.)
![Page 12: Grammar - courses.cs.washington.edu · Grammar for D, a little language program::= function-def | program function-def function-def::= int id ( ) {statements} | int id ( parameters)](https://reader034.fdocuments.in/reader034/viewer/2022042806/5f75ddc7b80c1037cb3021dd/html5/thumbnails/12.jpg)
22-November-2002 cse413-19-Grammar © 2002 University of Washington 12
“Something Useful”
• At each point (node) in the traversal, perform somesemantic action» Construct nodes of full parse tree (rare)» Construct abstract syntax tree (common)» Construct linear, lower-level representation (more common
in later parts of a modern compiler)» Generate target code on the fly (1-pass compiler; not
common in production compilers – can’t generate verygood code in one pass)
![Page 13: Grammar - courses.cs.washington.edu · Grammar for D, a little language program::= function-def | program function-def function-def::= int id ( ) {statements} | int id ( parameters)](https://reader034.fdocuments.in/reader034/viewer/2022042806/5f75ddc7b80c1037cb3021dd/html5/thumbnails/13.jpg)
22-November-2002 cse413-19-Grammar © 2002 University of Washington 13
Context-Free Grammars
• Formally, a grammar G is a tuple <N,Σ,P,S>where» N a finite set of non-terminal symbols» Σ a finite set of terminal symbols» P a finite set of productions
A subset of N × (N ∪ Σ )*» S the start symbol, a distinguished element of N
If not specified otherwise, this is usually assumed to bethe non-terminal on the left of the first production
![Page 14: Grammar - courses.cs.washington.edu · Grammar for D, a little language program::= function-def | program function-def function-def::= int id ( ) {statements} | int id ( parameters)](https://reader034.fdocuments.in/reader034/viewer/2022042806/5f75ddc7b80c1037cb3021dd/html5/thumbnails/14.jpg)
22-November-2002 cse413-19-Grammar © 2002 University of Washington 14
Standard Notations
a, b, c elements of Σ terminals
w, x, y, z elements of Σ* strings of terminals
A, B, C elements of N non-terminals
X, Y, Z elements of N ∪ Σ grammar symbols
α, β, γ elements of (N ∪ Σ )* strings of symbols
A→α or A ::= α if <A, α > in P"non-terminal A can take the form α"
![Page 15: Grammar - courses.cs.washington.edu · Grammar for D, a little language program::= function-def | program function-def function-def::= int id ( ) {statements} | int id ( parameters)](https://reader034.fdocuments.in/reader034/viewer/2022042806/5f75ddc7b80c1037cb3021dd/html5/thumbnails/15.jpg)
22-November-2002 cse413-19-Grammar © 2002 University of Washington 15
Derivation Relations
• α A γ => α β γ iff A ::= β in P» "=>" is read "derives"
• A =>* w if there is a chain of productionsstarting with A that generates w» transitive closure
![Page 16: Grammar - courses.cs.washington.edu · Grammar for D, a little language program::= function-def | program function-def function-def::= int id ( ) {statements} | int id ( parameters)](https://reader034.fdocuments.in/reader034/viewer/2022042806/5f75ddc7b80c1037cb3021dd/html5/thumbnails/16.jpg)
22-November-2002 cse413-19-Grammar © 2002 University of Washington 16
Derivation Relations
• w A γ =>lm w β γ iff A ::= β in P» derives leftmost
• α A w =>rm α β w iff A ::= β in P» derives rightmost
• We will only be interested in leftmost andrightmost derivations – not random orderings
![Page 17: Grammar - courses.cs.washington.edu · Grammar for D, a little language program::= function-def | program function-def function-def::= int id ( ) {statements} | int id ( parameters)](https://reader034.fdocuments.in/reader034/viewer/2022042806/5f75ddc7b80c1037cb3021dd/html5/thumbnails/17.jpg)
22-November-2002 cse413-19-Grammar © 2002 University of Washington 17
Languages
• For A in N, L(A) = { w | A =>* w }• If S is the start symbol of grammar G, define
L(G ) = L(S )» The language derived by G is the language derived
by the start symbol S
![Page 18: Grammar - courses.cs.washington.edu · Grammar for D, a little language program::= function-def | program function-def function-def::= int id ( ) {statements} | int id ( parameters)](https://reader034.fdocuments.in/reader034/viewer/2022042806/5f75ddc7b80c1037cb3021dd/html5/thumbnails/18.jpg)
22-November-2002 cse413-19-Grammar © 2002 University of Washington 18
Reduced Grammars
• Grammar G is reduced iff for everyproduction A ::= α in G there is a derivation
S =>* x A z => x α z =>* xyz» i.e., no production is useless
• Convention: we will use only reducedgrammars
![Page 19: Grammar - courses.cs.washington.edu · Grammar for D, a little language program::= function-def | program function-def function-def::= int id ( ) {statements} | int id ( parameters)](https://reader034.fdocuments.in/reader034/viewer/2022042806/5f75ddc7b80c1037cb3021dd/html5/thumbnails/19.jpg)
22-November-2002 cse413-19-Grammar © 2002 University of Washington 19
Ambiguity
• Grammar G is unambiguous iff every w inL(G ) has a unique leftmost (or rightmost)derivation» Fact: unique leftmost or unique rightmost implies
the other• A grammar without this property is ambiguous
» Note that other grammars that generate the samelanguage may be unambiguous
• We need unambiguous grammars for parsing
![Page 20: Grammar - courses.cs.washington.edu · Grammar for D, a little language program::= function-def | program function-def function-def::= int id ( ) {statements} | int id ( parameters)](https://reader034.fdocuments.in/reader034/viewer/2022042806/5f75ddc7b80c1037cb3021dd/html5/thumbnails/20.jpg)
22-November-2002 cse413-19-Grammar © 2002 University of Washington 20
Ambiguous Grammar for Expressions
expr ::= expr + expr | expr - expr| expr * expr | expr / expr | int
int ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9• Show that this is ambiguous
» How? Show two different leftmost or rightmostderivations for the same string
» Equivalently: show two different parse trees forthe same string
![Page 21: Grammar - courses.cs.washington.edu · Grammar for D, a little language program::= function-def | program function-def function-def::= int id ( ) {statements} | int id ( parameters)](https://reader034.fdocuments.in/reader034/viewer/2022042806/5f75ddc7b80c1037cb3021dd/html5/thumbnails/21.jpg)
Example DerivationGive a leftmost derivation of 2+3*4 and show the parse tree
expr ::= expr + expr | expr - expr | expr * expr | expr / expr | int
int ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
![Page 22: Grammar - courses.cs.washington.edu · Grammar for D, a little language program::= function-def | program function-def function-def::= int id ( ) {statements} | int id ( parameters)](https://reader034.fdocuments.in/reader034/viewer/2022042806/5f75ddc7b80c1037cb3021dd/html5/thumbnails/22.jpg)
Give a different leftmost derivation of 2+3*4 and show the parse tree
Another Derivationexpr ::= expr + expr | expr - expr
| expr * expr | expr / expr | intint ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
![Page 23: Grammar - courses.cs.washington.edu · Grammar for D, a little language program::= function-def | program function-def function-def::= int id ( ) {statements} | int id ( parameters)](https://reader034.fdocuments.in/reader034/viewer/2022042806/5f75ddc7b80c1037cb3021dd/html5/thumbnails/23.jpg)
Give two different derivations of 5+6+7
Another Exampleexpr ::= expr + expr | expr - expr
| expr * expr | expr / expr | intint ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
![Page 24: Grammar - courses.cs.washington.edu · Grammar for D, a little language program::= function-def | program function-def function-def::= int id ( ) {statements} | int id ( parameters)](https://reader034.fdocuments.in/reader034/viewer/2022042806/5f75ddc7b80c1037cb3021dd/html5/thumbnails/24.jpg)
22-November-2002 cse413-19-Grammar © 2002 University of Washington 24
What’s going on here?
• The grammar has no notion of precedence orassociativity
• Solution» Create a non-terminal for each level of precedence» Isolate the corresponding part of the grammar» Force the parser to recognize higher precedence
subexpressions first
![Page 25: Grammar - courses.cs.washington.edu · Grammar for D, a little language program::= function-def | program function-def function-def::= int id ( ) {statements} | int id ( parameters)](https://reader034.fdocuments.in/reader034/viewer/2022042806/5f75ddc7b80c1037cb3021dd/html5/thumbnails/25.jpg)
22-November-2002 cse413-19-Grammar © 2002 University of Washington 25
Classic Expression Grammar
expr ::= expr + term | expr – term | termterm ::= term * factor | term / factor | factorfactor ::= int | ( expr )int ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7
![Page 26: Grammar - courses.cs.washington.edu · Grammar for D, a little language program::= function-def | program function-def function-def::= int id ( ) {statements} | int id ( parameters)](https://reader034.fdocuments.in/reader034/viewer/2022042806/5f75ddc7b80c1037cb3021dd/html5/thumbnails/26.jpg)
expr ::= expr + term | expr – term | termterm ::= term * factor | term / factor | factorfactor ::= int | ( expr )int ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7
Derive 2 + 3 * 4
![Page 27: Grammar - courses.cs.washington.edu · Grammar for D, a little language program::= function-def | program function-def function-def::= int id ( ) {statements} | int id ( parameters)](https://reader034.fdocuments.in/reader034/viewer/2022042806/5f75ddc7b80c1037cb3021dd/html5/thumbnails/27.jpg)
expr ::= expr + term | expr – term | termterm ::= term * factor | term / factor | factorfactor ::= int | ( expr )int ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7
Derive 5 + 6 + 7
![Page 28: Grammar - courses.cs.washington.edu · Grammar for D, a little language program::= function-def | program function-def function-def::= int id ( ) {statements} | int id ( parameters)](https://reader034.fdocuments.in/reader034/viewer/2022042806/5f75ddc7b80c1037cb3021dd/html5/thumbnails/28.jpg)
expr ::= expr + term | expr – term | termterm ::= term * factor | term / factor | factorfactor ::= int | ( expr )int ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7
Derive 5 + (6 + 7)
![Page 29: Grammar - courses.cs.washington.edu · Grammar for D, a little language program::= function-def | program function-def function-def::= int id ( ) {statements} | int id ( parameters)](https://reader034.fdocuments.in/reader034/viewer/2022042806/5f75ddc7b80c1037cb3021dd/html5/thumbnails/29.jpg)
22-November-2002 cse413-19-Grammar © 2002 University of Washington 29
Another Classic Example
• Grammar for conditional statementsifStmt ::= if ( cond ) stmt
| if ( cond ) stmt else stmt
» Exercise: show that this is ambiguousHow?
![Page 30: Grammar - courses.cs.washington.edu · Grammar for D, a little language program::= function-def | program function-def function-def::= int id ( ) {statements} | int id ( parameters)](https://reader034.fdocuments.in/reader034/viewer/2022042806/5f75ddc7b80c1037cb3021dd/html5/thumbnails/30.jpg)
if ( cond ) if ( cond ) stmt else stmt
ifStmt ::= if ( cond ) stmt| if ( cond ) stmt else stmtOne Derivation
![Page 31: Grammar - courses.cs.washington.edu · Grammar for D, a little language program::= function-def | program function-def function-def::= int id ( ) {statements} | int id ( parameters)](https://reader034.fdocuments.in/reader034/viewer/2022042806/5f75ddc7b80c1037cb3021dd/html5/thumbnails/31.jpg)
if ( cond ) if ( cond ) stmt else stmt
ifStmt ::= if ( cond ) stmt| if ( cond ) stmt else stmtAnother Derivation
![Page 32: Grammar - courses.cs.washington.edu · Grammar for D, a little language program::= function-def | program function-def function-def::= int id ( ) {statements} | int id ( parameters)](https://reader034.fdocuments.in/reader034/viewer/2022042806/5f75ddc7b80c1037cb3021dd/html5/thumbnails/32.jpg)
22-November-2002 cse413-19-Grammar © 2002 University of Washington 32
Solving if Ambiguity
• Fix the grammar to separate if statements withelse clause and if statements with no else» Done in Java reference grammar» Adds lots of non-terminals
• Use some ad-hoc rule in parser» “else matches closest unpaired if”