Theory of Computation (Fall 2014): Finite State Automata, Context-Free Grammars, & Compilation
-
Upload
vladimir-kulyukin -
Category
Science
-
view
320 -
download
3
description
Transcript of Theory of Computation (Fall 2014): Finite State Automata, Context-Free Grammars, & Compilation
![Page 1: Theory of Computation (Fall 2014): Finite State Automata, Context-Free Grammars, & Compilation](https://reader033.fdocuments.in/reader033/viewer/2022051412/548f97bdb479590d2b8b5158/html5/thumbnails/1.jpg)
Theory of Computation
Finite Automata, Context-Free Grammars, & Compilation
Vladimir Kulyukin
![Page 2: Theory of Computation (Fall 2014): Finite State Automata, Context-Free Grammars, & Compilation](https://reader033.fdocuments.in/reader033/viewer/2022051412/548f97bdb479590d2b8b5158/html5/thumbnails/2.jpg)
Outline
Programming Language L
Finite Automata, CFGs, and Compilation
Tokenization
Syntactic Analysis
Recursive-Descent Parsing
![Page 3: Theory of Computation (Fall 2014): Finite State Automata, Context-Free Grammars, & Compilation](https://reader033.fdocuments.in/reader033/viewer/2022051412/548f97bdb479590d2b8b5158/html5/thumbnails/3.jpg)
Programming Language L
![Page 4: Theory of Computation (Fall 2014): Finite State Automata, Context-Free Grammars, & Compilation](https://reader033.fdocuments.in/reader033/viewer/2022051412/548f97bdb479590d2b8b5158/html5/thumbnails/4.jpg)
L’s Tokens
1
1
1
211111
321
321
as same theis
as same theis
as same theis
example,For 1. be toassumed isit omitted, issubscript theIf
,...,,,,, :Labels
:iableOutput var
,...,, : variablesLocal
,...,, :ablesInput vari
AA
ZZ
XX
AEDCBA
Y
ZZZ
XXX
![Page 5: Theory of Computation (Fall 2014): Finite State Automata, Context-Free Grammars, & Compilation](https://reader033.fdocuments.in/reader033/viewer/2022051412/548f97bdb479590d2b8b5158/html5/thumbnails/5.jpg)
L’s Basic Instructions (Primitives)
same theare side hand-right theand
side hand-left on the variables the3 2, 1, nsinstructioIn :NOTE
branch) (cond. GOTO 0 IF 4.
opp)-(no .3
)(decrement 1 .2
)(increment 1 .1
LV
VV
VV
VV
![Page 6: Theory of Computation (Fall 2014): Finite State Automata, Context-Free Grammars, & Compilation](https://reader033.fdocuments.in/reader033/viewer/2022051412/548f97bdb479590d2b8b5158/html5/thumbnails/6.jpg)
L’s Labeled Primitives
GOTO.after
dropped are brackets square thedispatches lconditionain However,
brackets. squarein is label theline theof beginning At the :NOTE
branch) (cond. GOTO 0 IF L 4.
opp)-(no L .3
)(decrement 1 L .2
)(increment 1 L .1
LV
VV
VV
VV
![Page 7: Theory of Computation (Fall 2014): Finite State Automata, Context-Free Grammars, & Compilation](https://reader033.fdocuments.in/reader033/viewer/2022051412/548f97bdb479590d2b8b5158/html5/thumbnails/7.jpg)
Labeled Primitives: Examples
● [A1] X1 X1 + 1
● [B1] X23 X23 – 1
● [C10] Z12 Z12 + 1
● [E1] Y Y
● [D101] IF X1 != 0 GOTO E1
![Page 8: Theory of Computation (Fall 2014): Finite State Automata, Context-Free Grammars, & Compilation](https://reader033.fdocuments.in/reader033/viewer/2022051412/548f97bdb479590d2b8b5158/html5/thumbnails/8.jpg)
The Output Value of L’s Program
● The output value of an L program is the value of the Y variable
● If an L program goes into an infinite loop, the value is undefined
● Thus, an L program implements a function that maps the values of the input variables into the value of Y
![Page 9: Theory of Computation (Fall 2014): Finite State Automata, Context-Free Grammars, & Compilation](https://reader033.fdocuments.in/reader033/viewer/2022051412/548f97bdb479590d2b8b5158/html5/thumbnails/9.jpg)
Exit Label E
● We will assume that each L program has a unique exit label E or (E1)
● If conditional dispatch with GOTO E or GOTO E1 is executed, the control exits the program and its execution terminates
● If we want to be explicit about this, we can assume that the implicit last statement of every L-program is [E1] return Y
![Page 10: Theory of Computation (Fall 2014): Finite State Automata, Context-Free Grammars, & Compilation](https://reader033.fdocuments.in/reader033/viewer/2022051412/548f97bdb479590d2b8b5158/html5/thumbnails/10.jpg)
Example
otherwise
0 if 1)(
x
xxf
![Page 11: Theory of Computation (Fall 2014): Finite State Automata, Context-Free Grammars, & Compilation](https://reader033.fdocuments.in/reader033/viewer/2022051412/548f97bdb479590d2b8b5158/html5/thumbnails/11.jpg)
Implementing f(x) in L
AX
YY
XXA
AX
YY
XXA
GOTO 0 IF
1
1 ][
:subscripts use onot want t do weif Or,
GOTO 0 IF
1
1 ][
11
111
![Page 12: Theory of Computation (Fall 2014): Finite State Automata, Context-Free Grammars, & Compilation](https://reader033.fdocuments.in/reader033/viewer/2022051412/548f97bdb479590d2b8b5158/html5/thumbnails/12.jpg)
Three Stages of Compilation
● Syntactic Analysis: The source program is processed to determine its conformity to the language grammar and its structure
● Contextual Analysis: The output of the syntactic analysis (a parse tree) is checked for its conformity to the language’s contextual constraints
● Code Generation: The checked parse tree is used to generate the target code, e.g. Java byte code or assembly or some other target language
![Page 13: Theory of Computation (Fall 2014): Finite State Automata, Context-Free Grammars, & Compilation](https://reader033.fdocuments.in/reader033/viewer/2022051412/548f97bdb479590d2b8b5158/html5/thumbnails/13.jpg)
Components of Syntactic Analysis
● Syntactic Analysis consists of Tokenization and Parsing
● Tokenization: We have to define a set of FA’s (regular expressions) to tokenize input statements (primitive instructions)
● Parsing: We have to define a CFG to map tokenized input statements (primitive instructions) into parse trees.
![Page 14: Theory of Computation (Fall 2014): Finite State Automata, Context-Free Grammars, & Compilation](https://reader033.fdocuments.in/reader033/viewer/2022051412/548f97bdb479590d2b8b5158/html5/thumbnails/14.jpg)
Tokenization: Two Basic Design Principles
● Zero Token Ambiguity: Each sequence of non-white-space characters must be mapped to at most one token
● Zero Statement (Instruction) Ambiguity: Each sequence of tokens recognized in between the beginning of a line and a newline character must have at most one parse tree
![Page 15: Theory of Computation (Fall 2014): Finite State Automata, Context-Free Grammars, & Compilation](https://reader033.fdocuments.in/reader033/viewer/2022051412/548f97bdb479590d2b8b5158/html5/thumbnails/15.jpg)
Tokenization of Programming Language L
![Page 16: Theory of Computation (Fall 2014): Finite State Automata, Context-Free Grammars, & Compilation](https://reader033.fdocuments.in/reader033/viewer/2022051412/548f97bdb479590d2b8b5158/html5/thumbnails/16.jpg)
Sample L Program
Here is a sample program in L:
[A1] X1 <= X1 – 1
Y <= Y + 1
IF X1 != 0 GOTO A1
![Page 17: Theory of Computation (Fall 2014): Finite State Automata, Context-Free Grammars, & Compilation](https://reader033.fdocuments.in/reader033/viewer/2022051412/548f97bdb479590d2b8b5158/html5/thumbnails/17.jpg)
Tokenization: Input Variables (InputVarToken)
Input variables are tokens of the form X1, X2, X3, etc. In general, an input variable is Xk, where k is a natural number greater than 0. An NFA is as follows:
X [1 – 9]
[0 – 9]
![Page 18: Theory of Computation (Fall 2014): Finite State Automata, Context-Free Grammars, & Compilation](https://reader033.fdocuments.in/reader033/viewer/2022051412/548f97bdb479590d2b8b5158/html5/thumbnails/18.jpg)
Tokenization: Output Variables (OutputVarToken)
L has only one output variable: Y. Here is an NFA:
Y
![Page 19: Theory of Computation (Fall 2014): Finite State Automata, Context-Free Grammars, & Compilation](https://reader033.fdocuments.in/reader033/viewer/2022051412/548f97bdb479590d2b8b5158/html5/thumbnails/19.jpg)
Tokenization: Local Variables (LocalVarToken)
Local variables are tokens of the form Z1, Z2, Z3, etc. In general, a local variable is Zk, where k is a natural number greater than 0. An NFA is as follows:
Z [1 – 9]
[0 – 9]
![Page 20: Theory of Computation (Fall 2014): Finite State Automata, Context-Free Grammars, & Compilation](https://reader033.fdocuments.in/reader033/viewer/2022051412/548f97bdb479590d2b8b5158/html5/thumbnails/20.jpg)
Tokenization: Labels
● There are two places where a label can occur in a primitive instruction: at the beginning of a line and at the end of a line
● At the beginning of a line a label is bracketed; at the end of a line it is not
● Furthermore, labels that start with A, B, C, D are non-exit labels; labels that start with E are exit labels
![Page 21: Theory of Computation (Fall 2014): Finite State Automata, Context-Free Grammars, & Compilation](https://reader033.fdocuments.in/reader033/viewer/2022051412/548f97bdb479590d2b8b5158/html5/thumbnails/21.jpg)
Tokenization: Non-Exit Non-Bracketed Labels (NELblToken)
Non-exit labels that occur at the end of a line are tokens of the form Λ1, Λ 2, Λ3, etc. In general, a label is Λk, where k is a natural number greater than 0 and Λ is in {A, B, C, D}. An NFA is as follows:
A,B,C,D [1 – 9]
[0 – 9]
![Page 22: Theory of Computation (Fall 2014): Finite State Automata, Context-Free Grammars, & Compilation](https://reader033.fdocuments.in/reader033/viewer/2022051412/548f97bdb479590d2b8b5158/html5/thumbnails/22.jpg)
Tokenization: Non-Exit Bracketed Labels (NEBrLblToken)
Non-exit labels that occur at the end of a line are tokens of the form [Λ1], [Λ2], [Λ3], etc. In general, a label is [Λk], where k is a natural number greater than 0 and Λ is in {A, B, C, D}. An NFA is as follows:
A,B,C,D [1 – 9]
[0 – 9]
[ ]
![Page 23: Theory of Computation (Fall 2014): Finite State Automata, Context-Free Grammars, & Compilation](https://reader033.fdocuments.in/reader033/viewer/2022051412/548f97bdb479590d2b8b5158/html5/thumbnails/23.jpg)
Tokenization: Exit Non-Bracketed Label (ELblToken)
Every L program has a unique exit label (E1). If the exit label occurs at the end of a line, it is not bracketed. An NFA is as follows:
E 1
![Page 24: Theory of Computation (Fall 2014): Finite State Automata, Context-Free Grammars, & Compilation](https://reader033.fdocuments.in/reader033/viewer/2022051412/548f97bdb479590d2b8b5158/html5/thumbnails/24.jpg)
Tokenization: Exit Bracketed Label (EBrLblToken)
Every L program has a unique exit label (E1). If the exit label occurs at the beginning of a line is it bracketed. An NFA is as follows:
E 1 [ ]
![Page 25: Theory of Computation (Fall 2014): Finite State Automata, Context-Free Grammars, & Compilation](https://reader033.fdocuments.in/reader033/viewer/2022051412/548f97bdb479590d2b8b5158/html5/thumbnails/25.jpg)
Tokenization: Operators
There are four operator tokens in L: <=, +, -, != . Here is possible NFAs for operators:
< =
! =
+
-
AssignOperToken
NotEqOperToken
PlusOperToken
MinusOperToken
![Page 26: Theory of Computation (Fall 2014): Finite State Automata, Context-Free Grammars, & Compilation](https://reader033.fdocuments.in/reader033/viewer/2022051412/548f97bdb479590d2b8b5158/html5/thumbnails/26.jpg)
Tokenization: Keywords
L has two keywords: IF and GOTO. Two possible NFAs:
I F
G O T O
IFToken
GOTOToken
![Page 27: Theory of Computation (Fall 2014): Finite State Automata, Context-Free Grammars, & Compilation](https://reader033.fdocuments.in/reader033/viewer/2022051412/548f97bdb479590d2b8b5158/html5/thumbnails/27.jpg)
Tokenization: Literals
L has 2 literals: 0 and 1. Two possible NFAs:
0
1
ZeroLitToken
OneLitToken
![Page 28: Theory of Computation (Fall 2014): Finite State Automata, Context-Free Grammars, & Compilation](https://reader033.fdocuments.in/reader033/viewer/2022051412/548f97bdb479590d2b8b5158/html5/thumbnails/28.jpg)
Complete List of Tokens
1.InputVarToken
2.OutputVarToken
3.LocalVarToken
4.NELblToken
5.ELblToken
6.NEBrLblToken
7.EBrLblToken
8.AssignOperToken
9.NotEqOperToken
10.PlusOperToken
11.MinusOperToken
12.IFToken
13.GOTOToken
14.ZeroLitToken
15.OneLitToken
![Page 29: Theory of Computation (Fall 2014): Finite State Automata, Context-Free Grammars, & Compilation](https://reader033.fdocuments.in/reader033/viewer/2022051412/548f97bdb479590d2b8b5158/html5/thumbnails/29.jpg)
Tokenization Algorithm: Outline
● Read in a line of text
● Partition the line into substrings on white space
● Run each substring through all possible NFAs
● Each substring can be recognized by at most one NFA
● If a substring is not recognized by an NFA, report an error; otherwise, create an appropriate token, depending on what NFA recognized the substring
● The output is a sequence of tokens
![Page 30: Theory of Computation (Fall 2014): Finite State Automata, Context-Free Grammars, & Compilation](https://reader033.fdocuments.in/reader033/viewer/2022051412/548f97bdb479590d2b8b5158/html5/thumbnails/30.jpg)
Tokenization Algorithm: Details
● Activate all Lazy NFAs for token recognition
● Read the file character by character; when a non-white-space character is read, go into the token recognition mode
● In the token recognition mode, when a character is read, feed it to every NFA so that all NFAs that recognize it make their transitions; if no NFA can transition, fail
● When a white-space character is read, switch off the token recognition mode and check if any NFAs accepted the sequence of non-white space characters
– if yes, construct the appropriate token and reset each NFA back to its start state
– If none of the NFAs accepted or more than one accepted, fail
![Page 31: Theory of Computation (Fall 2014): Finite State Automata, Context-Free Grammars, & Compilation](https://reader033.fdocuments.in/reader033/viewer/2022051412/548f97bdb479590d2b8b5158/html5/thumbnails/31.jpg)
Tokenization Example: Line 1
● [A1] X1 <= X1 – 1 ● White space partitioning gives us the following substrings: “[A1]”, “X1”, “<=“, “X1”, “-”, “1” ● “[A1]” is recognized by the Non-Exit Bracketed Label NFA; so create NEBrLblToken(“A1”) ● “X1” is recognized by the Input Variable NFA; so create InputVarToken(“X1”) ● “<=“ is recognized by the Assignment Operator NFA; so create AssignOperToken(“<=“) ● “X1” is recognized by the InputVariable NFA; so create InputVarToken(“X1”) ● “-” is recognized by the Minus Operator NFA; so create MinusOperToken(“-”) ● “1” is recognized by the One Literal NFA; so create OneLitToken(“1”) ● The output is: –<NEBrLblToken(“A1”), InputVarToken(“X1”), AssignOperToken(“<=“), InputVarToken(“X1”), MinusOperToken(“-”), OneLitToken(“1”)>
![Page 32: Theory of Computation (Fall 2014): Finite State Automata, Context-Free Grammars, & Compilation](https://reader033.fdocuments.in/reader033/viewer/2022051412/548f97bdb479590d2b8b5158/html5/thumbnails/32.jpg)
Tokenization Example: Line 1
The line [A1] X1 <= X1 – 1 gives us the following sequences of tokens:
NEBrLblToken InputVarToken AssigOperToken InputVarToken MinusOperToken OneLitToken
“A1” “X1” “<=“ “X1” “-” “1”
![Page 33: Theory of Computation (Fall 2014): Finite State Automata, Context-Free Grammars, & Compilation](https://reader033.fdocuments.in/reader033/viewer/2022051412/548f97bdb479590d2b8b5158/html5/thumbnails/33.jpg)
Tokenization Example: Line 2
The line Y <= Y + 1 gives us the following sequences of tokens:
OutputVarToken AssigOperToken OutputVarToken PlusOperToken OneLitToken
“Y” “<=“ “Y” “+” “1”
![Page 34: Theory of Computation (Fall 2014): Finite State Automata, Context-Free Grammars, & Compilation](https://reader033.fdocuments.in/reader033/viewer/2022051412/548f97bdb479590d2b8b5158/html5/thumbnails/34.jpg)
Tokenization Example: Line 3
The line IF X1 != 0 GOTO A1 gives us the following sequences of tokens:
IFToken InputVarToken NotEqOperToken ZeroLitToken GOTOToken NELblToken
“IF” “X1” “!=“ “0” “GOTO” “A1”
![Page 35: Theory of Computation (Fall 2014): Finite State Automata, Context-Free Grammars, & Compilation](https://reader033.fdocuments.in/reader033/viewer/2022051412/548f97bdb479590d2b8b5158/html5/thumbnails/35.jpg)
Parsing
![Page 36: Theory of Computation (Fall 2014): Finite State Automata, Context-Free Grammars, & Compilation](https://reader033.fdocuments.in/reader033/viewer/2022051412/548f97bdb479590d2b8b5158/html5/thumbnails/36.jpg)
Recursive Descent Parsing
● Recursive Descent Parsing is an algorithm that should be considered for any unambiguous CF grammar
● All programming languages are specified either with unambiguous CF grammars or with ambiguous CF grammars where ambiguity can be easily handled
● The basic step in designing an RDP parser is to design a parsing procedure parseN for every non-terminal symbol N in the grammar
![Page 37: Theory of Computation (Fall 2014): Finite State Automata, Context-Free Grammars, & Compilation](https://reader033.fdocuments.in/reader033/viewer/2022051412/548f97bdb479590d2b8b5158/html5/thumbnails/37.jpg)
Developing Recursive-Descent Parser for L
● To develop a recursive-descent parser for L we need to accomplish three tasks:
– Develop a CFG G for L
– Derive a set of RD parsing procedures from G
– Implement the rules in a programming language (Java, C/C++, C#, Structured COBOL , etc.)
![Page 38: Theory of Computation (Fall 2014): Finite State Automata, Context-Free Grammars, & Compilation](https://reader033.fdocuments.in/reader033/viewer/2022051412/548f97bdb479590d2b8b5158/html5/thumbnails/38.jpg)
A CFG Grammar for L
![Page 39: Theory of Computation (Fall 2014): Finite State Automata, Context-Free Grammars, & Compilation](https://reader033.fdocuments.in/reader033/viewer/2022051412/548f97bdb479590d2b8b5158/html5/thumbnails/39.jpg)
A CFG Grammar for L
● Incrmnt VarToken AssignOperToken VarToken PlusOperToken OneLitToken
–Note: this rule is simplified, because, technically speaking, VarToken is not present in the list of tokens. So, we have to write additional productions of the form:
VarToken InputVarToken | OutputVarToken | LocalVarToken
● Decrmnt VarToken AssignOperToken VarToken MinusOperToken OneLitToken
● NOP VarToken AssignOperToken VarToken
● CDisp IFToken VarToken NotEqOperToken ZeroLitToken GOTOToken DispLBL
● DispLBL NELblToken | ELblToken
![Page 40: Theory of Computation (Fall 2014): Finite State Automata, Context-Free Grammars, & Compilation](https://reader033.fdocuments.in/reader033/viewer/2022051412/548f97bdb479590d2b8b5158/html5/thumbnails/40.jpg)
Top-Level CFG Productions
● LProgram LInstructSEQ
–To recognize a L Program is to recognize a sequence of L instructions
● LInstructSEQ ε
–A sequence of L instructions can be empty
● LInstructSEQ LInstruct LInstructSEQ
–A non-empty sequence of L instructions starts with an L instructions and is followed by a sequence of L instructions
![Page 41: Theory of Computation (Fall 2014): Finite State Automata, Context-Free Grammars, & Compilation](https://reader033.fdocuments.in/reader033/viewer/2022051412/548f97bdb479590d2b8b5158/html5/thumbnails/41.jpg)
Recursive-Descent Parsing Procedures
![Page 42: Theory of Computation (Fall 2014): Finite State Automata, Context-Free Grammars, & Compilation](https://reader033.fdocuments.in/reader033/viewer/2022051412/548f97bdb479590d2b8b5158/html5/thumbnails/42.jpg)
Parsing Procedures for L
● Let us agree that each parsing procedure returns a ParseTree data structure (the base class)
● Consider the first rule in our grammar: LProgram LInstructSEQ
● ParseTree parseLProgram(input, start_pos) {
ParseTree progTree = parseLInstructSEQ(input, start_pos);
return progTree;
}
![Page 43: Theory of Computation (Fall 2014): Finite State Automata, Context-Free Grammars, & Compilation](https://reader033.fdocuments.in/reader033/viewer/2022051412/548f97bdb479590d2b8b5158/html5/thumbnails/43.jpg)
parseLinstructSEQ Procedure
●There are 2 productions: LInstructSEQ ε | LInstructSEQ LInstruct LInstructSEQ
ParseTree parseLInstructSEQ(input, start_pos) {
if ( input is empty )
return the empty LInstructSEQ;
else {
ParseTree firstIns = parseLInstruct(input, start_pos);
ParseTree restInstructs = parseLInstructSEQ(input, firstIns.getNextPos());
return new LInstructSEQ(firstInstruct, restInstructs);
}
}
![Page 44: Theory of Computation (Fall 2014): Finite State Automata, Context-Free Grammars, & Compilation](https://reader033.fdocuments.in/reader033/viewer/2022051412/548f97bdb479590d2b8b5158/html5/thumbnails/44.jpg)
parseLInstruct Procedure
●Two productions for LInstruct: LInstruct LblStmnt | Stmnt
● ParseTree parseLInstruct(input, start_pos) {
ParseTree lblSt = parseLblStmnt(input, start_pos);
if ( lblSt == null )
return parseStmnt(input, start_pos);
else
return lblSt;
}
![Page 45: Theory of Computation (Fall 2014): Finite State Automata, Context-Free Grammars, & Compilation](https://reader033.fdocuments.in/reader033/viewer/2022051412/548f97bdb479590d2b8b5158/html5/thumbnails/45.jpg)
parseLblStmnt Procedure
● There is one production for LblStmnt: LblStmnt BrLBL Stmnt
● ParseTree parseLblStmnt(input, start_pos) {
ParseTree brLbl = parseBrLbl(inut, start_pos);
if ( brLbl == null ) return null;
else {
ParseTree stmnt = parseStmnt(input, brLbl.getNextPos();
if ( stmnt == null ) return null;
else
return new LblStmnt(brLbl, stmnt);
}
![Page 46: Theory of Computation (Fall 2014): Finite State Automata, Context-Free Grammars, & Compilation](https://reader033.fdocuments.in/reader033/viewer/2022051412/548f97bdb479590d2b8b5158/html5/thumbnails/46.jpg)
parseLbl Procedure
● G has two productions for BrLbl:
BrLBL NEBrLblToken | EBrLblToken
● Note that both right-hand sides consist of tokens Remember that tokens are terminals to the parser
● So, in this case, instead of parsing we have to make sure that these terminals are in the input
![Page 47: Theory of Computation (Fall 2014): Finite State Automata, Context-Free Grammars, & Compilation](https://reader033.fdocuments.in/reader033/viewer/2022051412/548f97bdb479590d2b8b5158/html5/thumbnails/47.jpg)
parseLbl Procedure
ParseTree parseLbl(input, start_pos) {
if (input[start_pos] == NEBrLblToken )
return new Lbl(input[start_pos]);
else if (input[start_pos] == EBrLblToken)
return new Lbl(input[start_pos]);
else
return null;
}
![Page 48: Theory of Computation (Fall 2014): Finite State Automata, Context-Free Grammars, & Compilation](https://reader033.fdocuments.in/reader033/viewer/2022051412/548f97bdb479590d2b8b5158/html5/thumbnails/48.jpg)
ParseIncrmnt Procedure
● The rest of the parsing procedures can be derived in a similar fashion
● There is one rule for Incrmnt:
Incrmnt VarToken AssignOperToken VarToken PlusOperToken OneLitToken
● This rule does not require any parsing; it requires only matching of tokens
![Page 49: Theory of Computation (Fall 2014): Finite State Automata, Context-Free Grammars, & Compilation](https://reader033.fdocuments.in/reader033/viewer/2022051412/548f97bdb479590d2b8b5158/html5/thumbnails/49.jpg)
parseIncrmnt Procedure
ParseTree parseIncrmnt(input, start_pos) {
if ( input[start_pos] != VarToken )
return null;
else if ( input[start_pos+1] != AssignOperToken )
return null;
else if ( input[start_pos+2] != VarToken)
return null;
else if ( input[start_pos+3] != PlusOperToken)
return null;
else if ( input[start_pos+4] != OneLitToken)
return null;
else
return new Incrmnt(VarToken, AssignOperToken,
VarToken, PlusOperToken, OneLitToken);
}
![Page 50: Theory of Computation (Fall 2014): Finite State Automata, Context-Free Grammars, & Compilation](https://reader033.fdocuments.in/reader033/viewer/2022051412/548f97bdb479590d2b8b5158/html5/thumbnails/50.jpg)
Parsing Example
Let us parse the following L program:
[A1] X1 <= X1 – 1
Y <= Y + 1
IF X1 != 0 GOTO A1
![Page 51: Theory of Computation (Fall 2014): Finite State Automata, Context-Free Grammars, & Compilation](https://reader033.fdocuments.in/reader033/viewer/2022051412/548f97bdb479590d2b8b5158/html5/thumbnails/51.jpg)
Parsing Example: Line 1 Tokenized
The line [A1] X1 <= X1 – 1 gives us the following sequences of tokens:
NEBrLblToken InputVarToken AssigOperToken InputVarToken MinusOperToken OneLitToken
“A1” “X1” “<=“ “X1” “-” “1”
![Page 52: Theory of Computation (Fall 2014): Finite State Automata, Context-Free Grammars, & Compilation](https://reader033.fdocuments.in/reader033/viewer/2022051412/548f97bdb479590d2b8b5158/html5/thumbnails/52.jpg)
Parsing Example: Line 1 ParseTree
LInstruct
LblStmnt
BrLbl Stmnt
NEBrLblToken
“[A1]”
Decmnt
InputVarToken AssignOperToken InputVarToken MinusOperToken OneLitToken
“X1” “<=“ “X1” “-” “1”
![Page 53: Theory of Computation (Fall 2014): Finite State Automata, Context-Free Grammars, & Compilation](https://reader033.fdocuments.in/reader033/viewer/2022051412/548f97bdb479590d2b8b5158/html5/thumbnails/53.jpg)
Parsing Example: Line 2 Tokenized
The line Y <= Y + 1 gives us the following sequences of tokens:
OutputVarToken AssigOperToken OutputVarToken PlusOperToken OneLitToken
“Y” “<=“ “Y” “+” “1”
![Page 54: Theory of Computation (Fall 2014): Finite State Automata, Context-Free Grammars, & Compilation](https://reader033.fdocuments.in/reader033/viewer/2022051412/548f97bdb479590d2b8b5158/html5/thumbnails/54.jpg)
Parsing Example: Line 2 ParseTree
LInstruct
Stmnt
Incmnt
OutputVarToken AssignOperToken OutputVarToken PlusOperToken OneLitToken
“Y” “<=“ “Y” “+” “1”
![Page 55: Theory of Computation (Fall 2014): Finite State Automata, Context-Free Grammars, & Compilation](https://reader033.fdocuments.in/reader033/viewer/2022051412/548f97bdb479590d2b8b5158/html5/thumbnails/55.jpg)
Parsing Example: Line 3 Tokenized
The line IF X1 != 0 GOTO A1 gives us the following sequences of tokens:
IFToken InputVarToken NotEqOperToken ZeroLitToken GOTOToken NELblToken
“IF” “X1” “!=“ “0” “GOTO” “A1”
![Page 56: Theory of Computation (Fall 2014): Finite State Automata, Context-Free Grammars, & Compilation](https://reader033.fdocuments.in/reader033/viewer/2022051412/548f97bdb479590d2b8b5158/html5/thumbnails/56.jpg)
Parsing Example: Line 3 ParseTree
LInstruct
Stmnt
CDisp
IFToken NotEqOperToken InputVarToken ZeroLitToken GOTOToken
“IF” “X1“ “!=” “GOTO” “A1”
NELblToken
“0”
![Page 57: Theory of Computation (Fall 2014): Finite State Automata, Context-Free Grammars, & Compilation](https://reader033.fdocuments.in/reader033/viewer/2022051412/548f97bdb479590d2b8b5158/html5/thumbnails/57.jpg)
Parsing Example: LProgram ParseTree
LProgram
LInstructSEQ
LInstruct LInstruct LInstruct
“[A1] X1 <= X1 – 1” “Y <= Y + 1” “IF X1 != 0 GOTO A1”
![Page 58: Theory of Computation (Fall 2014): Finite State Automata, Context-Free Grammars, & Compilation](https://reader033.fdocuments.in/reader033/viewer/2022051412/548f97bdb479590d2b8b5158/html5/thumbnails/58.jpg)
References & Reading Suggestions
Hopcroft and Ullman. Introduction to Automata
Theory, Languages, and Computation, Narosa
Publishing House
Moll, Arbib, and Kfoury. An Introduction to Formal
Language Theory
Davis, Weyuker, Sigal. Computability, Complexity,
and Languages, 2nd Edition, Academic Press
Brooks Webber. Formal Language: A Practical
Introduction, Franklin, Beedle & Associates, Inc