1 Lex. 2 Lex is a lexical analyzer Var = 12 + 9; if (test > 20) temp = 0; else while (a < 20)...
-
Upload
carmel-reed -
Category
Documents
-
view
215 -
download
2
Transcript of 1 Lex. 2 Lex is a lexical analyzer Var = 12 + 9; if (test > 20) temp = 0; else while (a < 20)...
1
Lex
2
Lex is a lexical analyzer
Var = 12 + 9;if (test > 20) temp = 0;else while (a < 20) temp++;
Lex
Ident: VarInteger: 12Oper: +Integer: 9Semicolumn: ;Keyword: ifParen: (Ident: testOper: >....
Input
Output
3
For each kind of stringsthere is a regular expression
“if”“then”
“+”“-”“=“
/* operators */
/* keywords */
Lex
Regular expressions
4
(0|1|2|3|4|5|6|7|8|9)+ /* integers */
/* identifiers */
Lex
Regular expressions
(a|b|..|z|A|B|...|Z)+
5
integers
[0-9]+(0|1|2|3|4|5|6|7|8|9)+
6
(a|b|..|z|A|B|...|Z)+ [a-zA-Z]+
identifiers
7
Each regular expression has an action:
Examples:
\n
Regular expression Action
linenum++
[a-zA-Z]+ printf(“identifier”);
[0-9]+ prinf(“integer”);
8
Default action: ECHO;
Print the string identifiedto the output
9
A small program
%%
[a-zA-Z]+ printf(“Identifier\n”);
[0-9]+ prinf(“Integer\n”);
[ \t\n] ; /*skip spaces*/
10
1234 test
var 566 78
9800
Input Output
IntegerIdentifierIdentifierIntegerIntegerInteger
11
%%
[a-zA-Z]+ printf(“Identifier\n”);
[0-9]+ prinf(“Integer\n”);
[ \t] ; /*skip spaces*/
. printf(“Error in line: %d\n”, linenum);
Another program%{ int linenum = 1;%}
\n linenum++;
12
1234 test
var 566 78
9800 +
temp
Input Output
IntegerIdentifierIdentifierIntegerIntegerIntegerError in line 3Identifier
13
Lex matches the longest input string
“if”“ifend”
Regular Expressions
Input: ifend if ifn
Matches: “ifend” “if” nomatch
14
Internal Structure of Lex
Lex
Regular expressions
NFA DFAMinimalDFA
The final states of the DFA areassociated with actions
15
Compilers
16
Compiler
Program
v = 5;if (v>5) x = 12 + v;while (x !=3) { x = x - 3; v = 10;}......
Add v,v,0cmp v,5jmplt ELSETHEN: add x, 12,vELSE:WHILE:cmp x,3...
Machine Code
17
Lexicalanalyzer parser
Compiler
program machinecode
18
Parser knows the grammarof the programming language
19
Parser
PROGRAM -> STMT_LISTSTMT_LIST -> STMT STMT_LIST | STMT;STMT -> EXPR ; | IF_STMT | WHILE_STMT | { STMT_LIST }
EXPR -> EXPR + EXPR | EXPR - EXPR | IDIF_STMT -> if (EXPR) then STMT | if (EXPR) then STMT else STMTWHILE_STMT-> while (EXPR) do STMT
20
The parser constructs the derivation for the particular input program
10 + 2 * 5
Parser
E -> E + E | E * E | INT
E => E + E => E + E * E => 10 + E*E => 10 + 2 * E => 10 + 2 * 5
input
derivation
21
10
E
2 5
E => E + E => E + E * E => 10 + E*E => 10 + 2 * E => 10 + 2 * 5
derivation
derivation tree
E E
E E
+
*
22
10
E
2 5
derivation tree
E E
E E
+
*
mult t1, 10, 5add t2, 10, t1
machine code
23
Parsing
24
grammar
Parserinputstring
derivation
25
Example:
Parserderivation
S
bSAS
aSbS
SSSinput
?aabb
26
Exhaustive Search
||| bSAaSbSSS
Phase 1:
S
bSaS
aSbS
SSS
aabb
27
S
bSaS
aSbS
SSS
aabb
28
||| bSAaSbSSS Phase 2
aSbS
SSS
aabbSSSS
bSaSSSS
aSbSSSS
SSSSSS
aaSbS
bSaSaSbS
aaSbbaSbS
aSSbaSbS
Phase 1
29
||| bSAaSbSSS Phase 2
aSbS
SSS
aabbSSSS
bSaSSSS
aSbSSSS
SSSSSS
aaSbS
bSaSaSbS
aaSbbaSbS
aSSbaSbS
Phase 1
30
Phase 2
SSSS
aSbSSSS
SSSSSS
aaSbbaSbS
aSSbaSbS
Phase 3
aabbaaSbbaSbS
31
Final result of exhaustive search
Parser
derivation
S
bSAS
aSbS
SSSinput
aabb
aabbaaSbbaSbS
(Top-down parsing)
32
Time complexity of exhaustive search
Suppose there are no productions of the form
A
BA
Number of phases for string : w ||2 w
33
Time for phase 1: k
k possible derivations
For grammar with rules k
34
Time for phase 2: 2k
possible derivations2k
35
Time for phase : ||2 wk
possible derivations||2 wk
||2 w
36
Total time needed for string :w
||22 wkkk
Extremely bad!!!
37
There exist faster algorithmsfor specialized grammars
S-grammar: axA
symbol stringof variables
),( aA appears once
38
S-grammar example:
cS
bSSS
aSS
abccabcSabSSaSS
Each string has a unique derivation
39
In the exhaustive search parsingthere is only one choice in each phase
For S-grammars:
Total time for parsing string :w ||w
Time for a phase: 1
40
For general context-free grammars:
There exists a parsing algorithmthat parses a stringin time
||w3||w