SCRIBE SUBMISSION GROUP 8 Date: 7/8/2013 By – IKHAR SUSHRUT MEGHSHYAM 11CS10017 Lexical Analyser...
-
Upload
lenard-cross -
Category
Documents
-
view
215 -
download
0
Transcript of SCRIBE SUBMISSION GROUP 8 Date: 7/8/2013 By – IKHAR SUSHRUT MEGHSHYAM 11CS10017 Lexical Analyser...
SCRIBE SUBMISSIONGROUP 8
Date: 7/8/2013
By – IKHAR SUSHRUT MEGHSHYAM11CS10017
• Lexical Analyser• Constructing Tokens• State-Transition Diagram• S-T Diagrams of Operators, Variables, Digits
Index
Topic Covered : Detecting lexemes from a given set of patterns/stream of chars
Lexical Analyser
BLACK BOX
Construct Token
Pattern
Pattern
Regular Expression
Regular Language
Stream ofCharacters
Lexeme
Pattern To Parser
• Describe Patterns using regular expression• For a specific pattern we can define a regular expression corresponding to regular language
Construct Tokens for specified set of patterns
Tokens for some patterns:
1. Keywords : if <if>, else <else>, while <while>, then <then>, do <do>
2. Operators : > <op , GT>, >= <op , GE>,< <op , LT>, <= <op , LE>.= <op , EQ>
3. Variables : start with letter followed by letters/digits/underscores< id , pointer to symbol table>
4. Numbers : Whole numbers & Floating point numbers< number , pointer to constant table>
5. Whitespaces : tab/newline/whitespaceNo tokens will be created
STATE - TRANSITION DIAGRAM
S-T diagram is a directed graph consisting states as set of nodes and directed edges corresponding to transitions from one state to another.
starta
a
b
a INPUT
Start state
Final states
∑ = {a, b}
For an input string X, If final state is reached then X is accepted by the machine M defined over the alphabet ∑ L(M) denotes the set of all accepted strings by machine M
S-T Diagrams for some patterns 1. S-T Diagram for ‘while’
start\0
hw i el
backtracking
Token : <while>
2. S-T Diagram for ‘digits’
start . digit
digit
digit
digit
other symbol
other symbol
*
*
*
digit : [0-9]digits : {digit}*
3. S-T Diagram for ‘operators’
start =<
\0Other symbol
=>
\0Other symbol
*
*
=
<OP,LE>
<OP,LT>
<OP,GE>
<OP,GT>
<OP,EQ>
3. S-T Diagram for ‘variables’
start elseletters
letters/digits/underscore
*
backtracking
To distinguish between keywords and variables we try to maximize the length of the lexeme.
We apply “parallel simulation” for all the above S-T diagrams and determine the token for which lexeme is of maximum length
letter : [A-Za-z]letters : {letter}*digit : [0-9]digits : {digit}*underscore : _ + €