SCRIBE SUBMISSION GROUP 8 Date: 7/8/2013 By – IKHAR SUSHRUT MEGHSHYAM 11CS10017 Lexical Analyser...

7
SCRIBE SUBMISSION GROUP 8 Date: 7/8/2013 By – IKHAR SUSHRUT MEGHSHYAM 11CS10017 Lexical Analyser • Constructing Tokens • State-Transition Diagram • S-T Diagrams of Operators, Variables, Digits Index Topic Covered : Detecting lexemes from a given set of patterns/stream of chars

Transcript of SCRIBE SUBMISSION GROUP 8 Date: 7/8/2013 By – IKHAR SUSHRUT MEGHSHYAM 11CS10017 Lexical Analyser...

Page 1: SCRIBE SUBMISSION GROUP 8 Date: 7/8/2013 By – IKHAR SUSHRUT MEGHSHYAM 11CS10017 Lexical Analyser Constructing Tokens State-Transition Diagram S-T Diagrams.

SCRIBE SUBMISSIONGROUP 8

Date: 7/8/2013

By – IKHAR SUSHRUT MEGHSHYAM11CS10017

• Lexical Analyser• Constructing Tokens• State-Transition Diagram• S-T Diagrams of Operators, Variables, Digits

Index

Topic Covered : Detecting lexemes from a given set of patterns/stream of chars

Page 2: SCRIBE SUBMISSION GROUP 8 Date: 7/8/2013 By – IKHAR SUSHRUT MEGHSHYAM 11CS10017 Lexical Analyser Constructing Tokens State-Transition Diagram S-T Diagrams.

Lexical Analyser

BLACK BOX

Construct Token

Pattern

Pattern

Regular Expression

Regular Language

Stream ofCharacters

Lexeme

Pattern To Parser

• Describe Patterns using regular expression• For a specific pattern we can define a regular expression corresponding to regular language

Page 3: SCRIBE SUBMISSION GROUP 8 Date: 7/8/2013 By – IKHAR SUSHRUT MEGHSHYAM 11CS10017 Lexical Analyser Constructing Tokens State-Transition Diagram S-T Diagrams.

Construct Tokens for specified set of patterns

Tokens for some patterns:

1. Keywords : if <if>, else <else>, while <while>, then <then>, do <do>

2. Operators : > <op , GT>, >= <op , GE>,< <op , LT>, <= <op , LE>.= <op , EQ>

3. Variables : start with letter followed by letters/digits/underscores< id , pointer to symbol table>

4. Numbers : Whole numbers & Floating point numbers< number , pointer to constant table>

5. Whitespaces : tab/newline/whitespaceNo tokens will be created

Page 4: SCRIBE SUBMISSION GROUP 8 Date: 7/8/2013 By – IKHAR SUSHRUT MEGHSHYAM 11CS10017 Lexical Analyser Constructing Tokens State-Transition Diagram S-T Diagrams.

STATE - TRANSITION DIAGRAM

S-T diagram is a directed graph consisting states as set of nodes and directed edges corresponding to transitions from one state to another.

starta

a

b

a INPUT

Start state

Final states

∑ = {a, b}

For an input string X, If final state is reached then X is accepted by the machine M defined over the alphabet ∑ L(M) denotes the set of all accepted strings by machine M

Page 5: SCRIBE SUBMISSION GROUP 8 Date: 7/8/2013 By – IKHAR SUSHRUT MEGHSHYAM 11CS10017 Lexical Analyser Constructing Tokens State-Transition Diagram S-T Diagrams.

S-T Diagrams for some patterns 1. S-T Diagram for ‘while’

start\0

hw i el

backtracking

Token : <while>

2. S-T Diagram for ‘digits’

start . digit

digit

digit

digit

other symbol

other symbol

*

*

*

digit : [0-9]digits : {digit}*

Page 6: SCRIBE SUBMISSION GROUP 8 Date: 7/8/2013 By – IKHAR SUSHRUT MEGHSHYAM 11CS10017 Lexical Analyser Constructing Tokens State-Transition Diagram S-T Diagrams.

3. S-T Diagram for ‘operators’

start =<

\0Other symbol

=>

\0Other symbol

*

*

=

<OP,LE>

<OP,LT>

<OP,GE>

<OP,GT>

<OP,EQ>

Page 7: SCRIBE SUBMISSION GROUP 8 Date: 7/8/2013 By – IKHAR SUSHRUT MEGHSHYAM 11CS10017 Lexical Analyser Constructing Tokens State-Transition Diagram S-T Diagrams.

3. S-T Diagram for ‘variables’

start elseletters

letters/digits/underscore

*

backtracking

To distinguish between keywords and variables we try to maximize the length of the lexeme.

We apply “parallel simulation” for all the above S-T diagrams and determine the token for which lexeme is of maximum length

letter : [A-Za-z]letters : {letter}*digit : [0-9]digits : {digit}*underscore : _ + €