CSC3315 (Spring 2009)1 CSC 3315 Lexical and Syntax Analysis Hamid Harroud School of Science and...

12
CSC3315 (Spring 2009) 1 CSC 3315 CSC 3315 Lexical and Syntax Lexical and Syntax Analysis Analysis Hamid Harroud Hamid Harroud School of Science and Engineering, Akhawayn School of Science and Engineering, Akhawayn University University http://www.aui.ma/~H.Harroud/csc3315/
  • date post

    21-Dec-2015
  • Category

    Documents

  • view

    223
  • download

    0

Transcript of CSC3315 (Spring 2009)1 CSC 3315 Lexical and Syntax Analysis Hamid Harroud School of Science and...

CSC3315 (Spring 2009) 1

CSC 3315CSC 3315Lexical and Syntax Lexical and Syntax AnalysisAnalysis

Hamid HarroudHamid HarroudSchool of Science and Engineering, Akhawayn School of Science and Engineering, Akhawayn

UniversityUniversityhttp://www.aui.ma/~H.Harroud/csc3315/

Constructing a Lexical Analyzer

state = S // S is the start state

repeat {k = next character from the input

if k == EOF // the end of inputif state is a final state then accept

else reject

state = T[state,k]

if state = empty then reject // got stuck

}

Constructing a Lexical Analyzer

Constructing a Lexical Analyzer

int LexAnalyzer() {getChar();if (isLetter(nextChar)) {

addChar();getChar();while (isLetter(nextChar) || isDigit(nextChar)){ addChar(); getChar();}return lookup(lexeme);

} . . .

Constructing a Lexical Analyzer

int LexAnalyzer() {getChar();if (isLetter(nextChar)) { . . .}else if (isDigit(nextChar)) {

addChar();getChar();while (isDigit(nextChar)) { addChar(); getChar();}return INT_LIT;break;

}}

Lexical Errors

Consider the following two programs:

Lexical Errors

Jlex: a scanner generator

JLex.Main(java)

JLex.Main(java)

javacjavac

P.main(java)P.main(java)

jlex specificationxxx.jlex

xxx.jlex.java

generated scannerxxx.jlex.java

Yylex.class

Yylex.class

input programtest.sim

Output of P.main

public class P {public static void main(String[] args) {

FileReader inFile = new FileReader(args[0]); Yylex scanner = new Yylex(inFile);

Symbol token = scanner.next_token(); while (token.sym != sym.EOF) {

switch (token.sym) {case sym.INTLITERAL: System.out.println("INTLITERAL (" + ((IntLitTokenVal)token.value).intVal \+ ")");

break;…

} token = scanner.next_token(); } }

Jlex: a scanner generator

Regular expression rulesregular-expression { action } pattern to be matched code to be executed when

the

pattern is matched

When next_token() method is called, it repeats: Find the longest sequence of characters in the input (starting with

the current character) that matches a pattern. Perform the associated action

until a return in an action is executed.

Matching rules

If several patterns that match the same sequence of characters, then the longest pattern is considered to be matched.

If several patterns that match the same (longest) sequence of characters, then the first such pattern is considered to be matched

so the order of the patterns can be important!

If an input character is not matched in any pattern, the scanner throws an exception

An Example%%

DIGIT= [0-9]

LETTER= [a-zA-Z]

WHITESPACE= [ \t\n] // space, tab, newline

{LETTER}({LETTER}|{DIGIT}*)

{System.out.println(yyline+1

+ ": ID " + yytext());}

{DIGIT}+ {System.out.println(yyline+1 + ": INT");}

"=" {System.out.println(yyline+1 + ": ASSIGN");}

"==" {System.out.println(yyline+1 + ": EQUALS");}

{WHITESPACE}* { }

. {System.out.println(yyline+1 + ": bad char");}