CH3.1 CS 345 Dr. Mohamed Ramadan Saady Algebraic Properties of Regular Expressions AXIOMDESCRIPTION...

9
CH3.1 CS 345 Dr. Mohamed Ramadan Saady Algebraic Properties of Regular Expressions AXIOM DESCRIPTION r | s = s | r r | (s | t) = (r | s) | t (r s) t = r (s t) r = r r = r r* = ( r | )* r ( s | t ) = r s | r t ( s | t ) r = s r | t r r** = r* | is commutative | is associative concatenation is associative concatenation distributes over | relation between * and Is the identity element for concatenation * is idempotent

Transcript of CH3.1 CS 345 Dr. Mohamed Ramadan Saady Algebraic Properties of Regular Expressions AXIOMDESCRIPTION...

Page 1: CH3.1 CS 345 Dr. Mohamed Ramadan Saady Algebraic Properties of Regular Expressions AXIOMDESCRIPTION r | s = s | r r | (s | t) = (r | s) | t (r s) t = r.

CH3.1

CS 345

Dr. Mohamed Ramadan Saady

Algebraic Properties of Regular Expressions

AXIOM DESCRIPTION

r | s = s | r

r | (s | t) = (r | s) | t

(r s) t = r (s t)

r = rr = r

r* = ( r | )*

r ( s | t ) = r s | r t( s | t ) r = s r | t r

r** = r*

| is commutative

| is associative

concatenation is associative

concatenation distributes over |

relation between * and

Is the identity element for concatenation

* is idempotent

Page 2: CH3.1 CS 345 Dr. Mohamed Ramadan Saady Algebraic Properties of Regular Expressions AXIOMDESCRIPTION r | s = s | r r | (s | t) = (r | s) | t (r s) t = r.

CH3.2

CS 345

Dr. Mohamed Ramadan Saady

Regular Expression Examples

• All Strings that start with “tab” or end with

“bat”:

tab{A,…,Z,a,...,z}*|{A,…,Z,a,....,z}*bat

• All Strings in Which Digits 1,2,3 exist in

ascending numerical order:

{A,…,Z}*1 {A,…,Z}*2 {A,…,Z}*3 {A,…,Z}*

Page 3: CH3.1 CS 345 Dr. Mohamed Ramadan Saady Algebraic Properties of Regular Expressions AXIOMDESCRIPTION r | s = s | r r | (s | t) = (r | s) | t (r s) t = r.

CH3.3

CS 345

Dr. Mohamed Ramadan Saady

Towards Token Definition

Regular Definitions: Associate names with Regular Expressions

For Example : PASCAL IDs

letter A | B | C | … | Z | a | b | … | zdigit 0 | 1 | 2 | … | 9 id letter ( letter | digit )*

Shorthand Notation: “+” : one or more r* = r+ | & r+ = r r* “?” : zero or one r?=r | [range] : set range of characters (replaces “|” ) [A-Z] = A | B | C | … | Z

Example Using Shorthand : PASCAL IDs

id [A-Za-z][A-Za-z0-9]*

Page 4: CH3.1 CS 345 Dr. Mohamed Ramadan Saady Algebraic Properties of Regular Expressions AXIOMDESCRIPTION r | s = s | r r | (s | t) = (r | s) | t (r s) t = r.

CH3.4

CS 345

Dr. Mohamed Ramadan Saady

Token Recognition

How can we use concepts developed so far to assist in recognizing tokens of a source language ?

Assume Following Tokens:

if, then, else, relop, id, num

What language construct are they used for ?

Given Tokens, What are Patterns ?

if ifthen thenelse elserelop < | <= | > | >= | = | <>id letter ( letter | digit )*num digit + (. digit + ) ? ( E(+ | -) ? digit + ) ?

What does this represent ? What is ?

Grammar:stmt |if expr then stmt

|if expr then stmt else stmt|

expr term relop term | termterm id | num

Page 5: CH3.1 CS 345 Dr. Mohamed Ramadan Saady Algebraic Properties of Regular Expressions AXIOMDESCRIPTION r | s = s | r r | (s | t) = (r | s) | t (r s) t = r.

CH3.5

CS 345

Dr. Mohamed Ramadan Saady

What Else Does Lexical Analyzer Do?

Scan away b, nl, tabs

Can we Define Tokens For These?

blank btab ^Tnewline ^Mdelim blank | tab | newlinews delim +

Page 6: CH3.1 CS 345 Dr. Mohamed Ramadan Saady Algebraic Properties of Regular Expressions AXIOMDESCRIPTION r | s = s | r r | (s | t) = (r | s) | t (r s) t = r.

CH3.6

CS 345

Dr. Mohamed Ramadan Saady

Overall

Regular Expression

Token Attribute-Value

wsifthenelse

idnum

<<==

< >>

>=

-if

thenelseid

numreloprelop reloprelopreloprelop

----

pointer to table entrypointer to table entry

LTLEEQNEGTGE

Note: Each token has a unique token identifier to define category of lexemes

Page 7: CH3.1 CS 345 Dr. Mohamed Ramadan Saady Algebraic Properties of Regular Expressions AXIOMDESCRIPTION r | s = s | r r | (s | t) = (r | s) | t (r s) t = r.

CH3.7

CS 345

Dr. Mohamed Ramadan Saady

Constructing Transition Diagrams for Tokens

• Transition Diagrams (TD) are used to represent the tokens

• As characters are read, the relevant TDs are used to attempt to match lexeme to a pattern

• Each TD has:

• States : Represented by Circles

• Actions : Represented by Arrows between states

• Start State : Beginning of a pattern (Arrowhead)

• Final State(s) : End of pattern (Concentric Circles)

• Each TD is Deterministic - No need to choose between 2 different actions !

Page 8: CH3.1 CS 345 Dr. Mohamed Ramadan Saady Algebraic Properties of Regular Expressions AXIOMDESCRIPTION r | s = s | r r | (s | t) = (r | s) | t (r s) t = r.

CH3.8

CS 345

Dr. Mohamed Ramadan Saady

Example TDs

start

other

=>0 6 7

8 * RTN(G)

RTN(GE)> = :

We’ve accepted “>” and have read other char that must be unread.

Page 9: CH3.1 CS 345 Dr. Mohamed Ramadan Saady Algebraic Properties of Regular Expressions AXIOMDESCRIPTION r | s = s | r r | (s | t) = (r | s) | t (r s) t = r.

CH3.9

CS 345

Dr. Mohamed Ramadan Saady

Example : All RELOPs

start <0

other

=6 7

8

return(relop, LE)

5

4

>

=1 2

3

other

>

=

*

*

return(relop, NE)

return(relop, LT)

return(relop, EQ)

return(relop, GE)

return(relop, GT)