Syntax Analysis

12
Syntax Analysis The recognition problem : given a grammar G and a string w, is w L(G)? The parsing problem : if G is a grammar and w L(G), how can w be derived in G? Both of these problems are decidable - that is, there are algorithms which will give a definite (correct) yes or no answer for any given instance of the problems. Parsing is important, because understanding the derivation of a structure helps us to understand the meaning of the structure.

description

Syntax Analysis. The recognition problem : given a grammar G and a string w , is w Î L(G)?. The parsing problem : if G is a grammar and w Î L(G), how can w be derived in G?. Both of these problems are decidable - that is, there are algorithms which will give a definite - PowerPoint PPT Presentation

Transcript of Syntax Analysis

Page 1: Syntax Analysis

Syntax Analysis

The recognition problem: given a grammar Gand a string w, is w L(G)?

The parsing problem: if G is a grammar andw L(G), how can w be derived in G?

Both of these problems are decidable - that is,there are algorithms which will give a definite(correct) yes or no answer for any giveninstance of the problems.

Parsing is important, because understandingthe derivation of a structure helps us tounderstand the meaning of the structure.

Page 2: Syntax Analysis

Derivation Structure

Consider the expression in the language G0:

a +( a * a)

In order to process this expression, it helpsto consider the (a*a) substring as a moresignificant sub-unit than a+(a, for example.

We can use the derivation of the string:

1) S -> S + S2) S -> S * S3) S -> (S)4) S -> a.

Page 3: Syntax Analysis

Derivation Structure

Consider the expression in the language G0:

a +( a * a)

In order to process this expression, it helpsto consider the (a*a) substring as a moresignificant sub-unit than a+(a, for example.

We can use the derivation of the string:

S => S+S => S+(S) => S+(S*S) => S+(S*a)=> S+(a*a) => a+(a*a).

S

S + S

( S )

S * S

a a

a

1) S -> S + S2) S -> S * S3) S -> (S)4) S -> a.

Page 4: Syntax Analysis

Derivation Trees

For any derivation, we can construct a derivation tree.

The root of the tree will be a node representing the start symbol.

Every time we apply a production A -> , we add a subtree below AA is the root, and there is a branch for every symbol of , in the same left-to-right order in which they appear in .

We read the string represented by the derivationtree by reading the "leaf" nodes in left-to-right order.

Note: "left-to-right" order means the "structural"order - the leftmost path, then the same path, but with the next-to-left branch at the last nodewhere there was a choice, etc. - and not anyorder which may appear in the sketch.

Page 5: Syntax Analysis

S => S+S => S+(S) => S+(S*S) => S+(S*a) => S+(a*a) => a+(a*a).

S S

S + S

( S )

=> S

S + S

S

S + S

( S )

S * S

=> =>

S

S + S

( S )

S * S

a a

=> =>S

S + S

( S )

S * S

a a

a

S

S + S

( S )

S * S

=>

a

Page 6: Syntax Analysis

Equivalent Derivations

Two different derivations can have the samederivation tree.

Example:

S => S+S => S+a => a+a

and

S => S+S => a+S => a+a

both produce the tree

S

S + S

a a

In CFG's, the order of applying productions is irrelevant, as long as the same production is applied to the same symbol.

1) S -> S + S2) S -> S * S3) S -> (S)4) S -> a.

Page 7: Syntax Analysis

Multiple Derivation Trees

Consider the two derivations below:

1. S => S+S => S+S*S => S+S*a => S+a*a => a+a*a

2. S => S*S => S*a=> S+S*a => S+a*a => a+a*a

These give essentially different derivationtrees for the same final sentence.

S

S

a

+ S

S * S

a a

1. S

S

a+

S

S

*

S

a a

2.

This causes problems for our attempt to understand a string by considering its derivation.

1) S -> S + S2) S -> S * S3) S -> (S)4) S -> a.

Page 8: Syntax Analysis

Ambiguous Grammars

A derivation in which at each step the rightmostnon-terminal is replaced is a right derivation.

In a right derivation, the order of symbols to be replaced is fixed.

A string has two different right derivations iff ithas two different derivation trees.

A CFG is ambiguous if there is at least onestring in L(G) having two or more differentright derivations (or, equally, two or moredifferent derivation trees).

Page 9: Syntax Analysis

The Problem With Ambiguity

By the previous example, the grammar ofalgebraic expressions, G0, is ambiguous.

Problem: 2+2*2 = ?

Under derivation 1., we get 2 + (2*2) = 6.

Under derivation 2., we get (2+2)*2 = 8.

Which do we select?

Why is this a problem?

Suppose we are attempting to analyse strings in the language of G0, in order to performsimple arithmetic - the structure of thederivation will tell us which operation to applywhen.

1) S -> S + S2) S -> S * S3) S -> (S)4) S -> a.

Page 10: Syntax Analysis

Unambiguous Expressions

We are aiming to produce an unambiguousversion of G0. Essentially, we want to assign priorities to the operators, and reflect this in the grammar. Also, although it makes no difference to the evaluated expression, we want a+a+a to be (a+a)+a.

We will do this by introducing new symbols - aterm, T, will represent a product; a factor, F,will represent things that can be multiplied; andS will represent sums.

An expression can be a sum of an expression and a term, or simply a term. A term can be aproduct of a term and a factor, or simply a factor.A factor can be an expression (in parentheses), orsimply a symbol.

1) S -> S + S2) S -> S * S3) S -> (S)4) S -> a.

Page 11: Syntax Analysis

Unambiguous Expressions

We are aiming to produce an unambiguousversion of G0. Essentially, we want to assign priorities to the operators, and reflect this in the grammar. Also, although it makes no difference to the evaluated expression, we want a+a+a to be (a+a)+a.

Example: Grammar G1.

S -> S + T | TT -> T * F | FF -> (S) | a

We will do this by introducing new symbols - aterm, T, will represent a product; a factor, F,will represent things that can be multiplied; andS will represent sums.

An expression can be a sum of an expression and a term, or simply a term. A term can be aproduct of a term and a factor, or simply a factor.A factor can be an expression (in parentheses), orsimply a symbol.

1) S -> S + S2) S -> S * S3) S -> (S)4) S -> a.

Page 12: Syntax Analysis

Ambiguity and Decidability

The ambiguity we have seen so far has always been a property of the grammar, and not of thelangauge. However, there exist languages for which every grammar defining them is ambiguous.

Example: {aibjck : i = j or j = k }

A language for which every defining grammar isambiguous is inherently ambiguous.

More importantly, there is no algorithm whichwill determine whether or not a given grammaris ambiguous.