The CYK Parsing Method

24
The CYK Parsing Method Chiyo Hotani Tanya Petrova CL2 Parsing Course 28 November, 2007

description

The CYK Parsing Method. Chiyo Hotani Tanya Petrova CL2 Parsing Course 28 November, 2007. Overview. CYK Recognition with CF grammar Basic Algorithm Problems: unit-rules, є -rules Recognition with a grammar in CNF CYK Parsing with CNF Parsing with CNF Recognition Table Chart Parsing - PowerPoint PPT Presentation

Transcript of The CYK Parsing Method

Page 1: The CYK Parsing Method

The CYK Parsing MethodChiyo Hotani

Tanya Petrova

CL2 Parsing Course28 November, 2007

Page 2: The CYK Parsing Method

Overview

CYK Recognition with CF grammar Basic Algorithm Problems: unit-rules, є-rules Recognition with a grammar in CNF

CYK Parsing with CNF Parsing with CNF Recognition Table

Chart Parsing Summary

Advantages and Disadvantages Other remarks

Page 3: The CYK Parsing Method

Basic Algorithm of CYK Recognition (1)

Example Grammar:

A grammar describing numbers in scientific notation

Input: 32.5e+1

Page 4: The CYK Parsing Method

derivations of substrings of length 1

Basic Algorithm of CYK Recognition (2)

Digit -> 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9Sign -> + | -

Page 5: The CYK Parsing Method

NumberS -> Integer | Real

Integer -> Digit | Integer Digit

Digit -> 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

derivations of substrings of length 1

Unit Rule: rules of the form AB, where A and B are non-terminals. We can have chains of them in a derivation.

Basic Algorithm of CYK Recognition (3)

Page 6: The CYK Parsing Method

NumberS -> Integer | RealInteger -> Digit | Integer DigitFraction -> . IntegerScale -> e Sign Integer | Empty

Basic Algorithm of CYK Recognition (4)

Page 7: The CYK Parsing Method

NumberS -> Integer | RealReal -> Integer Fraction Scale

Number does indeed derive 32.5e+1.

Basic Algorithm of CYK Recognition (5)

Page 8: The CYK Parsing Method

є-rules

Basic Algorithm of CYK Recognition (6)

Page 9: The CYK Parsing Method

Rє = { Empty, Scale }

sentence: z = z1 z2 . . . zn

substring of z starting at positi

on i, of length l.

si,l = zizi+1. . . zi+l-1

Rsi,l: the set of non-terminals

deriving the substring si,l

A graphical presentation of substrings

Basic Algorithm of CYK Recognition (7)

Page 10: The CYK Parsing Method

CYK recognition with a grammar in CNF

Required restrictions: Eliminate є-rules and unit rulesLimit the maximum length of RHS of the

rule to 2CNF

No є-rules and unit rules all rules have one of the following two forms:

AaABC

Page 11: The CYK Parsing Method

Our example grammar in CNF

Page 12: The CYK Parsing Method

CYK Parsing with CNF

Building the recognition tableInput :

Our example grammar in CNF

input sentence: 32.5 e + 1

Page 13: The CYK Parsing Method

CYK Parsing with the CNF

bottom-row : read directly from the grammar (rules of the form A a )

Page 14: The CYK Parsing Method

Two Ways to Copmute a R s i,l:

check each right-hand side

compute possible right-hand sides from the recognition table

Page 15: The CYK Parsing Method

How this is done

Example: 2.5 e ( = s 2, 4)

1) N1 not in R s 2, 1 or R s 2, 2N1 is a member of R s 2, 3But Scale´ is not a member of R s 5, 1

2) R s 2, 4 is the set of Non- Terminals that have a right-hand side AB where either:

A in R s 2, 1 and B in R s 3, 3A in R s 2, 2 and B in R s 4, 2A in R s 2, 3 and B in R s 5, 1Possible combinations: N1 T2 or Number T2In our grammar we do not have such a right-

hand side, so nothing is added to R s 2, 4.

Page 16: The CYK Parsing Method

Recognition table

l

i

Page 17: The CYK Parsing Method

As a result we find out that:

This process is much less complicated than the one we saw before

Page 18: The CYK Parsing Method

Reasons

• We do not have to repeat the process again and again until no new Non-Terminals are added to R s i,l

(The substrings we are dealing with

are really substrings and cannot be equal to the string we start with)

• We only have to find one place where the substring must be split into two A B C

Here !

Page 19: The CYK Parsing Method

Chart Parsing

A chart is just a recognition table.

Page 20: The CYK Parsing Method

A short retrospective of CYK

First: recognition table using the original grammar.

Then: transforming grammar to CNF.

Page 21: The CYK Parsing Method

A short retrospective of CYK cont.

CNF is useful for improving the efficiency, but it is actually a bit too restrictive 

Disadvantage of CNF: Resulting recognition table lacks the

information we need to construct a derivation using the original grammar!

Page 22: The CYK Parsing Method

A short retrospective of CYK cont.

In the transformation process, some non-terminals were thrown away

(non-productive)Missing information could be added.

Page 23: The CYK Parsing Method

A short retrospective of CYK cont.

Result: almost the same recognition table.Extra information on non-terminalsObtained in a simpler and much more

efficient way.

Page 24: The CYK Parsing Method

Thank you

for your attention!