CHR as grammar formalism A first report Henning Christiansen Roskilde University, D ENMARK henning...

19
CHR as grammar formalism A first report Henning Christiansen Roskilde University, DENMARK http://www.dat.ruc.dk/~henning Idea: • Propagation rules of CHR: a natural bottom evaluator • A bottom-up analogy to Definite Clause Grammars • ... and see what else CHR can offer to language processing

Transcript of CHR as grammar formalism A first report Henning Christiansen Roskilde University, D ENMARK henning...

CHR as grammar formalismA first report

Henning Christiansen

Roskilde University, DENMARK

http://www.dat.ruc.dk/~henning

Idea:

• Propagation rules of CHR: a natural bottom evaluator

• A bottom-up analogy to Definite Clause Grammars

• ... and see what else CHR can offer to language processing

Example: ”Peter likes Mary”

np(N0,N1), verb(N1,N2), np(N2,N3) ==> sentence(N0,N3).

token(peter,N0,N1) ==> np(N0,N1).

token(mary, N0,N1) ==> np(N0,N1).

token(likes,N0,N1) ==> verb(N0,N1).

token(peter,0,1) token(likes,1,2) token(mary,2,3)

sentence(0,3)

/ | \

np(0,1) verb(1,2) np(2,3)

/ | \

Example: ”Peter likes Mary”

np(N0,N1), verb(N1,N2), np(N2,N3) ==> sentence(N0,N3).

token(peter,N0,N1) ==> np(N0,N1).

token(mary, N0,N1) ==> np(N0,N1).

token(likes,N0,N1) ==> verb(N0,N1).

• Principle obviously correct for grammars without empty productions and loops

• Robust of errors

• Attributes can be added as in DCG (of course)

• Ambiguity not a problem

• Naive, elegant, but is it useful??

A few facts about CHR

• Declarative language for writing constraint solvers [Frühwirth, 1995]

• Propagation rules adds: ... ==> ...• Simplification rules replaces: ... <=> ...+ variations and misc. tools• Recognized as general purpose logic programming

language, e.g.:[Abdennadher, Schütz, 1998]: Combine bottom-up & top-down[Abdennadher, Christiansen, 2000]: ”Natural embedding” of

abduction and integrity constraints

This talk: Exercises using CHR for language processing

Time complexity of prop. rule parsers

• No backtrack => no combinat. explosion in case of errors

• n3 for Chomsky Normal Form grammars

[McAllester, 2000; Cocke-Younger-Kasami ...];

in general worse

• Special problem: local ambiguity => large amount of (perhaps) useless constraints (e.g. A::= a | A A)

Several ways to get rid of n3....

Improving time complexity

Example: (grammar for arithmetic expressions)exp(N0,N1),token(+,N1,N2),exp(N2,N3) ==> exp(N0,N3).

Improving time complexity

• Make all but rightmost symbol passive

exp(N0,N1)#Id1, token(+,N1,N2)#Id2, exp(N2,N3) ==>

exp(N0,N3),

pragma passive(Id1), passive(Id2).

Improving time complexity

• Make all but rightmost symbol passive• Use look-ahead

exp(N0,N1)#Id1,token(+,N1,N2)#Id2,exp(N2,N3)#Id3,

token(R,N3,N4) ==> member(R,[+,')',eof]) | exp(N0,N3),

pragma passive(Id1), passive(Id2), passive(Id3).

Improving time complexity

• Make all but rightmost symbol passive• Use look-ahead• Use simplification/simpagation rules instead

token(R,N3,N4) \ exp(N0,N1)#Id1,token(+,N1,N2)#Id2,exp(N2,N3)#Id3

<=> member(R,[+,')',eof]) | exp(N0,N3),

pragma passive(Id1), passive(Id2), passive(Id3).

Improving time complexity• Make all but rightmost symbol passive

– does not change grammar; can be added even without grammar-writer’s attention

• Use look-ahead– as above

• Use simplification/simpagation rules instead– does not change unambigous grammars

– a feature to enforce unambiguity

Time complexity not a problem!• ”Reasonable” grammars run almost linearly

• A variety of tools available for the competent CHR grammar writer

Fact: CHR perfect tool for passing round hypothesis

Example:produce: ... ==> ... h(X) ...apply: ... h(X) ... ==> ... X ...

Let’s try Assumption Grammars [Dahl, Tarau, Li, 1997]

• anaphora, coordination, etc.

Assumption Grammars

+h(a) assert linear hypothesis h(a) forsubsequent text

*h(a) assert ”intuitionistic” hypothesis for subsequent text

-h(X) consume/apply hypothesis

=+h(a), =*h(X), =-h(X)as above but free-order

(NB: syntax changed slightly compared with [Dahl, Tarau, Li, 1997])

Assumption Grammars in CHR

Represent =+h(a) as =+(h,[a]), etc.

+h(a) as +(h,[a],position), etc.

Implement by following CHR rules: =+(P,A), =-(P,B) <=> true & A=B | true.

=*(P,A) \ =-(P,B) <=> true & A=B | true.

+(P,A,Z1), -(P,B,Z2) <=> Z1 < Z2 & A=B | true.

*(P,A,Z1)\ -(P,B,Z2) <=> Z1 < Z2 & A=B | true.

Problem with commit-ment and ambiguity

So assume no ambiguityfor the moment

Example: Anaphora, ”Peter ... he ...”

token(he,N0,N1) <=> pronoun(masc,N0,N1).

pronoun(Gender,N0,N1) <=>

-(active_individual,[X,Gender],N0),

np(X,Gender,N0,N1).

token(peter,N0,N1) <=>

proper_name(peter,masc,N0,N1).

proper_name(X,Gender,N0,N1) <=>

*(active_individual,[X,Gender],N0),

np(X,Gender,N0,N1).

Example: Coordination (adapted from[Dahl, Tarau, Li, 1997])

”Mary likes • and Martha hates Peter”

Rule for sentence asking around for a subjecttoken(and,N3,N4) \ np(Sub,N1,N2), verb(V,N2,N3) <=>

=-(ref_object,[Obj]), sent(V*(Sub,Obj),N1,N3).

The following rule instance applied for sample sentence=+(ref_object,[peter]), =-(ref_object,[Obj])

<=> true & [peter]=[Obj] | true.

Rule sentence offering its objects to contextsent(S1,N1,N2), token(and,N2,N3), sent(V2*(Sub2,Obj2),N3,N4)

<=> =+(ref_object,[Obj2]),

sent(S1+V2*(Sub2,Obj2),N1,N4).

Problems with Assumption Grammars

• Better control of scope of hypotheses needed,

e.g. ”offered subjects” only in single period

• Priority of hypothesis, pruning

... heuristics, weights, fuzzy?

Claim: Can be approached in CHR• suggestions for new mechanisms can be

implemented in CHR• ... or simply program add hoc!

Abduction with integrity constraints (sketch)

Consider DCG rule:A --> B1, B2, B3, {P}

Under suitable conditions equivalent withB1(N0,N1), B2(N1,N2), B3(N1,N3) <=> P, A(N0,N3).

Notice: Our implementation of Assumption Gram’s example of abduction with integrity constraints

Integrity constraints reject inconsistent hyp. sets, e.g.*(active_individual,[X,masc]), *(active_individual,[X,fem]) <=> fail.

fact(likes,X,Y), fact(hates,X,Y) <=> fail.

Problems with abduction + int. constr’s

• Only one out of several hypotheses chosen• Inconsistent choices leads to total failure• Attempt to handle ambiguity by propagation rules

(==>) will mix up different interpretations of the text

Under development:• Indexing techniques: Maintain different but shared

hypotheses sets in same constraint store• Replace ”... <=> fail” by ”... <=> vanish(index)”

Our point here:These problems can

be approached in CHR!!

Conclusion

Current work:• Indexing and sharing techniques

• Recognition of NPs for ontology-based search in text DBs;

in parallel with ”trad’l approach” in the OntoQuery project

• Adapt grammar for large subset of Danish; with CST, Copenhagen, also in OntoQuery

... and fun toplay with too!

• Simple, naive and neat approach

• Efficient and robust b.u. parsers

• Potentiality for powerful and flexible natural language processing methods

• Suggestive application of declarative constraint programming