Download - Augmenting WordNet for Deep Understanding of Text

Transcript
Page 1: Augmenting WordNet for Deep Understanding of Text

Augmenting WordNet for Deep Understanding of Text

Peter Clark, Phil Harrison, Bill Murray, John Thompson (Boeing)Christiane Fellbaum (Princeton Univ)Jerry Hobbs (ISI/USC)

Page 2: Augmenting WordNet for Deep Understanding of Text

“Deep Understanding”• Not (just) parsing + word senses• Construction of a coherent representation of the scene

the text describes• Challenge: much of that representation is not in the text

“A soldier was killed in a gun battle”

“The soldier died”“The soldier was shot”“There was a fight”…

Page 3: Augmenting WordNet for Deep Understanding of Text

“Deep Understanding”

“A soldier was killed in a gun battle”

“The soldier died”“The soldier was shot”“There was a fight”…

BecauseA battle involves a fight.Soldiers use guns.Guns shoot.Guns can kill.If you are killed, you are dead.….

How do we get this knowledge into the machine?How do we exploit it?

Page 4: Augmenting WordNet for Deep Understanding of Text

“Deep Understanding”

“A soldier was killed in a gun battle”

“The soldier died”“The soldier was shot”“There was a fight”…

BecauseA battle involves a fight.Soldiers use guns.Guns shoot.Guns can kill.If you are killed, you are dead.….

Several partially useful resources exist.WordNet is already used a lot…can we extend it?

Page 5: Augmenting WordNet for Deep Understanding of Text

The Initial Vision• Our vision:

Rapidly expand WordNet to be more of a knowledge-base

Question-answering software to demonstrate its use

Page 6: Augmenting WordNet for Deep Understanding of Text

The Evolution of WordNet

• v1.0 (1986)– synsets (concepts) + hypernym (isa) links

• v1.7 (2001)– add in additional relationships

• has-part• causes• member-of• entails-doing (“subevent”)

• v2.0 (2003)– introduce the instance/class distinction

• Paris isa Capital-City is-type-of City– add in some derivational links

• explode related-to explosion

• …• v10.0 (200?)

– ?????

lexicalresource

knowledgebase?

Page 7: Augmenting WordNet for Deep Understanding of Text

Augmenting WordNet

• World Knowledge– Sense-disambiguate the glosses (by hand)– Convert the glosses to logic

• Similar to LCC’s Extended WordNet attempt– Axiomatize “core theories”

• WordNet links– Morphosemantic links– Purpose links

• Experiments

Page 8: Augmenting WordNet for Deep Understanding of Text

Converting the Glosses to Logic

Convert gloss to form “word is gloss”

Parse (Charniak)

“ambition#n2: A strong drive for success”

LFToolkit: Generate logical form fragments

strong drive for success

strong(x1) & drive(x2) & for(x3,x4) & success(x5)

Lexical output rulesproduce logical form

fragments

Page 9: Augmenting WordNet for Deep Understanding of Text

Converting the Glosses to Logic

Convert gloss to form “word is gloss”

Parse (Charniak)

LFToolkit: Generate logical form fragments

Identify equalities, add senses

“ambition#n2: A strong drive for success”

Page 10: Augmenting WordNet for Deep Understanding of Text

Converting the Glosses to Logic

Identify equalities, add senses

A strong drive for success

strong(x1) & drive(x2) & for(x3,x4) & success(x5)

x2=x3

x1=x2

Lexical output rulesproduce logical form

fragments

Composition rulesidentify variables

x4=x5

Page 11: Augmenting WordNet for Deep Understanding of Text

Converting the Glosses to Logic

Convert gloss to form “word is gloss”

Parse (Charniak)

LFToolkit: Generate logical form fragments

Identify equalities, add senses

“ambition#n2: A strong drive for success”

ambition#n2(x1) → a(x1) & strong#a1(x1) & drive#n2(x1) & for(x1,x2) & success#a3(x2)

Page 12: Augmenting WordNet for Deep Understanding of Text

Converting the Glosses to Logic• Sometimes works well!• But often not. Primary problems:

1. Errors in the language processing2. Only capture definitional knowledge3. “flowery” language, many gaps, metonymy, ambiguity;

If logic closely follows syntax → “logico-babble”

“hammer#n2: tool used to deliver an impulsive force by striking”hammer#n2(x1) →

tool#n1(x1) & use#v1(e1,x2,x1) & to(e1,e2) & deliver#v2(e2,x3) & driving#a1(x3) & force#n1(x3) & by(e3,e4) & strike#v3(e4,x4).

→ Hammers hit things??

Page 13: Augmenting WordNet for Deep Understanding of Text

Augmenting WordNet

• World Knowledge– Sense-disambiguate the glosses (by hand)– Convert the glosses to logic– Axiomatize “core theories”

• WordNet links– Morphosemantic links– Purpose links

• Experiments

Page 14: Augmenting WordNet for Deep Understanding of Text

Core Theories

• Many domain-specific facts are instantiations of more general, “core” knowledge

• By encoding this core knowledge, get leverage• eg 517 “vehicle” noun (senses), 185 “cover” verb (senses)

• Approach:– Analysis and grouping of words in Core WordNet– Identification and encoding of underlying theories

Page 15: Augmenting WordNet for Deep Understanding of Text

Composite Entities: perfect, empty, relative, secondary, similar, odd, ...Scales: step, degree, level, intensify, high, major, considerable, ...Events: constraint, secure, generate, fix, power, development, ...Space: grade, inside, lot, top, list, direction, turn, enlarge, long, ...Time: year, day, summer, recent, old, early, present, then, often, ...Cognition: imagination, horror, rely, remind, matter, estimate, idea, ...Communication: journal, poetry, announcement, gesture, charter, ...Persons and their Activities: leisure, childhood, glance, cousin, jump, ...Microsocial: virtue, separate, friendly, married, company, name, ...Material World: smoke, shell, stick, carbon, blue, burn, dry, tough, ... Geo: storm, moon, pole, world, peak, site, village, sea, island, ...Artifacts: bell, button, van, shelf, machine, film, floor, glass, chair, ...Food: cheese, potato, milk, break, cake, meat, beer, bake, spoil, ... Macrosocial: architecture, airport, headquarters, prosecution, ...Economic: import, money, policy, poverty, profit, venture, owe, ...

Core Theories

Page 16: Augmenting WordNet for Deep Understanding of Text

Augmenting WordNet

• World Knowledge– Sense-disambiguate the glosses (by hand)– Convert the glosses to logic– Axiomatize “core theories”

• WordNet links– Morphosemantic links– Purpose links

• Experiments

Page 17: Augmenting WordNet for Deep Understanding of Text

Morphosemantic Links• Often need to cross part-of-speech

T: A council worker cleans up after Tuesday's violence in Budapest.H: There were attacks in Budapest on Tuesday.

(“attack”) attack_v3 aggression_n4 (←“violence”)

“aggress”/“aggression”derivation link

• Can solve with WN’s derivation links:

Page 18: Augmenting WordNet for Deep Understanding of Text

Morphosemantic Links• But can go wrong!

T: Paying was slowH1: The transaction was slowH2: *The person was slow [NOT entailed]

“pay”/“payment”payment_n1 (→ “transaction”)(“pay”) pay_v1

“pay”/“payer”payer_n1 (→ “person”)(“pay”) pay_v1

Problem: The type of relation matters for derivatives! (Event? Agent?..)

A pays B → The payment (event-noun) by A A is the payer (agent-noun) of B

etc.

Page 19: Augmenting WordNet for Deep Understanding of Text

Morphosemantic Links• Task: Classify the 22,000 links in WordNet:

• Semi-automatic process– Exploit taxonomy and morphology

• 15 semantic types used– agent, undergoer, instrument, result, material, destination,

location, result, by-means-of, event, uses, state, property, body-part, vehicle.

Verb Synset Noun Synset Relationshiphammer_v1 hammer_n1 instrumentexecute_v1 execution_n1 event (equal)sign_v2 signatory_n1 agent

Page 20: Augmenting WordNet for Deep Understanding of Text

Experimentation

Page 21: Augmenting WordNet for Deep Understanding of Text

Task: Recognizing Entailment• Experiment with WordNet, logical glosses, DIRT• Text interpretation to logic using Boeing’s NLP system

• Entailment: T → H if:– T is subsumed by H (“cat eats mouse” → “animal was eaten”)– An elaboration of T using inference rules is subsumed by H

• (“cat eats mouse” → “cat swallows mouse”)

• No statistical similarity metrics

“A soldier was killed in a gun battle”

“soldier”(soldier01),“kill”(…..object(kill01,soldier01),“in”(kill01,battle01),modifier(battle01,gun01).

isa(soldier01,soldier_n1), isa(……object(kill01,soldier01)during(kill01,battle01)instrument(battle01,gun01)

Initial Logic Final Logic

Page 22: Augmenting WordNet for Deep Understanding of Text

Successful Examples with the Glosses• Good example

T: Britain puts curbs on immigrant labor from Bulgaria and Romania.H: Britain restricted workers from Bulgaria.

14.H4

Page 23: Augmenting WordNet for Deep Understanding of Text

Successful Examples with the Glosses• Good example

T: Britain puts curbs on immigrant labor from Bulgaria and Romania.H: Britain restricted workers from Bulgaria.

WN: limit_v1:"restrict“: place limits on.

→ ENTAILED (correct)

14.H4

T: Britain puts curbs on immigrant labor from Bulgaria and Romania.

H: Britain placed limits on workers from Bulgaria.

Page 24: Augmenting WordNet for Deep Understanding of Text

T: The administration managed to track down the perpetrators.H: The perpetrators were being chased by the administration.

56.H3

Successful Examples with the Glosses• Another (somewhat) good example

Page 25: Augmenting WordNet for Deep Understanding of Text

T: The administration managed to track down the perpetrators.H: The perpetrators were being chased by the administration.

WN: hunt_v1 “hunt” “track down”: pursue for food or sport

→ ENTAILED (correct)

56.H3

T: The administration managed to pursue the perpetrators [for food or sport!].H: The perpetrators were being chased by the administration.

Successful Examples with the Glosses• Another (somewhat) good example

Page 26: Augmenting WordNet for Deep Understanding of Text

Unsuccessful examples with the glosses• More common: Being “tantalizingly close”

T: Satomi Mitarai bled to death.H: His blood flowed out of his body.

16.H3

Page 27: Augmenting WordNet for Deep Understanding of Text

Unsuccessful examples with the glosses• More common: Being “tantalizingly close”

T: Satomi Mitarai bled to death.H: His blood flowed out of his body.

16.H3

bleed_v1: "shed blood", "bleed", "hemorrhage": lose blood from one's body

WordNet:

So close!

Need to also know: “lose liquid from container” → “liquid flows out of container”

usually

Page 28: Augmenting WordNet for Deep Understanding of Text

T: The National Philharmonic orchestra draws large crowds.H: Large crowds were drawn to listen to the orchestra.

20.H2

Unsuccessful examples with the glosses• More common: Being “tantalizingly close”

Page 29: Augmenting WordNet for Deep Understanding of Text

T: The National Philharmonic orchestra draws large crowds.H: Large crowds were drawn to listen to the orchestra.

20.H2

WN: orchestra = collection of musicians WN: musician: plays musical instrument WN: music = sound produced by musical instruments WN: listen = hear = perceive sound

WordNet:

So close!

Unsuccessful examples with the glosses• More common: Being “tantalizingly close”

Page 30: Augmenting WordNet for Deep Understanding of Text

Success with Morphosemantic Links• Good example

T: The Zoopraxiscope was invented by Mulbridge.H*: Mulbridge was the invention of the Zoopraxiscope. [NOT entailed]

66.H100

(“invent”) invent_v1 invention_n1 (“invention”)

“invent”/ “invention”derivation link

But need an agent (X verb Y -> X is agent-noun of Y)Got: result-noun (“invention” is result of “invent”)

So no entailment (correct!)

WordNet too permissive!:

Page 31: Augmenting WordNet for Deep Understanding of Text

T: The president visited Iraq in September.H: The president traveled to Iraq.

54.H1

Successful Examples with DIRT• Good example

DIRT: IF Y is visited by X THEN X flocks to YWordNet: "flock" is a type of "travel"

Entailed [correct]

Page 32: Augmenting WordNet for Deep Understanding of Text

T: The US troops stayed in Iraq although the war was over.H*: The US troops left Iraq when the war was over. [NOT entailed]

55.H100

Unsuccessful Examples with DIRT• Bad rule

DIRT: IF Y stays in X THEN Y leaves X

Entailed [incorrect]

Page 33: Augmenting WordNet for Deep Understanding of Text

Overall Results• Note: Eschewing statistics!

• BPI test suite (61%):

Correct Incorrect

When H or ¬H is predicted by:

Simple syntax manipulation 11 3

WordNet taxonomy + morphosemantics 14 1

WordNet logicalized glosses 4 1

DIRT paraphrase rules 27 20

When H or ¬H is not predicted: 97 72

“Straight-Forward”

Page 34: Augmenting WordNet for Deep Understanding of Text

Overall Results• Note: Eschewing statistics!

• BPI test suite (61%):

Correct Incorrect

When H or ¬H is predicted by:

Simple syntax manipulation 11 3

WordNet taxonomy + morphosemantics 14 1

WordNet logicalized glosses 4 1

DIRT paraphrase rules 27 20

When H or ¬H is not predicted: 97 72

Useful

Page 35: Augmenting WordNet for Deep Understanding of Text

Overall Results• Note: Eschewing statistics!

• BPI test suite (61%):

Correct Incorrect

When H or ¬H is predicted by:

Simple syntax manipulation 11 3

WordNet taxonomy + morphosemantics 14 1

WordNet logicalized glosses 4 1

DIRT paraphrase rules 27 20

When H or ¬H is not predicted: 97 72

Occasionallyuseful

Page 36: Augmenting WordNet for Deep Understanding of Text

Overall Results• Note: Eschewing statistics!

• BPI test suite (61%):

Correct Incorrect

When H or ¬H is predicted by:

Simple syntax manipulation 11 3

WordNet taxonomy + morphosemantics 14 1

WordNet logicalized glosses 4 1

DIRT paraphrase rules 27 20

When H or ¬H is not predicted: 97 72

Often useful but

unreliable

• RTE3: 55%

Page 37: Augmenting WordNet for Deep Understanding of Text

Summary

• “Understanding”– Constructing a coherent model of the scene being described– Much is implicit in text → Need lots of world knowledge

• Augmenting WordNet– Made some steps forward:

• More connectivity

• Logicalized glosses

• But still need a lot more knowledge!

Page 38: Augmenting WordNet for Deep Understanding of Text

Thank you!