TEXT PROCESSING 1 Anaphora resolution: recent developments Massimo Poesio.

48
TEXT PROCESSING 1 Anaphora resolution: recent developments Massimo Poesio

Transcript of TEXT PROCESSING 1 Anaphora resolution: recent developments Massimo Poesio.

TEXT PROCESSING 1

Anaphora resolution: recent developments

Massimo Poesio

Outline

Analysis of errors made by mainstream systems

Joint Entity Detection and Tracking Lexical and Commonsense knowledge AR tools (Global models)

Error analysis (Soon et al)

Errors most affecting precision: Prenominal modifiers identified as mentions and

other errors in mention identification String match but noun phrases refer to different

entities Errors most affecting recall:

Errors in mention identification (11%) Errors in SEMCLASS determination (10%) Need more features (63.3%)

Soon et al examples of errors:

Tarnoff, a former Carter administration official and president of the Council on foreign relations, is expected to be named [undersecretary] for political affairs … Former. Sen Tim Wirth is expected to get a newly created [undersecretary] post for global affairs

[Ms Washington and Mr. Dingell] have been considered [allies] of [the Securities exchanges], while [banks] and [future exchanges] often have fought with THEM

Mention detection errors in GUITAR (Kabadjov, 2007)

[The bow] (see detail, below right) is decorated with a complicated arrangement of horses and lions’ heads.

Above the lions’ heads are four sphinxes.

Three pairs of lions clamber up the section from the point where [the sheath and bow] are joined.

Joint Entity Detection and Tracking

Daume and Marcu 2005: Mention identification, classification, and linking take place at the same time

Denis and Balridge 2007: ILP

Lexical and commonsense knowledge

Nominals the most common type of anaphoric expression

Main source of errors with nominals: lack of commonsense knowledge

Lexical and commonsense knowledge: outline

Charniak, Sidner: the semantic network assumption

Formal models: Hobbs et al 1992, Poesio 1993, Lascarides & Asher 1998, Gardent & Kohlase 1999, Kehler

Using information about semantic relations: Byron 2002, Kehler 2004, Ng, Yang

Ponzetto & Strube [Simone] Pattern-based approaches (Poesio et al,

Markert & Nissim, Versley)

Basic errors in GUITAR: synonyms & hyponyms

Toni Johnson pulls a tape measure across the front of what was once [a stately Victorian home].

…..

The remainder of [THE HOUSE] leans precariously against a sturdy oak tree.

Most of the 10 analysts polled last week by Dow Jones International News Service in Frankfurt .. .. expect [the US dollar] to ease only mildly in November

…..

Half of those polled see [THE CURRENCY] …

Early work: the semantic network assumption (Charniak, Sidner)

CAR

VEHICLE

WHEELS

ISA

HAS

I saw [a car] come in. … THE VEHICLE was moving very slowly … THE WHEELS were moving very slowly …

Basic errors: NE

[Bach]’s air followed. Mr. Stolzman tied [the composer] in by proclaiming him the great improviser of the 18th century ….

[The FCC] …. [the agency]

Basic NE?

Modifiers

FALSE NEGATIVE:

A new incentive plan for advertisers …

…. The new ad plan ….

FALSE NEGATIVE:

The 80-year-old house

….

The Victorian house …

More complex cases of hyponymy

(Actual example of error from Soon et al on ACE)

<COREF ID="set_5">The quality</COREF> that 's coming out of <COREF ID="set_8">software</COREF> from <COREF ID="set_3">India</COREF> now is exceeding <COREF ID="set_5">the quality</COREF> coming out of <COREF ID="set_8">software</COREF> from the United States .

More complex cases of modification

The cabinet was actually made for [a Polish general, [Count Jan Klemens Branicki]]. An inventory of Count Branicki's possessions made at his death describes both the corner cupboard and the objects displayed on its shelves: a collection of mounted Chinese porcelain and clocks, some embellished with porcelain flowers.The drawing of the corner cupboard, or more probably anengraving of it, must have caught Branicki's attention.

Dubois was commissioned through a Warsaw dealer to construct the cabinet for THE POLISH ARISTOCRAT.

More advanced reasoning

Lexical & encyclopedic knowledge in anaphora resolution

Arguably the most important source of errors, especially for nominals

Importance of this sort of information perceived very early on (e.g., Charniak and Winograd’s use of frame-based representations to model simple inferences)

But methods used to deploy it too labour-intensive and hard to evaluate

Current challenge: to find ways of extracting and deploying this knowledge automatically

Formal models

To recover the preferred interpretation of “John hid Bill’s car keys. He was drunk”

(Hobbs, 1979) You need to (defeasibly) infer:

That utterances may be used to explain previous statements by providing a cause for certain events;

That somebody being drunk may cause other people not to want that person to drive;

That one way to prevent somebody driving is to hide that person’s car keys

These analyses plausible …

… and implementable (e.g., in Hobbs’ TACITUS, SRI’s Core Language Engine)

… But very brittle Need theories of ‘robust inferencing’ Also: need to develop ways of acquiring

such knowledge Most work in the past ten years has

focused on this.

Alternatives to hand-coding for a specific domain

Using an existing resource, even if hand-coded E.g., WordNet: Harabagiu, Poesio &

Vieira Extracting knowledge from corpora Extracting knowledge from resources

such as lexica and encyclopedias Recently: Wikipedia

Using Wordnet

WordNet was the first publically available repository of ‘semantic-network’ – style knowledge

Good coverage of hyponymy relations (around 60%)

Not so good for meronymy. Also, knowledge in WordNet often hard to find.

The case of HOUSE

ARTIFACT

HOUSING BUILDING

HOUSE HOME ROOM

WALL FLOOR

IS-A IS-A

IS-AIS-A PART-OF

PART-OF PART-OF

Extracting lexical information from corpora for anaphora resolution

Using vector-based representations Poesio et al 1998: LSA-style vector

models for synonymy Using patterns

UNSUPERVISED RELATION EXTRACTION USING PATTERNS

HEARST 1998: HYPONYMYNP {, NP}* {,} or other NPbruises …… broken bones, and other INJURIESHYPONYM (bruise, injury)

POESIO ET AL, 2002: MERONYMYthe N of the N is ….the WHEEL of the CAR is …MERONYM (wheel, car)

Used for anaphora resolution by Markert & Nissim (2005), Versley (2006, 2007) (German)

Experiment: adding Web patterns to Soon et al baseline

Versley, 2007 (Johns Hopkins workshop)

Using web patterns to resolve references to nominals

Methods

Results (qualitative)

Using information about semantic relations

(Bean and Riloff, NAACL 04)

Knowledge about semantic relations

Eckert & Strube 2001, Byron 2002: distinguish between abstract and concrete references

Dagan et al 1997, Kehler et al 2004: selectional restrictions for pronoun resolution

Encyclopedic knowledge in IDC

[The FCC] took [three specific actions] regarding [AT&T]. By a 4-0 vote, it allowed AT&T to continue offering special discount packages to big customers, called Tariff 12, rejecting appeals by AT&T competitors that the discounts were illegal. ….. …..[The agency] said that because MCI's offer had expired AT&T couldn't continue to offer its discount plan.

Why Wikipedia may help addressing the encyclopedic knowledge problem

http://en.wikipedia.org/wiki/FCC:

The Federal Communications Commission (FCC) is an independent United States government agency, created, directed, and empowered by Congressional statute (see 47 U.S.C. § 151 and 47 U.S.C. § 154).

Using Wikipedia for anaphora resolution

Using category structure Using disambiguation pages Using links

Wikipedia: using categories

Problems with categories

Category structure is becoming unusable (too many categories)

Two solutions: Find ways of pruning the category tree Use other types of information in

Wikipedia

Ponzetto & Strube 2007: pruning category tree

Ponzetto & Strube 2007: pruning category tree

Other types of information

Redirects Aliases Lists Gender

Results (ACE-02 BNews)

Recall Precision F-Score

SoonEtAl 51.2% 69.9% 59.1%

WikiCats 56.8% 68.0% 61.9%

WikiOther 56.5% 67.9% 61.7%

All 60.7% 65.4% 63.0%

From mentions to entities

<ANAPHOR (j), ANTECEDENT (i)>

Soon et al: <mention (j), mention (i)>

Global models: <mention (j), entity (coref chain) (i)>

Entity features

Global models make it possible to use information about entities / coref chains to decide on antecedent Culotta et al: first-order features

E.g., Agree in gender with all mentions in chain / most mentions in chain / some mention in chain

Finkel and Manning: transitivity

Tools for AR

Java-RAP (pronouns) GUITAR (Kabadjov, 2007) BART

See labs

GuiTAR: A general-purpose tool

XML in / XML out Implemented in Java

Data- and Application-level integration into NLP systems

New preprocessing modules / AR algorithms can easily be integrated

Version 3.03: Mitkov’s pronoun resolution algorithm (1998) Vieira and Poesio’s algorithm for DD resolution

(2000) + statistical DN classifier Boncheva et al algorithm for NE resolution

MAS-XML EXAMPLE

<s> … <ne id="ne139" cat="the-np" per="per3“ gen=”neut” num=”plur”>

<W P="DT">The</W><mod id="m89" type=”pre”>

<W P="JJ">fragile</W></mod><nphead>

<W P="NNS">eggs</W></nphead>

</ne> …</s>

<ante current="ne139" rel="ident"><anchor antecedent="ne112" />

</ante>

A modular architecture

A generic discourse model for anaphora resolution

GUITAR STRENGHTS / WEAKNESSES

Strenghts Well engineered Modular Nice discourse model Interface to lexical knowledge

Weaknesses Very basic learning architecture No commonsense knowledge Algorithms other than DDs not high performance

Readings and references

Johns Hopkins ELERFED workshop: www.clsp.jhu.edu/ws2007/groups/elerfed/

GUITAR: dces.essex.ac.uk/research/nle/GuiTAR/