The extension of the Anaphora Resolution Exercise (ARE) to Spanish and Catalan Constantin Orasan...
-
date post
19-Dec-2015 -
Category
Documents
-
view
216 -
download
0
Transcript of The extension of the Anaphora Resolution Exercise (ARE) to Spanish and Catalan Constantin Orasan...
The extension of the Anaphora Resolution Exercise (ARE)
to Spanish and Catalan
Constantin OrasanUniversity of Wolverhampton, UK
and
Marta RecasensUniversitat de Barcelona, Spain
Structure
1. Description of ARE2007
2. English corpus used in ARE2007
3. The AnCora corpora
4. Adapting the AnCora corpora for ARE2009
5. Plans for ARE2009
The Anaphora Resolution Exercises (AREs)
the goal of ARE was to “develop discourse anaphora resolution methods and to evaluate them in a common and consistent manner“
we organise them in conjunction with DAARC conferences
thought as multilingual evaluations not supposed to be restricted only to pronominal
and NP coreference
Do we need a roadmap?
ARE2007
organised in conjunction with DAARC2007 only English texts very short time to organise it can be considered a dry-run for ARE2009 focused on 4 tasks 3 participants, 8 runs submitted we used the NP4E corpus, a corpus of
newswire texts
Task 1: Pronominal resolution on pre-annotated texts
resolve pronouns to NPs participants received the pronouns to be
resolved and NP candidates
Input text
[Israeli-PLO relations]1 have hit [a new low]2
with [the Palestinian Authority]3 saying [Israel]5
is wrong to think [it]6 can treat [the Authority]7
like [a client militia]8.
Pronouns to resolve
6 = it
Output
(6, 5) = (Israel, it)
Evaluation method for task 1
success rate (accuracy) defined as the number of correctly resolved anaphoric pronouns divided by the total number of anaphoric pronouns
Task 2: Coreferential chains resolution on pre-annotated texts
assign NPs to chains participants received texts with NPs belonging to a
chain with at least two elements annotated
Input text
[Israeli-PLO relations]1 have hit a new low
with [the Palestinian Authority]2 saying [Israel]3 is wrong to think [it]4 can treat [the Authority]5 like a client militia.
Output
Chain 13 = Israel4 = it
Chain 22 = the Palestinian Authority5 = the Authority
Evaluation method for Task 2
Precision and recall as defined by MUC only one system participated:
precision: 53.01% recall: 45.72% f-measure: 48.32%
Task 3: Pronominal resolution on raw texts
unannotated texts were given to participants systems had to:
determine the referential pronouns NP candidates resolve pronouns to NPs
Input text
Japan26 and27 Peru28 on29 Saturday30 took31 a32 tough33 stand34
…their45 accord46 was47 swiftly48 ...
Output text
(45 – 45, 26 – 28): (their, Japanand Peru)
Task 4: Coreferential chains resolution on raw texts
unannotated texts were given to participants systems had to determine
the coreferential NPs assign the to chains
the most popular task (3 runs submitted)
Input text
Japan26 and27 Peru28 on29 Saturday30 took31 a32 tough33 stand34
…their45 accord46 was47 swiftly48
Output text
(26 – 28 , 45 – 45, …): (Japan and Peru, their, …)
Overlap measure
Evaluation method for task 3
Evaluation of task 3
Evaluation for Task 4
MUC scores modified to use overlap metric
Rationale for the tasks
tasks 1 and 3 evaluation of pronominal anaphora
tasks 2 and 4 evaluation of coreference resolution
tasks 1 and 2 evaluation of algorithms tasks 3 and 4 evaluation of fully automatic
systems
Corpus used in ARE2007
we used the NP4E corpus (Hasler et. al, 2006) over 55,000 words newswire texts five clusters of related documents annotation in two steps:
identification of markables identification of relations between markables
annotation done using PALinkA (Orasan 2003)
Markables
all the NPs at all levels regardless whether they are coreferential or not
include all the modifiers (both pre- and post- modifiers)
possessive pronouns and possessors no relative pronouns or relative clauses no NPs from fixed expressions (in town, on
board, etc.)
COREF and UCOREF only nominal identity of reference direct
anaphoric expressions relations marked:
identity synonymy generalisation specialisation
Coreferential links
lexical choice rather than concept (i.e. the house… the door)
Coreferential links
definite NPs in copular relation:[the blast] was [the worst attack on [civilians] on [U.S. soil]]
definite appositives[Zaire Airlines, [the main commercial airline in [Zaire]]]
text in brackets I, you, we in speech coreferential to their
antecedents
AnCora corpora
ANnotated CORporA for Catalan and Spanish Newspaper and newswire texts 500,000 words each Annotated with:
PoS tags and lemmas Constituents and functions Argument structures, thematic roles Named entities Nominal WordNet synsets Coreference relations
AnCora corpora
XML in-line annotation
Markables (syntactic nodes) NPs <sn> ... </sn> + <sn elliptic=“yes”/> Clitics <v> darles </v> Clauses, sentences <S> ... </S>
Attributes entity=“entity#” coreftype=“ident/pred/dx”
Example: AnCora-Ca
"[L' aeroport] ha d' anar amb [compte] amb
[els sorolls]. [Ø] Ha de comportar -se com [un
bon veí]", va recomanar [Morlanes] ...
Malgrat [les diferències entre [AENA i veïns
de [Gavà_Mar]]]
Example: AnCora-Ca
<sn entity=“entity3”> "L' aeroport </sn> ha d' anar
amb compte amb els sorolls. <sn elliptic=“yes”
entity=“entity3” coreftype=“ident”/> Ø Ha de
comportar -se com un bon veí", va recomanar
Morlanes ... Malgrat les diferències entre <sn
entity=“entity3” coreftype=“ident”> AENA </sn> i
veïns de Gavà_Mar
Identity of reference
Identity, synonymy
l’Ajuntament de Tarragona ... l’Ajuntament
los usuarios de la Red en EEUU ... los internautas estadounidenses
Generalisation, specialisation
los precios del café ... los precios
Metonymy
los conductores de camiones ... los camiones no hacen caso de los agentes
Identity of reference
Different scope of generics las mujeres de España ... las mujeres
Place boundness In Garraf, the unemployment rate ... it is higher in Lleida
Time boundnessel Festival de la Música Viva ... aquesta edició
Unrealized entitiesSi hay [un fan de [Georgie_Fame] , o de [Gary_Brooker] o de [Albert_Lee]] , [Ø] puede estar a [dos metros de [él]]
AnCora vs. NP4E
Similarities All NPs (modifiers, embedded NPs, coordinated NPs)
e.g. [passengers on [a flight from [Moscow] to [Nigeria]]]
[la existencia de [una fuerte división] en [esta institución]]
[[McVeigh] and [Nichols]] [[el PP] y [el PSOE]]
[[Barcelona] airport] vs. [l’ aeroport de [Barcelona]] No NPs part of fixed expressions
e.g. came to power, subió al poder Identity relation Predicative relation: copular, apposition No identity-of-sense: China-org vs. China-loc No pleonastic pronouns
AnCora vs. NP4E
Differences
Zero elements: elliptical subjects Clitical pronouns
e.g. give them = (Spanish) darles / (Catalan) donar-les Relative pronouns Discourse deixis No possessive pronouns Split antecedents
Preparation of Catalan and Spanish data for ARE2009
there are lots of similarities between the guidelines used for AnCora and NP4E corpus
features too specific will be discarded AnCora will be converted to the light XML
annotation used in ARE2007
We hope not to encounter major problems when we do the actual conversion
Lessons learnt from ARE2007
if possible more evaluation methods and more baselines
better overlap metric participants want more time (lots of interest,
but the evaluation clashed with some major conferences)
participants want to be able to publish
Better overlap metric
no head/MIN attribute for the markable the same system obtained better results on
Task 4 than Task 2
Gold standard
Output of the system
Task 2 Score 0
Task 4 Score 0.xxxx
Insert MIN attribute
Plans for ARE2009
include 4 languages: Catalan, Dutch, English, and Spanish
keep the 4 tasks include a multilingual task for pronominal
anaphora resolution evaluate some preprocessing stages for
anaphora resolution have a real time task
Preprocessing tasks
1. Identification of pleonastic it pronouns in English texts
2. Identification of pleonastic het pronouns in Dutch
3. Identification of elliptical subjects in Spanish and Catalan
NP anaphora and NP coreference resolution tasks
Catalan Dutch EnglishSpanish
Task 1 Yes No Yes Yes
Task 2 Yes No Yes Yes
Task 3 Yes Yes Yes Yes
Task 4 Yes Yes Yes Yes
Multilingual task for pronominal anaphora resolution
Is it possible to have a multilingual system? participants get a set of documents with paragraphs
in Catalan, Dutch, English, and Spanish referential personal pronouns marked in all the texts candidate noun phrases not annotated use a modified version of success rate that
considers how correctly pronouns were resolved and in how many languages
350+50 pronouns per language
… but, more thinking necessary
Real time tasks
Invite DAARC participants to take part in a real time exercise
The same tasks as for the main ARE2009 exercise, but … participants will need to bring their programs will have one hour to submit the results the tasks may include some surprise texts
… subject to interest from participants and presence of the necessary infrastructure
Tentative timescale
14 Nov 2008 Preliminary call for participation
15 Jan 2009 Training data released
4 - 23 May 2009 Test data is released (48 hours to submit the results after test data downloaded)
30 May 2009 Results communicated back to participants
6 June 2009 4 page technical reports due from participants
20 June 2009 Reviews back to participants
1 July 2009 Final version of technical reports
5 - 6 Nov 2009 DAARC2009, Goa, India
Webpagehttp://www.anaphora-and-coreference.info/ARE2009
Mailing [email protected]
Email [email protected]
Thank you!