Inference Protocols for Coreference Resolution Kai-Wei Chang, Rajhans Samdani, Alla Rozovskaya, Nick...
-
Upload
neil-simon -
Category
Documents
-
view
224 -
download
0
Transcript of Inference Protocols for Coreference Resolution Kai-Wei Chang, Rajhans Samdani, Alla Rozovskaya, Nick...
Inference Protocols for Coreference Resolution Kai-Wei Chang, Rajhans Samdani, Alla Rozovskaya, Nick Rizzolo, Mark Sammons, and Dan Roth
This research is supported by ARL, and by DARPA, under the Machine Reading Program
CoreferenceCoreference Resolution is the task of grouping all the mentions of entities into equivalence classes so that each class represents a discourse entity. In the example below, the mentions are colour-coded to indicate which mentions are co-referent (overlapping mentions have been omitted for clarity).
An American official announced that American President Bill Clinton met his Russian counterpart, Vladimir Putin, today. The president said that Russia was a great country.
System Architecture
CoNLL Shared Task 2011
Experiments and Results Results on DEV set with predicted mentions
Results on DEV set with gold mentions
Official scores on TEST set:
Method MD MUC BCUB CEAF AVG
Best-Link 64.70 55.67 69.21 43.78 56.22
Best-Link W/ Const. 64.69 55.80 69.29 43.96 56.35
All-Link 63.30 54.56 68.50 42.15 55.07
All-Link W/ Const. 63.39 54.56 68.46 42.20 55.07
Method MUC BCUB CEAF AVG
Best-Link 80.58 75.68 64.68 73.65
Best-Link W/ Const. 80.56 75.02 64.24 73.27
All-Link 77.72 73.65 59.17 70.18
All-Link W/ Const. 77.94 73.43 59.47 70.28
Task MD MUC BCUB CEAF AVG
Pred. Mentions w/ Pred. Boundaries(Best-Link W/ Const. )
64.88 57.15 67.14 41.94 55.96
Pred. Mentions w/ Gold Boundaries(Best-Link W/ Const. )
67.92 59.75 68.65 41.42 56.62
Gold Mentions(Best-Link) - 82.55 73.70 65.24 73.83
Discussion and Conclusions
Mention Detection
Coreference Resolution with Pairwise Coreference Model• Inference Procedure
• Best-Link Strategy• All-Link Strategy
• Knowledge-based Constraints
Post-Processing
Coreference resolution on the OntoNotes-4.0 data set. Based on Bengtson and Roth (2008), our system is built
on Learning Based Java (Rizzolo and Roth, 2010) We participated in the “closed” track of the shared task.
Compared to the ACE 2004 Corpus, the OntoNotes-4.0 data set has two main differences:
Singleton mentions are not annotated in OntoNotes-4.0. OntoNotes-4.0 takes the largest logical span to represent
a mention.
We design a high recall (~90%) and low precision (~35%) rule-based mention detection system.
As a post-processing step, we remove all predicted mentions that remain in singleton clusters after the inference stage.
The system achieves 64.88% in F1 score on TEST set.
Mention Detection
Pairwise Mention Score
Knowledge-based Constraints
Inference ProcedureThe inference procedure takes as input a set of pairwise mention scores over a document and aggregates them into globally consistent cliques representing entities. We investigate two techniques: Best-Link and All-Link.Best-Link Inference:For each mention, Best-Link considers the best mention on its left to connect to (according the pairwise score) and creates a link between them if the pairwise score is above some threshold.
All-Link Inference:The All-Link inference approach scores a clustering of mentions by including all possible pairwise links in the score. It is also known as correlational clustering (Bansal et al., 2002).
ILP Formulation:Both Best-Link and All-Link can be written as an Integer linear programming (ILP) problem:Best-Link:
All-Link:
wuv is the compatibility score of a pair of mentions, yuv is a binary variable. For the All-Link clustering, we drop one of the three transitivity constraints for each triple of mention variables. Similar to Pascal and Baldridge (2009), we observe that this improves accuracy.
wuvwuv
Weight vector learned from training data
Extracted features
Compatibility score given by constraints
A threshold parameter (to be tuned)
Training Procedure
We use the same features as Bengtson and Roth (2008) with the knowledge extracted from OntoNotes-4.0.
We explored two types of learning strategies, which can be used to learn w in Best-Link and All-Link. The choice of a learning strategy depends on the inference procedure.1. Binary Classification:Following Bengtson and Roth (2008), we learn the pairwise scoring function w on:
Positive examples: for each mention u, we construct a positive example (u, v), where v is the closest preceding mention in u’s equivalence class.
Negative examples: all mention pairs (u, v), where v is a preceding mention of u and u, v are in different classes.
As singleton mentions are not annotated, the sample distributions in the training and inference phases are inconsistent. We apply the mention detector to the training set, and train the classifier using the union set of gold and prediction mentions.
2. Structured Learning (Structured Perceptron):We present a structured perceptron algorithm, which is similar to the supervised clustering algorithm of Finley and Joachims (2005) to learn w:
We define three high precision constraints that improve recall on NPs with definite determiners and mentions whose heads are Named Entities. Examples of mention pairs that are correctly linked by the constraints are: [Governor Bush] and [Bush],[a crucial swing state, Florida] and [Florida], [Sony itself] and [Sony].
We investigate two inference methods: Best-Link & All-Link
We provide a flexible architecture for incorporating constraints
We compare and evaluate the two inference approaches and the contribution of constraints
Contributions
Best-Link outperforms All-Link. This raises a natural algorithmic question regarding the inherent nature of the clustering style most suitable for coreference resolution and regarding possible ways of infusing more knowledge into different coreference clustering styles.
Constraints improve the recall on a subset of mentions. There are other common errors for the system that might be fixed by constraints.
Our approach accommodates infusion of know-ledge via constraints. We have demonstrated its utility in an end-to-end coreference system.