Automatic recognition of discourse relations Lecture 3.

Automatic recognition of discourse relations

Lecture 3

Can RST analysis be done automatically? In the papers we’ll read the question

is really about local rhetorical relations

Part of the problem is the availability of training data for automatic labellingManual annotation is slow and

expansiveLots of data can be cleverly collected,

but is it appropriate (SL’07 paper)

ME’02

Discourse relations are often signaled by cue phrases CONTRAST: but EXPLANATION-EVIDANCE: because

But not always. In a manually annotated corpus 25% of contrast and explanation-evidence

relations marked explicitly by a cue phrase• Mary liked the play, John hated it• He wakes up early every morning. There is a

construction site opposite his building.

Cleverly labeling data through patterns with cue phrases

CONTRAST [BOS…EOS][BOS But…EOS] [BOS…][but…EOS] [BOS…][although…EOS] [BOS Although…,][…EOS]

CAUSE-EXPLANATION [BOS…][because…EOS] [BOS Because…,][…EOS] [BOS…EOS][Thus,…EOS]

Extraction patterns

CONDITION [BOS If…,][…EOS] [BOS If…][then…EOS] [BOS…][if…EOS]

ELABORATION [BOS…EOS][BOS…for example…EOS] [BOS…][which…EOS]

NO-RELATION-SAME-TEXT NO-RELATION-DIFF-TEXT

Main idea

Pairs of words can trigger a given relationJohn is good in math and sciences.Paul fails almost every class he takes.Embargo—legally

Features for classification the cartesian product of the words in

the two text spans being annotated

Probability of word-pairs given a relation

log(W1,W2|RLk) + log(P(RLk)

Classification results are well above the baseline

Using only content words did not seem to be very helpful

Model does not perform that well on manually annotated examples

Discussion

Would be interesting to see the list of the most informative word-pairs per relation

Is there an intrinsic difference in clauses explicitly marked for a relation compared to those where the relation is implicit?

B-GMR’07: Offer several improvements over ME’02 Tokenizing and stemming

Improves accuracy Reduces model size

Vocabulary size limit/minimum frequency Using 6,400 most frequent words is best

Using a stoplist Performance deteriorates (as in the original

ME’02 paper!) Topic segmentation for better example

collection

SL’07

Using automatically labeled examples to classify rhetorical relationsIs it a good idea?

The answer is no, as already hinted by the other papers

Two classifiers

Word-pair based Naïve Bayes

Multi-feature (41) BoosTexter model Positional Length Lexical POS Temporal Cohesion (pronouns and ellipsis)

Explicit note here, not in the previous papers

The distribution of different relations in the automatically extracted corpus does not reflect the true distribution

In all studies data is downsampled

Testing on explicit relations

Results deteriorate for both machine learning approachesStill better than random

Natural data does not seem suitable for trainingDo not generalize well to examples

which occur naturally without unambiguous discourse markers

Training on manually labeled, unmarked data

Less training data is availableWorse for the Naïve Bayes classiferGood for the Boostexter model

Why?Semantic redundancy between

discourse markers and the context they appear in?

Using the Penn discourse tree bank

Implicit relationsNot that good performance

Explicit relationsPerformance closer to that in

automatically collected test set Cheap data collection for this task

probably not that good idea after all!

Automatic recognition of discourse relations Lecture 3.

Documents

Transcript of Automatic recognition of discourse relations Lecture 3.