Post on 17-Dec-2015
2
What is discourse?
Any piece of text consisting of more than one sentence
Until now our lectures revolved mainly around topics concerning word-level or sentence-level analysis.
3
Discourse phenomena
Anaphora resolution– The Tin Woodman went to Emerald City to see the Wizard
of Oz and ask for a heart. After he asked for it, the Woodman waited for the Wizard’s response.
Types of noun phrases– Indefinite: Julia has a cat. Some cat entered the house.– Definite: The cat is brown.– Pronoun: It doesn’t eat much.
4
Coherence– John hid Bill’s car keys. [the reason he did this
was that] He was drunk.– ?? John hid Bill’s car keys. [How are these
sentences related?] He likes spinach.
Coherence relations– explanation or cause– contrast or concession
5
Discourse connectives
Cue phrases, discourse markers– Because, although, but, for example, yet, and
– John hid Bill’s car keys because he was drunk.
– [We can’t win] [but we must keep trying] contrast
6
Implicit and explicit discourse relations
I took my umbrella this morning. [because] The forecast was rain in the afternoon.
She is never late for meetings. [but] He always arrives 10 minutes late.
She woke up early. [afterward] She had breakfast and went for a walk in the park.
7
Ambiguity of discourse connectives
They have not spoken to each other since they argued last fall. (Temporal)
I assumed you were not coming since you never replied to the invitation. (Causal)
8
Penn Discourse Tree Bank
Annotated explicit and implicit discourse relations
Each relation is annotated with its sense
12
In order to interpret (understand) discourse automatically, the problem of identification and disambiguation of discourse relations needs to be addressed.
What else?
13
Reference resolution
Victoria Chen, Chief Financial Officer of Megabucks Banking Corp since 2004, saw her pay jump 20%, to $1.3 million, as the 37-year-old also became the Denver-based financial-services company’s president. It has been ten years since she came to Megabucks from rival Lotsabucks.
14
Definitions
Reference: use of linguistic expressions (her, Chen) to denote an entity or individual
Reference resolution: the task of determining what entities are referred to by which linguistic expressions
A natural language expression used to perform reference is called a referring expression, and the entity that is referred to is called the referent.
15
Two referring expressions that are used to refer to the same entity are said to corefer
Reference to an entity that has been previously introduced into the discourse is called anaphora.
Coreference resolution is the task of finding referring expressions in a text that refer to the same entity (coreference chains)
16
Features for pronominal anaphora resolution
Number agreement– John has a Ford Falcon. It is red– ?? John has a Ford Falcon. They are red.– John has three cars. They are red.– ?? John has three cars. It is red.
Person agreement Gender agreement
17
Preferences in pronoun interpretationSalience
Recency– pronoun antecedents have been mentioned nearby in the
text. Grammatical role:
– typically entities mentioned in subject position are more salient than those mentioned in object position
Repeated mention Selectional restrictions
– John parked his car in the garage after driving it around for hours.
18
Relation to summarization
Revisions that improve cohesion in multidocument summaries: a preliminary study (2002) Jahna C. Otterbacher, Dragomir R. Radev, Airong Luo . In Proceedings of the Workshop on Automatic Summarization
19
Types of problems in manually edited summaries (15 multi-doc summaries)
Discourse – Concerns the relationships between the sentences in a summary, as well as
those between individual sentences and the overall summary.
Identification of entities – Involves the resolution of referential expressions such that each entity mentioned in a summary can easily be identified by the reader.
Temporal – Concerns the establishment of the correct temporal relationships between events.
Grammar – Concerns the correction of grammatical problems, which may be the result of juxtaposing sentences from different sources, or due to the previous revisions that were made.
Location/setting – Involves establishing where each event in a summary takes place