CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

65
CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA

Transcript of CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

Page 1: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

CS 544: Lecture 3.5Discourse Coherence

Jerry R. Hobbs

USC/ISI

Marina del Rey, CA

Page 2: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

Outline

Interpreting Adjacency

What Coherence Relations are there?

Definitions and Examples of Specific Coherence Relations

Discourse Structure

Page 3: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

Interpretation

To understand our environment, we seek

the best explanation of the observable facts.

To understand a text, we seek the best explanation

of the "observable facts" that the text presents.

Page 4: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

Interpreting Adjacency

Adjacency is one of the observable facts to be explained.

Environment: chair on table

Text: Two segments of text x and y together.

turpentine jar R = y's function is to contain x

oil sample R = y is sample of x

Page 5: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

Compositional Semanticsas Interpretation of Adjacency

oil sample

R = y is sample of x

men work

R = y is a working event by x

Syntax and compositional semantics are constraints on the interpretation of adjacency as predicate-argument relations.

Page 6: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

Discourse Coherence

John can open Bill's safe. He knows the combination.

Interpreting text includes explaining the adjacency of clauses, sentences, and larger segments of discourse.

= Finding relation between adjacent segments

Page 7: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

Discourse CoherenceRelation

Segment1 Segment2

Interpret each segment, and find the relation between them.

causefigure-ground and ground-figuresimilarity and contrast

R4

R3

R1 R2

S1 S2 S3 S4 S5

The Structure of Discourse

Page 8: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

Back to the Boat

Boat in Tree by Sea

Storm

ExplainEntities in

Environment

cause

Explain Relationsin Environment

“Help! Thief!”

Explain Wordsin Utterance

Explain Relationsbetween Them

(Why are they adjacent?)

Page 9: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

Tasks of a Discourse Theory

1. What are the possible relations between adjacent discourse segments?

2. How are they recognized or characterized?

Page 10: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

Interpreting Adjacent Sentences

Sentence-1 Sentence-2

Relationbetween

Event Event

Possible Relations: Cause Similarity Background .....

Coherence Relations

Page 11: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

Outline

Interpreting Adjacency

What Coherence Relations are there?

Definitions and Examples of Specific Coherence Relations

Discourse Structure

Page 12: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

Coherence Relations

Causality: Cause, Explanation, Metatalk, ....

Change of State: Occasion

Figure-Ground: Background

Similarity: Parallelism, Contrast, Exemplification

Coarsening of Granularity: Elaboration, ....

Page 13: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

Coherence RelationsSentences and larger segments of text describe situations or eventualities.

What are the principal kinds of relations that can obtain between situations/eventualities?

Figure-ground or Ground-figure March Madness is happening. USC won on Sunday. Interlocking change of state (occasion) He drives to the basket. He dunks it. Causality and its violation USC played excellent defense. Texas only scored 68. Texas had the best player. USC won anyway. Similarity and its negation (contrast) UCLA advanced. USC also advanced. UCLA won narrowly. USC won handily. including the limiting case of Elaboration USC tromped Texas. We dominated the game. Predicate-argument Duke lost! Again!

These are semantic relations (the information conveyed by adjacency), not rhetorical relations (what the speaker is trying to do by putting these together)

Page 14: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

Functionality of Coherence Relations

Figure-ground or Ground-figure Interlocking change of state Causality and its violation

Similarity and its negation (contrast)

including the limiting case of Elaboration Predicate-argument

The environment influenceswhat happens to an entity in

that environment.

These allow us to predictwhat will happen next.

Similar thingsbehave similarly.

The basic unitof information.

Page 15: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

Formalizing the Tree Structureof Discourse

Logical form of sentence s --> Syn(s,e)

Syn(s,e) --> Segment(s,e)

Segment(s1,e1) & Segment(s2,e2) & CoRel(e1,e2,e) --> Segment(s1 s2, e)

Note: Syntactic composition rules are an instance of this rule, where relation is pred-arg.

To interpret text, prove: ( e) Segment(text, e)

Summary

Page 16: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

Outline

Interpreting Adjacency

What Coherence Relations are there?

Definitions and Examples of Specific Coherence Relations

Discourse Structure

Page 17: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

The Ground-Figure Relation

composite-entity(s) & relations-of(r,s) & member(e1,r) & p'(e1,x,y) & at'(e2,x,y) --> CoRel(e1,e2,e2)

S1 describes some aspect of a composite entity (the ground).S2 places an entity x (the figure) at some point within that system.

March Madness is happening. USC won on Sunday.

T is a pointer to the root of a binary tree.Set the variable P to T.

Page 18: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

Change of State: Occasion

change(e,e1,e2) --> CoRel(e1,e,e)

change(e,e1,e2) --> CoRel(e,e2,e)

change(e4,e1,e2) & change(e5,e2,e3) & change(e6,e1,e3) --> CoRel(e4,e5,e6)

John walked to the door. He opened it. He stepped out.

Typically e6 is a higher-level, coarser-grained description of the sequence of changes.

Page 19: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

Causality and Explanation

cause(e2,e1) --> CoRel(e1,e2)

A segment of discourse conveying e2 explains a segment conveying e1 if e2 could cause e1.

The police prohibited the women from demonstrating.They feared violence.

Segment1 <- explains - Segment2

e1 <- causes - e2

describes describes

Page 20: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

Explanation: Example 1The police prohibited the women from demonstrating. They feared violence.

Logical Form:

prohibit'(p1,p,d) & demonstrate'(d,w) & CoRel(p1,f,p1)

& fear'(f,y,v) & violent'(v,z) cause(f,p1)

Knowledge Base:

fear'(f,p,v) --> diswant'(d2,x,v) & cause(f,d2)

demonstrate'(d,w) --> cause(d,v) & violent'(v,z)

cause(d,v) & diswant'(d2,p,v) --> diswant'(d1,p,d) & cause(d2,d1)

diswant'(d1,p,d) & authority(p) --> prohibit'(p1,p,d) & cause(d1,p1)

cause(e1,e2) & cause(e2,e3) --> cause(e1,e3)

(Winograd)

Page 21: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

Causality: Example 2

cause(e2,e1) --> CoRel(e1,e2,e1)

Bush supports big business. He will veto Bill 1711.

Page 22: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

Causality: Example 2

Required KnowledgeBush supports big business. He will veto Bill 1711.

KB:

support'(e1,x,y) & bad-for(z,y) --> prevent'(e2,x,z) & cause(e1,e2)

prevent'(e2,x,z) & etc1(e2,x,z) --> veto'(e2,x,z)

Page 23: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

Example 2: The InterpretationBush supports big business. He will veto Bill 1711.LF:

support'(e1,Bush,BB)

& CoRel(e1,e2,e) & veto'(e2,x,1711)

cause(e1,e2)

prevent'(e2,x,1711)

x = Bush etc1(e2,x,1711)

bad-for(1711,BB)

e = e2

Page 24: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

Causality: Example 3Peter: Do you want to go to the cinema?Mary: I'm tired.

Mary didn't want to go to the cinema. She was tired.

diswant'(e1,M,e2)

cause(e3,e1)

diswant'(e1,M,e2) & activity(e2)

go'(e2,M,c) cinema(c) CoRel(e1,e3) tired'(e3,x)

etc(e2,x)

x=M

x=M

Page 25: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

Causality: Example 4Ann: Why are you so happy?Beth: I finally met a guy who is a bachelor.

Beth was so happy. She finally met a guy who was a bachelor.

happy'(e1,B) CoRel(e1,e2) meet'(e2,B,g) guy(g) bachelor(g)

poss'(e3,e5) & marry’(e5,B,g)

meet&date'(e4,B,g) & eligible(g) & bachelor(g)

cause(e4,e3)

cause(e3,e1)

cause(e4,e1)

Page 26: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

Explanation: Example 5I don’t own a TV set. I would watch it all the time.

Rexists(e1) own’(e2,i,t) CoRel(e1,e3,e1) Rexists(e3)

not’(e1,e2) tv(t) would’(e3,e4,c) watch’(e4,i,x)

not’(e1,e2)cause(e3,e1)

bad-for(e4,i) cause’(e3,e2,e4)

watch’(e4,i,t)

use’(e4,i,t)tv(t)

own’(e2,i,t)

c=e2

x=t

Owning causes using

To use TV is to watch it

Watching TV is bad

bad effect causes avoid cause

would (given C)if C causes

Page 27: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

Explanation andDefinite Reference

restaurant(a) canteen(b) prefer’(p,i,a,b) CoRel(p,e) capp(c) cheaper’(e,c,z)

cause(e,p)

sell(a,c) sell(b,z) capp(z)

I prefer the restaurant on the corner to the student canteen.The cappuccino is less expensive there.

(Matsui)

Restaurantssell cappucino

Canteens sellcappucino

I’m cheap

Page 28: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

Coherence Relations Based on Similarity

Specific -> Specific -> General -> Specific General Specific

Positive: Parallel Generalizaton Exemplification (Elaboration)

Negative: Contrast -- --

Question-Answer pairs

Page 29: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

SimilarityProperties are similar, if they are or imply properties whose predicates are the same, and whose arguments are coreferential or similar.

Similar[ p’(e1,x1, ..., z1), p’(e2,x2, ..., z2) ] : Coref(x1,...,x2,...) OR Similar(x1,x2) .... Coref(z1,...,z2,...) OR Similar(z1,z2)

Arguments are similar, if their other inferentially independent properties are similar.

Similar[ x1,x2 ] : Similar[ p1(...,x1,...), p2(...,x2,...) ] .... Similar[ q1(...,x1,...), q2(...,x2,...) ]

Mapping is preserved as recursion progresses.

Inferential Independence: K, P =/=> Q; K, Q =/=> P

Page 30: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

Similarity: Example

A ladder weighs 100 lb with its center of gravity 20 ft from the foot,and a 150 lb man is 10 ft from the top.

force(w1,L,d1,x1) w1: lb(w1,100) L: ladder(L) d1: Down(d1) x1: distance(x1,f, 20 ft) f: foot(f,L) ==> end(f,L) L:

force(w2,y,d2,x2) w2: lb(w2,150) y: ==> Coref(y,...,L,...) d2: Down(d2) x2: distance(x2,t, 10 ft) t: top(t,z) ==> end(t,z) z: ==> Coref(z,...,L,...)

Complicated to formalize, but easy for brains

Page 31: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

Verb Phrase Ellipsis

John revised his paper before the teacher did.

before(e11,e21) e11: revise’(e11,j,p1) j: John(j) ==> person(j) p1: paper(p1) Poss(x1,p1) x1: he(x1), Coref(x1,...,j,...)

e21: revise’(e21,t,p2) t: teacher(t) ==> person(t) p2: paper(p2) Poss(x2,p2) x2: Coref(x2,...,x1,...) he(x2), Coref(x2,...,t,...)

Strict: JJ

Sloppy: JT

Page 32: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

Similarity or Semantic ParallelismBlood probably contains the highest concentration of hepatitis B virus of any tissue except liver.Semen, vaginal secretions, and menstrual blood contain the agent and are infective.Saliva has lower concentrations than blood, and even hepatitis B surface antigen may be detectable in no more than half of infected individuals.Urine contains low concentrations at any given time.

BODY MATERIAL CONTAINS CONCENTRATION AGENT

blood contains highest concentration HBV

semenvaginal secretions contain agentmenstrual blood

saliva has lower concentrations

(saliva of) infected in detectable ... no more HBsAgindividuals more than half

urine contains low concentrations

Page 33: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

Elaboration

Elaboration(e1,e2,e) --> CoherenceRel(e1,e2,e)

gen(e1,e) & gen(e2,e) --> Elaboration(e1,e2,e)

Go down First Street.Just follow First Street three blocks to A Street.

go(Agent: you, Goal: x, Path: First St., Measure: y)

go(Agent: you, Goal: A St., Path: First St., Measure: 3 blks)

Page 34: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

ElaborationSegment("Go .. A Street.", f)

CoherenceRel(g,f,f)

Segment("Go down 1st St.", g) Segment("Follow ... A St.", f)

Elaboration(g,f,f)

Syn("Go down 1st St.", g,-,-) Syn("Follow ... A St.", f,-,-)

gen(g,f) gen(f,f)

follow'(f,u,FS,AS)

go'(g,u,x,y) along(g,FS)

down(g,FS)

Page 35: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

Contrast

p'(e1,x) & not'(e2,e3) & p'(e3,y) & q(x) & q(y) --> CoRel(e1,e2,e2)

x and y are similar by virtue of property q. S1 and S2 assert contrasting properties p and ~p of x and y (e1 and e2). Second segment is dominant.

Mary is graceful. John is an elephant.

Page 36: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

Mary is graceful. John is an elephant.

rel(e3,e2)

CoRel(e1,e2) Syn("John is an elephant",e2,-,-)

graceful'(e1,m)

Mary(m)

Syn(" is an elephant",e2,j,-)Syn("John",j,-,-)

Syn("an elephant",e2,j,-)

Syn(" is",e2,j,-)

Syn("an elephant",e3,j,-)

not'(e2,e4) & graceful'(e4,j)

Contrast(e1,e2)

John(j)

elephant'(e3,j) --> clumsy'(e2,j) & imply(e3,e2)

Present(e2)person(m) person(j)

Metaphor via ContrastSentence's

claim is John's clumsiness

Coercionprotects fromcontradiction

This belief issource ofmetaphor

Search for coherence forcesmetaphor reading

Page 37: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

AQUAINT-I: Question-Answeringfrom Multiple Sources

Show me the region 100 km north of the capital of Afghanistan.

What is the capitalof Afghanistan?

What is the lat/long100 km north?

What is the lat/longof Kabul?

CIAFact Book Geographical

Formula

QuestionDecomposition

via Logical Rules

AlexandrianDigital Library

Gazetteer

Show thatlat/long

Terravision

ResourcesAttached toReasoning

Process

Page 38: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

A Complex QueryWhat recent purchases of suspicious equipment has XYZ Corp or its subsidiaries or parent firm made in foreign countries?

subsidiary(x,y)

parent(y,x)

Subsidiaries:XYZ: ABC, ...DEF: ..., XYZ, ...

illegal

biowarfare

DB of bio-equip

Ask User not USA

Purchase: Agent: XYZ, ABC, DEF, ... Patient: anthrax, ... Date: since Jun05 Location: --

Page 39: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

Prove Question from Answer

Q: “How did Adolf Hitler die?”QLF: manner(e4) & Adolf(x10) & Hitler(x11) & nn(x12,x10,11) & die’(e4,x12)

ALF: it(x14) & be’(e1,x14,x2) & Zhukov(x1) & ’s(x2,x1) & soldier(x2) & plant’(e2,x2,x3) & Soviet(x3) & flag(x3) & atop(e2,x4) & Reichstag(x4) & on(e2,x8) & May(x5) & 1(x6) & 1945(x7) & nn(x8,x5,x6,x7) & day(x9) & Adolf(x10) & Hitler(x11) & nn(x12,x10,x11) & commit’(e3,x12,e5) & suicide’(e5,x12)A: “It was Zhukov’s soldiers who planted a Soviet flag atop the Reichstag on May 1, 1945, a day after Adolf Hitler committed suicide.”

“suicide” is troponym of “kill”: suicide’(e5,x12) --> kill’(e5,x12,x12) & manner(e5)

Gloss of “kill”: kill’(e5,x12,x12) <--> cause’(e5,x12,e4) & die’(e4,x12)

Gloss of “suicide”: suicide’(e5,x12) <--> kill’(e5,x12,x12)

e4=e5?

Page 40: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

The Search Space Problem

120,000 glosses --> 120,000 axiomsTheorem proving would take forever.

Lexical chains / marker passing: Try to find paths between Answer Logical Form and Question Logical Form. Ignore the arguments; look for links between predicates in XWN; it becomes a graph traversal problem (e.g., confuse “buy”, “sell”) Observation: All proofs use chains of inference no longer than 4 steps Carry out this marker passing only 4 levels out

Q: “What Spanish explorer discovered the Mississippi River?”Candidate A: “Spanish explorer Hernando de Soto reached the Mississippi River in 1536.”Lexical chain: discover-v#7 --GLOSS--> reach-v#1

Set of support strategy: Use only axioms that are on one of these paths. 120,000 axioms ==> several hundred axioms

Page 41: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

Relaxation (Assumptions)

Rarely or never can the entire Question Logical Form be proved from the Answer Logical Form ==> We have to relax the Question Logical Form

“Do tall men succeed?”

Logical Form: tall’(e1,x1) & x1=x2 & man’(e2,x2) & x2=x3 & succeed’(e3,x3)

Remove these conjuncts from what has to be proved, one by one, in some order, and try to prove again.

E.g., we might find a mention of something tall and a statement that men succeed.One limiting case: We find a mention of success.

Penalize proof for every relaxation, and pick the best proof.

Page 42: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

Abduction

Observable: QGeneral principle: P --> Q

Conclusion, assumption, or explanation: P

Inference to thebest explanation

In the LCC QA system: The question is the observable: Hitler died The XWN glosses and troponyms are suicide --> kill --> die the general principles: The answer is the explanation: Hitler committed suicide

Relaxation is the assumptions you have to make to get the proof to go through.

Abduction: Try to prove Q the best you can; Make assumptions where you have to.

Page 43: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

Coherence Relationsbetween Embeddings

The model shows that the human immune system is only able to mount an effective response against HIV quasispecies whose diversity is below some threshold value;

once the population of viral strains exceeds this "diversity threshold" the immune system is no long able to regulate viral replication.

The model shows that

^

Page 44: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

Coherence Relations between Coercions

John must be at home.

His car is in the driveway.

I believe

I see that

^

^

CAUSE

Page 45: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

Do We Rally RecognizeCoherence Relations?

Recognizing coherence relation = recognizing sentences as part of one discourse

"We don't recognize coherence relations. We just find the best interpretation of the whole text."

"We don't parse sentences. We just figure out the predicate-argument relations."

Page 46: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

Outline

Interpreting Adjacency

What Coherence Relations are there?

Definitions and Examples of Specific Coherence Relations

Discourse Structure

Page 47: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

Tree Structure from Multiple Adjacencies

[Cancer Research] Institute

vs. Stanford [Research Institute]

John [believes [men work]]

Page 48: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

The Principal Informationin a Composite Segment

turpentine jar ==> jar

Stanford Research Institute ==> Institute

men work ==> work

John believes men work ==> believes

For full clause, the principal information is the assertion: main verb | top-level adverbials | high stress | new information | ....

The entity or eventuality that participates in higher-level structures.

Page 49: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

Discourse Structure from

Multiple AdjacenciesHe was in a foul humor.He hadn't slept well.His electric blanket hadn't worked.

John got straight A's.He got a 1500 on his SATs.He is very intelligent.

Page 50: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

The Principal Informationin a Discourse Segment

1. John got straight A's.2. He got a 1500 on his SATs.3. He is very intelligent.

To relate 1-2 to 3, we need a characterization of the principal information conveyed by 1-2.

Need to compute an Assertion or Summary for composite segments of discourse.

That’s the eventuality that participates in higher-level structures.

Page 51: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

Tasks of a Discourse Theory

1. What are the possible relations between adjacent discourse segments?

2. How are they recognized or characterized?

3. What are the assertions / summaries of the composite segments?

Page 52: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

Formalizing the Tree Structureof Discourse

Syn(s,e) --> Segment(s,e)

Segment(s1,e1) & Segment(s2,e2) & CoRel(e1,e2,e) --> Segment(s1 s2, e)

Note: Syntactic composition rules are an instance of this rule, where relation is pred-arg.

To interpret text, prove: ( e) Segment(text, e)

Summary

Page 53: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

Hypotactic and Paratactic

Coherence RelationsHypotactic:

CoRel(e1,e2,e1)

Paratactic:

CoRel(e1,e2,e) where e is derived somehow from e1 and e2

Dominant Subordinate

Page 54: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

Tree Building

Coherence Structure: Static, after-the-factFlow Model: Dynamic, play-by-play

NOT A REAL DISTINCTION

The Tree-Building Operation:

N1 R(N1,N2) ==> R

N2N1

Page 55: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

Discourse Pivots

A: Let's see. An hour for Brian, an hour and fifteen minutes for me, thirty five minutes for Charles. That's almost exactly three hours.

B: But they're typically late on these things.

SUMMATION

DISAGREEMENT

(Elaboration)

(Contrast)

Discourse Pivot: In S1 S2 S3, S1 and S2 are related to each other by virtue of one part of the content of S2, and S2 and S3 are related to each other by virtue of another part of the content of S2.

Page 56: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

Discourse CoherenceRelation

Segment1 Segment2

Interpret each segment, and find the relation between them.

causefigure-ground and ground-figuresimilarity and contrast

R4

R3

R1 R2

S1 S2 S3 S4 S5

The Structure of Discourse

Page 57: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

Method for Analyzing Discourse

1. Find the major one or two breaks in the text, recursively, until single clauses.

2. Label the nonterminal nodes in the resulting tree with the coherence relations.

3. Make precise the knowledge that was used to justify this labelling.

4. Validate the hypothesized knowledge of Step 3 by finding other examples of the use of the same knowledge elsewhere in the corpus.

F(K,T) = I

Page 58: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

Paragraph from Novel1. The town itself is dreary;

2. not much is there except the cotton mill, the two-room houses where the workers live, a few peach trees, a church with two colored windows, and a miserable main street only a few hundred yards long.

3. On Saturdays the tenants from the near-by farms come in for a day of talk and trade.

4. Otherwise the town is lonesome, sad,

5. and like a place that is far off and estranged from all other places in the world.

6. The nearest train stop is Society City,

7. and the Greyhound and White Bus Lines use the Forks Falls Road which is three miles away.

8. The winters here are short and raw,

9. the summers white with glare and fiery hot.--- Carson McCullers, The Ballad of the Sad Cafe, p. 1

Page 59: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

Analysis of Paragraph from Novel

1 2 3 4 5 6 7 8 9

Contrast, Parallel

ParallelParallel

Contrast

Exemplification

Parallel

Elaboration

Page 60: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

Fragment of Conversation

1. A: So, um, if I went first, let's say with for um, see I, 2. as I said, I need about an hour and fifteen minutes3. I could do the, my reporting on the ongoing project, ah, for that first hour.4. See if we total up all the time we need,5. let's see an hour for Brian,6. A: an hour and fifteen minute for me, B: So it's7. A: thirty five minutes B: almost exactly ...8. A: it's almost exactly correct. Three hours.9. B: But we've got to take into account that they're typically late on these things.10. B: All right, so we're gonna get squeezed someplace. A: Okay, right, right, okay. Um,11. A: I think what I'd be willing to do is if we get squeezed on the, uh if I go first and if we get squeezed I'll I'll eat the ah the time that we lose.

Page 61: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

Analysis of Fragment of Conversation

1 2 3 4 5 6 7 8 9 10 11

Parallel

Elaboration

Elaboration

Elaboration

Elaboration

Parallel

Contrast

Contrast:Problem-Solution

Cause

Page 62: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

Paragraph from Scientific Text

We propose that the genetic variability of HIV is not so much of acomplication, as the key to understanding the development of AIDS.

(R1) In particular, we examine a mathematical model for viral multiplication

that explicitly describes the interplay between the total diversityof viral strains

(which in general will increase over time)

and the suppressing capacity of the immune system.

(R2) The model shows that the human immune system is only able tomount an effective response against HIV quasispecies

whose diversity is below some threshold value;

(R3) once the population of viral strains exceeds this "diversity threshold"

the immune system is no longer able to regulate viral replication,

with consequent destruction of CD + cells.

Page 63: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

Structure of Scientific TextR1: in particular

propose

not so much ... as ...

complication key

R2

examine

that

model describes

interplay

and

which

diversity increase

suppressing

shows

R3

able once

whose exceeds with

quasispecies below

no longer able

destruction

Page 64: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

Analysis of Scientific TextR1: in particular

propose

not so much ... as ... (CONTRAST)

complication key

R2 (ELAB)

examine

that (ELAB)

model describes

interplay

and (CAUSE)

which

diversity increase

suppressing

shows

R3 (CONTRAST)

able once (CAUSE)

whose exceeds with(CAUSE)

quasispecies below

no longer able

destruction

(ELAB)

Page 65: CS 544: Lecture 3.5 Discourse Coherence Jerry R. Hobbs USC/ISI Marina del Rey, CA.

Summary

Discourse structure arises from the use and interpretation of adjacency.

Recognition of discourse structure is naturally embedded in the abduction framework.

A small number of coherence relations probably suffice, in combination with general interpretive mechanisms.