Text Understanding through Probabilistic Reasoning about Actions

IBM T.J. Watson Research Center

© 2007 IBM Corporation

Text Understanding through Probabilistic Reasoningabout Actions

Hannaneh HajishirziErik T. Mueller


Problem

Understanding a text and answering questions

A fundamental problem in Natural Language Processing and Linguistics

Very hard to solve (specifically by machines)

Imagine if computers could understand text

Problem + Use cases Approaches Our Approach: Representation, Inference Future work


Help Desk

User text aboutthe problem

Text Understanding System

Commonsense Reasoning

Solutions

Problem:I’m having trouble installing Notes. I got error message 1. How do I solve it?

Yes, you will get error message 1 if there is another notes installed.

Yes, you will get error message 1 if there is another notes installed.

You must first uninstall Notes.Then, when you run setup you will get notes installed



Commanding a Robot

Go one block ahead. Then, turn right. Take the keys.

Open the door.

Query: Where is the robot? Is the door open?

Initial states



Question Answering Systems

Question: Where was President Bush two years ago?

Ask.com

President Bush said, "Two hundred thirty-one years ago, …..

Applications at IBM:

Playing Jeopardy

Natural language input to semantic engine



General Solution to Text Understanding Task

(1)Framework for representing the

content of text

(2)Algorithms for reasoning

(based on the representation)



Approaches to Natural Language Processing

Machine learning and Statistical approaches (Manning & Schütze, 1999)

Logical approaches(Alshawi, 1992; Hobbs, 1993)

Disadvantages: Bad on semanticsRequires training data

Disadvantages: Unable to represent uncertainties in text/knowledge

Our approach



Our Approach

Represent sentences using a logical framework + probabilities

– Each sentence states properties or actions:

• Property: a statement about the world• Action: a change in the world

– Probabilities: uncertainty and ambiguity

Algorithms for stochastic inference



Potential Open Problems

Transferring text to logical representation

– Represent sentences with actions

– Disambiguate sentences using probabilities

– Represent prior knowledge

Answer queries using probabilistic reasoning in logical framework

– Efficient algorithms

– Fill in missing actions



Representation


….John woke up. He flipped the light switch. He had his breakfast. He went to work….

WakeUp(John, Bed). SwitchLight(John).Eat(John, Food). Move(John, Work)

Translation to actions

Action Level

Text Level


Elements in our Representation


Variablesobjectagent: objectphysobj: objectlocation…

World state: Example: At(John, Work), ¬ LyingOn(John,Bed), Hungry(John), ¬ OnLight(HotelRoom)...

WakeUp(John, Bed). Switch(John,Light). Eat(John, Food). Move(John, Work)

ConstantsJohn: agentBedroom: room HotelRoom: roomWork: location….

PredicatesAt(agent, location) Hungry(agent)Awake(agent)LightOn(room)…


Text Representation

Transition: stochastic choice of deterministic execution

s1 s2

move memorizemake

run

drive

walk

throw

cook

build

study

review

absorb

s3s0

states

actions

disambiguatedactions



Action Declarations

WakeUp(John, Bed)Pre: ¬Awake(John), LyingOn(John, Bed)Eff: Awake(John), ¬LyingOn(John, Bed)

Move (John, Location1, Location2) (simplified)1. Walk(John, Location1, Location2)2. Drive(John, Location1, Location2)

Deterministic Actions:

– Preconditions

– Effects

Probabilistic Actions:

– Assumption: description for basic primitives (e.g., walk) is known (preconditions and effects)

– Goal: Find primitives related to a probabilistic action and disambiguate by transition probabilities


Probabilistic Action

Deterministic Action


Probabilistic Action

Determining each transition using Wordnet (Fellbaum, 1998):

Assign transition probabilities by:– Calculating probability of each primitive using the context of the sentence

Go(John, Work) Walk(John, Work) Drive(John, Work)• Compute P(Walk|work), P(Drive|work)

go, move

drive test drive

fly soar billow

hover

run skitter

rush

walk march countermarch

step



Disambiguation Algorithm Train set (Lillian Lee):

– (noun, verb, frequency) for 1000 most popular nouns

Test set (Semcor, Senseval):

– (object, siblings of the verb) label: verb in the sentence

Goal: P(sibling verb|object) for each sentence

– Compute freq(sibling verb, object)/freq(object)

– If “object” is not in the train set• Replace object with hypernym(object) e.g. replace “lady” with

“woman”

– If (object, candidate verb) is not in the train set• Find similar nouns to “object”

Sims

objectsimDistsimverbPZ

nounverbP ),()|(1

)|(



Some results

Noun/verb Verb/prob Verb/prob Verb/prob Verb/prob

tea/ make Cook: .42 Make:.37 Throw:.16 Dip:.05

Pattern/ memorize

Study: .35 Review: .22 Absorb:.18 Memorize: .17

Lady/ know Know: .25 Feel:.25 Experience:.25 Catch: .25

Accuracy:Test set 1 Test set 2

Probability distribution over candidate verbs:



Prior Knowledge

Knowledge base for state constraints:– At(agent, location1), location1 != location2 ¬At(agent, location2)– AtHand(agent, physobj) ¬OnFloor(physobj)

Bayes net or Markov Network to represent dependencies: – P(Hungry(agent)|Eat(agent, food))=.8– P(Drive(agent,loc1,loc2)|distance(loc1,loc2)>1m)=.7

Probabilistic Open Mind (Singh et al., 2002):

Open Mind: You can often find “Object” in “Location”

– “You can often find a bed in a bedroom”– “You can often find a bed in a hotel room”



Acquisition of Object Location Probabilities Open Mind: You often find “Object” in “Location”

Goal: P(object in location)

– P(bed in bedroom) > P(bed in hotelroom) > P(bed in hospital)

Method: – Extract objects list (1600 objects) and locations list (2675 locations)

– Use a corpus of American literature stories (downloaded from Project Gutenberg)

– Compute correlations between objects and locations:

• Probability: P(Near(object, location)|object) We used this– Cross-reference probabilities with Open Mind and normalize

(Some) Results:

– P(bed in bedroom) = 0.5P(bed in hotelroom) = 0.33P(bed in hospital) = 0.17

Add missing assertions to Open Mind suggested by corpus



Our Approach

Represent sentences using event calculus (a logical framework) + probabilities

– Each sentence states properties or actions:

• Property: a statement about the world• Action: a change in the world

– Probabilities: uncertainty and ambiguity

Algorithms for stochastic inference



Inference Algorithm

Goal: Answer a question related to text– Question Format:

Algorithm: Consider all possible paths from root to leaves.

For each path:

1. Compute

2. Compute

da 11

da21

da3

1

da 12

da22

da3

2

da 13

da23

da3

3

i

ii PPP )Path|Query()Path()Query(

?)Query( trueP

)Path|Query( iP)Path( iP

logical formula



Updating world states

State Space

Query 1

Query 1: Certain answer

Updating world states Propagating information back Check for conflicts at each time

State Space State Space

)Path|Query( iP

States Updating world states

Propagating backPropagating back

Path1: WakeUp(John, Bed). SwitchLight(John).


time 0 time 1 time 2


Propagating backPropagating back

State Space

Query 1

Query 2

Query 2: Regress Query 2 to time 0 Use prior knowledge

Example: P(At(John, Bedroom)0) = ? = P(In(Bed, Bedroom)) = … from Prior knowledge

State Space State Space

)Path|Query( iP

States Updating world states Updating world states



Efficiency of the algorithm Naïve algorithm (Complete state):

– Truth assignment to all the possible predicates

Our algorithm (Partial state):

– Truth assignment to some predicates useful for understanding the text

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

1 2 3 4 5 6

Number of Predicates

Tim

e

Partial State

CompleteState



Free variables and quantifiers

No need to enumerate all the possible cases

Example

– No need to enumerate all the possible permutations of objects inside the briefcase

MvWithObj(B,l1,l2):

Precondition: At(B,l1),~At(B,l2),o: In(o)

Effect: ~At(B,l1),At(B,l2),At(o,l2)



Handling Free Variables

1. Store all the possible values that variable o can take

2. Add constraints when receive new information

– P(o) New claim: P(K) remove K from the possible values of o

– P(o)New claim: ~P(K) add (K != o) to the knowledge

0

0.5

1

1.5

2

2.5

3

2 4 6 8 10

sequence length

tim

e

Partial Statew/Quantifiers

Partial State



Filling Missing Actions

Bob woke up. Bob took shower.

Solution:

Bob went to bathroom.

1. Build the tree representing text before missing action

2. Build the tree representing text after missing action

3. If state for the left tree conflicts with initial state for the right tree

1. Find actions that do not have contradiction

Future work: rank the candidate actions.



Conclusions and Future Work

Done: Framework for representing a text and reasoning algorithm for answering queries

Approximate reasoning algorithm (sampling)(Hajishirzi & Amir, AAAI07, UAI 08)

Comparing the performance of whole system with other approaches

Definition of deterministic actions (preconditions and effects)

More accurate disambiguation technique



Thank You

Questions?


References

Alshawi, H. (1992). The Core Language Engine, Cambridge, MA: MIT Press.

Hobbs, J. R., Stickel, M. E., Appelt, D. E., & Martin, P. (1993). Interpretation as abduction. Artificial Intelligence, 63, 69-142.

Manning, C. D. & Schütze, H. (1999). Foundations of Statistical Natural Language Processing. Cambridge, MA: MIT Press.

Singh, P., Lin, T., Mueller, E. T., Lim, G., Perkins, T., & Zhu,W. L. (2002). Open Mind Common Sense: Knowledge acquisition from the general public. In Lecture Notes in Computer Science: Vol. 2519. On the Move to Meaningful Internet Systems. Berlin: Springer.

Fellbaum, C. (1998). WordNet: An Electronic Lexical Database. Cambridge, MA: MIT Press.


Action Declarations


SwitchLight(John)1. Pre: ¬OnLight(room), At(John, room) Eff: OnLight(room) 2. Pre: OnLight(room), At(John, room) Eff: ¬OnLight(room)

Eat(John, Food)1. Pre: Hungry(John), At(John, room), At(Food, room) Eff: ¬Hungry(John), ¬At(Food, room)2. Pre: ¬Hungry(John), At(John, room), At(Food, room) Eff: ¬At(Food, room)

Move (John, Location1, Location2) (simplified)1. Walk(John, location1, location2)2. Drive(John, location1, location2)


Path1: WakeUp(John, Bed). SwitchLight(John). Eat(John, Food).

¬Awake(John), LyingOn(John, Bed).

Awake(John), ¬LyingOn(John, Bed),At(John, room), ¬LightOn(room).

Awake(John), ¬LyingOn(John, Bed),At(John, room), LightOn(room),At(Food, room), Hungry(John).

¬Awake(John), LyingOn(John, Bed),At(John, room),¬LightOn(room),At(Food, room), Hungry(John).

Awake(John),¬LyingOn(John, Bed),At(John, room),¬LightOn(room),At(Food, room), Hungry(John).

Awake(John), ¬LyingOn(John, Bed),At(John, room), LightOn(room),At(Food, room), Hungry(John).

Awake(John), ¬LyingOn(John, Bed),At(John, room), LightOn(room),¬At(Food, room), ¬Hungry(John).

Progression Regression


SwitchLight(John)1. Pre: ¬OnLight(room), At(John, room) Eff: OnLight(room)

Eat(John, Food)1. Pre: Hungry(John), At(John, room), At(Food, room) Eff: ¬Hungry(John), ¬At(Food, room)


Partial Grounding Ground free variables when necessary:

Move(John, Kitchen, Bedroom) Pre: At(John, Kitchen), AtHand(John, physobj), At(physobj, Kitchen). Eff: At(John, Bedroom), ¬At(John, Kitchen), At(physobj, Bedroom), ¬At(physobj, Kitchen).

AtHand(John, Glass), At(John, Kitchen),At(Glass, Kitchen), AtHand(John, ?), counter=∞.

If followed by: He took out his wallet out of his pocket. ? == Wallet, counter = counter – 1

Remove “?” when counter = 0

Text: AtHand(John, Glass). Move(John, Kitchen, Bedroom).


Our Specific Contributions

Understanding spatial texts

Understanding texts by combining logical and probabilistic representations of commonsense knowledge– Representation of ambiguities and uncertainties in text– Efficient path-based algorithm

Acquisition of object location probabilities


Path1: WakeUp(John, Bed). SwitchLight(John).

¬Awake(John), LyingOn(John, Bed).

Awake(John), ¬LyingOn(John, Bed),At(John, room), ¬LightOn(room).

Awake(John), ¬LyingOn(John, Bed),At(John, room), OnLight(room).

¬Awake(John), LyingOn(John, Bed),At(John, room),¬OnLight (room).

Awake(John),¬LyingOn(John, Bed),At(John, room),¬OnLight(room).

Updating world states Propagating back


SwitchLight(John)1. Pre: ¬OnLight(room), At(John, room) Eff: OnLight(room)

Free variable

Efficient: 1. Partial representation of states2. Partial groundings of actions



Inference Algorithm

Goal: Answer a question related to text– Question Format:

Algorithm: Consider all possible paths from root to leaves.

For each path:

1. Compute

2. Compute

da 11

da21

da3

1

da 12

da22

da3

2

da 13

da23

da3

3

i

ii PPP )Path|Query()Path()Query(

?)Query( trueP

)Path|Query( iP)Path( iP

logical formula



)Path( iP

s1 s2

a1 a3a2

da 11

da21

da3

1

da 12

da22

da3

2

da 13

da23

da3

3

s3s0

Compute probability of each transition

t

ttti sadaPPathP ),|()( 1

321 ,,Path dadadai


Text Understanding through Probabilistic Reasoning about Actions

Documents

Transcript of Text Understanding through Probabilistic Reasoning about Actions