Text Understanding through Probabilistic Reasoning about Actions
description
Transcript of Text Understanding through Probabilistic Reasoning about Actions
IBM T.J. Watson Research Center
© 2007 IBM Corporation
Text Understanding through Probabilistic Reasoningabout Actions
Hannaneh HajishirziErik T. Mueller
IBM T.J. Watson Research Center
Problem
Understanding a text and answering questions
A fundamental problem in Natural Language Processing and Linguistics
Very hard to solve (specifically by machines)
Imagine if computers could understand text
Problem + Use cases Approaches Our Approach: Representation, Inference Future work
IBM T.J. Watson Research Center
Help Desk
User text aboutthe problem
Text Understanding System
Commonsense Reasoning
Solutions
Problem:I’m having trouble installing Notes. I got error message 1. How do I solve it?
Yes, you will get error message 1 if there is another notes installed.
Yes, you will get error message 1 if there is another notes installed.
You must first uninstall Notes.Then, when you run setup you will get notes installed
Problem + Use cases Approaches Our Approach: Representation, Inference Future work
IBM T.J. Watson Research Center
Commanding a Robot
Go one block ahead. Then, turn right. Take the keys.
Open the door.
Query: Where is the robot? Is the door open?
Initial states
Problem + Use cases Approaches Our Approach: Representation, Inference Future work
IBM T.J. Watson Research Center
Question Answering Systems
Question: Where was President Bush two years ago?
Ask.com
President Bush said, "Two hundred thirty-one years ago, …..
Applications at IBM:
Playing Jeopardy
Natural language input to semantic engine
Problem + Use cases Approaches Our Approach: Representation, Inference Future work
IBM T.J. Watson Research Center
General Solution to Text Understanding Task
(1)Framework for representing the
content of text
(2)Algorithms for reasoning
(based on the representation)
Problem + Use cases Approaches Our Approach: Representation, Inference Future work
IBM T.J. Watson Research Center
Approaches to Natural Language Processing
Machine learning and Statistical approaches (Manning & Schütze, 1999)
Logical approaches(Alshawi, 1992; Hobbs, 1993)
Disadvantages: Bad on semanticsRequires training data
Disadvantages: Unable to represent uncertainties in text/knowledge
Our approach
Problem + Use cases Approaches Our Approach: Representation, Inference Future work
IBM T.J. Watson Research Center
Our Approach
Represent sentences using a logical framework + probabilities
– Each sentence states properties or actions:
• Property: a statement about the world• Action: a change in the world
– Probabilities: uncertainty and ambiguity
Algorithms for stochastic inference
Problem + Use cases Approaches Our Approach: Representation, Inference Future work
IBM T.J. Watson Research Center
Potential Open Problems
Transferring text to logical representation
– Represent sentences with actions
– Disambiguate sentences using probabilities
– Represent prior knowledge
Answer queries using probabilistic reasoning in logical framework
– Efficient algorithms
– Fill in missing actions
Problem + Use cases Approaches Our Approach: Representation, Inference Future work
IBM T.J. Watson Research Center
Representation
Problem + Use cases Approaches Our Approach: Representation, Inference Future work
….John woke up. He flipped the light switch. He had his breakfast. He went to work….
WakeUp(John, Bed). SwitchLight(John).Eat(John, Food). Move(John, Work)
Translation to actions
Action Level
Text Level
IBM T.J. Watson Research Center
Elements in our Representation
Problem + Use cases Approaches Our Approach: Representation, Inference Future work
Variablesobjectagent: objectphysobj: objectlocation…
World state: Example: At(John, Work), ¬ LyingOn(John,Bed), Hungry(John), ¬ OnLight(HotelRoom)...
WakeUp(John, Bed). Switch(John,Light). Eat(John, Food). Move(John, Work)
ConstantsJohn: agentBedroom: room HotelRoom: roomWork: location….
PredicatesAt(agent, location) Hungry(agent)Awake(agent)LightOn(room)…
IBM T.J. Watson Research Center
Text Representation
Transition: stochastic choice of deterministic execution
s1 s2
move memorizemake
run
drive
walk
throw
cook
build
study
review
absorb
s3s0
states
actions
disambiguatedactions
Problem + Use cases Approaches Our Approach: Representation, Inference Future work
IBM T.J. Watson Research Center
Action Declarations
WakeUp(John, Bed)Pre: ¬Awake(John), LyingOn(John, Bed)Eff: Awake(John), ¬LyingOn(John, Bed)
Move (John, Location1, Location2) (simplified)1. Walk(John, Location1, Location2)2. Drive(John, Location1, Location2)
Deterministic Actions:
– Preconditions
– Effects
Probabilistic Actions:
– Assumption: description for basic primitives (e.g., walk) is known (preconditions and effects)
– Goal: Find primitives related to a probabilistic action and disambiguate by transition probabilities
Problem + Use cases Approaches Our Approach: Representation, Inference Future work
Probabilistic Action
Deterministic Action
IBM T.J. Watson Research Center
Probabilistic Action
Determining each transition using Wordnet (Fellbaum, 1998):
Assign transition probabilities by:– Calculating probability of each primitive using the context of the sentence
Go(John, Work) Walk(John, Work) Drive(John, Work)• Compute P(Walk|work), P(Drive|work)
go, move
drive test drive
fly soar billow
hover
run skitter
rush
walk march countermarch
step
Problem + Use cases Approaches Our Approach: Representation, Inference Future work
IBM T.J. Watson Research Center
Disambiguation Algorithm Train set (Lillian Lee):
– (noun, verb, frequency) for 1000 most popular nouns
Test set (Semcor, Senseval):
– (object, siblings of the verb) label: verb in the sentence
Goal: P(sibling verb|object) for each sentence
– Compute freq(sibling verb, object)/freq(object)
– If “object” is not in the train set• Replace object with hypernym(object) e.g. replace “lady” with
“woman”
– If (object, candidate verb) is not in the train set• Find similar nouns to “object”
Sims
objectsimDistsimverbPZ
nounverbP ),()|(1
)|(
Problem + Use cases Approaches Our Approach: Representation, Inference Future work
IBM T.J. Watson Research Center
Some results
Noun/verb Verb/prob Verb/prob Verb/prob Verb/prob
tea/ make Cook: .42 Make:.37 Throw:.16 Dip:.05
Pattern/ memorize
Study: .35 Review: .22 Absorb:.18 Memorize: .17
Lady/ know Know: .25 Feel:.25 Experience:.25 Catch: .25
Accuracy:Test set 1 Test set 2
Probability distribution over candidate verbs:
Problem + Use cases Approaches Our Approach: Representation, Inference Future work
IBM T.J. Watson Research Center
Prior Knowledge
Knowledge base for state constraints:– At(agent, location1), location1 != location2 ¬At(agent, location2)– AtHand(agent, physobj) ¬OnFloor(physobj)
Bayes net or Markov Network to represent dependencies: – P(Hungry(agent)|Eat(agent, food))=.8– P(Drive(agent,loc1,loc2)|distance(loc1,loc2)>1m)=.7
Probabilistic Open Mind (Singh et al., 2002):
Open Mind: You can often find “Object” in “Location”
– “You can often find a bed in a bedroom”– “You can often find a bed in a hotel room”
Problem + Use cases Approaches Our Approach: Representation, Inference Future work
IBM T.J. Watson Research Center
Acquisition of Object Location Probabilities Open Mind: You often find “Object” in “Location”
Goal: P(object in location)
– P(bed in bedroom) > P(bed in hotelroom) > P(bed in hospital)
Method: – Extract objects list (1600 objects) and locations list (2675 locations)
– Use a corpus of American literature stories (downloaded from Project Gutenberg)
– Compute correlations between objects and locations:
• Probability: P(Near(object, location)|object) We used this– Cross-reference probabilities with Open Mind and normalize
(Some) Results:
– P(bed in bedroom) = 0.5P(bed in hotelroom) = 0.33P(bed in hospital) = 0.17
Add missing assertions to Open Mind suggested by corpus
Problem + Use cases Approaches Our Approach: Representation, Inference Future work
IBM T.J. Watson Research Center
Our Approach
Represent sentences using event calculus (a logical framework) + probabilities
– Each sentence states properties or actions:
• Property: a statement about the world• Action: a change in the world
– Probabilities: uncertainty and ambiguity
Algorithms for stochastic inference
Problem + Use cases Approaches Our Approach: Representation, Inference Future work
IBM T.J. Watson Research Center
Inference Algorithm
Goal: Answer a question related to text– Question Format:
Algorithm: Consider all possible paths from root to leaves.
For each path:
1. Compute
2. Compute
da 11
da21
da3
1
da 12
da22
da3
2
da 13
da23
da3
3
i
ii PPP )Path|Query()Path()Query(
?)Query( trueP
)Path|Query( iP)Path( iP
logical formula
Problem + Use cases Approaches Our Approach: Representation, Inference Future work
IBM T.J. Watson Research Center
Updating world states
State Space
Query 1
Query 1: Certain answer
Updating world states Propagating information back Check for conflicts at each time
State Space State Space
)Path|Query( iP
States Updating world states
Propagating backPropagating back
Path1: WakeUp(John, Bed). SwitchLight(John).
Problem + Use cases Approaches Our Approach: Representation, Inference Future work
time 0 time 1 time 2
IBM T.J. Watson Research Center
Propagating backPropagating back
State Space
Query 1
Query 2
Query 2: Regress Query 2 to time 0 Use prior knowledge
Example: P(At(John, Bedroom)0) = ? = P(In(Bed, Bedroom)) = … from Prior knowledge
State Space State Space
)Path|Query( iP
States Updating world states Updating world states
Problem + Use cases Approaches Our Approach: Representation, Inference Future work
IBM T.J. Watson Research Center
Efficiency of the algorithm Naïve algorithm (Complete state):
– Truth assignment to all the possible predicates
Our algorithm (Partial state):
– Truth assignment to some predicates useful for understanding the text
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
1 2 3 4 5 6
Number of Predicates
Tim
e
Partial State
CompleteState
Problem + Use cases Approaches Our Approach: Representation, Inference Future work
IBM T.J. Watson Research Center
Free variables and quantifiers
No need to enumerate all the possible cases
Example
– No need to enumerate all the possible permutations of objects inside the briefcase
MvWithObj(B,l1,l2):
Precondition: At(B,l1),~At(B,l2),o: In(o)
Effect: ~At(B,l1),At(B,l2),At(o,l2)
Problem + Use cases Approaches Our Approach: Representation, Inference Future work
IBM T.J. Watson Research Center
Handling Free Variables
1. Store all the possible values that variable o can take
2. Add constraints when receive new information
– P(o) New claim: P(K) remove K from the possible values of o
– P(o)New claim: ~P(K) add (K != o) to the knowledge
0
0.5
1
1.5
2
2.5
3
2 4 6 8 10
sequence length
tim
e
Partial Statew/Quantifiers
Partial State
Problem + Use cases Approaches Our Approach: Representation, Inference Future work
IBM T.J. Watson Research Center
Filling Missing Actions
Bob woke up. Bob took shower.
Solution:
Bob went to bathroom.
1. Build the tree representing text before missing action
2. Build the tree representing text after missing action
3. If state for the left tree conflicts with initial state for the right tree
1. Find actions that do not have contradiction
Future work: rank the candidate actions.
Problem + Use cases Approaches Our Approach: Representation, Inference Future work
IBM T.J. Watson Research Center
Conclusions and Future Work
Done: Framework for representing a text and reasoning algorithm for answering queries
Approximate reasoning algorithm (sampling)(Hajishirzi & Amir, AAAI07, UAI 08)
Comparing the performance of whole system with other approaches
Definition of deterministic actions (preconditions and effects)
More accurate disambiguation technique
Problem + Use cases Approaches Our Approach: Representation, Inference Future work
IBM T.J. Watson Research Center
Thank You
Questions?
IBM T.J. Watson Research Center
References
Alshawi, H. (1992). The Core Language Engine, Cambridge, MA: MIT Press.
Hobbs, J. R., Stickel, M. E., Appelt, D. E., & Martin, P. (1993). Interpretation as abduction. Artificial Intelligence, 63, 69-142.
Manning, C. D. & Schütze, H. (1999). Foundations of Statistical Natural Language Processing. Cambridge, MA: MIT Press.
Singh, P., Lin, T., Mueller, E. T., Lim, G., Perkins, T., & Zhu,W. L. (2002). Open Mind Common Sense: Knowledge acquisition from the general public. In Lecture Notes in Computer Science: Vol. 2519. On the Move to Meaningful Internet Systems. Berlin: Springer.
Fellbaum, C. (1998). WordNet: An Electronic Lexical Database. Cambridge, MA: MIT Press.
IBM T.J. Watson Research Center
Action Declarations
WakeUp(John, Bed)Pre: ¬Awake(John), LyingOn(John, Bed)Eff: Awake(John), ¬LyingOn(John, Bed)
SwitchLight(John)1. Pre: ¬OnLight(room), At(John, room) Eff: OnLight(room) 2. Pre: OnLight(room), At(John, room) Eff: ¬OnLight(room)
Eat(John, Food)1. Pre: Hungry(John), At(John, room), At(Food, room) Eff: ¬Hungry(John), ¬At(Food, room)2. Pre: ¬Hungry(John), At(John, room), At(Food, room) Eff: ¬At(Food, room)
Move (John, Location1, Location2) (simplified)1. Walk(John, location1, location2)2. Drive(John, location1, location2)
IBM T.J. Watson Research Center
Path1: WakeUp(John, Bed). SwitchLight(John). Eat(John, Food).
¬Awake(John), LyingOn(John, Bed).
Awake(John), ¬LyingOn(John, Bed),At(John, room), ¬LightOn(room).
Awake(John), ¬LyingOn(John, Bed),At(John, room), LightOn(room),At(Food, room), Hungry(John).
¬Awake(John), LyingOn(John, Bed),At(John, room),¬LightOn(room),At(Food, room), Hungry(John).
Awake(John),¬LyingOn(John, Bed),At(John, room),¬LightOn(room),At(Food, room), Hungry(John).
Awake(John), ¬LyingOn(John, Bed),At(John, room), LightOn(room),At(Food, room), Hungry(John).
Awake(John), ¬LyingOn(John, Bed),At(John, room), LightOn(room),¬At(Food, room), ¬Hungry(John).
Progression Regression
WakeUp(John, Bed)Pre: ¬Awake(John), LyingOn(John, Bed)Eff: Awake(John), ¬LyingOn(John, Bed)
SwitchLight(John)1. Pre: ¬OnLight(room), At(John, room) Eff: OnLight(room)
Eat(John, Food)1. Pre: Hungry(John), At(John, room), At(Food, room) Eff: ¬Hungry(John), ¬At(Food, room)
IBM T.J. Watson Research Center
Partial Grounding Ground free variables when necessary:
Move(John, Kitchen, Bedroom) Pre: At(John, Kitchen), AtHand(John, physobj), At(physobj, Kitchen). Eff: At(John, Bedroom), ¬At(John, Kitchen), At(physobj, Bedroom), ¬At(physobj, Kitchen).
AtHand(John, Glass), At(John, Kitchen),At(Glass, Kitchen), AtHand(John, ?), counter=∞.
If followed by: He took out his wallet out of his pocket. ? == Wallet, counter = counter – 1
Remove “?” when counter = 0
Text: AtHand(John, Glass). Move(John, Kitchen, Bedroom).
IBM T.J. Watson Research Center
Our Specific Contributions
Understanding spatial texts
Understanding texts by combining logical and probabilistic representations of commonsense knowledge– Representation of ambiguities and uncertainties in text– Efficient path-based algorithm
Acquisition of object location probabilities
IBM T.J. Watson Research Center
Path1: WakeUp(John, Bed). SwitchLight(John).
¬Awake(John), LyingOn(John, Bed).
Awake(John), ¬LyingOn(John, Bed),At(John, room), ¬LightOn(room).
Awake(John), ¬LyingOn(John, Bed),At(John, room), OnLight(room).
¬Awake(John), LyingOn(John, Bed),At(John, room),¬OnLight (room).
Awake(John),¬LyingOn(John, Bed),At(John, room),¬OnLight(room).
Updating world states Propagating back
WakeUp(John, Bed)Pre: ¬Awake(John), LyingOn(John, Bed)Eff: Awake(John), ¬LyingOn(John, Bed)
SwitchLight(John)1. Pre: ¬OnLight(room), At(John, room) Eff: OnLight(room)
Free variable
Efficient: 1. Partial representation of states2. Partial groundings of actions
Problem + Use cases Approaches Our Approach: Representation, Inference Future work
IBM T.J. Watson Research Center
Inference Algorithm
Goal: Answer a question related to text– Question Format:
Algorithm: Consider all possible paths from root to leaves.
For each path:
1. Compute
2. Compute
da 11
da21
da3
1
da 12
da22
da3
2
da 13
da23
da3
3
i
ii PPP )Path|Query()Path()Query(
?)Query( trueP
)Path|Query( iP)Path( iP
logical formula
Problem + Use cases Approaches Our Approach: Representation, Inference Future work
IBM T.J. Watson Research Center
)Path( iP
s1 s2
a1 a3a2
da 11
da21
da3
1
da 12
da22
da3
2
da 13
da23
da3
3
s3s0
Compute probability of each transition
t
ttti sadaPPathP ),|()( 1
321 ,,Path dadadai
Problem + Use cases Approaches Our Approach: Representation, Inference Future work